29 October 2010

Thinking before typing (MySQL spatial indices gotcha)

"SQL, Lisp, and Haskell are the only programming languages that I’ve seen where one spends more time thinking than typing." --Philip Greenspun


MySQL, I just learned, has a geometry data type and supports R-tree indices, which could be very helpful for a new project I'm exploring. I wanted to combine a geometric lookup with a temporal lookup, but discovered that the naive way of doing so has some pitfalls.

Here was idea #1:

create table gt4 (g point not null, t datetime not null, spatial index(g), index(t), index(t,g));

If I put a few sample values into that table:

insert into gt4 (g,t) values (GeomFromText('Point(8 8)'), '2008-01-01 11:55');
insert into gt4 (g,t) values (GeomFromText('Point(81 80)'), '2009-01-01 11:55');
insert into gt4 (g,t) values (GeomFromText('Point(1 0)'), '2008-01-01 11:55');

...it worked as expected:

> select t, AsText(g) from gt4 where MBRContains(GeomFromText('Polygon((0 0, 31 0, 31 16, 0 16, 0 0))'),g);
+------+------------+
| t | AsText(g) |
+------+------------+
| 10 | POINT(8 8) |
| 10 | POINT(1 0) |
+------+------------+
2 rows in set (0.00 sec)


But, if I were to make a very small change---which is to say, reversing the arguments of the last index()...


create table gt5 (g point not null, t datetime not null, spatial index(g), index(t), index(g,t));

If we reinsert the same values, things seem to work:

insert into gt5 (g,t) values (GeomFromText('Point(8 8)'), '2008-01-01 11:55');
insert into gt5 (g,t) values (GeomFromText('Point(81 80)'), '2009-01-01 11:55');
select t, AsText(g) from gt5 where MBRContains(GeomFromText('Polygon((0 0, 31 0, 31 16, 0 16, 0 0))'),g);
+---------------------+------------+
| t | AsText(g) |
+---------------------+------------+
| 2008-01-01 11:55:00 | POINT(8 8) |
+---------------------+------------+
1 row in set (0.00 sec)

But if we add that last row...


insert into gt5 (g,t) values (GeomFromText('Point(1 0)'), '2009-01-01 11:55');


Then select...


> select t, AsText(g) from gt5 where MBRContains(GeomFromText('Polygon((0 0, 31 0, 31 16, 0 16, 0 0))'),g);
Empty set (0.00 sec)


Say what?

MySQL uses a different index in each case. For table gt4, it picks the spatial index:


mysql> show create table gt4\G
*************************** 1. row ***************************
Table: gt4
Create Table: CREATE TABLE `gt4` (
`g` point NOT NULL,
`t` int(11) DEFAULT NULL,
SPATIAL KEY `g` (`g`),
KEY `t` (`t`),
KEY `t_2` (`t`,`g`(25))
) ENGINE=MyISAM DEFAULT CHARSET=latin1
1 row in set (0.00 sec)


mysql> explain select t, AsText(g) from gt4 where MBRContains(GeomFromText('Polygon((0 0, 31 0, 31 16, 0 16, 0 0))'),g)\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: gt4
type: range
possible_keys: g
key: g
key_len: 34
ref: NULL
rows: 2
Extra: Using where
1 row in set (0.00 sec)



Whereas for gt5, it uses the combined index:


mysql> show create table gt5\G
*************************** 1. row ***************************
Table: gt5
Create Table: CREATE TABLE `gt5` (
`g` point NOT NULL,
`t` datetime NOT NULL,
SPATIAL KEY `g` (`g`),
KEY `t` (`t`),
KEY `g_2` (`g`(25),`t`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1
1 row in set (0.00 sec)

mysql> explain select t, AsText(g) from gt5 where MBRContains(GeomFromText('Polygon((0 0, 31 0, 31 16, 0 16, 0 0))'),g)\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: gt5
type: range
possible_keys: g,g_2
key: g_2
key_len: 27
ref: NULL
rows: 1
Extra: Using where
1 row in set (0.00 sec)


The documentation explains that combined indices concatenate the columns together. Apparently, this doesn't play nicely with geometry queries.

No comments:

Post a Comment

About Me

blog at barillari dot org Older posts at http://barillari.org/blog