1

I have created a mysql table which has the crime count, Crime description, Crime Category and address of crime. I have created some reports over this table. The user wants to have a search by address filter in the report. so we are going to be using a where clause on table and have a condition over street.

The problem is that street address is quite a large string and searching/filtering the table over address when the table is already quite big will take a lot of time. I tried using some hashing like md5(streetaddress) but that did not help either. The query become very slow with this kind of where clause

example

select * from crimedata where streetaddress = "41 BENNETT RD Watertown  Massachusetts United States"

Will indexing the streetaddress help in this case or should I use some kind of hashing to make this kind of string search faster in the table?

Shah

Arif
  • 377
  • 2
  • 5
  • 21
  • 1
    Yes, indexing will definitely make a difference – Peter van der Wal Oct 14 '14 at 18:26
  • You say that such queries "will take a lot of time." DO they take a lot of time, or are you simply assuming they will? MySQL is fairly good at this kind of lookup. Hashing isn't worth the trouble, and indexing is. Your real problem will be the exact match required for this kind of `WHERE` clause. You might consider FULLTEXT searching, about which you can read. – O. Jones Oct 14 '14 at 18:28

3 Answers3

0

Adding an index on streetaddress will help a bit but limited.

You may want to consider changing your storage engine to something that supports fulltext search. An example is Mroonga

NOTE: I am not associated with Mroonga. I just had a chance to use the library before and found that it does provide improvement in text search.

gmarintes
  • 1,288
  • 12
  • 16
  • Both InnoDB and MyISAM support full-text indexing in MySQL 5.6, though using MyISAM for anything critical is probably a very bad idea. – tadman Oct 14 '14 at 18:43
0

You could try properly normalizing your data, where addresses are stored in one table and referenced by ID in another.

Your query should look like?:

SELECT ... FROM crimedata WHERE address_id=?

Where that ? is a placeholder for the ID of the address you fetch from the other table.

As always, anything that shows up repeatedly in a WHERE clause as a condition is a strong candidate for being indexed.

tadman
  • 208,517
  • 23
  • 234
  • 262
0

I would take a step back and see if you are attacking the problem in a way that is going to scale.

I would look at using geospatial information to do your queries on then use the street address as an output display parameter.

If you use the GIS object to store things like a point then you'll be able to do radius searches and bounding box queries in the future.

Your coding would change when someone enters in a street address to convert to either lat/long or point. Then when doing searches it will go much quicker since you won't be doing full text searches. It will give you the ability to call mapping API to show the address or place location on public mapping services.

http://mysqlserverteam.com/mysql-5-7-and-gis-an-example/

[Yes, of course scaling something like this out to a global scale would take out of the realm of databases into bigdata world]

Jeff Muzzy
  • 94
  • 6