2

I want to share a db of information (eg, geonames.db.94y.info, a custom version of the geonames db) usefully, but safely. Is there a good way to do this?:

  • People shouldn't be forced to download the db to get useful information.

  • My current solution is to allow arbitrary SELECT queries with a max CPU time of 5s.

  • SELECT queries let you get SOME useful information, like the world's 50 most populated "cities", and the total population in all of the world's cities:

http://7cdf2e8dd01ab370c96c4d3daf67c5fb.geonames.db.94y.info/ http://d6da02fc16ae6e5afbf2a4c60b1e4e5c.geonames.db.94y.info/

  • We see that the most populated city (Sao Paulo, 9839066) has about 1.19% of the world's urban population (829523357). We can even write an (inefficient) query for this:

http://245bd4d07c2c5179dd37d8f7520ee7cf.geonames.db.94y.info/

  • How about the top 2 cities combined? What percentage of the world's urban population are they? Top 3 combined? Top n combined?

  • For this and more complex examples, we need a PL, or perhaps something more. While I'm OK w/ giving people readonly db access, I'm not sure running arbitrary PL commands would be safe.

  • Note: I realize MySQL can do "accumulative" queries like this, but:

    • It's not standard SQL

    • I'm using SQLite3

    • It requires multiple queries, but I only allow one query at a time

    • I'm sure there are examples that even MySQL couldn't handle.

  • Basically, I want to let people get whatever info they want out of the db w/o having to download it, provided they don't change the data or eat up too much CPU time.

Is there a known solution here? I know about Drupal and other CMSs, but I don't think they can do this sort of thing?

[Note: I'm using geonames.db as an example, but it's in testing, so don't rely on the data/results]

  • As a general rule you should validate any input from a user before running a query against your database. Off the top of my head without parsing the input queries I don't think you want to allow arbitrary queries unless your users are all trusted (e.g. all admins for example), but that's still a security violation. – Yzmir Ramirez Jan 16 '11 at 21:32
  • Well, I only allow SELECT queries and only accept certain characters in the query. I agree allowing arbitrary SELECT queries isn't ideal, but the sad part is that it's not even sufficient. I'm trading security for convenience and failing at both. –  Jan 17 '11 at 03:19

0 Answers0