2

I'm writing an app that needs to handle more than 15.000 photos and I want to store into the database their EXIF and IPTC attributes.

My initial approach is to use MySQL and create a table to store all the attributes, as it is suggested here.

However most of the photos have up to 250 attributes. Since I got 15k photos that means I will have almost 4 million rows. And this is only the beginning (I expect more photos in the future).

I wonder whether MySQL would be ok in this scenario or I should move to a NoSQL approach like MongoDB.

Please also note that I need to make the database searchable.

Thanks in advance.

Community
  • 1
  • 1
jävi
  • 4,571
  • 1
  • 24
  • 32
  • 2
    4 million rows is not large in some contexts. You haven't specified your hardware specs or latency requirements (overnight batches vs instant gratification). Have you considered if you really need every attribute? Do you have a database machine to build a test database with and do performance testing? – patrickmdnet Dec 27 '12 at 20:18
  • I'm afraid I don't have the machine specs yet. And yes I want all the attributes. I'm not a database expert but 4 millions of rows seems to me a lot of data for a simple app – it is not? – jävi Dec 27 '12 at 21:08
  • Not if you need all the data. – Robert Harvey Dec 28 '12 at 18:51

1 Answers1

1

If you're a .Net developer, RavenDB is ideal for your scenario. It can easily handle that volume on very modest hardware, and has outstanding search capabilities provided by it's internal use of the Lucene search engine.

The photos themselves would be stored as attachments, while the attributes would be part of the document.

Even if you're not a .Net developer, RavenDB can be used over http/rest from any language. It's just much easier with the native .Net client.

Matt Johnson-Pint
  • 230,703
  • 74
  • 448
  • 575
  • Are you recommending RavenDB because it is a document-oriented database? What precludes the use of an ordinary SQL database? – Robert Harvey Dec 28 '12 at 18:49
  • 1) Volume - Raven can scale to this size without expensive hardware. 2) Search - Robust full-text searching on any of the attributes, including partial string matches, suggested search results, very google-like search capabilities baked in. 3) The OP expects more photos on a regular basis. SQL blocks reads while updating indexes during the write. Raven does indexing in the background, optimizing for fast reads. 4) I'm biased. Mongo or Couch might be ok also. I like Raven. :) – Matt Johnson-Pint Dec 28 '12 at 19:00
  • Actually I'm a Rails developer. Do you think Mongo would work as Raven? – jävi Dec 29 '12 at 11:44
  • MongoDB and CouchDB would both handle the scale, but they do not have fulltext search built-in. I understand that there are add-ons that can provide that functionality. Personally, I think Raven is vastly superior to either, but Mongo is a very mature product that will likely fit your needs. Perhaps a MongoDB person can chime in here to address your specifics. - and you CAN still use Raven if you want, but there is no native client for Rails. You will have to use the REST API. – Matt Johnson-Pint Dec 29 '12 at 15:10