0

Currently we use Oracle for storing images in the application. But we expect to see lot of images/videos in the application. We would like to move away from oracle to be able to shard easily and achieve high throughput. Any recommendations?

Did anyone try using NoSQL databases such as Couchbase/MongoDB for this purpose? Are they optimized for this purpose.

I see that Cloudinary uses Amazon S3 for this purpose. But I am looking for something, which can be deployed in our datacenter for privacy concerns.

4 Answers4

2

From your problem description, I can't see any indication pro or contra a NoSQL database.

Having media like pictures, sound, or video, in a database means just having a large uninterpreted binary object. Uninterpreted means: The database can store and deliver the binary, but can't analyze it for its properties, take it as a basis for queries, and the like (what databases are made for).

Both relational and non-relational databases provide data types for that kind of BLOB. The features in which they differ are, for example,

  • tabular vs. tree structured data structures - not applicable for the BLOB, as it will be one attribute, no matter how large it becomes,

  • different sorts of transaction logic (CAP theorem) that aren't addressed by the BLOB subject matter.

So I'm afraid your architecture will need to be decided on a much broader range than just considering your media data. Which are your data structures? Which are your query and update scenarios?

TAM
  • 1,731
  • 13
  • 18
  • Thanks for the reply. Currently we do have a relational database which stores some metadata about images. This relational DB stores information about million other things as well. So it is really hard to move away from that for now. What I am really looking for is a optimized file storage mechanism, which can be scaled easily for high volume. And use that in conjunction with my current RDB. Since this is only for storing images, transactions are not a big concern. 2PC protocol is fine. – user3731623 May 14 '16 at 03:20
  • Have you looked at solutions like Nuxeo? – Laurent Doguin May 16 '16 at 22:06
2

What I see people do with Couchbase is store all of the meta-data about the image in a JSON document in Couchbase, but host the image itself is something optimized for files. You get the benefits of both worlds. In this kind of use case you mention, from my experience a NoSQL database will be much better than a relational database.

Having managed very large relational and NoSQL databases with blobs in them, IMO it is a terrible idea in most cases, regardless of the database type. So I wrote up this blog post for just such a situation.

NoSQLKnowHow
  • 4,449
  • 23
  • 35
  • Thanks for the reply. Actually, I read your blog post before posting this question. It is a very interesting read. Currently we do have a relational database which stores some metadata about images. This relational DB stores information about million other things. So it is really hard to move away from that for now. What I am really looking for is a optimized file storage mechanism, which can be scaled easily for high volume. And use that in conjunction with my current RDB. – user3731623 May 14 '16 at 03:16
0

As you are looking for private deployment in your data center, you may consider MongoDB or OpenStack Swift.

I have seen people using MongoDB gridfs (https://docs.mongodb.com/manual/core/gridfs/) for storing images/videos. The advantages of using MongoDB gridfs:

  1. You can use MongoDB replica set for fault tolerance/high availability.
  2. You can access a portion of a large file without loading the whole file into memory. As MongoDB stores files into small chunks(255KB), so video files can be streamed faster.
  3. You can scale using MongoDB sharding.

Openstack Swift is a highly available, distributed, eventually consistent object/blob store comparable to Amazon S3, which you can deploy in your data center. Also OpenStack Swift is being used by many companies, Rackspace's Cloud Files runs Swift. You may also give a look into Swift : http://docs.openstack.org/developer/swift/

Pranab Sharma
  • 729
  • 10
  • 12
0

S3 which has a very strong commitment to privacy. What are your concerns regarding S3? Also, which datacenter are you planning to move away from Oracle's storage?

Itay Taragano
  • 1,901
  • 1
  • 11
  • 12