4

Which is more expensive to do in terms of resources and efficiency, File read/write operation or Database Read/Write operation?

I'm using MongoDB, with Python. I't be preforming about 100k requests on the db/file per minute. Also, there's about 15000 documents in the database / file.

Which would be faster? thanks in advance.

Deduplicator
  • 44,692
  • 7
  • 66
  • 118
  • Hello James! see this question http://stackoverflow.com/questions/2954957/mongodb-vs-couchdb-speed-optimization maybe it will help you;) – Edward83 Nov 17 '10 at 23:20

4 Answers4

6

It depends.. if you need to read sequenced data, file might be faster, if you need to read random data, database has better chances to be optimized to your needs.

(after all - database reads it's records from a file as well, but it has an internal structure and algorithms to enhance performance, it can use the memory in a smarter way, and do a lot in the background so the results will come faster)

in an intensive case of random reading - I will go with the database option.

Dani
  • 14,639
  • 11
  • 62
  • 110
3

There are too many factors to offer a concrete answer, but here's a list for you to consider:

  1. Disk bandwidth
  2. Disk latency
  3. Disk cache
  4. Network bandwidth
  5. MongoDB cluster size
  6. Volume of MongoDB client activity (the disk only has one "client" unless your machine is busy with other workloads)
Marcelo Cantos
  • 181,030
  • 38
  • 327
  • 365
1

Reading from a database can be more efficient, because you can access records directly and make use of indexes etc. With normal flat files you basically have to read them sequentially. (Mainframes support direct access files, but these are sort of halfway between flat files and databases).

If you are in a multi-user environment, you must make sure that your data remain consistent even if multiple users try updates at the same time. With flat files, you have to lock the file for all but one user until she is ready with her update, and then lock for the next. Databases can do locking on row level.

You can make a file based system as efficient as a database, but that effort amounts to writing a database system yourself.

M.A.K. Ripon
  • 2,070
  • 3
  • 29
  • 47
0

If caching is not used sequential IO operations are faster with files by definition. Databases eventually use files, but they have more layers to pass before data hit the file. But if you want to query data using database is more efficient, because if you choose files you will have to implement it yourselves. For your task i recommend to research clustering for different databases, they can scale to your rate.

Andrey
  • 59,039
  • 12
  • 119
  • 163