4

To serve millions of files out of a single directory, being able to connect to a drive from hundreds of endpoints, and for some other reasons (to avoid gluster/nfs/all fs based networking solutions), I want to evaluate the possibility of making a filesystem that's based on a mongodb (or any other).

Basically, it works like fusefs, every single file is kept in mongo gridfs. In theory, I do,

mount mongodbfs /mountPoint mongodb://localhost

then when i say touch /mountPoint/test.txt this file is inserted into mongodb. This FS will also store uid/gid and perms with the file, we can throw hundreds of servers to it, and no useradd will be necessary. I'm not thinking to include all the features of FS, just the ones we need.

My question is, how do I start my quest in finding resources, books, links, people, developers who'd help me implement this? at least a proof of concept. Is it feasible? What should I expect as a timeline for such undertaking?

Please only think about gazillion small files and folders.

ps: after a few days of research i think this is the direction i'm heading http://www.ibm.com/developerworks/library/l-sc12.html http://www.flipcode.com/archives/Programming_a_Virtual_File_System-Part_I.shtml

ps2: i'm aware of the difficulty of this undertaking. however we're willing to set aside a serious budget and willing to form a serious team implementing it - only after we make sure that this isn't a black hole (thus the question).

Devrim
  • 2,826
  • 5
  • 25
  • 31
  • 1
    Is this project similar to what you want https://github.com/mikejs/gridfs-fuse ? –  Mar 05 '11 at 22:56
  • it is similar, but we want it without FUSE. – Devrim Mar 05 '11 at 23:00
  • 5
    I'm sure this is an entirely original idea that has never once been attempted before, because all other filesystem designers in history have just been stupid, including the 5,000 man-year project at Microsoft that failed. Good luck with this! – Chopper3 Mar 07 '11 at 16:03
  • @Chopper - that's some painful truths right there. – mfinni Mar 07 '11 at 16:20
  • Also - @Devrim since your question is "how do i start writing this", this should be migrated to StackOverflow, where you will no doubt get even better reasons that this won't be easy or even feasible. – mfinni Mar 07 '11 at 16:22
  • What is your objection to fuse? Performance? – pjc50 Mar 07 '11 at 16:38
  • @pjc50 yes. this needs to be close to a native FS – Devrim Mar 07 '11 at 16:53
  • @Chopper i'm sorry i gotta do this :) here is another 5.000 man-year project from microsoft that has failed: http://www.youtube.com/watch?v=3oGFogwcx-E – Devrim Mar 07 '11 at 17:03
  • 1
    I've got to point this out: you're implementing a filesystem backend for MongoDB and you're worried about FUSE overhead? FUSE is ultra-cheap compared to a MongoDB access. – Zan Lynx Mar 07 '11 at 18:30
  • @Zan not necessarily if the connection is persisted per mount. Mongo accepts 20k connections which means 20k endpoints/mounts, now let's imagine this with NFS and reconsider with FUSE. – Devrim Mar 08 '11 at 04:48
  • 2
    Yet again, we have to ask: have you benchmarked this and confirmed that FUSE is the slow spot? – pjc50 Mar 14 '11 at 14:34

1 Answers1

7

Your most frequent piece of advice here is going to be "Use FUSE". This is excellent advice, and you would do well to heed it (As Sciurus pointed out there's already gridfs-fuse which is pretty close to what you want).

That said, if you want to take the long, hard road of pain and suffering (writing your own filesystem), you almost certainly want to take an operating systems course at a local university, or look at some online course materials ("Write a simple FS" is usually a small project. The filesystems typically suck because they're academic toys).
Follow that up with Linux File Systems (Moshe Bar) and a thorough reading of some simple filesystem drivers to see the basic skeleton of what you'll need to do.

As far as timeline, if you're a decent coder you can write a basic filesystem in a few days to a week (but it will SUCK). I wouldn't even guess how long it would take to write a GOOD filesystem -- UFS/FFS (the BSD filesystem) has been under continuous development since at least the late 1970s/early 1980s, and improvements/enhancements/bug fixes still pop up occasionally. Sun/Oracle's ZFS has gone through over 20 iterations in its relative short (6-year) life, though admittedly much of that is related to volume management capabilities.

voretaq7
  • 212
  • 3
  • 11
  • 1
    If you are trying to write a filesystem interacting with the VFS layer is pretty much a foregone conclusion -- FUSE takes care of much of that work for you :) – voretaq7 Mar 07 '11 at 17:53
  • at a cost that we're trying to avoid. but point taken - you think a non-fuse implementation wouldn't be a feasible option. – Devrim Mar 08 '11 at 05:03
  • Not so much "infeasible" as "a lot more work". Note that for performance you may be able to parallelize the mongo-to-FUSE interface (threads?) -- someone more familiar with FUSE may be able to tell you more... – voretaq7 Mar 08 '11 at 05:36