In this presentation there was a chart that showed the following horizontal scalability ceiling as data gets larger:
key-value > column family > document database > graph database
http://youtu.be/UodTzseLh04?t=13m36s
In other words, as data gets more connected (i.e. complex) the limit on which you can let the database grow gets lower.
Why is data size not as scalable for document databases compared to key-value stores? Have I answered my own question by saying "the more freedom in connecting data, the harder it is to partition data"?
(The "what I'm trying to do" part which everyone usually asks: I have a database with a schema that is MOSTLY tree-like but occasionally has nodes with 2 parents. I used Neo4j in my prototype but for a production-scale app I'd need to think more about partitioning. I'm going to have to use Mongo DB since Graph Databases cannot easily be partitioned, and it will be harder to write code for my "multiple parents" relationships in Mongo DB. So I'm wondering if it's worth going the extra mile and use key-value stores - or at least a column family store).