I don't know about testing for MongoDB, but for PostgreSQL here's what I do.
I follow a pattern while developing against databases that separates the DB side from the app side. For testing the DB side, I have a test schema which includes a single stored procedure that resets all the data in the real schema. This reset is done following the MERGE pattern (delete any records with an unrecognized key, update records that have matching keys but which are changed, and insert missing records). This reset is called before running every unit test. This gives me simple, clear test coverage for stored functions.
For testing code that calls into the database, the database layer is always mocked, so there are never any calls that actually go to the database.
What you are describing suggests to me that you are attempting to mix unit testing with integration testing, and I rather strongly suggest that you don't do that. Integration testing is what happens when you've already proved base functionality and want to prove integration between components and probably also performance, too. For IT, you really need a representative data set on representative hardware. Usually this means a dedicated machine, and using hudson for CI.
The direction you seem to be going in is going to be difficult because, as you've already noticed, it's difficult to handle that volume of data and it's difficult to generate representative data sets (most CI systems actually use production data that's been "cleaned" of sensitive information)
Which is why most of the places I've worked have not gone that way.