During the development of our map-reduce jobs our MR code generates useful diagnostic data structures independently of the data being map-reduced. Is there an easy way to get these data out to the code that called mapReduce or to persist them in Mongo? Just writing to the log file is turning out to be very sub-optimal as (a) there is a lot of data there already and (b) our diagnostic info is highly structured and, in fact, we'd like to run queries against it.
My investigation so far suggests that MR data structures are passed by value (via serialization) so any in-memory data structures are lost, including those hooked to the "global" scope. The namespaces are isolated from the main JS server-side namespace so dbeval
can't seem to reach them (or, at least, I don't know where to look). Last but not least, although all the database objects and functions are present, 10gen is generating (confusing) error messages to prevent their use, e.g., about coll.insert
not being a function while typeof coll.insert === 'function'
is true
.
To be clear, I'm interested in doing this for development in a single node, because the logging/debugging support in MongoDB is pretty limited. This type of side-effects are not good in production environments.