I have different instances of MongoDB on the same machine, and all of these are pointing to partitions that are created in memory with Linux, in a sort of:
mount -t ramfs -o size=8000M ramfs /mongo/ramdata<n>/
with this configuration for each <n>
(e.g.: 1) MongoDB instance:
dbpath=/mongo/ramdata1/
nojournal = true
smallFiles = true
noprealloc = true
Those instances have exactly the same data, and I am just using purely MongoDB-Java driver to geo-query those data which are meant to be read-only (no MongoDB-hadoop or Stratio or whatsover).
So at some point I would like that my Spark process to finish with something like:
...foreach(query_a_specific-mongo_instance_for_a_specific_port)
as the MongoDB instances will run at the same address but different ports.
Given that I don't want to create a MongoDB replica set with one or more Mongo-conf instances, is it possible "partition" the process-flow with Spark in a way that, for example, every single "core/partition" point to a specific Mongoldb port?
For example if I have 100 cores, the first "core" will point to mongo-address:30001
, and the 100th core will point to mongo-address:30100
?