0

Not sure whether this valid question or not...

Requrement - I am going to write an application which captures huge data from External REST endpoint, I want to use MLCP to store that stream of data coming from External REST Endpoint to MarkLogic.

is it possible using MLCP ?

Please give your solutions.

DevNinja
  • 1,459
  • 7
  • 10

2 Answers2

3

DMSDK (the Data Movement SDK) might help to meet your requirements:

http://docs.marklogic.com/guide/java/data-movement

ehennum
  • 7,295
  • 13
  • 9
2

If by "stream" you mean unbounded in space and time, and by "huge" you mean multi GB+, then no MLCP is not the right choice, or is not sufficient. MLCP is a command line 'batch' program, you need to have all your data already stored locally before starting it, its not 'streaming' in this sense.

In any case you will need to split up your data before sending to MarkLogic -- ideally chunks (documents) < 100MB (not a magic number, just a good upper bound). So your streaming code needs to read data, buffer it, split it into 'chunks' then send to ML. Once in 'chunks' then any API to ML will work, including MLCP. There are performance and usability tradeoffs between the different APIs' -- I'll leave that for another discussion.

Mads Hansen
  • 63,927
  • 12
  • 112
  • 147
DALDEI
  • 3,722
  • 13
  • 9