Questions tagged [mlcp]

MarkLogic Content Pump is an open-source, Java-based command-line tool (mlcp). mlcp provides the fastest way to import, export, and copy data to or from MarkLogic databases. It is designed for integration and automation in existing workflows and scripts.

https://developer.marklogic.com/products/mlcp

User Guide

https://docs.marklogic.com/guide/mlcp

Features

Content Pump can:

  • Bulk load billions of local files
  • Split and load large, aggregate XML files or delimited text
  • Bulk load billions of triples or quads from RDF files
  • Archive and restore database contents across environments
  • Copy subsets of data between databases
  • Load documents from HDFS, including Hadoop SequenceFiles

Data sources and destinations

Content Pump supports moving data between a MarkLogic database and any of the following:

  • Local filesystem
  • HDFS
  • MarkLogic archive
  • Another MarkLogic database

Formats

Content Pump supports

  • XML, JSON, text, binary files
  • RDF encoded in RDF/XML, Turtle, RDF/JSON, N3, N-Triples, N-Quads, or TriG serialization formats
  • Compressed files and archives (ZIP, GZIP)
  • MarkLogic archive, which includes both content and metadata (e.g., permissions and properties)
  • Delimited text (e.g., CSV) (import only)
  • Temporal Documents
  • Hadoop SequenceFiles

Getting Started with MLCP

You may find this free online training course helpful.

To get started moving data with mlcp, download and unpack the binaries. For those interested in hacking or look at the internals, you can also download the Apache 2.0 licensed source.

To create your first import script make sure you have an XDBC server attached to your database (running on port 8006, for example, below). From the command line, run the following, substituting your particulars.

156 questions
2
votes
2 answers

Marklogic - Delete Versioned Collections

I have around 43 million documents which is having the latest versioned document in LIVE collection and also have same versioned document in another version collection named as (/collection/versionNumber). I want to delete the versioned collections…
2
votes
0 answers

MLCP Copy command with redaction getting timed out

ML version used: 9.0-10.4 Running the MLCP COPY command on large data set (39753201 docs). On running the command getting the below error. 2020-07-29 20:38:09 WARN NativeCodeLoader:62 - Unable to load native-hadoop library for your platform...…
Dixit Singla
  • 2,540
  • 3
  • 24
  • 39
2
votes
1 answer

Error parsing HTTP headers exception while copying data using MLCP

I am trying to copy large set of data from one database to other database using MLCP but I am getting following exception. 2019-08-30 11:53:54.847 SEVERE [15] (StreamingResultSequence.next): IOException instantiating ResultItem 130891: Error parsing…
DevNinja
  • 1,459
  • 7
  • 10
2
votes
0 answers

ERROR mapreduce.ContentWriter: XDMP-NOTXN: No transaction with identifier 1796851315598328505 MLCP

I got an error while using mlcp. I use mlcp in my load balancer so i just have one ip and at the back is i have 8 nodes. But if i just connect it to one node the ingestion is successful. Please help... Connect to mklogic-ed03 18/12/13 08:05:40…
Falcon Ryu
  • 475
  • 1
  • 6
  • 17
2
votes
1 answer

Challenges for ingesting Raw data into Marklogic using MLCP

I want to some RAW data into marklogic using MLCP but by data is in the form like this Informatio#data1 #data2#data3#data4 #data5 Informatio#data10 #data6#data7#data8 #data9 The challenges for sending this data into ML 9…
Private
  • 1,661
  • 1
  • 20
  • 51
2
votes
1 answer

Default URI Replace

I am picking up a file from a folder specified using Camel File Component and mlcp automatically injects the filename to the default URI and i dont want the filename When i put the file in D:/Camel with file named test_1.xml mlcp produces a URI…
Vikram
  • 635
  • 1
  • 9
  • 29
2
votes
1 answer

how to set multiple permission in MLCP -option_permissions

I use MLCP IMPORT option parameter for permissions. But I getting an error at permissions. If I put "read" it works. However MLCP doesn't work if I combine "(read,update)". Please give me some hints. Thanks in advance.…
thichxai
  • 1,073
  • 1
  • 7
  • 16
2
votes
4 answers

MarkLogic - S3 Import

Can we import data from Amazon S3 into MarkLogic using JavaScript/xQuery API MarkLogic Content Pump Any other way? Please share the reference, if available.
blackzero
  • 78
  • 7
2
votes
1 answer

Partial document transfer when separated in batches by MLCP

While using MLCP I encountered a strange issue with '-batch_size' option given in the options file(options.txt) when copying documents from one database to another, for example if -batch_size = 10 and the number of documents to be transferred(on…
2
votes
2 answers

Check for null/blank while inserting into MarkLogic DB using MLCP

I am exploring MarkLogic database and have been trying to import data into it by using MarkLogic content pump. Here is the gist of the csv file. firstname, middlename, lastname, address1, address2, city, state, zip, country Rajath,,A,No 20 GN,16th…
DMA
  • 1,033
  • 1
  • 11
  • 22
2
votes
1 answer

XDMP-NEWSTAMP error when loading data with MLCP

I have a database attached to 4 forests and I want to create a change document in MarkLgic for every time any value in the document changes. The change document should contain the date of change, old value, and new value. I was able to accomplish…
2
votes
1 answer

Marklogic Encoding Insertion using MLCP

I have inserted following XML content with "’" in the content to the MarkLogic server using XQuery. XML content debtor’s Insert XQuery used xdmp:document-load("C:/a.xml",
Antony
  • 966
  • 8
  • 19
2
votes
3 answers

MarkLogic content pump (MLCP) with GUI

I've tried using the mlcp pump through the terminal with ease following https://docs.marklogic.com/guide/ingestion/content-pump but I have no clue on how to implement the mlcp function with a user interface in a website. I've searched the whole…
Olivia A.
  • 41
  • 3
2
votes
2 answers

MLCP delimited file

I try to load data. It's not working. What I have tried: multiple delimiters, all fields with quotes, all fields without, leaving headers out of the data, no delimiter option in mlcp, other delimiter options in mlcp, other computer, other ML8…
Thijs
  • 1,423
  • 15
  • 38
2
votes
1 answer

Unknown content type: json How to load JSON document as XML in MarkLogic 8

I am trying to load a bunch of JSON files into MarkLogic 8 using MLCP and a basic transform script on ingest. I can load the files as-is, I get JSON objects in ML. What I want is to transform, on ingestion, from JSON to XML, so I wrote a basic…
Hugo Koopmans
  • 1,349
  • 1
  • 15
  • 27
1 2
3
10 11