2

I want to some RAW data into marklogic using MLCP but by data is in the form like this

Informatio#data1      #data2#data3#data4     #data5   
Informatio#data10      #data6#data7#data8     #data9  

The challenges for sending this data into ML 9 using MLCP are

  • First there is no column names in first row , Usually when using mlcp the first row is become column name for the below respective columns . Rather than having column names in the first row is there any way to pass them into marklogic.
  • Second, Since the first column is same value. when generating URI's the first column name will be taken so the data ingested into ML were overwritten . In my csv file there is no unique column values so i dont know how can i generate unique URI'S for the documents .

Any help is appreciated

Thanks

Dave Cassel
  • 8,352
  • 20
  • 38
Private
  • 1,661
  • 1
  • 20
  • 51

1 Answers1

3
  1. The MLCP command requires that delimited text files start with a header line. Add this as part of your pre-processing using your favourite scripting language.
  2. The command line switch -delimited_uri_id can change to another column for the ID generation.

Other interesting ideas that may be useful:

  • Let MarkLogic create unique IDs (another command line switch)
  • Use a transformation on input to generate a more specific URI - maybe from a compound key.

For reference: https://docs.marklogic.com/6.0/guide/ingestion/content-pump#id_70366