0

Is there a way to copy/move data from Marklogic Server to Amazon S3? I don't want all data to be moved, but certain documents pertaining to a particular collection or some other logic. I can do xdmp:save() and that works for few thousand documents, but I have got few million records and this method won't work out well in that case, is there a better and robust way that can be used to copy data over? Can I use MLCP for this or use a spawn module to run it over a task server and get this work done? I am running on ML-8 hosted on AWS.

Any suggestion would help immensely.

Regards Amit

Amit Gope
  • 120
  • 1
  • 10
  • See similar question [Accessing S3](https://stackoverflow.com/questions/37554370/xquery-api-to-upload-data-from-marklogic-to-amazon-s3) – DALDEI Dec 23 '17 at 16:29

4 Answers4

1

I would use Corb2 to facilitate the xdmp:save() command since s3:// is a built-in file-system. Any solution with MLCP would suffer more data transfer and I am not sure of the value unless you also want an archive (which is a valid point if you want to preserve properties, permissions, collections, etc)

Second to that - I have never done it, but I understand that you can use S3 as the location of a forest. In that case, you could balance certain documents to a forest located on S3.

1

you can use the backup feature and set the target directory to s3://bucket/path

DALDEI
  • 3,722
  • 13
  • 9
0

Retrieve the documents from MarkLogic using REST API and pipe the output to aws command to upload to AWS S3 bucket:

curl --anyauth --user user:password -X GET -H "Content-type: application/xml" http://localhost:8052/LATEST/documents?uri=/docs/test.xml | aws s3 cp - s3://yourbucket/test.xml
mg_kedzie
  • 337
  • 1
  • 9
0

I used mlcp export for making the change, and it works quite well with the collection filter and does the trick for me. I have not tried the CORB2 yet, but will give it a try as well when time permits

mlcp export -host {host} -port {port} -username {username} -password {password} -output_file_path {S3 path} -collection_filter {collection name to be moved}

Amit Gope
  • 120
  • 1
  • 10