3

On a remote server I have a file which is the mongodump output compressed, let me say a file called mongodb.tar.gz.

Inside mongodb.tar.gz there is a directory structure like this:

dump/dbname/
dump/dbname/blogs.bson
dump/dbname/blogs.metadata.json
dump/dbname/editors_choice.bson
dump/dbname/editors_choice.metadata.json
...

Is there anyway to restore this dump without download and uncompress entire file locally?

I mean something like:

curl http://remoteserver/mongodb.tar.gz | gunzip | mongorestore -d dbname 
freedev
  • 323
  • 5
  • 17

4 Answers4

3

You can only pipe compressed files which contain one collection.

You could do:

curl http://remoteserver/mongodb.collection.gz | gunzip -c | mongorestore -d dbname -c collectionname - 

The -c gunzip option is needed so it writes to stdout and the last -so mongorestore expects input from stdin.

Tested with version 3.0.7 (doesn't work with v2.6.4).

nessa.gp
  • 131
  • 2
2

At the moment, this is not possible, at least not without writing something yourself. The feature has been requested as SERVER-4345 and SERVER-5190 but there are several issues with an immediate implementation based on how the current tools work (i.e. it is not simple to do).

Adam C
  • 5,222
  • 2
  • 30
  • 52
1

Although only a partial answer, you could use fuse to mount the .tar.gz file after downloading it.

Seeking a direct answer to the other part, I asked question 730494.

Jason R. Coombs
  • 999
  • 1
  • 10
  • 18
0

Well I did it and it wasn't pretty. What I did was first extract only the metadatas from the tarball since they couldn't be directly piped into the mongorestore command which accepts only BSON.

After extracting the metadata I ran two restores: first the normal mongorestore with the folder as a parameter to restore the metadata.

Then in the second restore I read the file names of the BSON files from a file I had created earlier and for each file I untarred it to STDIN and piped the result to mongorestore. Yes it was messy but hey, it works!

To see the abomination in its full glory here's the repo: https://github.com/datascienceproject2019-codescoop/codescoop-models

And here's the script https://github.com/datascienceproject2019-codescoop/codescoop-models/blob/master/commands.sh

The restore script is in a different file since piping to docker exec is too difficult: https://github.com/datascienceproject2019-codescoop/codescoop-models/blob/master/gh_mongo_scripts/restore.sh

I used Mongo 4.0.6

EDIT: Buuut it is a lot slower to use streams than to just read from the extracted files. So I probably did this all for nothing since extracting temporarily 26 GBs of extra files isn't that big of a deal. Oh well.

TeemuK
  • 101
  • 2