27

I cannot replicate between two couchdb servers, so I would like to dump to file from one server and load from file into the other server.

I used this statement to dump and it worked fine:

curl -X GET http://localhost:5984/<DATABASE_NAME>/_all_docs?include_docs=true > FILE.txt

But when I used this statement to load:

curl -d @FILE.txt -H “Content-Type: application/json” -X POST http://localhost:5984/<DATABASE_NAME>/_bulk_docs

it failed like this:

curl: (6) Could not resolve host: application; Host not found {"error":"bad_content_type","reason":"Content-Type must be application/json"}

Any ideas?

eriq
  • 1,524
  • 3
  • 13
  • 22

6 Answers6

15

As said, you should use the " and not the as argument of the -H option

If you are a Linux or MacOSX user you can use the couchdb-dump tool, which basically works on bash shell.

It dumps the database on a local file (ASCII text file), formatted as requested by http://wiki.apache.org/couchdb/HTTP_Bulk_Document_API

Then you can restore it with the bulk document upload or with the couchdb-dump restore tool included in the package.

This is the link to the tool: https://github.com/animamea/couchdb-dump

But you can find other tools also:

https://github.com/stockr-labs/couchdbdump

https://github.com/zebooka/couchdb-dump

Daniele B
  • 3,117
  • 2
  • 23
  • 46
10

You can use the following command line to convert the output of the curl command to the “docs” structure that the _bulk_docs requires:

curl -X GET 'http://localhost:5984/mydatabase/_all_docs?include_docs=true' | jq '{"docs": [.rows[].doc]}' | jq 'del(.docs[]._rev)' > db.json

jq is the name of an excellent command line processor very useful (i.e. in this situation).

Hope it helps.

Roberto
  • 374
  • 6
  • 9
  • Your example using jq was very helpful, thank you! I also found one of my files was too large to load, and chunking using jq did the trick: `cat db.json | jq '{"docs": .docs[0:30000]}' > db.0-30000.json; cat db.json | jq '{"docs": .docs[30000:]}' > db.30000-.json` – Michael Allan Jackson Jun 15 '16 at 15:14
  • Is there a way to include the revisions? – Michael Allan Jackson Jun 15 '16 at 21:51
  • Yes, of course, just drop the `| jq 'del(.docs[]._rev)'` part that removes the revisions. Is that what you need? Note that you wont be able to restore the file if you include the revisions, AFAIK. – Roberto Jun 16 '16 at 18:40
  • I tried that and many of the records get an error `"error":"conflict","reason":"Document update conflict."` as described here: https://wiki.apache.org/couchdb/HTTP_Bulk_Document_API . I also tried adding `"new_edits": false` field described in that link, but in that case zero records were loaded without error. – Michael Allan Jackson Jun 16 '16 at 18:56
  • Sorry Michael, I didn't realize you're talking about the import, not the export. That's exactly the reason why I removed the `_rev` fields in the first place, because of the document conflict. I don't know very much about the way MVCC (multiversion version concurrent control) is handled in CouchDB, so I can't help too much here as I haven't got the need to store more than the current version of the documents. Regards – Roberto Jun 18 '16 at 11:44
7

The reason for your actual error is that you're using instead of plain ASCII " around your -H argument on the command line.

However, the real solution here is to just copy the <DATABASE_NAME>.couch file from the /path/to/var/lib/couchdb directory from one server to the other.

smathy
  • 26,283
  • 5
  • 48
  • 68
  • I tried to copy the *.couch files first, but after tying to access the database entries, I received the error, that the database was build with a wrong Erlang version. – eriq Jul 25 '12 at 17:06
  • 1
    sure! How do I accept? Never the less it was a good and correct answer. I'm obviously not too familiar with this system. – eriq Aug 13 '12 at 22:11
2

As alternative solution, you may use couchdb-load and couchdb-dump utilities from couchdb-python project.

Jonathan Hall
  • 75,165
  • 16
  • 143
  • 189
Kxepal
  • 4,659
  • 1
  • 19
  • 16
2

Nolan from the PouchDB team makes some great tools. These will work well to dump and load from CouchDB (including attachments):

Dump/Backup:

https://github.com/nolanlawson/pouchdb-dump-cli

Load/Restore:

https://github.com/nolanlawson/pouchdb-load

PaulMest
  • 12,925
  • 7
  • 53
  • 50
1

There's also github.com/danielebailo/couchdb-dump , which might help to clean out old transients, the authors state:

We've seen 15GB database files, containing only 2.1GB of raw JSON, reduced to 2.5GB on disk after import!

There are also hints on how update_seq works if you want do sync clusters.