6

I think I'm really close or either I don't understand what I'm doing...

We use Firestore inside a project and I am writing a custom script to export all the data. But there is an option to export it via gcloud CLI. Which is great, but the data returned is in protocol buffer & binary.

I exported the data via gcloud CLI like this

gcloud beta firestore export gs://[BUCKET_NAME]

Which takes your default project and exports the data in the bucket name you provide. The result is this inside the bucket:

- timestamp folder
  - timestamp.overall_export_metadata file
  - all_namespaces folder
    - all_kinds folder
      - all_namespaces_all_kinds.export_metadata file
      - output-0 file
      - output-1 file
      - ... > up until output-250 file

I download the https://github.com/googleapis/googleapis proto files and tried to decode the outputs with

protoc --decode google.firestore.v1beta1.Document ./google/firestore/v1beta1/document.proto < output-0 (or any other number for that matter)

resulted in: 

Failed to parse input.

What did work were the following to commands:

protoc --decode google.firestore.admin.v1beta1.ExportDocumentsMetadata ./google/firestore/admin/v1beta1/firestore_admin.proto < all_namespaces_all_kinds.export_metadata

resulted in:

start_time {
  nanos: 629329010
  1: "export_entities"
  3: 1565426104489599
}
end_time {
  1: "__all__"
  2: "output-0"
  2: "output-1"
  2: "output-2"
  ... etc, 2 stays the key, output-{number} changes until
  3 {
    1: "__all__"
  }
}

and

protoc --decode_raw < all_namespaces_all_kinds.export_metadata

resulted in

1 {
  1: "export_entities"
  2: 1565426014407794
  3: 1565426104489599
}
2 {
  1: "__all__"
  2: "output-0"
  ... same as before until the 
  3 {
    1: "__all__"
  }
}

I'm missing a piece of the puzzle, because I have the feeling I have to combine all the outputs with the metadata to be able to decode it. But am lost on the how..

Dennis
  • 121
  • 4
  • This is a gateway to loading the Firestore export to the local emulator, I bet. – deepelement Nov 01 '19 at 16:10
  • I found myself up against the same issue. I think there's a couple missing pieces of the puzzle: - I expect there is an additional method used somewhere to convert exports downloaded from storage to their proper protobuf format (a number of files I have can't even decode_raw). - For files that represent a stream of data (e.g. output files) then I think they need to be read in chunks. A good effort has been made to do this here: https://github.com/labbots/firestore-export-json – chrismclarke Aug 01 '21 at 21:25
  • At least now firestore has a local emulator that can process these files for local querying instead – chrismclarke Aug 01 '21 at 21:25
  • output-0 file is levelDB log format not a normal binary. https://firebase.google.com/docs/firestore/manage-data/export-import#export_format_and_metadata_files – Andy May 19 '23 at 11:35

0 Answers0