Assuming I've a db like the quick-start of https://graphql.dgraph.io/docs/quick-start/
i.e.
type Product {
productID: ID!
name: String @search(by: [term])
reviews: [Review] @hasInverse(field: about)
}
type Customer {
custID: ID!
name: String @search(by: [hash, regexp])
reviews: [Review] @hasInverse(field: by)
}
type Review {
id: ID!
about: Product! @hasInverse(field: reviews)
by: Customer! @hasInverse(field: reviews)
comment: String @search(by: [fulltext])
rating: Int @search
}
Now I would like to import millions of entries and therefore would like to use the bulk loader. My dataset is a bug folder full of .json
files.
To what I've seen, I should be able to run a command like
dgraph bulk -f folderOfJsonFiles -s goldendata.schema --map_shards=4 --reduce_shards=2 --http localhost:8000 --zero=localhost:5080
But to run my server, I am using the dgraph/standalone:graphql
image ran docker run -v $(pwd):/dgraph -p 9000:9000 -it dgraph/standalone:graphql
Now how to start the bulk import ?
1:
Should I run the command within the docker container itself (and share the volume (folder) containing all my .json
files ) or install dgraph on my host and run the dgraph bulk
command from the host ?
2: What should be the format of the .json
files ?
3: Would the bulk loader support blank nodes (id which are not _:0x1234
) ?
[edit]
- bulk loader seems not to support graphql schema, the schema should be converted to rdf first. To achieve this, I exported the schema and data right after importing the graphql schema
curl 'localhost:8080/admin/export?format=json'