I use DSE Graph Loader reading input files from Hadoop Distributed File Systems.
I would like to insert the data into dse graph cluster(on multiple machines) in a distributed way.How can It be done?
I use DSE Graph Loader reading input files from Hadoop Distributed File Systems.
I would like to insert the data into dse graph cluster(on multiple machines) in a distributed way.How can It be done?
The DSE Graph Loader is a command line utility which supports loading data from many sources including CSV, text, JSON, Gryo, HDFS and AWS S3 sources. It cannot be run as a Hadoop/Spark job.
To parallelize the injest with multiple threads, configure the parameter load_threads (default 1). Documentation can be found here: Configuring DSE Graph Loader