NebulaGraph Database: How to write data with spark-connector in pyspark?

Question

In the example, I saw the way of writing data in scala. Is there a way to write nebulagraph data in python?

/spark/bin/pyspark --driver-class-path nebula-spark-connector-3.0.0.jar --jars nebula-spark-connector-3.0.0.jar

df = spark.read.format(
  "com.vesoft.nebula.connector.NebulaDataSource").option(
    "type", "vertex").option(
    "spaceName", "basketballplayer").option(
    "label", "player").option(
    "returnCols", "name,age").option(
    "metaAddress", "metad0:9559").option(
    "partitionNumber", 1).load()

score 1 · Accepted Answer · answered Oct 02 '22 at 15:19

It seems that pyspark is already supported by nebula-spark-connector. The related issue has been addressed and closed on Github Issue #19.

If you search "pyspark" on the Github project README, you'll see some examples.
Just make sure that you set the paths to the spark-connector jar file in SparkConf before starting your spark application.

An example taken from the README:

df.write.format("com.vesoft.nebula.connector.NebulaDataSource").option(
    "type", "vertex").option(
    "spaceName", "basketballplayer").option(
    "label", "player").option(
    "vidPolicy", "").option(
    "vertexField", "_vertexId").option(
    "batch", 1).option(
    "metaAddress", "metad0:9559").option(
    "graphAddress", "graphd1:9669").option(
    "passwd", "nebula").option(
    "user", "root").save()

NebulaGraph Database: How to write data with spark-connector in pyspark?

1 Answers1