0

I'm trying to process considerable number of records using cloud dataflow. My source is google cloud storage and my sink is cloud SQL(MySQL). I have the following code to write to the sink(Cloud SQL).

p.apply()
....
.withDataSourceConfiguration(JdbcIO.DataSourceConfiguration.create(
                                "com.mysql.cj.jdbc.Driver", "jdbc:mysql://google/<DBNAME>?cloudSqlInstance=<INSTANCE_NAME>&socketFactory=com.google.cloud.sql.mysql.SocketFactory&user=<USERNAME>&password=<PASSWORD>&useSSL=false"
                            )
                        )

The above works fine when I run the pipeline using DirectRunner. But it throws a NullPointer Exception when run on a DataflowRunner. The exception is as follows:

java.lang.NullPointerException
    at org.apache.beam.sdk.io.jdbc.JdbcIO$Write$WriteFn.executeBatch(JdbcIO.java:775)
    at org.apache.beam.sdk.io.jdbc.JdbcIO$Write$WriteFn.finishBundle(JdbcIO.java:755)

Beam Version = 2.16.0, 2.15.0 - tried both versions but failed. Any reason why this happens? What's the solution to make it work with DataflowRunner?

bigbounty
  • 16,526
  • 5
  • 37
  • 65

1 Answers1

0

I have resolved this. The NUllPointerException is for another reason in the finishBundle

bigbounty
  • 16,526
  • 5
  • 37
  • 65