2

Is it possible to sink processed stream data into a database using pyflink? All methods to write processed data are limited to save them in the txt, csv or Json formats and there is no way to sink data with database.

  • for Java you can use the JDBC connector: https://ci.apache.org/projects/flink/flink-docs-stable/dev/connectors/jdbc.html#jdbc-connector . I am not sure for python, You can use the data stream API for Python (https://flink.apache.org/2020/04/09/pyflink-udf-support-flink.html) and try to implement your sink. – Felipe Jul 27 '20 at 09:17
  • Could you please share a solution to implement custom sink? – hamid darabian Jul 27 '20 at 09:21
  • I have never worked with Flink using the Python API, but I would start using the `write_to_socket()` that is a sink (https://ci.apache.org/projects/flink/flink-docs-release-1.6/dev/stream/python.html#data-sinks). I am not sure which version you are using and if this method changed for your version. – Felipe Jul 27 '20 at 10:01

1 Answers1

2

You could use SQL DDL within pyflink to define a JDBC table sink that you can then insert into. That will look something like this

my_sink_ddl = """
CREATE TABLE MyUserTable (
  id BIGINT,
  name STRING,
  age INT,
  status BOOLEAN,
  PRIMARY KEY (id) NOT ENFORCED
) WITH (
   'connector' = 'jdbc',
   'url' = 'jdbc:mysql://localhost:3306/mydatabase',
   'table-name' = 'users'
);
"""

t_env.sql_update(my_sink_ddl)
David Anderson
  • 39,434
  • 4
  • 33
  • 60
  • I don't understand the point about "where they should be placed". A sink is always a terminal node in the dataflow DAG. But no, there is no Flink connector for MongoDB. – David Anderson Jul 31 '20 at 06:11