Sink processed stream data into a database using Apache-flink

Question

Is it possible to sink processed stream data into a database using pyflink? All methods to write processed data are limited to save them in the txt, csv or Json formats and there is no way to sink data with database.

for Java you can use the JDBC connector: https://ci.apache.org/projects/flink/flink-docs-stable/dev/connectors/jdbc.html#jdbc-connector . I am not sure for python, You can use the data stream API for Python (https://flink.apache.org/2020/04/09/pyflink-udf-support-flink.html) and try to implement your sink. — Felipe, Jul 27 '20 at 09:17
I have never worked with Flink using the Python API, but I would start using the `write_to_socket()` that is a sink (https://ci.apache.org/projects/flink/flink-docs-release-1.6/dev/stream/python.html#data-sinks). I am not sure which version you are using and if this method changed for your version. — Felipe, Jul 27 '20 at 10:01

score 2 · Answer 1 · answered Jul 30 '20 at 20:04

2

You could use SQL DDL within pyflink to define a JDBC table sink that you can then insert into. That will look something like this

my_sink_ddl = """
CREATE TABLE MyUserTable (
  id BIGINT,
  name STRING,
  age INT,
  status BOOLEAN,
  PRIMARY KEY (id) NOT ENFORCED
) WITH (
   'connector' = 'jdbc',
   'url' = 'jdbc:mysql://localhost:3306/mydatabase',
   'table-name' = 'users'
);
"""

t_env.sql_update(my_sink_ddl)

answered Jul 30 '20 at 20:04

David Anderson

39,434
4
33
60

I don't understand the point about "where they should be placed". A sink is always a terminal node in the dataflow DAG. But no, there is no Flink connector for MongoDB. – David Anderson Jul 31 '20 at 06:11

Sink processed stream data into a database using Apache-flink

1 Answers1