3

We are trying to send data from MySQL to elastic(ETL) though Kafka.

In MySQL we have multiple tables which we need to aggregate in specific format than we can send it to elastic search.

For that we used debezium to connect with Mysql and elastic and transformed data through ksql.

we have created streams for both the tables then partition them and create table of one entity but after joining we dint get the data from both the tables.

we are trying to join two tables of Mysql through Ksql and send it to elastic search using debezium.

Table 1: items

table 2 : item_images

CREATE STREAM items_from_debezium (id integer, tenant_id integer, name string, sku string, barcode string, qty integer, type integer, archived integer) 
WITH (KAFKA_TOPIC='mySqlTesting.orderhive.items',VALUE_FORMAT='json');


CREATE STREAM images_from_debezium (id integer,item_id  integer,image string, thumbnail string) 
WITH (KAFKA_TOPIC='mySqlTesting.orderhive.item_images',VALUE_FORMAT='json');


CREATE STREAM items_flat 
WITH (KAFKA_TOPIC='ITEMS_REPART',VALUE_FORMAT='json',PARTITIONS=1) as SELECT * FROM items_from_debezium PARTITION BY id;

CREATE STREAM images_flat 
WITH (KAFKA_TOPIC='IMAGES_REPART',VALUE_FORMAT='json',PARTITIONS=1) as SELECT * FROM images_from_debezium PARTITION BY item_id;

CREATE TABLE item_images (id integer,item_id  integer,image string, thumbnail string) 
WITH (KAFKA_TOPIC='IMAGES_REPART',VALUE_FORMAT='json',KEY='item_id');


SELECT item_images.id,item_images.image,item_images.thumbnail,items_flat.id,items_flat.name,items_flat.sku,items_flat.barcode,items_flat.type,items_flat.archived,items_flat.qty   
FROM items_flat left join item_images on items_flat.id=item_images.item_id 
limit 10;

We are expecting data of both the tables but we are getting null from item_images table.

Matthias J. Sax
  • 59,682
  • 7
  • 117
  • 137
Rahul Kumawat
  • 165
  • 1
  • 11
  • Hi we have also been facing the same issue, in which the table columns in join are null. – Raj Saraogi Aug 28 '19 at 08:18
  • 2
    Is it possible this is a timing issue where one topic may have the data and the other is still in process and that causes your situation where one side of the join is empty? – Naros Sep 03 '19 at 13:48
  • If you run a `SELECT * FROM items_flat LIMIT 5;` and `SELECT * FROM item_images LIMIT 5;` what results do you get? – Robin Moffatt Sep 03 '19 at 13:50
  • also can you edit your question to clarify what the actual result of your `SELECT` with the `JOIN` in is—do you get _no rows_, or do you get _only the rows from `items_flat` and no matches (nulls) from `item_images`_ ? – Robin Moffatt Sep 03 '19 at 13:51

0 Answers0