0

We have two applications

  1. app 1 creates two tables ie sourceTable and targetTable1. It does select * from sourceTable to targetTable1. This are both created on kafka topics
  2. App2 creates a sourceTable2 on kafka topic of targetTable1 created above. It then creates another targetTable2. Does the same query ie select * from targetTable1 to targetTable2

This are written as sql queries in zeppline notebooks in aws kinesis data analytics. We build and deploy thru zeppline and it works fine

After that we take the python code from the build package in s3. Do the build and deployment thru our CI/CD. The app1 works fine , but app2 throws a error saying object targetTable1 is not found. This table in created in another app. Not sure why it works when deployed thru zeppline but not thru our CI/CD. Any idea ?

We tried creating the targetTable1 again in app2 and doing deployment . It worked fine. Problem is only when we have to access table created in another app.

Kapil More
  • 11
  • 2

1 Answers1

0

In the case where you are deploying app1 and app2, targetTable1 was defined by app1 and stored in its temporary, in-memory catalog. app2 doesn't have access to this catalog.

You could store targetTable1 in a persistent, external catalog (I believe zeppelin uses hive for this), or you could have app2 also create targetTable1. This is workable because targetTable1 is just a description of how Flink SQL should interpret the data in some underlying kafka topic. Both app1 and app2 need this metadata describing how to work with the external data coming from Kafka. They can either share this description through an external catalog, or they can each have their own copy of the same table description.

David Anderson
  • 39,434
  • 4
  • 33
  • 60