2

I am working on an use case where I have to import external Kafka topic metadata into the apache atlas. I have few queries to be clarified which are listed below:

  1. Is it possible to import topic metadata from an external Kafka which is not the one used for atlas notification purpose? If possible, how?
  2. How to make Kafka-metadata update automatic similar to Hive or Hbase instead of manually running import script every time?
  3. There is no lineage data for imported topics. At what cases the lineage data is captured for a topic?
  4. Since there is only one Kafka related entity "kafka_topic", will there be no relationship data at all?
  5. At what cases the audits got captured for the topics?
Mangai
  • 95
  • 2
  • 6

1 Answers1

-1

I'm also working on something similar on external Kafka topic and Atlas, and have almost same questions of yours.

To your 3rd question, I think part of the reasons that there is no Kafka topic lineage graph is because Kafka is simply a messaging bus. Kafka messages are immutable, so there is no DML like HBase or Hive, even though in HBase the tables are updated by "version" upon the same row key.

In each Kafka topic, there is a retention period setting, by default 7 days, in which no matter the expired messages of the topic have been consumed or not, the expired messages will be removed from the log. Based on that, there is little value to monitor "deleted" messages.

After all, Kafka's main role is a messaging vehicle to deliver messages from source to destination. It could cache messages temporarily, though, it's not the same as a database. I'm not very positive about using a shipping carrier to do a warehouse's job.

D Emma
  • 317
  • 1
  • 3
  • 12