0

I have the following scenario, 1 Apache Kafka topic where multiple types of events are pushed in. Druid will pick up form this topic and aggregate based on timestamp.

Say for example the below are the messages in kafka topic,

type 1,

{"timestamp" : "07-08-2016", "service" : "signup", "no_of_events" : 8}

{"timestamp" : "08-08-2016", "service" : "signup", "no_of_events" : 10}

type 2,

{"timestamp" : "08-08-2016", "user" : "xyz", "no_of_events" : 3}

{"timestamp" : "08-08-2016", "user" : "abc", "no_of_events" : 2}

Q1: Can I write two parsers within the same spec file pointing to the events from the same topic? If yes, what will be the structure of the spec file?

Any other suggestion on the design is welcome :)

Q2: Also to understand better, is it possible to have multiple datasources within in the spec file?

Thanks in advance!!

Krishna
  • 3
  • 4

1 Answers1

0

Q2: yes you can definitely have two datasources within the same spec file. You can just list them out in the spec file as an array under the "dataSources" attribute:

"dataSources" : [
  {
  "spec" : {
    "dataSchema" : {
      "dataSource" : "Data Souce1"
      ...other stuff
      }
    }
  },
  {
  "spec" : {
    "dataSchema" : {
      "dataSource" : "Data Source 2"
      ...other stuff
    }
  },

Q1: Do you wan't two different datasources pointing to the same Kafka topic? I haven't tried but I'm pretty sure you could do that - they are specified within the "Properties" section of the data source spec:

"dataSources" : [
  {
  "spec" : {
    "dataSchema" : {
      "dataSource" : "Data Souce1"
      ...other stuff
      }
    }
    "properties" : {
        "topicPattern.priority" : "1",
        "topicPattern" : "kafka_topic"
    }
  },
{
  "spec" : {
    "dataSchema" : {
      "dataSource" : "Data Souce1"
      ...other stuff
      }
    }
    "properties" : {
        "topicPattern.priority" : "1",
        "topicPattern" : "kafka_topic"
    }
}
Simon D
  • 5,730
  • 2
  • 17
  • 31