0

I'm setting up a pipeline in NiFi where I get JSON records which I then use to make a request to an API. The response I get would have both numeric and textual data. I then have to write this data to Hive. I use InferAvroSchema to infer the schema. Some numeric values are signed values like -2.46,-0.1 While inferring the type, the processor considers them as string instead of double or float or decimal type.

I know we can hard code our AVRO schema in the processors but I thought making it more dynamic by utilizing the InferAvroSchema would be even better. Is there any other way we can overcome/resolve this?

Sivaprasanna Sethuraman
  • 4,014
  • 5
  • 31
  • 60

1 Answers1

3

InferAvroSchema is good for guessing an initial schema, but once you need something more specific it is better to remove InferAvroSchema and provide the exact schema you need.

Bryan Bende
  • 18,320
  • 1
  • 28
  • 39
  • Okay. However, it can be done, right? They are using Kite SDK. It's just a fewer lines of code that scans the field values and if the values are signed, and are numeric, it can be inferred as double/decimal, else if it has alphabets, infer it as string. Or is it much more complex than that? – Sivaprasanna Sethuraman Mar 01 '17 at 10:37