0

We are logging Restify requests in ElasticSearch using Logstash and Bunyan. However, when including the JSON body in logging the index gets conflicts because fields with the same name sometimes have different types.

One example is when req.body is sometimes a string and sometimes an object. We've worked around that by always setting body to an object (since our restify API is not supposed to receive strings for any valid requests).

However the problem keeps occurring for fields in the body object. We can't really control what a client sends in, and if a request contains a string where a number is expected the elasticsearch index has already typed that field to number.

Is there any way of continually fixing this apart from checking and potentially replacing every field posted in the body? Converting the body from an object to a string before logging it would work, but that would seriously reduce its usability in Kibana when making visualizations.

Anders Bornholm
  • 1,326
  • 7
  • 18

1 Answers1

0

If a field needs to contain both a string or a number, you'll have to define it as a string. Otherwise, elasticsearch will just gladly drop the event when there's a mismatch.

Elastic is planning to add a "dead-letter" function to logstash, so you could keep the field as a number and have any events that came through as a string be directed there instead of elasticsearch.

To keep elasticsearch from making the field a number based on the first data received, you may want to disable dynamic mapping or setup a template to better help with the mapping.

Alain Collins
  • 16,268
  • 2
  • 32
  • 55
  • The problem is I have to predefine every field that could change type then? – Anders Bornholm Feb 20 '16 at 18:18
  • If you're lucky, there's a pattern to the field name? (all "foo_" should be int?) – Alain Collins Feb 20 '16 at 21:14
  • No such pattern unfortunately. And I don't see a reasonable way around it since one reason for logging is tracking failed requests. For instance those caused by submitting values of the wrong types... Catch 22 this... – Anders Bornholm Feb 20 '16 at 22:36
  • How about mapping all input as strings and having logstash create a multi_field for the numeric representation if it considers the field to be a number? Sort of like ".raw". https://www.elastic.co/guide/en/elasticsearch/reference/current/_multi_fields.html – Alain Collins Feb 21 '16 at 15:44
  • Interesting, I'll look into that – Anders Bornholm Feb 22 '16 at 20:22