Logstash kv filter

Question

I have a file with the following format:

10302\t<document>.....</document>   
12303\t<document>.....</document>   
10054\t<document>.....</document>   
10034\t<document>.....</document>

as you can see there are two values separated by a tab char. I need to

index the first token (e.g. 10302, 12303...) as ID
extract (and then index) some information from the second token (the XML document). In other words, the second token would be used with the xml filter for extracting some information

Is it possibile to do that separating the two values using the kv filter? Ideally I should end, for each line, with a document like this:

id:10302       
msg:<document>....</document>

I could use a grok filter but I'd like to avoid any regex as the field detection is very easy and can be accomplished with a simple key-value logic. However, using a plain kv detection I'm ending with the following:

"10302": <document>.....</document>   
"12303": <document>.....</document>   
"10054": <document>.....</document>   
"10034": <document>.....</document>

and this is not want I need.

I do not have it because I don t know how to say "take the key and create and attribute id with that key as value, then take the value and create an attribute message with that value" — Andrea, Nov 02 '16 at 17:50
Ok. I don't think it's possible to use kv for the job you want to do, since there are no possible key for the id (10302, 10303, 10304...). But grok would be perfectly workable with `%{INT:ID}\t%{GREEDYDATA:msg}` — baudsp, Nov 03 '16 at 08:49
Many thanks, I think I came to the same conclusion. If you put your comment in an answer I will accept it. — Andrea, Nov 03 '16 at 10:10
You're welcome. I added an answer, with an additional anchor (`^`) in the regex for better performance (in theory) — baudsp, Nov 03 '16 at 10:23

score 0 · Accepted Answer · answered Nov 03 '16 at 10:19

It is not possible to use kv for the job you want to do, as far as I know, since there are no possible key for the id (10302, 10303, 10304...). There are no possible key since there is nothing before the id.

This grok configuration would work, assuming each id + document is on the same line :

grok {
  match => { "message" => "^%{INT:ID}\t%{GREEDYDATA:msg}"}
}

Logstash kv filter

1 Answers1