0

I'm trying to grab distinct latest data from Elasticsearch server using Logstash, eliminate some fields and zero values, and then insert it into Redis. For the elastic search data, i have a name field, some description fields, and 2 location values -- latitude and longitude. There are other guys that manage the elastic server, so I cannot change any configuration with the Elastic.

I found this Stackoverflow question that I think fits my problem: How to get latest values for each group with an Elasticsearch query?

So I tried using the query in the selected answer in my logstash configuration. Below is my configuration file:

input {
    elasticsearch {
        hosts => "elasticdb"
        size => 1000
        index => "logstash-db"
        query =>'{"aggs":{"group":{"terms":{"field":"name.raw"},"aggs":{"group_docs":{"top_hits":{"size":1,"sort":[{"@timestamp":{"order":"desc"}}]}}}}}}'
    }
}

filter {
        if [latitude] = 0 and [longitude] = 0 {
                drop { }
        }
        mutate {
                remove_field => ["country","message","type","@timestamp","@version"]
        }
}

output {
    redis {
        data_type => "list"
        key => "%{name}"
    }
    stdout {
        codec => rubydebug
    }
}

The elastic query (for better view):

{
    "aggs": {
        "group": {
            "terms": {
                "field": "name.raw"
            },
            "aggs": {
                "group_docs": {
                    "top_hits": {
                        "size": 1,
                        "sort": [{
                                "@timestamp": {
                                    "order": "desc"
                                }
                            }
                        ]
                    }
                }
            }
        }
    }
}

I tried running logstash with above configuration, but logstash keeps sending some rows with exact same data multiple times. Any help?

Community
  • 1
  • 1
Aldibe
  • 1,277
  • 2
  • 11
  • 16
  • You cannot send aggregations queries with the `elasticsearch` input. You can try doing it with the `http_poller` input instead in order to send your aggregation query directly through the REST interface, but you'll have a hard time parsing the response in my opinion. – Val Feb 13 '17 at 07:26
  • for running ad-hoc/ raw queries you can use kibana's dev-tools – user3775217 Feb 13 '17 at 07:32
  • @Val: ah okay... I was hoping i can use it to build a key-value store in redis. do you have any recommendation for tools that I can use to easily do this? p.s: Thank you for answering my past question as well. I really appreciate it. – Aldibe Feb 13 '17 at 07:49

0 Answers0