1

I know that there are several alternative elasticsearch clients for python beyond this one. However, I do not have access to those. How can I write a query that has a 'less than or equal' logic for a timestamp? My current way of doing this is:

query = group_id:" + gid + '" AND data_model.fields.price:' + price
less_than_time = # datetime object
data = self.es.search(index=self.es_index, q=query, size=searchsize)
hits = data['hits']['hits']
results = []
for hit in hits:
    time = datetime.strptime(hit['_source']['data_model']['utc_time'], time_format)
    dt = abs(time - less_than_time).seconds
    if dt <= 0:
        results.append(hit)

This is a really clumsy way of doing it. Is there a way I can keep my query generation using strings and include a range?

Hal T
  • 527
  • 7
  • 22
  • Do you want only the query? I have a little script that generates a valid query (for me) that I use with es2csv. It generates a correct json query – pandaadb Jan 31 '17 at 16:31
  • @pandaadb yes, I only need code to generate the query. Basically, the ability to generate queries based on 'lte' or 'gte' conditionals. – Hal T Jan 31 '17 at 18:35

2 Answers2

0

I have a little script that generates a query for me. The query however is in the json notation (which I believe the client can use).

here's my script:

#!/usr/bin/python

from datetime import datetime
import sys

RANGE = '"range":{"@timestamp":{"gte":"%s","lt":"%s"}}'
QUERY = '{"query":{"bool":{"must":[{"prefix": {"myType":"test"}},{%s}]}}}'

if __name__ == "__main__":
    if len(sys.argv) < 3:
        print "\nERROR: 2 Date arguments needed: From and To, for example:\n\n./range_query.py 2016-08-10T00:00:00.000Z 2016-08-10T00:00:00.000Z\n\n"
        sys.exit(1)
    try:
        date1 = datetime.strptime(sys.argv[1], "%Y-%m-%dT%H:%M:%S.%fZ")
        date2 = datetime.strptime(sys.argv[2], "%Y-%m-%dT%H:%M:%S.%fZ")

    except:
        print "\nERROR: Invalid dates. From: %s, To: %s" %(sys.argv[1], sys.argv[2]) + "\n\nValid date format: %Y-%m-%dT%H:%M:%S.%fZ\n"
        sys.exit(1)

    range_q = RANGE %(sys.argv[1], sys.argv[2])


    print(QUERY %(range_q))

The script also uses a bool query. It should be fairly easy to remove that and use only the time constraints for the range.

I hope this is what you're looking for.

This can be called and spits out a query such as:

./range_prefix_query.py.tmp 2016-08-10T00:00:00.000Z 2016-08-10T00:00:00.000Z
{"query":{"bool":{"must":[{"prefix": {"myType":"test"}},{"range":{"@timestamp":{"gte":"2016-08-10T00:00:00.000Z","lt":"2016-08-10T00:00:00.000Z"}}}]}}}

Artur

pandaadb
  • 6,306
  • 2
  • 22
  • 41
  • Yep, this should work. I don't know how to express the rest of my query in JSON format though, which was my issue. Can you point me towards a resource where I can do that as well? – Hal T Feb 01 '17 at 14:11
  • Do you use java with ES? For the queries I use 2 shortcuts: 1. Use kibana + inspection tools to see what kind of query kibana is sending to ES when you query for your data. 2. Use the java client implementation and create your query there programatically (using the objects). The client impl is designed in a way that the `toString` method will spit out the exact json content you need to use. – pandaadb Feb 01 '17 at 14:40
  • I actually use python but I can probably do the same. – Hal T Feb 01 '17 at 15:42
0

Take a look at https://elasticsearch-dsl.readthedocs.io/en/latest/

        s = Search()\
            .filter("term", **{"name": name})\
            .query(q)\
            .extra(**paging)

Belegnar
  • 721
  • 10
  • 24