1

I'm using PyES to use ElasticSearch in Python. Typically, I build my queries in the following format:

# Create connection to server.
conn = ES('127.0.0.1:9200')

# Create a filter to select documents with 'stuff' in the title.
myFilter = TermFilter("title", "stuff")

# Create query.
q = FilteredQuery(MatchAllQuery(), myFilter).search()

# Execute the query.
results = conn.search(query=q, indices=['my-index'])

print type(results)
# > <class 'pyes.es.ResultSet'>

And this works perfectly. My problem begins when the query returns a large list of documents. Converting the results to a list of dictionaries is computationally demanding, so I'm trying to return the query results already in a dictionary. I came across with this documentation:

http://pyes.readthedocs.org/en/latest/faq.html#id3 http://pyes.readthedocs.org/en/latest/references/pyes.es.html#pyes.es.ResultSet https://github.com/aparo/pyes/blob/master/pyes/es.py (line 1304)

But I can't figure out what exactly I'm supposed to do. Based on the previous links, I've tried this:

from pyes import *
from pyes.query import *
from pyes.es import ResultSet
from pyes.connection import connect

# Create connection to server.
c = connect(servers=['127.0.0.1:9200'])

# Create a filter to select documents with 'stuff' in the title.
myFilter = TermFilter("title", "stuff")

# Create query / Search object.
q = FilteredQuery(MatchAllQuery(), myFilter).search()

# (How to) create the model ?
mymodel = lambda x, y: y

# Execute the query.
# class pyes.es.ResultSet(connection, search, indices=None, doc_types=None,
# query_params=None, auto_fix_keys=False, auto_clean_highlight=False, model=None)

resSet = ResultSet(connection=c, search=q, indices=['my-index'], model=mymodel)
# > resSet = ResultSet(connection=c, search=q, indices=['my-index'], model=mymodel)
# > TypeError: __init__() got an unexpected keyword argument 'search'

Anyone was able to get a dict from the ResultSet? Any good sugestion to efficiently convert the ResultSet to a (list of) dictionary will be appreciated too.

Stephane Rolland
  • 38,876
  • 35
  • 121
  • 169
JCJS
  • 3,031
  • 3
  • 19
  • 25
  • you should not try to convert it into a dict or similar. This would be done the same thing twice. What i did was overwrite ES object that it does not use DottedDict access. But another possiblity would be to user the "raw query". – Julian Hille Jan 21 '14 at 23:52

2 Answers2

1

I tried too many ways directly to cast ResultSet into dict but got nothing. The best way I recently use is appending ResultSet items into another list or dict. ResultSet covers every single item in itself as a dict.

Here is how I use:

#create a response dictionary
response = {"status_code": 200, "message": "Successful", "content": []}

#set restul set to content of response
response["content"] = [result for result in resultset]

#return a json object
return json.dumps(response)
fth
  • 2,478
  • 2
  • 30
  • 44
0

Its not that complicated: just iterate over the result set. For example with a for loop:

for item in results:
   print item
speznaz
  • 98
  • 1
  • 10
  • This is exactly what I was trying to avoid. When handling large resulting sets, this aproach turns to be quite slow. – JCJS Aug 23 '13 at 11:34