I have an index of ~113000 documents. I'm trying to retrieve all of them, and I don't care about the score. basically a select * from index;
And i'm doing this in python using elasticutils (haven't found the time to switch to elasticsearch-dsl yet)
Running
S().indexes('da_userstats').query().count()
completes in about 0.003 seconds.
Running
S().indexes('da_userstats').query()[0:113595].execute().objects
is taking about 15 seconds.
From what I understand of the documentation both should forcing execution, so I don't see why there is the huge difference in time.
In the mapping I've tried marking the fields as don't analyze but its had no effect. I really don't get why there is a difference of so many orders of magnitude.
@classmethod
def get_mapping(cls):
return {
'properties': {
'id': {
'type': 'integer',
'index': 'not_analyzed',
"include_in_all": False,
},
'email': {
'type': 'string',
'index': 'not_analyzed',
"include_in_all": False
},
'username': {
'type': 'string',
'index': 'not_analyzed',
"include_in_all": False
},
'date_joined': {
'type': 'string',
'index': 'not_analyzed',
"include_in_all": False
},
'last_activity': {
'type': 'string',
'index': 'not_analyzed',
"include_in_all": False
},
'last_activity_web': {
'type': 'string',
'index': 'not_analyzed',
"include_in_all": False
},
'last_activity_ios': {
'type': 'string',
'index': 'not_analyzed',
"include_in_all": False
},