1

I have an Elastic Search server which I am querying both from a live website and through a Django management command. The management command runs using celery beat at 3am GMT to synchronise data from an outside service. Sometimes (but not every time) this command is run elastic search appears to crash and I get the following error in my error log.

    [09/Jan/2014 08:03:46] ERROR [django.request:212] Internal Server Error: /
    Traceback (most recent call last):
      File
"/srv/www/site.co.uk/env/local/lib/python2.7/site-packages/django/core/handlers/base.py",
line 115, in get_response
        response = callback(request, *callback_args, **callback_kwargs)
      File
"/srv/www/site.co.uk/env/local/lib/python2.7/site-packages/django/views/generic/base.py",
line 68, in view
        return self.dispatch(request, *args, **kwargs)
      File
"/srv/www/site.co.uk/env/local/lib/python2.7/site-packages/django/views/generic/base.py",
line 86, in dispatch
        return handler(request, *args, **kwargs)
      File
"/srv/www/site.co.uk/env/local/lib/python2.7/site-packages/django/views/generic/base.py",
line 153, in get
        context = self.get_context_data(**kwargs)
      File
"/srv/www/site.co.uk/clothes_comparison/clothes_comparison/views.py",
line 56, in get_context_data
        fields=['id', 'name', 'price', 'images', 'advertiser']
      File
"/srv/www/site.co.uk/env/local/lib/python2.7/site-packages/pyelasticsearch/client.py",
line 96, in decorate
        return func(*args, query_params=query_params, **kwargs)
      File
"/srv/www/site.co.uk/env/local/lib/python2.7/site-packages/pyelasticsearch/client.py",
line 512, in multi_get
        'GET', ['_mget'], {'docs': docs}, query_params=query_params)
      File
"/srv/www/site.co.uk/env/local/lib/python2.7/site-packages/pyelasticsearch/client.py",
line 238, in send_request
        **({'data': request_body} if body else {}))
      File
"/srv/www/site.co.uk/env/local/lib/python2.7/site-packages/requests/sessions.py",
line 347, in get
        return self.request('GET', url, **kwargs)
      File
"/srv/www/site.co.uk/env/local/lib/python2.7/site-packages/requests/sessions.py",
line 335, in request
        resp = self.send(prep, **send_kwargs)
      File
"/srv/www/site.co.uk/env/local/lib/python2.7/site-packages/requests/sessions.py",
line 438, in send
        r = adapter.send(request, **kwargs)
      File
"/srv/www/site.co.uk/env/local/lib/python2.7/site-packages/requests/adapters.py",
line 327, in send
        raise ConnectionError(e)
    ConnectionError: HTTPConnectionPool(host='127.0.0.1', port=9200): Max
retries exceeded with url: /_mget (Caused by <class 'socket.error'>: [Errno 111]
Connection refused)

I am using pyelasticsearch to connect to Elastic Search with the following code in my settings.py file:

try:
    ES_CON
except NameError:
    ES_CON = None

if not ES_CON:
    ES_CON = ElasticSearch(ELASTICSEARCH_URI)

Any help would be greatly appreciated.

Prydie
  • 1,807
  • 1
  • 20
  • 30
  • The exception you're seeing is due to your elastic search instance refusing the connection. It would seem either your server runs out of available ports to connect on or your elastic search instance is too busy to accept a connection from your Django app. This seems more indicative of how you have configured/are utilizing your ES instance. – Ian Stapleton Cordasco Jan 09 '14 at 13:42
  • Is it to do with the way I'm establishing the connection to ElasticSearch? pyelasticsearch states that it's connections are automatically pooling but I don't understand how it could be exhausting the sockets if that was the case. – Prydie Jan 10 '14 at 14:00
  • This doesn't strictly relate to your problem, but I would suggest you use the official python client (elasticsearch-py) which is thread safe and seems to have better connection handling. I made the switch from pyelasticsearch to elasticsearch-py pretty easily and have found it to be more stable and faster. – Erve1879 Jan 12 '14 at 10:18
  • Thanks @Erve1879 I will certainly switch that over and see if it resolves things. – Prydie Jan 12 '14 at 10:22
  • @Erve1879 your solution looks good. If you want to post it as an answer then I'm happy to mark it as correct presuming the site remains stable for the next 24h. – Prydie Jan 14 '14 at 20:32

1 Answers1

2

I would suggest using the official Elasticsearch python client: elasticsearch-py which has reliable connection handling, is thread safe etc. It is also faster (according to the author, who is part of the Elasticsearch team).

You can then have your es = Elasticsearch() either at the top of your tasks.py, or in e.g. core.helpers and import es from there.

Erve1879
  • 845
  • 1
  • 14
  • 26