1

Hi I am using gunicorn with nginx and a postgreSQL database to run my web app. I recently change my gunicorn command from

gunicorn run:app -w 4 -b 0.0.0.0:8080 --workers=1 --timeout=300

to

gunicorn run:app -w 4 -b 0.0.0.0:8080 --workers=2 --timeout=300

using 2 workers. Now I am getting error messages like

  File "/usr/local/lib/python2.7/dist-packages/flask_sqlalchemy/__init__.py", line 194, in session_signal_after_commit
    models_committed.send(session.app, changes=list(d.values()))
  File "/usr/local/lib/python2.7/dist-packages/blinker/base.py", line 267, in send
    for receiver in self.receivers_for(sender)]
  File "/usr/local/lib/python2.7/dist-packages/flask_whooshalchemy.py", line 265, in _after_flush
    with index.writer() as writer:
  File "/usr/local/lib/python2.7/dist-packages/whoosh/index.py", line 464, in writer
    return SegmentWriter(self, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/whoosh/writing.py", line 502, in __init__
    raise LockError
LockError

I can't really do much with these error messages, but they seem to be linked to whoosh search which I have on the User table in my database model

import sys
if sys.version_info >= (3, 0):
    enable_search = False
else:
    enable_search = True
    import flask.ext.whooshalchemy as whooshalchemy

class User(db.Model):
    __searchable__ = ['username','email','position','institute','id'] # these fields will be indexed by whoosh

    id = db.Column(db.Integer, primary_key=True)
    username = db.Column(db.String(100), index=True)
    ...

    def __repr__(self):
        return '<User %r>' % (self.username)

if enable_search:
    whooshalchemy.whoosh_index(app, User)

any ideas how to investigate this? I thought postgres allows parallel access and hence I thought lock errors should not happen? When I used only 1 worked they did not happen, so it definitely is caused by having multiple workers... any help is appreciated thanks carl

carl
  • 4,216
  • 9
  • 55
  • 103

1 Answers1

1

This has nothing to do with PostgreSQL. Whoosh holds file locks for writing and it's failing on the last line of this code...

class SegmentWriter(IndexWriter):
    def __init__(self, ix, poolclass=None, timeout=0.0, delay=0.1, _lk=True,
                 limitmb=128, docbase=0, codec=None, compound=True, **kwargs):
        # Lock the index
        self.writelock = None 
        if _lk: 
            self.writelock = ix.lock("WRITELOCK")
            if not try_for(self.writelock.acquire, timeout=timeout,
                           delay=delay):
                raise LockError

Note, the delay default on this is 0.1 seconds and if it does not get the lock in that time it will fail. You increased your workers so now you have contention on the lock. From the following docs...

https://whoosh.readthedocs.org/en/latest/threads.html

Locking

Only one thread/process can write to an index at a time. When you open a writer, it locks the index. If you try to open a writer on the same index in another thread/process, it will raise whoosh.store.LockError.

In a multi-threaded or multi-process environment your code needs to be aware that opening a writer may raise this exception if a writer is already open. Whoosh includes a couple of example implementations (whoosh.writing.AsyncWriter and whoosh.writing.BufferedWriter) of ways to work around the write lock.

While the writer is open and during the commit, the index is still available for reading. Existing readers are unaffected and new readers can open the current index normally.

You can find examples on how to use Whoosh concurrently.

Buffered

https://whoosh.readthedocs.org/en/latest/api/writing.html#whoosh.writing.BufferedWriter

Async

https://whoosh.readthedocs.org/en/latest/api/writing.html#whoosh.writing.AsyncWriter

I'd try the buffered version first since batching writes is almost always faster.

Harry
  • 11,298
  • 1
  • 29
  • 43
  • I am just thinking about how to implement this... do I have to use a 'try: access database except LockError:' wherever I access the database? Or is there an easier solution? – carl Apr 22 '16 at 09:04
  • If you want to catch the exception then use `try/catch`. Don't use exceptions for control flow though. I'd look into using Whoosh in buffered mode and also use `try/catch`. You want to avoid the exception if you can. – Harry Apr 22 '16 at 18:36
  • flask-whooshalchemy does not seem to have implementations for BufferedWriter or AsyncWriter and I also can't get the try/except statement to work, since when the LockError happens I can't just repeat the session.commit()... it seems to be stuck with the LockError? I also don't understand the difference between try/except and try/catch? thanks carl – carl May 22 '16 at 12:27