0

I'm trying to sort ~13,000 documents on my Mac's local CouchDB database by date, but it gets hung up on document 5407 each time. I've tried increasing the time-out tolerance on Futon but to no avail. This is the error message I'm getting:

for row in db.view('index15/by_date_time', startkey=start, endkey=end): File "/Library/Python/2.6/site-packages/CouchDB-0.8-py2.6.egg/couchdb/client.py", line 984, in iter File "/Library/Python/2.6/site-packages/CouchDB-0.8-py2.6.egg/couchdb/client.py", line 1003, in rows File "/Library/Python/2.6/site-packages/CouchDB-0.8-py2.6.egg/couchdb/client.py", line 990, in _fetch File "/Library/Python/2.6/site-packages/CouchDB-0.8-py2.6.egg/couchdb/client.py", line 880, in _exec File "/Library/Python/2.6/site-packages/CouchDB-0.8-py2.6.egg/couchdb/http.py", line 393, in get_json File "/Library/Python/2.6/site-packages/CouchDB-0.8-py2.6.egg/couchdb/http.py", line 374, in get File "/Library/Python/2.6/site-packages/CouchDB-0.8-py2.6.egg/couchdb/http.py", line 419, in _request File "/Library/Python/2.6/site-packages/CouchDB-0.8-py2.6.egg/couchdb/http.py", line 239, in request File "/Library/Python/2.6/site-packages/CouchDB-0.8-py2.6.egg/couchdb/http.py", line 205, in _try_request_with_retries socket.error: 54

incidentally, this is the same error message that is produced when I have a typo in my script.

I'm using couchpy to create the view as follows:

def dateTimeToDocMapper(doc):

from dateutil.parser import parse
from datetime import datetime as dt
if doc.get('Date'):
    # [year, month, day, hour, min, sec]
    _date = list(dt.timetuple(parse(doc['Date']))[:-3])
    yield (_date, doc)

while this is running, I can open a python shell and using server.tasks() I can see that the indexing is indeed taking place.

>>> server.tasks()

[{u'status': u'Processed 75 of 13567 changes (0%)', u'pid': u'<0.451.0>', u'task': u'gmail2 _design/index11', u'type': u'View Group Indexer'}]

but each time it gets stuck on process 5407 of 13567 changes (it takes ~8 minutes to get this far). I have examined what I believe to be document 5407 and it doesn't appear to be anything out of the ordinary.

Incidentally, if I try to restart the process after it stops, I get this response from server.tasks()

>>> server.tasks()

[{u'status': u'Processed 0 of 8160 changes (0%)', u'pid': u'<0.1224.0>', u'task': u'gmail2 _design/index11', u'type': u'View Group Indexer'}]

in other words, couchDB seems to have recognized that it's already processed the first 5407 of the 13567 changes and now has only 8160 left.

but then it almost immediately quits and gives me the same socket.error: 54

I have been searching the internet for the last few hours to no avail. I have tried initiating the indexing from other locations, such as Futon. As I mentioned, one of my errors was an OS timeout error, and increasing the time_out thresholds in Futon's configuration seemed to help with that.

Please, if anyone could shed light on this issue, I would be very very grateful. I'm wondering if there's a way to restart the process once its already indexed 5407 documents, or better yet if there's a way to prevent the thing from quitting 1/3 of the way through in the first place.

Thanks so much.

Mike G
  • 1

1 Answers1

0

From what I gather, CouchDB builds your view contents by sending all documents to your couchpy view server, which runs your Python code on that document. If that code fails for any reason, CouchDB will be notified that something went wrong, which will stop the update of the view contents.

So, there is something wrong with document 5408 that causes your Python code to misbehave. If you need more help, I suggest you post that document here. Alternatively, look into the logs for your couchpy view server: they might contain information about how your code failed.

Victor Nicollet
  • 24,361
  • 4
  • 58
  • 89