I am trying to download some data from the datastore using the following command:
appcfg.py download_data --config_file=bulkloader.yaml --application=myappname
--kind=mykindname --filename=myappname_mykindname.csv
--url=http://myappname.appspot.com/_ah/remote_api
When I didn't have much data in this particular kind/table I could download the data in one shot - occasionally running into the following error:
.................................[ERROR ] [Thread-11]
ExportProgressThread:
Traceback (most recent call last):
File "C:\Program Files\Google\google_appengine\google\appengine\tools
\bulkload
er.py", line 1448, in run
self.PerformWork()
File "C:\Program Files\Google\google_appengine\google\appengine\tools
\bulkload
er.py", line 2216, in PerformWork
item.key_end)
File "C:\Program Files\Google\google_appengine\google\appengine\tools
\bulkload
er.py", line 2011, in StoreKeys
(STATE_READ, unicode(kind), unicode(key_start), unicode(key_end)))
OperationalError: unable to open database file
This is what I see in the server log:
Traceback (most recent call last):
File "/base/python_runtime/python_lib/versions/1/google/appengine/
ext/remote_api/handler.py", line 277, in post
response_data = self.ExecuteRequest(request)
File "/base/python_runtime/python_lib/versions/1/google/appengine/
ext/remote_api/handler.py", line 308, in ExecuteRequest
response_data)
File "/base/python_runtime/python_lib/versions/1/google/appengine/
api/apiproxy_stub_map.py", line 86, in MakeSyncCall
return stubmap.MakeSyncCall(service, call, request, response)
File "/base/python_runtime/python_lib/versions/1/google/appengine/
api/apiproxy_stub_map.py", line 286, in MakeSyncCall
rpc.CheckSuccess()
File "/base/python_runtime/python_lib/versions/1/google/appengine/
api/apiproxy_rpc.py", line 126, in CheckSuccess
raise self.exception
ApplicationError: ApplicationError: 4 no matching index found.
When that error appeared I would simply re-run the download and things would work out well.
Of late, I am noticing that as the size of my kind increases, the download tool fails much more often. For instance, with a kind with ~3500 entities I had to run to the command 5 times - only the last of which succeeded. Is there a way around this error? Previously, my only worry was I wouldn't be able to automate downloads in a script because of the occasional failures - now I am scared I won't be able to get my data out at all.
This issue was discussed previously here but the post is old and I am not sure what the suggested flag does - hence posting my similar query again.
Some additional details. As mentioned here I tried the suggestion to proceed with interrupted downloads (in the section Downloading Data from App Engine ). When I resume after the interruption, I get no errors, but the number of rows that are downloaded are lesser than the entity count the datastore admin shows me.This is the message I get:
[INFO ] Have 3220 entities, 3220 previously transferred
[INFO ] 3220 entities (1003 bytes) transferred in 2.9 seconds
The datastore admin tells me this particular kind has ~4300 entities. Why aren't the remaining entities getting downloaded?
Thanks!