Sqlite disk I/O error (python - databricks)

Question

So, I'm running a cluster on Databricks. when I try to query an sqlite file on it like this:

import sqlite3

conn = sqlite3.connect("/dbfs/FileStore/testing.sqlite")
cur = conn.cursor()
cur.execute('''SELECT * FROM ZACCELEROMETER''')

I get this error:

OperationalError                          Traceback (most recent call last)
<command-163> in <module>
----> 1 cur.execute('''SELECT * FROM ZACCELEROMETER''')

OperationalError: disk I/O error

It's worth mentioning that I can query this exact file locally on Jupiter. Another thing that's unusual is on databricks, I tried to open an example SQLite file I found on the internet called "Chinook_Sqlite.SQLite", it works fine with no error, unlike the first file. I'm thinking it might have to be something with the encoding of the first SQLite file.

Any suggestion would be greatly appreciated! Please let me know if something is unclear.

Thanks!

@Shawn I connected it is the databricks DBFS (https://docs.databricks.com/data/databricks-file-system.html). Any ideas it works with some sqlite files but with others there is disk I/O error? — Faisal, Feb 25 '21 at 13:46
I have no idea what databricks is, but from that link, "distributed file system"? Yeah, that'd probably be the problem. Sqlite isn't designed to be used on networked filesystems, just local ones. You can look into extended error codes for a more specific failure reason, but locking in particular is a weak point with many of them. — Shawn, Feb 25 '21 at 14:22

Faisal · Accepted Answer · 2021-02-26T01:38:56.423

For anyone that will encounter this issue in the future, I found a solution and I think it will work on any distributed file system.

conn = sqlite3.connect('data.sqlite')
curr = conn.execute('PRAGMA locking_mode = EXCLUSIVE')

Just simply run the second line and it should work. I believe that the problem was that sqlite is meant to run locally (i.e., run by one machine), so on a NFS, setting the locking mode of the sqlite file to exclusive will allow only one user to access the sqlite file while the connection is still active.

Sqlite disk I/O error (python - databricks)

1 Answers1

Linked