3

I have a website on PythonAnywhere that uses Flask and Flask-SQLAlchemy connected to a MySQL database. Users can use the website to queue a task that is saved to the database as a record in a table, and then a separate scheduled task (Python program) checks the database and handles every unhandled record.

The problem I'm encountering is that the scheduled task's db query seems to only find new records when it runs for the first time, but if I then use the website to add a new task, the still-running scheduled task's recurring db queries (every 5 seconds) don't seem to detect the new record.

Any ideas about what could be happening here?

Here's the code being run by the bash file:

def generate_any_pending_videos():
    unfinished_videos = db.session.query(Video)\
                                  .filter(~Video.status.has(VideoStatus.status.in_(['Error', 'Finished', 'Video deleted'])))\
                                  .order_by(Video.datetime_created)\
                                  .all()
    for video in unfinished_videos:
        try:
            logging.info("Attempting to create video for video %d" % video.id)
            generate_video(video)
        except Exception as e:
            logging.error(str(e))


if __name__ == '__main__':
    while True:
        generate_any_pending_videos()
        time.sleep(5)
Ilja Everilä
  • 50,538
  • 7
  • 126
  • 127
Nathan Wailes
  • 9,872
  • 7
  • 57
  • 95
  • Are you sure that the server’s db connection has persisted the changes (i.e. transaction has committed), and are you sure that the scheduled task’s db session refreshes instead of operating on cached objects? – Jens Sep 03 '18 at 21:54
  • I used MySQL Workbench to confirm that the new records are being persisted, and I'm not sure about the second point and in fact that is what I was suspecting was happening, but I wasn't sure if that's how the session worked or not. If there's a way to refresh the session that would solve my problem. – Nathan Wailes Sep 03 '18 at 21:55
  • Try opening a new session, or take a look at this question: https://stackoverflow.com/questions/19143345/about-refreshing-objects-in-sqlalchemy-session – Jens Sep 03 '18 at 21:59
  • Can you provide your code (or better, a minimal working example)? It's hard to help without looking at what you are doing. – jorgeh Sep 03 '18 at 21:59
  • @jorgeh I've added the essence of the code that's being run. Note that the "db" object is the `SQLAlchemy(app)` Flask-SQLAlchemy object from my Flask server, I'm not sure if that makes a difference. – Nathan Wailes Sep 03 '18 at 22:10
  • @Jens Thanks for the suggestion. I did see that SO question before I posted this one, but the difference here is that this isn't a case of a single object's values not refreshing; I'm actively trying to query the database for *new* objects. So It's not clear to me how I would use the `session.refresh(object)` method for this problem. – Nathan Wailes Sep 03 '18 at 22:15
  • As @Jens points out, it seems to be a cached-session problem. You could try issuing `session.expire_all()` at the end of `generate_any_pending_videos `. Probably not the most elegant solution, but if it works, at least you know what the issue is – jorgeh Sep 03 '18 at 22:16
  • @jorgeh I did actually just try that and it didn't work. – Nathan Wailes Sep 03 '18 at 22:17
  • @NathanWailes, take a look at the `db.session.new` (and `dirty` and `deleted`) properties to check for any changes to _that_ session. The `db.session` in your code is a fresh-shiny-new one, or where is it being created? – Jens Sep 03 '18 at 22:32
  • .oO( And are you sure that you want to stuff videos into a MySQL db? Take a look at [Choose Something Else](https://grimoire.ca/mysql/choose-something-else)—I think that there are better solutions to the problem at hand. ) – Jens Sep 03 '18 at 22:41
  • @Jens The records aren't actually videos, they're the settings used to generate and manage the videos. The videos themselves are stored on the filesystem. – Nathan Wailes Sep 03 '18 at 22:43
  • @Jens I used pdb to step into the program and check the status of the variables you mentioned but they all always show up as being `IdentitySet([])`. – Nathan Wailes Sep 03 '18 at 22:48

4 Answers4

8

Found a fix: for some reason running db.session.commit() before my query made the new records show up.

if __name__ == '__main__':
    while True:
        generate_any_pending_videos()
        time.sleep(5)
        db.session.commit()  # For some reason this is needed to be able to detect newly-created videos
Nathan Wailes
  • 9,872
  • 7
  • 57
  • 95
  • Hmm… Flask should tie SQLAlchemy sessions to the request transaction, and when a response is generated then the transaction should commit and with it the db session. If that is not the case, I recommend reviewing request handling and db session handling to make sure that data is being persisted correctly. – Jens Sep 03 '18 at 23:52
  • If InnoDB commited when a nonlocking SELECT returns a result, the (default) isolation level [REPEATABLE READ](https://dev.mysql.com/doc/refman/8.0/en/innodb-transaction-isolation-levels.html#isolevel_repeatable-read) would not make any sense. @MiguelGarcia is right in his answer. As long as you don't commit (or rollback) the transaction in REPEATABLE READ, all queries in it will read from the snapshot created by the first query and later changes to the DB will remain transparent to your transaction/session. – shmee Sep 04 '18 at 06:29
  • 1
    this worked for me!!! should be the right anwser. another way was to set `[mysqld] transaction-isolation=READ-COMMITTED` in server.cnf – r3v3r53 Dec 04 '19 at 09:14
  • 1
    OMG you have no idea how much I have searched and attempted to find a solution to this issue. thanks for your post! – C. Cooney Oct 11 '20 at 18:36
6

All SqlAlchemy queries run inside a transaction by default http://docs.sqlalchemy.org/en/latest/orm/session_transaction.html . That's the reason you are only getting new data in the first call of your script. Because inside a transaction seen data won't change, that is the I in ACID, transactions are isolated. After the commit or rollback the next query starts a new transactions, so you will get fresh data from the database.

Miguel Garcia
  • 356
  • 1
  • 5
1

You can change isolation level in sqlalchemy engine:

engine = create_engine(db_path, execution_options={"isolation_level": "READ COMMITTED"})

That is problem of mysql default config.

user1941407
  • 2,722
  • 4
  • 27
  • 39
0

The way I made this work was by setting the transaction isolation level in the MySQL config.

With a GLOBAL scope, the transaction isolation level is reset after restarting MySQL. You can do this by logging into the MySql DB on the terminal

sudo mysql -u root -p

SET GLOBAL TRANSACTION ISOLATION LEVEL READ COMMITTED;

Or:

sudo mysql -u root -p

SET GLOBAL transaction_isolation = 'READ-COMMITTED';

You can set 4 transaction isolation levels with 4 scopes as shown below. You can check more details in the docs here https://dev.mysql.com/doc/refman/8.0/en/set-transaction.html

Gaurav Sarma
  • 2,248
  • 2
  • 24
  • 45