1

I'm working on a real-time video analysis system which processes the video stream frame by frame. At each frame it can generate several events which should be recorded and some delivered to another system via network. The system is soft real-time, i.e. message latencies higher than 25ms are highly undesirable, but not fatal.

Are relational databases (specifically, MySQL and Postgres) appropriate as the datastore for such system?

Can I expect the DB to work well when it is installed on its own server and has ~50 25fps streams of single-row SQL inserts coming in over the network?

EDIT: I think in general performance would not be a problem, but I worry about the latency variance. If it will occasionally delay for 1000 ms, that would be very bad.

Oh, and the system runs 24/7 so the DB could grow arbitrarily big. Does that degrade the insert latency?

ansgri
  • 2,126
  • 5
  • 25
  • 37
  • As an added note, you only need MySQL and Postres to be "normally" able to save the frames. If they "normally" take 20ms to save the frame, it's ok. You can have an async queue of buffers that need saving. – xanatos Oct 29 '11 at 10:21
  • @xanatos: That queue might take ACID away, which might not be a problem with this app. Another possibility is working on a real-time operating system. But I'd really expect "work well" to depend a lot on the hardware that RTOS is running on. – Mike Sherrill 'Cat Recall' Oct 29 '11 at 10:28
  • @CodeInChaos Rows about 1kb, several dozens of ints and and floats. – ansgri Oct 29 '11 at 10:32
  • As for hardware, now we have fairly recent and powerful Core i7 boxes, hoping to move to something much less expensive for video processing + one powerful server for the DB. – ansgri Oct 29 '11 at 11:06

3 Answers3

2

I wouldn't worry too much about performance when choosing a relational database over another type of datastore, choose the solution that best meets your requirements for accessing that data later. However, if you do choose not only a RDBMS but one over the network then you might want to consider buffering events to a local disk briefly on their way over to the DB. Use a separate thread or process or something to push events into the DB to keep the realtime system unaffected.

Dawngerpony
  • 3,288
  • 2
  • 34
  • 32
  • Added an edit clarifying my worries. Can there be big latency variance? – ansgri Oct 29 '11 at 10:35
  • +1 For nothing that you should handle the db-writing on a separate thread to make sure it does not block frame recordning on occasional latency fluctuations. – Albin Sunnanbo Oct 29 '11 at 10:45
2

Biggest problems are how unpredictable the latency will be and how it never goes down, always up. But modern hardware to the rescue, specify a machine with enough cpu cores. You can count on at least two, getting four is easy. So you can spin up a thread and dedicate one core to the dbase updates, isolating it from your soft real-time code. Now you don't care about the variability in the delays, at least as long as the dbase updates don't take so long that you generate data faster than it can consume.

Setup a dbase server and load it up with fake data, double the amount you think it ever needs to store. Test continuously while you develop, add the instrumenting code you need to measure how it is doing at an early stage in the project.

Hans Passant
  • 922,412
  • 146
  • 1,693
  • 2,536
  • So I should have a separate "DB-writer" process on each machine that serves as a cache for DB updates and guarantees data integrity in the case DB dies? – ansgri Oct 29 '11 at 10:53
  • You don't need a process, a thread will do just fine. This answer was addressing latency concerns, not data integrity. You very rarely program to deal with that. Dbase engines are very reliable and when they fall over it is rarely useful to keep your program running. If you know how to store data securely without a working dbase then you probably don't need that dbase in the first place. – Hans Passant Oct 29 '11 at 11:10
  • I like that suggestion; and I can store data in memory while the DB restarts to minimize damage, then send it. The system becomes self-repairing, as it should be: remote access to it is quite unreliable. – ansgri Oct 29 '11 at 11:23
1

As I've written, if you queue the rows that need to be saved and save them in an async way (so not to stop the "main" thread) there shouldn't be any problem... BUT!!!

You want to save them in a DB... So someone else will read the rows AT THE SAME TIME they are being written. Sadly it's normally quite difficult to tell to a DB "this work is very high priority, everything else can be stalled but not this". So if someone does:

BEGIN TRANSACTION
    SELECT COUNT(*) FROM TABLE
    WAITFOR DELAY '01:00:00'

(I'm using T-Sql here... But I think it's quite clear. Ask for the COUNT(*) of the table, so that there is a lock on the table and then WAITFOR an hour)

then the writes could be stalled and go in timeout. In general if you configure everyone but the app to be able only to do reads, these problems shouldn't be present.

xanatos
  • 109,618
  • 12
  • 197
  • 280