0

I have a class that does some parsing of two large (~90K rows, 11 columns in the first and around ~20K, 5 columns in the second) CSV files. According to the specification I'm working with the CSV files can be externally changed (removing/adding of new rows; columns remain constant as well as the paths). Such updates can happen at any time (though highly unlikely that an update will be launched in time intervals shorter than a couple of minutes) and an update of any of the two files has to terminate the current processing of all that data (CSV, XML from an HTTP GET request, UDP telegrams), followed by re-parsing the content of each of the two (or just one if only one has changed).

I keep the CSV data (quite reduced since I apply multiple filters to remove unwanted entries) in memory to speed working with it and also to avoid unnecessary IO operations (opening, reading, closing file).

Right now I'm looking into the QFileSystemWatcher, which seems to be exactly what I need. However I'm unable to find any information on how it actually works internally.

Since all I need is to monitor 2 files for changes the number of files shouldn't be an issue. Do I need to run it in a separate thread (since the watcher is part of the same class where the CSV parsing happens) or is it safe to say that it can run without too much fuss (that is it works asynchronously like the QNetworkAccessManager)? My dev environment for now is a 64bit Ubuntu VM (VirtualBox) on a relatively powerful host (a HP Z240 workstation) however the target system is an embedded one. While the whole parsing of the CSV files takes just 2-3 seconds at the most I don't know how much performance impact there will be once the application gets deployed so additional overhead is something of a concern of mine.

rbaleksandar
  • 8,713
  • 7
  • 76
  • 161
  • Just use it like most Qt things: create, connect signals to slots, never block event loop. – hyde Nov 16 '17 at 11:27
  • Parsing the big CSV file and how to do it without impact on other events, now that is a separate question, and quite broad. You can do it another thread, or (probably simpler) you can do it chunks and let event loop run in between the chunks (a few ways to achieve this, too). – hyde Nov 16 '17 at 11:29
  • @hyde I have done it (create, connect) and it works fine. The only concern is how it will be affected when the system resources get reduced by a large margin. – rbaleksandar Nov 16 '17 at 11:29
  • So basically if the CSV parsing blocks the event loop for a long period of time the watcher is very likely to miss change in the files? – rbaleksandar Nov 16 '17 at 11:30
  • Qt mostly runs in single thread itself, so no slot is going to get called while your own method is for example doing some loop. No method is going to be interrupted when something happens, the whole thing relies on methods returning, so event loop can process next event (or sleep/wait for socket etc). – hyde Nov 16 '17 at 11:31
  • No changes are going to be lost, wathcer uses OS services. I don't think (not 100% sure, but it'd be pretty bad if it could miss changes...) it will miss anything, at least not on Linux. You'll just not get the Qt signal, until control returns to Qt event loop. – hyde Nov 16 '17 at 11:31
  • Yeah, the watcher uses `inotify` (if available; if not, some restrictions are imposed). Come to think of it I am actually perfectly fine to wait for the parsing to be completed (even if a change in the file has occurred it is still loaded in memory whe the `QFile` opens it...as far as I know :D) and then (at a later point in time) get notified about the file change. – rbaleksandar Nov 16 '17 at 11:36

0 Answers0