So I'm new to python and just finished my first application. (Giving random chords to be played on a midi piano and increasing the score if the right notes are hit in a graphical interface, nothing too fancy but also non-trivial.) And now I'm looking for a new challenge, this time I'm going to try and create a program that monitors a poker table and collects data on all the players. Though this is completely allowed on almost all poker rooms (example of the largest one) there is obviously no set and go API available. This probably makes the extraction of relevant data the most challenging part of the entire program. In my search for more information, I came across an undergraduate thesis that goes in to writing such a program using Java (Internet Poker: Data Collection and Analysis - Haruyoshi Sakai).
In this thesis, the author speaks of 3 data collection methods:
- Sniffing packets
- Hand history
- Scraping the screen
Like the author, I've come to the conclusion that the third option is probably the best route, but unlike him I have no knowledge of how to start this.
What I do know is the following: Any table will look like the image below. Note how text, including numbers is written in the same font on the table. Additionally, all relevant information is also supplied in the chat box situated in the lower left corner of the window.
In some regards using the chat box sounds like the best way to go, seeing as all text is predictable and in the same font. The problem I see is computational speed: It will often occur that many actions get executed in rapid succession. Any program will have to be able to keep up with this.
On the other hand, using the table as reference means that you have to deal with unpredictable bet locations.
The plan: Taking this in to a count, I'd start by getting an index of all player's names and stacks from the table view and "initialising" the table that way, and continue to use their stacks to extrapolate the betting they do.
The Method: Of course, the method is the entire reason why I made this post. It seems to me like one would need some sort of OCR to achieve all this, but seeing as everything is in a known font, there may be some significant optimisations that can be made. I would love some input on resources to learn about solutions to similar problems. Or if you've got a better idea on how to tackle this problem, I'd love to hear that too!
Please do be sure to ask any questions you may have, I will be happy to answer them in as much detail as possible.