I would approach this with the traditional way an opening book is handled in chess engines:
- Generate all possible moves
- For each move:
- Make that move
- Look the resulting position up in your database
- Undo the move
- Make the move that had the highest score in your database
Looking up a move
Chess engines usually compute a hash function of the current game state via Zobrist hashing, which is a simple way to construct a good hash function for gamestates.
The big advantage of this approach is that it takes care of transpositions, that is, if the same state can be reached via alternate paths, you don't need to worry about those alternate paths, only about the game states themselves.
How chess engines do this
Most chess engines use static opening books that are compiled from recorded games and hence use a simple binary file that maps these hashes to a score; e.g.
struct book_entry {
uint64_t hash
uint32_t score
}
The entries are then sorted by hash, and thanks to operating system caching, a simple binary search through the file will find the needed entries very quickly.
Updating the scores
However, if you want the engine to learn continously, you will need a more complicated data structure; at this point it is usually not worth doing yourself, and you should use an available library. I would probably use LevelDB, but anything that lets you store key-value pairs is fine (Redis, SQLite, GDBM, etc.)
Learning the scores
How exactly you update the scores depends on your game. In games with a lot of data available, a simple approach such as just storing the percentage of games won after the move that resulted in the position works; if you have less data, you can store the result of a game tree search from the position in question as score. Machine learning techniques such as Q learning are also a possibility, although I do not know of a program that actually does this in practice.