Statistical approach to chess?

Question

Reading about how Google solves the translation problem got me thinking. Would it be possible to build a strong chess engine by analysing several million games and determining the best possible move based largely (completely?) on statistics? There are several such chess databases (this is one that has 4.5 million games), and one could potentially weight moves in identical (or mirrored or reflected) positions using factors such as the ratings of the players involved, how old the game is (to factor in improvements in chess theory) etc. Any reasons why this wouldn't be a feasible approach to building a chess engine?

@Chinmay Kanchi: you've answered your own question in one of your own comment... It doesn't work because of the combinatorial explosion. Heck, to have a database of chess games, you could make the best poker programs fight each other and create billions of games. That would still be a drop in the bucket. Then how would you analyze them? In real-time? Or how would you store the move in advance for every board possible? Where? Be it CPU or memory, there aren't enough atoms in the universe to make your approach feasible. Combinatorial explosion. — SyntaxT3rr0r, Apr 27 '10 at 14:36
Google translate is useful for a quick and dirty translation of a language you do not know, but its translations are often ungrammatical. Perhaps depending on the specific languages it supports. — hkBst, Feb 17 '16 at 12:41
Now with Google's Alpha Zero I would say they have already got one step further, creating the statistical material by playing against itself. And guess what, thanks to Alpha Zero the preferred opening is now the English Opening - C4. — Per Digre, Apr 19 '18 at 17:19

score 21 · Accepted Answer · answered Apr 27 '10 at 00:13

21

Something like this is already done: it's the underlying concept of opening books.

Due to the nature of the game, computer AIs are notoriously bad in the beginning, when there are so many possibilities and the end goal is still far ahead. It starts to improve toward the middle, when tactical possibilities start to form, and can play perfectly in the end game far exceeding the capability of most humans.

To help the AI make good moves in the beginning, many engines rely on opening books instead: a statistically derived flowchart of moves, basically. Many games between highly rated players were analyzed, and recommendations are hard-coded into "the book", and while the positions are still in "the book", the AI doesn't even "think", and simply follow what "the book" says.

Some people can also memorize opening books (this is mostly why Fischer invented his random chess variant, so that memorization of openings becomes far less effective). Partly due to this, sometimes an unconventional move is made in the beginning, not because it's statistically the best move according to history, but precisely the opposite: it's not a "known" position, and can take your opponent (human or computer) "out of the book".

On the opposite end of the spectrum, there is something called endgame tablebase, which is basically a database of previously analyzed endgame positions. Since the positions were previously searched exhaustively, one can use this to enable perfect play: given any position, one can immediately decide if it's winning, losing, or draw, and what's the best way to achieve/avoid the outcome.

In chess, something like this is only feasible for the opening and endgame, though. The complexity of the middle game is what makes the game interesting. If one can play chess simply by looking up a table, then the game wouldn't be as exciting, interesting, and deep as it is.

answered Apr 27 '10 at 00:13

polygenelubricants

376,812
128
561
623

Rybka's opening book (it's for sale!): http://rybkachess.com/index.php?auswahl=Rybka+3+book – polygenelubricants Apr 27 '10 at 00:29
Both openingbook and endgame-database are not really statistical. I think a neural net (any kind of 'learning' software) conmes closer. – H H Apr 27 '10 at 07:50
@Henk: Endgame tablebases clearly aren't, but the creation of the opening book itself is often based on statistics, no? Of course once the book is made, the computer just blindly follows it. – polygenelubricants Apr 27 '10 at 10:11
2

Openingbooks are the result of analysis by experts. You could call human learning a statistical process but that is stretching it a little. – H H Apr 27 '10 at 10:47
1

@polygenelubricants: to me the *really* fascinating thing about chess is that it's a game of perfect information with strictly defined rules (well, mostly ;) and in the end, there are only three possible outcome assuming optimal play on both sides: *a)* white always wins, *b)* black always wins and *c)* neither white nor black wins (experts lean towards *a)* btw). And what is really fascinating: we'll *never* know, there aren't enough atom in the universe for us to "solve" chess :) The game is provably solvable (and it has been proven) but we simply cannot solve it :) – SyntaxT3rr0r Apr 27 '10 at 14:40
@WizardOfOdds: if you just want to go by raw numbers and other objective measurements, Go is vastly more fascinating than chess. I never learned to play Go, but for now I much prefer chess simply because I like having all the different types of pieces to play with. It gives it more personality and playing style in a sense ("Sac the queen!", "Twin bishops!", etc). Go's stones are homogeneous, which depending on how you see it can either be elegant or just plain boring. – polygenelubricants Apr 27 '10 at 15:09
@SyntaxT3rr0r: do you have a reference for `The game is provably solvable (and it has been proven) but we simply cannot solve it`? – sehe Jul 12 '11 at 16:42
@sehe:the keywords you're looking are: *"two-person turn-based zero-sum game with perfect information"*. Chess is really just like tic-tac-toe, just with a much bigger number of moves. And just like there's an optimal strategy for tic-tac-toe, there's one for chess. But the combinatorial explosion is so big we can't compute it. The **only** thing that matter is this: *at the beginning of the game, is white in a winning position?* If not, *is black in a winning position?* If neither, then neither white nor black wins. – SyntaxT3rr0r Jul 16 '11 at 15:35
@sehe: btw solving for fun two-person turn-based zero-sum game with perfect information is a very fun thing to do. They are typically very elegant recursive algorithms. *TopCoder* used to have algorithmic competition where you had 75 minutes to solve three problems, one of them sometimes being solving such a game. Fun times. But, yup, chess is solvable (no matter the rules, though there may be different solution depending on the rules), just not solved. – SyntaxT3rr0r Jul 16 '11 at 15:38
@sehe: as for a reference, the Wikipedia entry is actually quite good: http://en.wikipedia.org/wiki/Solving_chess Once again that chess is solvable is not up for debate. However there's some dispute as to if one day we'll be able to build a computer fast enough to solve it or not: some people claim the combinatorial explosion is too big to be solved in our universe, others thinks one day it shall be doable, around year 2250 or so. – SyntaxT3rr0r Jul 18 '11 at 01:21

score 6 · Answer 2 · answered Apr 26 '10 at 22:59

6

Well, 4.5 million games still only covers a very tiny (infinitesimal small) fraction of all possible games.

And while you would have a large number of winning and loosing positions, that would leave the problem of reducing that to a usable set of parameters. A very old problem, with neural networks as a standard approach. But NeuralNets aren't winning chess tournaments.

answered Apr 26 '10 at 22:59

H H

263,252
30
330
514

I do of course realise that the number of possible chess games is a humongous number. However, there is something to be said for a human's ability to quickly evaluate a small number of "good" moves without even cursorily considering stuff that's obviously "bad". It seems like a statistical engine could make use of this. – Chinmay Kanchi Apr 26 '10 at 23:07
@Chinmay, few humans will have studied even close to a million games. But designing that "statistical engine" is the big challenge here. What would it look for? – H H Apr 26 '10 at 23:13
I was advocating a "dumb" approach. If a statistically significant number of games (after weighting) have made that move, you make the move too. – Chinmay Kanchi Apr 26 '10 at 23:29
@Chinmay: maybe start by analysing those 4.5 million games, and see how many distinct positions there are among them after n moves, for n=1 (i.e. how many white opening moves - perhaps all 20 appear) and up. You might find that very soon (after about 20-30 moves), standard openings are all played out and there is no longer any such thing as a statistically significant number of games. – Steve Jessop Apr 26 '10 at 23:54
Also, your statistical chess player will probably be rubbish at endgames. It won't have any material to go on beyond the point where someone resigns, so opponents might be able to call its bluff, and beat it from an "unwinnable" position simply because your algorithm can't close out ;-) – Steve Jessop Apr 26 '10 at 23:56
Well, for endgames you can cheat and use tablebases ;) – Chinmay Kanchi Apr 27 '10 at 01:39

score 3 · Answer 3 · answered Apr 27 '10 at 05:43

This general strategy has been tried for a variety of games. Very often people generate a suitably large database of games by having the computer play itself. A quick internet search turns up http://www.cs.princeton.edu/courses/archive/fall06/cos402/papers/chess-RL.pdf - which builds on previous work in Backgammon. In chess, brute force look-ahead is very effective for computers, though - and in general statistics is much more effective when you can mix in all of the previously known information about the problem, rather than trying to re-learn it from the data. I note that in this link the computer learnt what amounts to the evaluation function at the bottom of the look-ahead, not the whole process.

@mcdowella: funny you mention that, I commented to the OP that he could take the best chess programs and make them play themselves to generate games, didn't know it was actually used for other games :) — SyntaxT3rr0r, Apr 27 '10 at 14:44

score 2 · Answer 4 · answered Apr 27 '10 at 11:12

There is something similar that works very well in computer Go - the UCT method. It does not use a known set of games but instead plays a huge number of random games while keeping statistics which moves lead to higher ratios of wins. It does so starting from the current position.

The statistics are kept in a tree of moves (similar to one used in minimax) and influence the choice of the next random game to play - the moves with higher win ratios are chosen more often. The growth of the tree is also guided by the games - usually each game adds a leaf to the tree. This leads to the promising paths being explored more deeply.

mjv · Answer 5 · 2010-04-26T23:19:44.040

I like the idea, but the analogy [with text translation] seems to fall short when considering that the context of a sentence in natural language requires much fewer elements than the context of a chess board position (even though the elements of such sentences, namely words, may come from of a bigger set than the elements of a chess game, namely the game pieces, knight, pawn etc.)
Furthermore, the availability of multi-lingual corpus (documents of various nature, in various languages) far exceeds the number of chess games which one may find in a digital form, in particular when considering that for chess analysis one needs the whole game whereby, for translation purpose, one can use every sentence independently from the rest of the text.

As a result, and with the possible exception of the opening portion of the games (when the board positions haven't had much opportunity to diverge relatively to other games), the number of chess games required to introduce some statistical significance would have to be astronomical...

Gotta run, but I'll be back with specific estimates on the number of possible chess games (in the absolute, and in the subset of plausible games), and should effectively prove that 4.5 million games is a relatively small sample.

I've seen estimates that the complexity of the game tree is ~10^50 given the fifty move rule. So yes, 4.5 million games is a drop in the ocean. However, a huge number of these moves will almost never be played because they're truly horrifically awful, the kind of moves human players don't even consider. Of course, even if the number of "not awful" moves is only 10% of the total number of moves, you still end up with ~10^49 moves which doesn't help :D — Chinmay Kanchi, Apr 26 '10 at 23:19

Anurag · Answer 6 · 2010-04-27T00:09:54.320

1

There are approximately 10¹²³ game trees in chess out of which you have about 4.5 × 10⁶ in that database. We can ignore the game trees, and only consider state-space complexity of which there are anywhere between 10⁴³ and 10⁵⁰ legal states. Let's assume that all games in that database have unique moves, and that there are 1000 moves on average per game, which gives us 4.5 × 10⁹ states. Taking the estimated lower bound 10⁴³ of possible states, that only covers 4.5 × 10^-34 of all states. I don't know what is the total number of unique board positions that exclude rotations or reflections, but it will only reduce it by a factor of two or so, which is not very helpful.

You would need to add more domain knowledge into the statistical engine and find out the level of similarity between two given board positions as there is a 1 in 10³⁵ chance that you will not find a matching board position (including reflections and rotations). I think the biggest key here would be finding how how two given board positions are similar. That will incorporate a lot more domain knowledge than just simple transformations.

It is a great idea nonetheless that is worth exploring further, although I suspect that it has been tried before given the complexity of chess and the interest around it.

edited Apr 27 '10 at 00:09

answered Apr 26 '10 at 23:55

Anurag

140,337
36
221
257

2

"there is a 1 in 10^35 chance that you will not find a matching board position (including reflections and rotations)". Not really. Chess positions aren't reached at random, either in the game in the database or in a particular game being played by the computer. – Steve Jessop Apr 27 '10 at 00:06
@Steve, are you saying the chances are higher or lower? The game tree will not always unfold as per the stored games in the database. It will exponentially branch out with each move the opponent makes that doesn't coincide with something already in the database. 1/10^35 is an approximation and the actual value isn't very far off. – Anurag Apr 27 '10 at 00:37
It's not really a question of probability. It depends how the opponent plays, and we can't treat that as a random variable with known distribution. When good chess players play machines, they deliberately play toward positions on which they think the machine will be weak. So in that case the probability would quite soon be much lower. Conversely, for as long as the opponent follows a standard opening the probability of a hit in the database is 1 for quite a few moves. – Steve Jessop Apr 27 '10 at 08:10
@Steve, probability has everything to do with it when you are writing a statistical engine. I am more interested in finding an upper bound for when the game does branch out beyond a small number of moves, and it can branch out really fast. Yes, there are 20 ways the first move can be played, but 69352859712417 ways the 10th move can be played. But you're right, when you have 20/400/8902 possible moves for the first three plies respectively, then probability of hitting one from the database is quite high, but reduces exponentially with each move. – Anurag Apr 27 '10 at 14:10
I think we understand different things by the word "probability". The probability of a game being in a position which is in the database is not equal to the number of positions in the database, divided by the number of positions a game can take. It is a function of the strategy of the opposing player. Since we have not said anything about the player, we therefore have literally no idea what the probabilities might be. If the player is another copy of our machine, for example, then the probability of the position being in the DB is 1, since that's the only way it can play. – Steve Jessop Apr 27 '10 at 14:23
... if the opposing player always plays standard openings for the first few moves, then likewise the probability of leaving the DB does not increase "exponentially", it remains at 1 until someone diverts from the standard opening. For top players, the probability that they will make a novel move early is quite low, at least in important games. Unless, of course, that's their tactic to beat the machine, in which case it's very high. For no reason other than sheer fluke will it be equal to the proportion of possible positions in the DB. – Steve Jessop Apr 27 '10 at 14:29
So, in order to calculate the probabilities you want, I think the closest thing to "correct" is to look at the games in the DB, and break them down to locate in each one the novel positions (if any) with respect to the DB at the time that game was played. This will give you a frequency-based probability estimate, assuming that the games your machine plays will be in some sense "typical". Which they won't be, of course, if the opponent knows or guesses they are playing a machine. – Steve Jessop Apr 27 '10 at 14:36
@Steve, Let's assume a hypothetical board game with toned down numbers. The game has 10 possible states. There is a database that contains only 2 of these 10 states. Given a random board position, what is the probability that the database contains that board position? If it's not 2/10 or 0.2, then yes our definitions of "probability" do differ. – Anurag Apr 27 '10 at 15:14
"Given a random board position" - random with what distribution? Your definition of "probability" appears to be "a board position selected randomly with uniform distribution". In stud poker the distribution of cards is uniform. In chess the distribution of pieces on the board is not, and neither is the distribution of board positions in games. So where we part company is that I don't think "a board position selected from all possible board positions with uniform distribution" is a reasonable model of what positions you'll see in a chess game. – Steve Jessop Apr 27 '10 at 15:49
@Steve, you're absolutely right in that a board position selected at random from a game does not have uniform distribution. But there are infinite factors you can throw at when coming up with the distribution like how much caffeine did the opponent have pre-game. Will the opponent play differently if it was a world championship versus a casual game with friends. Or maybe the opponent is pissed off at the rainy weather affecting his judgement. There are an infinite number of factors you can take into account in coming up with that model. – Anurag Apr 27 '10 at 18:14
And one thing you can be assured of is that model is not going to be the one right model. Ask ten people to model this as accurately as possible, and you will get ten different results. That's the reason why I picked a simple uniform distribution to find out the probability in the worst case - when textbook plays have played out, and the board is wide open and there is no help from that stored knowledge in the database. But if you fancy modeling out that distribution in detail and want to throw in as many factors as you want, then be my guest. – Anurag Apr 27 '10 at 18:17

score 1 · Answer 7 · answered Feb 15 '16 at 12:18

I would say yes, it could work. Nobody's really tried it successfully yet, but why not look for "patterns" using a statistical approach. I am not considering storing the whole board since there's astronomically many board positions to store, but merely looking for specific patterns.

Looking for Patterns

A typical chess program evaluates and gives bonuses for recognized patterns such as good defending pawn or an open rook line and on the other hand penalties for such as doubled pawns.

Such patterns could be programmed efficiently in 64-bit masks. You would have bit-masks for positions that matter and bit-masks for expected pieces in those positions. Each pattern takes time to match so it would be important to find the ones that makes a difference. That's where Google's statistical approach would be used. It could run through "historic" games and look for patterns. After it finds a pattern it would have to calculate a weight to the pattern and see if the improved evaluation outweighs the overhead.

I think this would be a rather huge project to try out, even too much for a PHD thesis.

score 0 · Answer 8 · answered Jun 27 '16 at 11:26

Machine Learning has made big progressions lately, especially after the Google team beat the GO champion using ML. It has now also been demonstrated with chess. Take a look at the article in MIT technology Review, https://www.technologyreview.com/s/541276/deep-learning-machine-teaches-itself-chess-in-72-hours-plays-at-international-master/

ML's Deep Learning is an enhancement to the old Neural Network self-tuning AI algorithms. Lai's demonstration didn't teach the machine the basic rules of chess or cared about the outcome of the games. He just fed the machine with a large database of games and the machine figured out the rest and played at a reasonable "human" level.

I assume two big enhancements would be to make it more efficient by teaching it the rules and then guide it by feeding it with the actual results of the games. Then after that go train with the current chess champions, engines like Stockfish! :-)

score 0 · Answer 9 · answered Oct 03 '16 at 17:20

0

Deep Learning algorithm similar to the GO program that defeated the Master Human player might be killer. It would require a high cost though. However, one might use the learned deep learning patterns from GO and apply it to chess.

answered Oct 03 '16 at 17:20

Cliff

1

score 0 · Answer 10 · answered Nov 12 '18 at 10:53

One thing I have not seen mentioned is consideration for the ratings of the players in games in your database. Some openings with good db percentages are a result of the better player tending to win and say little about the value of the opening.

In fact I have decided that databases are good for only one thing and that is indicating what moves are popular. Beyond that you are really stretching your interpretation of the data beyond what it merits.

Similarly computer analysis only shows the best result for computer vs computer games. Games between humans are different and you should not rely too heavily on computer analysis.

Both the database and computer analysis is interesting but they can easily be misinterpreted. Be-ware.

score -1 · Answer 11 · answered Mar 07 '11 at 14:38

Chinmay,

I know this is an old thread, but it's a topic I've been exploring lately. Most of the people who answered above didn't really get your question. I think that, yes, it is worth analyzing numerous games in the past to develop suggested moves. Will it cover all possible moves? No, obviously not. But it covers all realistic moves from real games. A human (or another computer algorithm) would have to start playing very strange moves to throw things off. So, you can't build a 'perfect' algorithm that wins all the time, but if it wins a lot, say a >2200 FIDE rating, it's not bad right? And if you incorporate Openings and Endgames, not just rely on past move analysis, it makes it an even better engine.

There are an astronomically high number of possible board positions, but it is finite, and if you remove the stupid positions it reduces the number quite a bit. Is it possible to have 4, 5 or 6 of one players pawns lined up in the same file? Yes, would it happen in a real game? Doubt it. Include a basic Chess brain into your logic for situations where the opponent goes 'off book'. Micro Max is only a couple hundred lines of code for example. If the opponent has played stupid to thwart your moves, they are probably beatable by a simple engine.

Statistical approach to chess?

11 Answers11

Looking for Patterns