Skip to main content

Corrections about Chess Explanations


As soon as I wrote my previous essay, I realized there are several points that require correction or further explication. 
First, I checked if Stockfish would use against me the same strategy that I described for KRK and KQK, and saw that it does not, which proves I was wrong when I supposed that there are no correct strategies which are substantially different from those I described. It turns out there are, and they are faster in delivering checkmate, especially when starting position is with BK in the center and WK in the corner. The basic difference is that I assumed BK must be squeezed from the center to some edge by pushing it away from WK, by taking immediately under control the line that separates kings, using the major piece, while the engine is not afraid to push BK towards WK, and save that way a few moves needed for WK to approach BK.
Second, precise and concise language in which rigorous proofs are written is known to be math. Describing KRK endgame concept mathematically speaking means proving the theorem that KRK endgame is a win for white regardless of starting position, with the trivial drawing exception in case when black moves first and is able to capture the rook, or is stalemated in the corner.
That requires describing at least one winning strategy for white, and proving its correctness, either by computer aided method or by pen and paper non-computer-aided method.
That leads to a decomposition of main theorem into a set of lemmas, each for a certain stage of strategic plan, that each has to be proved in order to prove the main theorem.
However this form of describing certain concept as an alternating sequence of theorems and proofs is applicable to endgames only (some of them at least), but chess concept is a broader term that includes not only endgames, but categories that cannot be described that way, otherwise chess would be mathematically solved.
Hence the question of utility of fictional CCDL still stands.
Also, there is a real language, the Chess Query Language or CQL,         http://www.gadycosteff.com/cql/index.html  coauthored by Gady Costeff and Lewis Stiller, which covers a great deal of needs for chess explanation. It searches for user-defined patterns in chess games by defining filters, a feature not uncommon in query languages, which applied to an input .pgn file outputs a new .pgn file that contains only games from the input file in which a pattern is recognized.
A filter is also defined inside a file, .cql, and it contains a boolean function that returns true, if certain condition that user finds interesting is met within a game, ie if pattern is matched. That function can take a single position as an argument, in case of position patterns, or multiple positions,  in case of movement patterns. That is the most basic categorization of CQL patterns one can think of, since each game from the input .pgn is replayed, ie transcribed into a fen list, or whatever other position representation is used, and positions are examined independently or dependently, according to pattern requirement. So any kind of pin is an example of position pattern, while two or more consecutive checks by the same side during which the score deteriorates for that side is an example of movement pattern, that can be called “wasted opportunity by futile checking”. Actually, that would be a movement pattern even with just one check, because it would involve score examination of two positions, before and after check. Anyway, the point is that a pattern can be whatever a human user finds meaningful, and can code for in the filter function, which leads to the third point I wanted to explicate. Namely, if certain pattern can be envisaged in advance as important and its filter function  can be implemented by manual coding easily and precisely, it obviously makes no sense whatsoever to deploy machine learning and NN training to obtain the function of recognizing that pattern, contrary to what I supposed was done in case of DecodeChess, however, one can draw parallels between this and MuZero functions legal_move_generator function and recognize_terminal_game_state, how are they obtained and why. If one wants a general algorithm that gets as little specific programmer’s input as possible because it can autonomously decide on what is important, then it would make sense in DecodeChess case to deploy machine learning heavily to identify all patterns by itself, unsupervised by anything else but oracles such as engines and EGTBs, ie to determine what’s been actually played when certain side gets the upper hand. But it might be that DecodeChess has manually hardcoded recognition of all its patterns, adding just a bit of natural language generation on top of that, and there you have it, a chess explainer.
Let us get back to KRK endgame, I got the idea to investigate how exactly is Stockfish going to play it, only after I learned about alternatives, such as those described in a series of articles authored by Maliković, Janičić, Marić and Čubrilo in which they demonstrate computer-aided proofs of KRK theorem:

That is a fascinating subject that made me aware among other things of existence of Bratko’s strategy for KRK. 

However, I wasn’t satisfied with my presentation of KRK subject in the previous essay, so I decided to rewrite it in a form of pen and paper proof:

Auxiliary Definition 1: two kings stand in opposition if they are on the same line two squares away.
Corollary 1: two kings in opposition prevent each other to move to three squares adjacent to both of them, ie to move to the line that separates them, and is perpendicular to the line they both stand on. Proof: trivially follows from Definition 1 and from the game rules.
Corollary 2: when king is not placed at the edge of the board, and is not constrained by other pieces, it has access to three lines parallel to the same edge, ie to three ranks and to three files. Proof: trivally follows from the game rules.
Obviously, these two are not corollaries of the main theorem, but of the game rules, they are also lemmas to main theorem, so I could have named them also that way. 
Lemma 1: in KRK endgame checkmate is possible if and only if BK is at the edge of the board. Proof:
BK is checkmated if WK stands in opposition on the line perpendicular to the edge at which both BK and WR stand, with d(BK,WR)>1, if BK is in the corner, WK can also stand one square away from the previously described opposition square, two lines away from the edge at which both BK and WR stand, with Chebyshev d(WK,BK)=2.
In both cases WK denies BK access to penultimate line parallel to the edge BK stands on, in case of opposition according to Corollary 1, and in the other case corner placement helps due to the fact that the third square not guarded by WK does not exist, and WR covers the edge. No other checkmate position is possible, because there are no other pieces that could prevent BK from accessing the third line BK would have access to, according to Corollary 2, had BK not been placed at the edge, and this proves Lemma 1.
Now, in order to prove the main theorem, one has to prove two more lemmas:
Lemma 2: in KRK endgame it is always possible to force BK to one edge
Lemma 3: in KRK endgame, if BK is constrained to one edge by WR standing on the penultimate line which is parallel to that edge and separates two kings, it is always possible to force position in which it is white to move and deliver checkmate, by achieving position described in Lemma 1. Proof:
In order to prepare that mate in 1 position, white will make three kinds of moves: 
1.) decreasing Manhattan d(WK,BK), approaching with king without ever stepping onto penultimate line or into opposition
2.) shifting rook from one edge to another staying safe on the same penultimate line when BK threatens its capture
3.) waiting move with rook when 1.) is not possible and 2.) is not necessary, this is any move on penultimate line which does not change the ordering of pieces and keeps WR safe from capturing
It is obvious that applying 2.) white can always obtain tempo to approach with the king to a line adjacent to that which rook holds, two lines away from the edge where BK stands, from any other line further away, and to transpose to a position in which WK stands between BK and WR in terms of lines perpendicular to the edge where BK stands, although WR stands all the time between WK and BK in terms of lines parallel to that edge. From that position, black has two choices moving king along the edge, either to increase or decrease Manhattan d(WK,BK). If it tries to increase it, white will follow with 1.), but black is limited in this strategy with the corner, if it tries to decrease it, white will either follow with 1.) if possible, or 3.) otherwise, limit to this strategy is black stepping into opposition, when white follows with checkmate, which proves Lemma 3.
In the same vein, one can prove Lemma 2, and complete the theorem:
It is always possible to place safely from capturing WR on the line that separates two kings, that conclusion is rather trivial. From there white will apply exactly the same strategy as in Lemma 3, the only difference is that instead of checkmate it will result in squeezing BK one line nearer the edge, the limit of this strategy is BK ending at the edge.

I double checked the quality of this proof by posting it here:


The fourth point I wanted to correct is about comparison between legal_move_generator and recognize_terminal_game_state functions on one side and evaluate_position on the other.
After studying:

and

I realized that the law of diminishing returns regarding the quality of approximation applies with respect to growing NN training dataset (which must be done to avoid overfitting if one wants to increase the number of NN training iterations), and with respect to growing NN topology, regardless of Kolmogorov complexity of functions one is trying to approximate, the only difference is that with evaluate_position we actually move our goal constantly towards the greater complexity to achieve higher accuracy (and that is why we need that growth, which is not the case with the first two functions where perfect accuracy can be achieved with a simple function), which then allows less search of the game tree, for the same playing strength of the engine which deploys such function. If we were satisfied with the simplest Shannon like evaluation of position, such function could be approximated sufficiently well by training a small NN on a small training dataset (just like the first two functions), but using it would require much more extensive search in order to accomplish level of play probably still inferior to that of engines that deploy more complex evaluation, that is a result of selfplay in AlphaZero training manner. Not to mention that such function can be easily handcrafted (manually implemented) instead of being result of NN training, while vice versa is not possible, we cannot manually implement the evaluation of position that is at the level achieved by selfplay in Alphazero training manner.
So, these are two things, how close can a NN approximate a manually implemented function, and how accurately this function fits the actual needs, this distinction becomes blur if we do not manually implement the function in practice, but it still exist in theory.




 

Comments

Popular posts from this blog

The Game

Besides information and an intelligent agent, the third fundamental notion in a theory that describes intelligent behaviour is a game. <wikipediaQuote>Game theory is the study of mathematical models of strategic interaction between rational decision-makers. It has applications in all fields of social science, as well as in logic and computer science. Originally, it addressed zero-sum games, in which one person's gains result in losses for the other participants. Today, game theory applies to a wide range of behavioral relations, and is now an umbrella term for the science of logical decision making in humans, animals, and computers. Modern game theory began with the idea regarding the existence of mixed-strategy equilibria in two-person zero-sum games and its proof by John von Neumann. The games studied in game theory are well-defined mathematical objects. To be fully defined, a game must specify the following elements: the players of the game, the information and actions a...

More on Chess Explanations

Let me explain my point further by giving an example of what knowledge exactly are we talking about here, that engines and endgame tablebases possess, but in much more concrete form, which raises the question of software possibility to extract it from them in an abstract form, like I will present here, without using software. Pawnless endgames with major pieces, are a good start because some of them are most elementary checkmates, simplest patterns to describe, such as K+Q vs K and K+R vs K. In both cases, the lone king must be forced to the edge (or to the corner) of the board in order to get checkmated, due to a lack of other pieces (its own or its opponent’s) that could constrain its mobility additionally in sufficient way, if they were present. This can always be achieved by squeezing it from the center, by placing our king in opposition (an even simpler concept which also requires explanation/description), and checking it from the side, with the major piece. Actually, this is need...

On Constructive Criticism and Critical Thinking

It is a pedestrian observation that people appreciate criticism much more if it is not directed towards something they relate to. In my last essay, I concluded that replicator<-->vehicle is actually very limited paradigm, and received an answer from Mr.Deutsch that ended with conclusion: "Hence the unavoidability of the von Neumann replicator-vehicle mechanism and the futility of attributing evolution to the vehicle." That mechanism may be mathematically provable necessity to describe accurate self-reproducer, but it also may be totally agnostic with respect to mechanism that causes the inaccuracies, ie which drives evolution. Or is it also mathematically/logically provable that inaccuracy in self reproduction must be random in its nature? Is it prescribed by this model what causes variation, or is it open for some other model to give an answer to that question? In fact, already the choice of the term variation vs inaccuracy has different connotations and implies or s...