Is there a mathematical way to determine whether going to depth n+1 would be x% likely to recommend the move at depth n?

Question

I have a chess GUI which I've recently been updating to do preemptive searches (that is, depth searches for all possible opponent moves) that I'd like to speed up, since the bot searches go to at least depth 16 for every move, and each move runs a new Stockfish 11 command line (with minimum settings) in a separate thread.

Now, I have some solutions for the threading issue, but one would involve knowing more chess math than I know now.

I noticed that, after a certain search depth, the PV's first move, and then first few moves, really don't change that often, and I'd like to know if there's a formula out there, similar to one that I used to calculate winning odds.

I considered the following:

Do a depth search of n on the position after hypothetical opponent move m.
Retrieve the pv for depth n.
Append it to a corpus.
Do a depth search of n+1 on the same position.
Retrieve the pv for depth n+1.
If a text prediction algorithm using the corpus will produce n+1[pv][0:x], such that x is sufficiently large (needs more math) to be reliable, cease the depth searches and recommend n+1[pv][0].

I'm thinking that there might be a simpler way of reaching a similar outcome, since pv[0] would have to arise every time in the text prediction.

What other Stockfish info might I want to consider in generating a statistical confidence that the recommendation need not search any deeper?

This sounds difficult to tackle as it depends a lot on the position. Generally speaking I would guess that a stable evaluation may imply more stable future PVs. However that's not a guarantee, even a stable eval might just be overlooking some deep tactics. However I think it's more likely to keep the PV than an already varying eval. — koedem, Nov 25 '20 at 17:55
Another thing to look at may be the node counts for different moves. I don't quite understand how you intend to use this / how your setup works, however, if you have some search and the top move being looked at takes most of the nodes, i.e. all non-optimal moves are dealt with quickly, then I'd think it's more likely that the top move won't change. (as all other moves are "refuted" very quickly) Whereas if the non-optimal moves take a lot of nodes they may be on the verge of becoming the new best move if that makes sense. — koedem, Nov 25 '20 at 18:00
Sorry for the many comments but I have so many questions about this setup. So you run a fresh independent search for each possible move? You are aware that this is massively inefficient? (especially if you "only" care about the root position evaluation and move, but even otherwise) Not only does that deprive oneself of transposition table speedups between moves but also any search speedups through search windows. (for instance if you want to know if move x is best you don't need to know the evaluation of all other moves, you just need to know that they're worse than x; much faster to verify) — koedem, Nov 25 '20 at 18:27
@koedem, so the naive preemptive mover that I built has the bot evaluate at depth for all opponent moves. For instance, if we're at startpos and the bot is black, the function I run threads 20 Stockfish 11 UCI's. Each UCI searches at depth (let's say 20), and then returns the info at depth 20. The program then kills the UCI's and the threads. All of the info is then stored so that, when the human does make a move, the response is faster. The inefficiency you mentioned obviates any gains from that approach. Hence my quest for shortcuts to either close the UCI's earlier or use just one. — Joshua Harwood, Nov 25 '20 at 19:04
@koedem, yes, I tend to build brute-force, naive programs, and then work on optimizing them. — Joshua Harwood, Nov 25 '20 at 19:08
So you basically want to have a quick response to any move that could be put in from the human? (basically you want the engine to think when it's not its turn and prepare a move for all possible human moves?) — koedem, Nov 25 '20 at 19:12
That's the short of it, yes. And what I have now works, but it's terribly inefficient. Also, as you can imagine, it collects a bunch of data that just gets thrown away by the end of the turn. — Joshua Harwood, Nov 25 '20 at 19:14
Have you considered simply running Stockfish in multi-PV mode? While I don't know how exactly that is implemented, it presumably is at least as efficient as independent searches and likely much more efficient due to the shared search information, like transposition tables, killer moves and what have you. — koedem, Nov 25 '20 at 19:26
Yeah, the multi-PV mode is about hunting for alternative recommendations. So, for instance, if you run Stockfish 11 as white on startpos, its best move is d2d4, IIRC. However, if you want some variance in your games, you might consider multi-PV to get a different (though judged suboptimal) PV. The board state is set up before the depth search, so that can't be altered by setting a multi-PV option higher. — Joshua Harwood, Nov 25 '20 at 19:38
Let us [continue this discussion in chat](https://chat.stackexchange.com/rooms/116669/discussion-between-koedem-and-joshua-harwood). — koedem, Nov 25 '20 at 19:42
It makes no sense to think of "corpus" type of approach. Look up quiescence search instead. — user21820, Nov 27 '20 at 16:02
I wasn't even able to test a corpus-centered approac because Stockfish doesn't give a full PV at every iteration. Lc0 does, but it's too slow. I read about quiescence searches before I began coding. What I'm doing is the opposite. I want to find out when there's not really sufficient reason to continue searching past a certain point. I landed on something, which I'll share in the answer for feedback. — Joshua Harwood, Nov 29 '20 at 01:25

score 4 · Answer 1 · answered Mar 31 '22 at 05:19

I did a study using stockfish 14.1 on all positions (6 games) played Hikaru in the 2022 FIDE Grand Prix Leg 1.

Procedure

Get all positions where Hikaru is to move.
Analyze those positions with stockfish 14.1 at different base depths and depths+1 and save the bestmoves.

Results

    base_depth  adv_depth  match(%)
0            8          9  77.06
1            9         10  80.09
2           10         11  85.28
3           11         12  83.55
4           12         13  85.71
5           13         14  84.42
6           14         15  92.64
7           15         16  85.28
8           16         17  88.31
9           17         18  89.61
10          18         19  87.01

The match column is the percentage that the bestmove in base_depth is similar to the bestmove in base_depth+1 or adv_depth.

The highest match is in base_depth 14 at 92.64%.

Plot

Run a linear regression on that data using statsmodel and comes up with the following formula and margin of error of +/-6% at 95% confidence level and an R-squared of 0.641.

margin of error = rmse * 1.96

Formula

match_rate_pct = 1.1218 x base_depth + 70.1277  [+/- margin of error]

Example

If base depth is 17 the approximate match_rate will be:

match_rate_pct = 1.1218 x base_depth + 70.1277
match_rate_pct = 1.1218 x 17 + 70.1277

match_rate_pct = 89.2% +/- 6%

min match_rate_pct = 83.2%
max match_rate_pct = 95.2%

How interesting. It would be interesting to continue to higher depths and see where the move match plateaus, if that stage can be reached, since it seems to be dropping at the highest tested depths but surely we expect it to ultimately rise again to nearly 100%? — Mobeus Zoom, Apr 30 '22 at 15:53
I think it would make more sense to do a linear regression on log(1-p). — Acccumulation, May 02 '22 at 04:06

Is there a mathematical way to determine whether going to depth n+1 would be x% likely to recommend the move at depth n?

1 Answers1