17

This is the study

[fen "3B4/1r2p3/r2p1p2/bkp1P1p1/1p1P1PPp/p1P4P/PPBK4/8 w - - 0 1 "]

As you can see, Stockfish gives an absolute decisive win for black, though it is clearly a draw!!

Check it out yourself

c4+ is a losing move, while Ba4+ is the correct one.

What's going on? Doesn't Stockfish take into consideration closed positions at all?

Rewan Demontay
  • 16,942
  • 4
  • 65
  • 109
William Kinaan
  • 347
  • 2
  • 8
  • 7
    There are lots of positions that Stockfish doesn't evaluate correctly (including some common endgames). – Qudit Jul 08 '19 at 01:11
  • 11
    Happens all the time! It's a piece of software, not a God – David Jul 08 '19 at 08:43
  • 4
    It would make me quite happy to see a scenario in a similar vein that's assessed completely wrong by AlphaZero or Leela Zero! – leftaroundabout Jul 08 '19 at 10:17
  • Stockfish certainly sees the position after 1.Ba4+ Kxa4 2.b3+ Kb5 3.c4+ Kc6 4.d5+ Kd7 5.e6+ any 6.f5. It's only eleven ply. So the real question is: why doesn't Stockfish see that the position after 6.f5 is a draw? – TonyK Jul 08 '19 at 18:39
  • @TonyK That's because Stockfish doesn't understand that black can never break through. It just sees that it is down a ton of material. Stockfish evaluations are based on calculation. It doesn't understand that the pawn structure is locked and material doesn't matter. – Qudit Jul 08 '19 at 18:44
  • 1
    @Qudit: Yes. I was just pointing out that the question should focus on precisely this aspect, rather than starting from the position as given. Sorry if I didn't make that clear. – TonyK Jul 08 '19 at 19:06
  • Hmm my local installation of stockfish 10 is giving me e5f6 on same depth. wonder what is the source of the difference – OganM Jul 08 '19 at 20:34
  • 3
    @leftaroundabout Leela misevaluates fortresses all the time as well, see e.g. the end of this game from the TCEC Sufi: https://cd.tcecbeta.club/archive.html?season=15&div=sf&game=31. In fact arguably Leela misevaluates even more than Stockfish, since quite often it will have some nonzero eval while Stockfish stoically displays 0.00, and it takes ages before Leela realizes the opponent is not letting it win. – Allure Jul 09 '19 at 01:05
  • Rule of thumb: If Stockfish doesn't change the evaluation score *at all* for 10-20+ moves (in this case staying at exactly -18.1), it has most likely found a fortress. – Annatar Jul 09 '19 at 14:03
  • I find this laughable. You are getting all upset about a program rated 3450, 600+ points higher than the best human player the world has ever seen, because it cannot see a draw in a position that would never occur in real life. – Randy Minder Jul 11 '19 at 21:04

3 Answers3

27

Stockfish isn't a perfect chess-playing entity, and you've found a position where it's unable to tell is a draw (at least until the 50 move rule kicks in and helps it prune). These positions are called "fortresses". You can tell this is happening because even if you input the solution, Stockfish still evaluates the final position as -10 or more. These fortress positions where Stockfish is dead wrong are few and far between, but they exist, and this is one of them.

There've been various attempts at writing fortress detection code in Stockfish and other conventional engines which are meant to recognize fortresses and stop Stockfish from heading for them if its position is superior. If you have a smart idea, you can probably publish it in an academic journal (see the publications in the chessprogramming wiki).

Allure
  • 25,298
  • 1
  • 65
  • 143
  • https://www.chessprogramming.org/Crystal is a variant of Stockfish that tries to solve tricky positions like fortresses – qwr Sep 20 '22 at 07:36
12

It helps to understand that engines don't really go off of "strategy", so much as they look several moves into the future, evaluate the score of the position, and find the optimal move set.

The great weakness of that approach is that if nothing can happen quickly, the engine's going to have problems. This used to be a huge problem with endgames. If you've got a K+B+P vs K+P, you're not resolving that position in just a few moves. So the fix was to add Endgame Tablebases to the engines - just bruteforce calculate every endgame position ahead of time and add them as a library for the engine to use. (This is why, in most cheap chess apps, you can earn a win simply by surviving to the endgame: because it doesn't have an endgame tablebase.)

But your position is (understandably) not going to be in any endgame tablebase. So stockfish has to play out millions of positions, stretching 10 or so moves into the future... only to find that in all of them, black is substantially up in material. It definitely can't play out ~56 moves for the full decision tree - which is what it would take to determine a definitive draw.

Kevin
  • 426
  • 3
  • 5
0

The other answers cover this question well, but it is interesting that not all programs do not understand this is a draw. I plugged this into my computer, and as a ChessBase Premium member, I also have access to cloud engines.

One of those cloud engines evaluated this as a draw almost instantaneously. I only wish there were more information than simply "New Engine".

enter image description here

PhishMaster
  • 32,397
  • 4
  • 102
  • 174
  • 2
    The reason that one engine evaluated it as draw is probably because it has that exact position in its cloud database solved as a draw. This position is actually a chess composition that I remember seeing in some book. This is why it might appear in a database, although it's not a position that naturally occured in a game, and the probability of that happening is pretty low. – neondrop Apr 12 '20 at 22:15