Skip to main content

Ai

Chessbench

·2 mins
Overview # I wrote a simple harness to run OpenRouter LLMs in known-winning endgame positions. There’s plenty of prior-art1,2 on LLMs playing chess, dating even back to a cool fine-tuned GPT-2 project3, but overall: they still can’t really do this. Surprising finding! Results # Aggregated Result Table I burned on the order of $200 to obtain these runs.