Deepseek R1 Robotic Reasoning with Checkers

Community Article Published March 5, 2025
Robotics Checkers Setup with Deepseek R1

In this post, we explore the abilty of DeepSeek R1, as well as other LLMs, to control a robotic arm to play checkers. We find DeepSeek R1 performs better than other comporable open-source LLMs, but falls behind human and algorithmic players, underscoring the need for further advancements in integrating LLMs with robotics.

Integration with LLMs

To allow LLMs to make the next move, we need a way to encode the checkers game into text, and then retrieve a valid move from text. Therefore, we design a prompt that includes the rules, the state of the board, and a list of valid moves:

prompt = f"""You are playing as black (● and ◎) in a game of checkers.
You need to choose the best move from the list of valid moves provided.

Rules to consider:
1. Regular pieces (●) can only move diagonally forward (upward)
2. King pieces (◎) can move diagonally in any direction
3. Getting pieces to the opposite end to make kings is advantageous
4. Look ahead to ensure your piece will not get captured in the next turn

Current board state:

  0 1 2 3 4 5 6 7
0 - ○ - ○ - ○ - ○
1 ○ - ○ - ○ - ○ -
2 - ○ - ○ - ○ - ○
3 - - - - - - - -
4 - - - - - - - -
5 ● - ● - ● - ● -
6 - ● - ● - ● - ●
7 ● - ● - ● - ● -

Valid moves:
1. MOVE: (2, 1) → (3, 0)
2. MOVE: (2, 1) → (3, 2)
3. MOVE: (2, 3) → (3, 2)
4. MOVE: (2, 3) → (3, 4)
5. MOVE: (2, 5) → (3, 4)
6. MOVE: (2, 5) → (3, 6)
7. MOVE: (2, 7) → (3, 6)

Briefly analyze the board position and select the best move from the list
above. End your response with your chosen move on a new line starting
with "MOVE:

Example response:
MOVE: 3

Integration with Robot Arm

We use Deepseek R1 to control a ViperX 300 S robotic arm. We extract the selected move from Deepseek R1 and use this to execute a pick and place.

Regular Moves with the ViperX Robotic Arm

We also support jump moves, which will remove any captured pieces.

Jump Moves with the ViperX Robotic Arm

Results

We compare Deepseek R1 against other LLMs, the well-established Minmax algorithm, and human players.

Deepseek R1 vs. other LLMs/Algorithms

To evaluate the performance of different players, we run a round-robin tournament with 4 players. We use Deepseek R1 (deepseek-r1-distill-qwen-32b), Llama 3 (llama-3.3-70b-instruct), and Qwen 2.5 (qwen2.5-32b-instruct) as 3 LLM players, and in addition to the Minmax algorithm. We match every player against every other player in round-robin format with repeated matches, resulting in a total of 120. We report the win rate of each player as:

Player Player Type Win Rate
Qwen 2.5 LLM 26.6%
Llama 3 LLM 30.0%
Deepseek R1 LLM 43.3%
Mimax Algorithm 100.0%

Deepseek R1 vs. Humans

We also play 3 games between Deepseek R1 and a human, showing the winner of each game:

Player Game 1 Game 2 Game 3
Human
Deepseek R1

We observe that the human and algorithmic players consistently beat Deepseek R1 and other LLMs, which we attribute to the fact that LLMs are not trained to play checkers. LLMs are trained for next-token prediction or, in the case of Deepseek R1, for solving mathematical and software engineering problems. While checkers-related text is likely included in training datasets, we believe that full checkers games are scarce, leading to poor performance in actual gameplay. We hypothesize that training LLMs on checkers using supervised fine-tuning or reinforcement learning could significantly enhance their performance.

We play a full game versus Deepseek R1

Conclusion

This post examined how Deepseek R1 and other LLMs can be integrated with a robotic arm to play checkers. While DeepSeek R1 outperforms comparable open-source models, it still lags behind human and algorithmic players. We believe that training LLMs specifically on the game of checkers could greatly enhance their performance and suggest this as a promising direction for future research.

Community

Sign up or log in to comment