The problem
Hokm is a four-player, trick-taking card game with a trump suit and, crucially, hidden hands — you never see what your opponents and partner hold. That makes it a small but honest test bed for the hard part of game AI: deciding well under partial observability, where the optimal move depends on belief about unseen state, not just the board in front of you.
What I built
A reinforcement-learning agent, in PyTorch, that learns to play Hokm:
- An encoding of the observable game state — your hand, the trump suit, cards played so far, whose turn it is.
- A learned policy trained through self-play, so the agent improves against versions of itself rather than a fixed heuristic opponent.
- Dynamic game-state logging, so a full match can be replayed and inspected move by move.
- An interactive interface to play against the trained agent directly.
Outcome
A playable agent and an end-to-end loop — environment, training, logging, and a human-facing UI — small enough to reason about completely, which was the point: a clean place to build intuition for RL under imperfect information.
Tech
Python, PyTorch.