The current generation of frontier LLMs can't make puzzles that get much more interesting than hot-vs-big-vs-fast. New inferences keep circling a small pool of concepts unless the prompting has a way to get the LLM into new territories of language. Puzzlemaking needs graph traversals.
I made 20 levels algorithmically because I work with a huge semantic graph with over 100M edges, built from manual lexicography and millions of LLM inferences (various models). I keep exploring what can emerge from this graph. The puzzles are randomly selected; reload to see others.
The front-end was built with Claude Code.
Maybe someday I'll make this into a mobile game, increase the complexity and peril. If you are a gamedev, feel free to dissect it and borrow any parts.
loading...