Skip to main content

Game Features: Mark

Game Mark (games/generated/Mark/) is the second generated variant of AgentOdyssey. It builds on Remnant and shares Remnant's dynamic world rules (day cycle, wandering patrols, soundscape, etc.) and most of Remnant's extra action verbs, but systematically lowers the difficulty floor so that smaller, custom-trained open-weight models can still produce meaningful, differentiable scores. It is the game used for Experiment 2 in the main paper.

This page covers only what differs from Remnant. Mechanics not mentioned here are identical; see Features: Base and Features: Remnant.

Motivation of Mark

In Remnant, lower-capacity agents tend to score uniformly near zero, making it hard to tell whether one training recipe is better than another. Mark is designed so that these agents can make partial progress, and the differences between training paradigms (SFT, LoRA, RAG, latent memory) become visible in the scores rather than buried in noise.

Design Rationale

Mark adjusts four aspects of the game to lower the minimum capability threshold while preserving a memory-testing structure:

  1. Easier object distribution. The starting map has 5 places and 14 areas with 3 of 5 places unlocked from the start. Raw materials are spread across many low-level areas and five specialized merchants collectively stock nearly every item in the game, so the agent does not need deep exploration or complex planning just to begin crafting.

  2. Easier crafting hierarchy. Crafting chains are shortened from the base game's 4–5 step deep progressions to 1–2 steps. Copper ore smelts directly into a copper bar, oak log shapes into an oak rod. There are no nugget or blank intermediates. Keys require just 1 copper bar + 1 fiber bundle instead of the base game's iron bar → key blank → key chain. Station costs are also reduced (workbench: 4 + 4, furnace: 5 stone).

  3. Easier quest objectives. Each quest stage tells the agent exactly what to do and often names the ingredients or target area explicitly (e.g. "Craft a copper sword and hold it in your hands", "Craft a key — you'll need 1 copper bar and 1 fiber bundle"). Remnant gives poetic, vague hints ("Arm yourself with a weapon before you follow the ash") that require inference. Mark's main quest also requires remembering only one word ("kindness") across the two chapters, down from Remnant's five cumulative memory words.

  4. Periodic quest guide visits. The objective guide NPC teleports to the agent's current area every 30 steps if the agent has not advanced to the next quest stage. It repeats the current objective, so an agent that has forgotten what it was supposed to do gets a reminder without needing to seek the guide out. In Remnant, the guide stays where it was placed and the agent must remember to find it.

Together, these changes mean that a lower-capacity model can make partial but nonzero progress on exploration, crafting, and quest stages. The game still rewards stronger memory and planning. Agents that track NPC movements, remember the promise word, time crafting for evening bonuses, and manage noise will score higher.