A Blueprint for Language-Native World Models

Niel Ok
Stanford University · April 2025

TL;DR: Drawing on the insight that humans reasoned before language and that language serves as an interface, not the substrate for thought, we introduce a theoretical framework for language-native world modeling. In this paradigm, language models act as semantic encoders that map natural language descriptions of environments and goals into structured, interpretable latent representations. These latents evolve over time through a learned dynamics model, enabling simulation, planning, and reasoning entirely within latent space without relying on token-level generation.

Our modular architecture comprises: (1) a semantic encoder that grounds language in structured latent variables, (2) a history-aware latent dynamics model that simulates world-state transitions, and (3) a verifier that evaluates alignment between predicted trajectories and goal descriptions. We extend Shannon’s entropy minimization framework to these structured trajectories, framing reasoning as predictive compression over evolving world states.

This work offers the first cohesive blueprint for agents that simulate and reason over structured meaning, not text, recasting language models as engines of latent semantic prediction rather than token completion.

BibTeX

@misc{ok2025langworld,
  title        = {A Blueprint for Language-Native World Models},
  author       = {Niel Ok},
  year         = {2025},
  month        = {April},
  url          = {https://nielok.github.io/language_native_world_model/}
}