April 9, 2025
intelligence, scaling thinking, reasoning
Reasoning vs Thinking
Looking at all of the recent work attempting to scale reasoning for language models with variations of Chain-of-Thought (DeepSeek-R1, OpenAI o-1, o-3), plus with some reflection on how human learning and thinking works, I have some insights. It seems to me that reasoning and thinking for language models should be established as two different things. Reasoning, to me, seems like a cognitive action that utilizes facts and step-by-step logic to get from one point to another. Basically, CoT. When we talk about reasoning for language models, we think about how we can bake reasoning into the models with human text that demonstrates reasoning. Humans can learn how to reason, and so can models. Meanwhile, when it comes to thinking, it seems like it is more about taking an existing model, no matter how good it is at reasoning, and allowing it to search a larger portion of the space of possible answers before narrowing it down to the most ideal one. When people say someone is thinking hard, it means that person is using a lot of internal compute on a certain problem, not that the person is reasoning particularly well or anything like that. DeepMind’s inference scaling framework for deterministic diffusion models is an example of a framework for thinking. Thinking can be more quantitative, without the need for the abstraction of human language on top of the numbers that the model is actually running. I think reasoning sacrifices mathematical interpretability in favor of qualitative interpretability, while thinking sacrifices qualitative interpretability in favor of mathematical interpretability. So, reasoning looks more ethical while thinking looks more promising in terms of creating superintelligent, flexible models. Now, I believe that we can eventually build tools for interpreting thinking, so I am more interested in thinking than reasoning. I work on structured metacognition frameworks and denoising trajectory search for text, image, and robotics policy diffusion models, because that’s thinking. These projects are attempts to make thinking controllable. Thinking speaks more in the language of the model than the human, and it doesn’t restrict the model to our mode of communication. I think reasoning may force the model to commit to certain trajectories too early, while with thinking, you can basically control how long the model explores and when it should commit. The framework of thinking just gives you more control over the model in my opinion. I’m working on building more of a theoretical framework on this in my free time, which I’m excited about.