Continuous Thought Machines Could be the Next Step in AI

Summary

  • The Continuous Thought Machine (CTM) model incorporates time into neural networks for more human-like problem-solving.
  • CTMs bridge the gap between AI learning models like LLMs and human-like adaptability, but require more resources.
  • The CTM model faces challenges such as longer training times, slower inference, and lower accuracy compared to current models.

Do AI models “think”? This is an important question, because to someone using something like ChatGPT or Claude, it sure looks like the bot is thinking. We even have little prompts that pop up that say “thinking” and these days you can even read a bot’s “chain of thought” to see how it reasoned its way to a conclusion.

The truth, however, is that while LLMs and other AI models mimic certain aspects of thinking, it’s still not quite the same as a natural brain doing the work. However, new research into Continuous Thought Machines (CTMs) might change that.

What Is a CTM?

A Continuous Thought Machine (CTM) is a new kind of neural network that literally incorporates time into its thinking. Instead of the usual one-shot calculation, each CTM neuron keeps track of its past activity and uses that history to decide what to do next. In May 2025, Sakana AI detailed the CTM model in a research paper and a blog post.

Sakana AI

Sakana claims this is a new type of artificial neural network that more closely mimics how natural brains work. Neurons in a CTM don’t just fire once and are done; they have a short “memory” and can sync their firing patterns with other neurons. The network’s internal state is defined by these patterns of synchrony over time.

That sounds a lot like the synchronicity in biological brains that leads to brain waves. This makes CTMs very different from standard deep nets or transformers. For example, a typical transformer-based model processes a piece of text (or an image) in a fixed number of layers, all at once. Basically, it thinks in a short defined burst, and then goes brain-dead as it waits for your next prompt.

Related

AI Is Here to Stay, So Update Yourself With These 7 AI Terms

Know what the cool kids mean by “LLMs.”

The implications here are profound. If I even understand what any of this means—and it’s very possible I don’t! This quote from the post regarding solving mazes and gazing at photos really struck me:

Remarkably, despite not being explicitly designed to do so, the solution it learns on mazes is very interpretable and human-like where we can see it tracing out the path through the maze as it ‘thinks’ about the solution. For real images, there is no explicit incentive to look around, but it does so in an intuitive way.

Further, this sums it up pretty well:

We call the resulting AI model the Continuous Thought Machine (CTM), a model capable of using this new time dimension, rich neuron dynamics and synchronization information to ‘think’ about a task and plan before giving its answers. We use the term ‘Continuous’ in the name because the CTM operates entirely in an internal ‘thinking dimension’ when reasoning. It is asynchronous regarding the data it consumes: it can reason about static data (e.g., images) or sequential data in an identical fashion. We tested this new model on a wide range of tasks and found that it is able to solve diverse problems and often in a very interpretable manner.

I don’t want to overhype this idea until we see more and independent parties benchmark it, but some of what I’m reading here edges every so slightly into the realm of rudimentary consciousness.

Related

What’s the Difference Between Strong AI and Weak AI?

We don’t need to welcome our robot overlords just quite yet.

Why Is This Better Than Current Neural Networks?

The whole concept of the CTM basically gets rid of the idea of “one-shotting” problems, which is often seen as a gold standard with AI models, where you need it to get the right answer most of the time within the fixed window it has to push the problem through its transformer network—which is the type of neural network that powers ChatGPT, for example.

Sakana AI

That’s one of the reasons current LLMs don’t really have a good way to correct the relatively small percentage of times they get things wrong. There have been improvements with chain-of-thought, self-prompting, and bouncing something between two models till it’s better, but it seems like the CTM approach could bridge an important gap in accuracy and reliability.

Based on the promise Sakana outlines in its papers, this could mean combining the strengths of models like LLMs with the adaptability and growth of biological brains. I also think this has implications for robotics, and helping embodied machines learn, grow, and exist in the physical world more like we do.

Related

What Is AI Hallucination? Can ChatGPT Hallucinate?

AI chatbots can hallucinate. Here’s what that means for you.

The Downsides of Overthinking

Continuous thinking is powerful, but it comes with trade-offs. First, CTMs are more complex and resource-hungry than plain feedforward networks. Allowing each neuron to carry its history expands the network’s internal state massively. In practice, training a CTM can demand much more compute power and memory. They can take longer to train and may need more data or iterations to converge. Inference can also be slower if the model chooses many thinking steps for a difficult input.

Also, all the current tools and libraries are based around static models, not CTMs, for obvious reasons. So if there’s really something to this, it will take some time for the tools to catch up.

The other big problem is pretty obvious—when should it stop thinking? There’s a risk of “runaway” thinking where the CTM just goes around in a circle. So there have to be some pretty sophisticated rules to help it know when it’s done. If you don’t, then you can get error amplification and the same sort of hallucinations we already have as the model strays away from the ground truth it started out with.

The last main issue, and it’s a biggie, is that this early CTM model is still pretty far off from matching the best current transformer models when it comes to accuracy. According to a report by VentureBeat it’s falling well short on accuracy benchmarks right now.

Related

What Is an LLM? How AI Holds Conversations

LLMs are an incredibly exciting technology, but how do they work?

Our Minds Remain a Mystery

While what I’ve seen of CTMs so far based on Sakana AI’s papers looks a lot like what we see in human brains, the truth is that we still don’t actually know all that much about how our own minds work. It might be that AI researchers have stumbled on a solution similar to what natural selection has created for us and other animal species, but it might also be that we’ve created something on a parallel track that might eventually be similarly capable.


I’ve felt like current models are only one part of the puzzle of more generalized AI for a while now, like an LLM is more like the language center of a brain than the whole thing, and at first glance CTMs look like another part of the puzzle to me. So I will be following Sakana’s progress with great interest.


Source link
Exit mobile version