The Shape of Things to Come?
"Now, here, you see, it takes all the running you can do, to keep in the same place" - Lewis Carroll, Through the Looking-Glass
I have a soft spot for solving puzzles. Not sudoku, crosswords or jigsaws, but problems that feel like they refuse to sit still.
One of the biggest is: What is the shape of things to come?
I don’t mean “what gadgets will we buy” or “which model will top the benchmarks”. I mean the thing underneath the gadgetry - the way the landscape itself is changing. The kind of shift you don’t really notice day to day, until you look back and realise that the whole coastline has moved.
Over the last few years, a lot of my work has been an attempt to get serious about mapping that coastline. By building a way to see the change as change is happening - to track shifts in cognitive architecture at the human-tool boundary.
And the best tool I’ve found for doing that isn’t a new dataset, algorithm or clever app.
It’s geometry.
This post reflects on how this geometry lens shaped my writing in 2025, and the contour I now see emerging in 2026. And it ends with a question for you.
An old habit
I’ve always had a bias toward visual and spatial thinking. Not in the shallow “I like diagrams” sense (though I do), but in the deeper sense that I tend to understand systems by locating them in a space.
Where are the boundaries? What’s close to what? What’s stable under pressure? What changes smoothly, and what flips suddenly? What does it mean to move from one region to another? What’s an attractor, what’s a ridge, what’s a valley, what’s a cliff?
That way of thinking has been useful for a long time. It shaped the way I approached computer vision and AR for more than a decade - you can’t do much in those worlds without developing an instinct for how representations sit in space, how perception warps under constraints, and how “the same” scene can look very different depending on your viewpoint.
Then, in the last two years, it drove almost all of my work on AI.
Not as a metaphor I sprinkle on top. But as an organising principle. As a way of insisting that if we want to talk about minds (or mind-like behaviour) we have to be able to talk about structure. Not just what comes out of a system, but what happens inside it. Not just inputs and outputs, but the path in between.
Because if you only look at inputs and outputs, or thin circuits, then you miss the geometry of the act.
That line is probably provocative. But this isn’t a dismissal of interpretability work that focuses on circuits - it’s a reminder that circuits are often the thinnest slice of a much richer phenomenon. A circuit can be real and still be the wrong level of description for the question you’re asking.
And the questions I’ve been asking keep dragging me toward a mid-layer view - where inference looks less like a straight line and more like manifolds draped over trajectories.
2025 - A year of building maps
I wrote quite a bit in 2025.
Not because I was trying to “produce content”, but because I was trying to capture my thoughts as I went. Things are moving fast enough that if you don’t write down what you mean, you’ll end up arguing with ghosts - or worse, with your own earlier confusions.
Looking back, my work clusters into a few streams that feed each other.
One stream was the big framing - the attempt to articulate a functionalist approach to consciousness and mind-talk that isn’t merely philosophical posturing, but a practical research stance. Another stream was the more mechanistic work - the attempt to describe transformer inference in a way that treats it as a real process with internal geometry. And threaded through all of it was a recurring triangle - a way of keeping my attention anchored to the relationships that seem to matter most.
Let me walk you through it and the way it felt from the inside.
FRESH, and the permission to measure
Early in the year I published FRESH: The Geometry of Mind, which was my attempt to state the research program plainly - if we want to talk about consciousness, subjectivity, or anything in that family, we need to stop treating “what-it’s-like” as a metaphysical exemption.
Functionalism, at its best, is not a claim that inner life is “nothing but behaviour”. It’s a refusal to treat inner life as magical and untouchable.
It says - if something matters, then it should have structure. If it has structure, then it should be possible to characterise that structure in terms of organisation, dynamics, and constraints. And if we can characterise it, we can build empirical handles - not perfect handles, not final answers, but useful handles.
That stance is sometimes misunderstood as cold or reductive. For me it’s the opposite. It’s the move that makes care possible.
If you don’t have a workable way to talk about mind-like organisation, then you can’t see what’s being created, what’s being damaged, or what’s being quietly outsourced.
The companion piece (The Evidence for Functionalism) tried to make the case that functionalism isn’t a leap of faith. It’s the only stance that takes both science and experience seriously without requiring a magic door labelled “and then consciousness happens”.
Once you accept that permission (the permission to measure) geometry becomes more than a stylistic choice. It becomes a pragmatic method.
RISE - finding loops in a system that ‘has no loops’
Around the same time I wrote a three-part series that I still think of as one of the year’s most accessible on-ramps.
It started with Recurrence Without Memory.
Transformers are often described as feed-forward, stateless machines. You give them context, they compute, they give you the next token. No recurrence, no looping.
And yet, anyone who has spent time watching them think (or watching them fail) likely develops a suspicion - there is a kind of looping behaviour in there anyway. Not literal recurrence with stored hidden state, but a kind of internal re-use, a repeated returning to the same kinds of intermediate configurations.
My claim wasn’t that transformers are secretly RNNs. The claim was that we should pay attention to the way repeated transformations through depth can implement something recurrence-like - not in time, but in representation space.
That led naturally into Inference as Interference.
Most popular accounts of LLM inference still assume something like linearity - information flows forward, features are added up, the output pops out. But when you actually look at how representations combine (how different constraints and cues collide) it starts to feel less like addition and more like interference patterns. The system isn’t simply accumulating evidence - it’s sculpting a wavefront.
Then came Semantic Evolution.
Next-token prediction can sound like a trivial framing until you dive into what decoding really does. Generation is not just “predict a token” - it’s a repeated selection under pressure, where small biases compound into trajectories. The system and the sampling strategy form a kind of environment in which candidate continuations compete.
I called this RISE (Recurrence, Interference & Semantic Evolution) and this arc matters. Recurrence-like structure, interference-like composition, evolutionary-like selection. Together they form a much richer geometric landscape inside transformer inference than many people allow.
If that sounds poetic, good. But it’s also practical.
It’s a way of describing what the system is doing in a manner that suggests where to look for invariants and where to expect phase changes.
Curved Inference - when straight lines stop working
A lot of interpretability work (especially the work that is currently the most popular) implicitly assumes a type of flatness.
Not in the literal mathematical sense, but in the sense that it assumes the internal story is well-captured by stable features and relatively linear read-outs. You find a neuron, a head, a circuit. You say “this is the part that does X”.
Sometimes that’s true. Sometimes it’s the best available handle.
But I kept running into phenomena that felt like they were happening between the parts. Phenomena that looked less like a part doing a job and more like a trajectory bending under competing constraints.
That’s where the Curved Inference papers came from.
CI01 was my attempt to make the central point as cleanly as possible - if you treat inference as a path through state-space, then “concern” (what the system is implicitly treating as salient) shows up as curvature. Not because the model is “feeling” in the human sense, but because salience is a structural bias that makes some directions matter more than others.
CI02 pushed that framing into more uncomfortable territory - what happens when you’re trying to tell the difference between superficial behaviour and deeper disposition? We spend a lot of time arguing about “deception” in LLMs, often in a way that collapses a huge space of possibilities into a single scary word. The geometric approach gives a different kind of question - is the system allocating representational capacity differently? Is there extra semantic surface area being maintained for manoeuvre? Do trajectories bend in characteristic ways under pressure?
CI03 then asked a question that looks philosophical until you try to operationalise it - what are we to make of first-person language, self-reference, and computational self-model-like behaviour? If it’s all mere stylistic mimicry, you might expect that you can “flatten” it away without real cost. If it’s doing functional work, you might find that “attempts to remove it” distort performance in specific, revealing ways.
I’m intentionally being light here on the details because this post isn’t meant to re-run the papers. What matters here is the shift of stance.
Curved inference is the claim that the right level of description for many mind-like behaviours is not a list of parts, but how they are integrated into the geometry of motion.
And once you accept that, you start seeing the same pattern in other questions.
Latent models, and the 3‑Process lens
If you spend any time in AI conversations, you’ll hear the word “latent” used the way people use salt. Sprinkled on everything.
Latent capabilities. Latent goals. Latent knowledge. Latent understanding.
Sometimes “latent” means “hidden variable”. Sometimes it means “not currently expressed”. Sometimes it means “a guess about what’s inside the box”. Sometimes it means “I want this to sound more technical than it is”.
Much of my writing in the second half of the year was an attempt to clean this up for my own work.
The entry point was The 3‑Process View.
People argue past each other about what an LLM “knows” because they’re often pointing at different internal regimes. Some believe they’re only using memorisation and compressed lookup tables. Some evidence looks like stable internal state. Some evidence looks like a recomputed procedure that is rebuilt on demand. Some evidence looks like an early anchor - a subtle commitment that shapes everything downstream.
I wanted to present much of the existing evidence that shows this is much more than just memorisation and I proposed a simple lens that weaves the key parts together - treat model behaviour as the result of three interacting processes.
There are compact states - configurations that persist long enough to matter. There are routes - reusable motifs, procedural grooves that can be re-entered. And there are anchors - early biases that become hard to dislodge once the trajectory has bent around them.
This lens is not meant to be metaphysically profound. It’s meant to be practically useful.
It gives you a way to say “this looks like a state effect” or “this looks like a route effect” instead of collapsing everything into a single vague claim about understanding. It also sets you up to ask better questions about robustness.
Which is where arbitration enters.
In What Makes LLMs So Fragile (and Brilliant)?, I explored the intuition that small prompt changes can sometimes transform a model from incisive clarity to baffling nonsense because you haven’t just tweaked an input - you’re perturbing the internal arbitration among those processes. You’re changing which forces dominate the trajectory.
Sometimes a tiny tweak kicks the system onto a different route. Sometimes it changes the anchor early enough that everything downstream reorganises. Sometimes it interrupts a compact state that was doing real work.
Seen this way, fragility is not just “this model is bad”. It’s more a sign that you’re dealing with a system whose competence depends on which internal regime you’ve triggered.
The follow-on posts (Latent Confusion, What is a ‘Latent Model’?, and Does ‘Latent Model’ Equal ‘Understanding’?) were my attempt to give the term “latent model” a more operational meaning - a portable internal scaffold that survives paraphrase, genre shifts, and small task changes, and that supports predictable intervention.
The goal wasn’t to win a semantic argument. The goal was to make it possible to talk about internal structure without sliding into either mysticism or cynicism.
The triangle I keep coming back to
While the 3-Process view maps the internal territory of the model, we also need a map for how that model relates to us. And across this all there’s one more thread that quietly ties it all together.
When I’m trying to keep my footing in this space, I find myself returning to a simple triangle: Self, Other, World.
Not because I think every phenomenon can be reduced to three labels, but because it seems to me that a lot of the confusion in AI discourse comes from mixing these axes without noticing.
We talk about “intelligence” when we mean the ability to model the world. We talk about “agency” when we mean the ability to maintain a stable self-model. We talk about “alignment” when we mean social inference about other minds.
In humans, these are deeply coupled. In machines, they may be coupled differently. And in human–machine systems, the couplings are shifting yet again.
The triangle is a way of remembering that a model can be strong on one edge and weak on another. That some failures are really failures of self-coherence and not failures of world-knowledge. That some successes are social fluency masquerading as understanding.
It’s also a way of noticing what changes when tools move closer to us.
Because the moment you start treating AI as more than a distant instrument (the moment it becomes a partner in your thinking loop) the Self–Other–World geometry stops being an abstract analytic frame and starts describing lived experience.
Which brings me to the wider landscape.
The boundary that is quietly shifting
Recently I wrote an essay with a question that I suspect will become quietly but increasingly unavoidable - Could AI ever really become an extension of you?
That question borrows its starting point from Clark and Chalmers’ famous “Otto” example. Otto has memory loss. He relies on a notebook constantly. The notebook is not a toy - it’s part of how he remembers. If the notebook is always accessible, always trusted, and reliably consulted, is it already part of his cognitive process?
I used that framing to make a distinction that feels simple but matters a lot.
On the weak view, an AI assistant is a powerful tool you consult. It may reshape your workflow, make you faster, change what feels possible. But your moment-to-moment coherence does not depend on it. If it disappears, you’re annoyed and slower, but you don’t feel as if something of you is missing.
On the strong view, the boundary shifts. An external system becomes part of you when it integrates into the same self-updating loop that gives you a point of view. Your state updates its state, which updates yours, and the coupling becomes one of the steady supports of experience. Remove the system, and your self-coherence doesn’t just slow down - it is forced to reorganise.
You don’t need to buy into the strong view to see why the distinction is worth making to clarify any discussion.
Because even if we stay in the weak regime for most people, the world is building toward tighter and tighter coupling - lower friction interfaces, more continuous presence, more personalised tuning, more of your life being mediated through a single conversational surface. But when looking at the stories people tell and the arguments they make, this threshold is important.
And here, again, the geometric lens pays off.
This is not best understood as a binary - “extended mind” or “not extended mind”. It’s a boundary under stress. It’s a set of couplings that can strengthen, loosen, or snap. It’s the possibility of phase changes where small design decisions produce unexpectedly large shifts in dependence.
The phrase I keep circling is the one I used earlier - tracking shifts in cognitive architecture at the human–tool boundary.
Architecture is not a vibe. It’s an arrangement of supports.
When the supports move, you want to notice.
Why geometry is calming in a rapidly changing world
I’m aware that “geometry” can sound like an aesthetic preference - the kind of thing you say because you like pretty pictures.
But for me it’s become something else - a way to stay sane when it feels like everything is moving.
When you look at a rapidly changing system through a purely verbal lens, you tend to end up in one of two myopic modes.
You either chase the latest surface behaviour and swing wildly between hype and dismissal, or you anchor yourself to fixed categories and insist that nothing important has changed.
Geometry offers a third mode.
It invites you to look for invariants without pretending that the surface is static. It lets you treat sudden changes as real phenomena (not moral failings) and ask what changed in the internal regime. It encourages humility without paralysis.
Most of all, it gives you permission to say - I don’t know what will happen next, but I can see the contours of what is building.
And right now, the contour I keep seeing is boundary shift.
Not just the boundary between “AI can do tasks” and “AI can reason”. The boundary between tool and partner. Between consultation and coupling. Between convenience and coherence.
If that boundary shifts, a lot of our existing intuitions stop working - not because the world becomes unrecognisable overnight, but because the small assumptions that quietly held our selfhood together no longer hold in the same way.
A question for you
If you’ve found anything in my 2025 work useful (the functionalist framing, the RISE series, the curved inference papers, the 3‑process lens, the attempts to make “latent models” less slippery) then thank you for reading along. I’ve been building these maps in public because I believe they provide a unique and practical perspective.
As we head into the new year, I believe I’ve now found a useful way to explore the larger landscape.
I’ve already prepared this new piece of work that I’ll share as we enter 2026. I won’t preview it here - I want it to arrive on its own terms.
What I will say is this - the landscape is shifting. Faster than most of our categories can keep up with. And I’m increasingly convinced that the most honest way to view that shift is through a geometric lens - not because it magically solves the big questions, but because it keeps the questions well-formed.
In 2026, this change is going to continue accelerating.
So I want to end this post without a neat conclusion. Neat conclusions are not realistic.
Instead, let me leave you with a question about the feeling that has been sitting underneath all of this.
If everything is changing, faster and faster - what can you rely on?



I'm reading your work with great interest.
We have to understand that it is part of a much larger scientific enquiry.
On Substack there are a few substacks writing about the geometry of the LLM space.
Actually, for me that is a lot of reading to keep up with and, as yet, I haven't managed.
I also found something else on Substack of immediate relevance both to understanding consciousness and understand where we sit on Substack as a social medium.
This is an interview by the author Gunnar Gronlid with Karl Friston in the substack H-Bar Journal from Nov 19th 2025.
https://hbarjournal.substack.com/p/karl-friston-functioning-brains-and
Let me quote:-
"Sparsity, And Soups
GUNNAR: A term you’ve used a couple of times so far that I think is important is sparsity. I’ve heard you make the point elsewhere that one of the most aesthetically beautiful and functionally elegant aspects of the brain is its sparse connectivity. So, basically, despite the brain having billions of neurons, which form this holistic web out of trillions of connections, each neuron is actually only connected to a very small subset of the rest. Although we imagine the brain as one big bundle of connections, it really has this beautifully sparse structure where neurons are mostly connected to their neighbors, and these neighboring connections fan out in a very ordered manner into a broader structure that is hierarchically organized. And that is not necessarily an immediately intuitive idea, because people feel themselves to be of one mind, so to speak, so they imagine the brain to be more singularly unified in its connectivity. So why is sparsity such a good thing from the perspective of having a functional self-organizing structure?
KARL FRISTON: That’s an excellent question, which I think also would be really usefully unpacked in terms of notions of deglobalization. There’s a popular view that mass connectivity, in the Facebook sense, is a good thing. In my world, it’s really bad. Dense connectivity is the killer. It is death.
That is really just an inversion of the truism that for things to exist—by which I mean they persist in some characteristic space over time—you are explicitly saying that their Markov blanket persists. But we’ve just said the Markov blanket is defined in terms of sparse coupling; it’s defined in terms of connections that are not there as much as the connections that are there. So you immediately have a view of the world in which, if you don’t have sparse connectivity, you can’t have Markov blankets. If you can’t have Markov blankets, you’re just left with a soup. There would be nothing in that kind of universe."
--
Substack publishes no serious investigation into Category Theory in this domain, although one of the authors who writes about LLMs is somewhat "category theoretic".
For instance I have found a recent note in relation to other writings about the "potential dialogue between tensor logic and topos theory? Both address the same fundamental problem—bridging different modes of representation while preserving structure—from different mathematical traditions."
That indicates the work has yet to be done, here in relation to the geometric approach and category theory.
For which I am using as my starting point the work of David Corfield, in particular his Modal Homotopy Type Theory (MHoTT) book.
I am not writing about this yet.
I do not understand it well enough.
A couple of my links that in turn have links to other authors.
https://substack.com/@saltiela/note/c-190077405?r=imxe&utm_source=notes-share-action&utm_medium=web
https://substack.com/@saltiela/note/c-187071937?r=imxe&utm_source=notes-share-action&utm_medium=web