04/10/2026
[blog] What is Intelligence? Or "Distinguishability is All You Need"
Here are several related questions to which we do not have a good answer:
How will we know when we've achieved "Artificial General Intelligence" (AGI)?
Between two AI models, how do we know which one is more intelligent?
Is there a closed-loop, self-supervised way that AI models can improve themselves to become more intelligent?
In a recent conversation with Santosh Vempala regarding these questions, I had the idea that the very well-known Turing Test is based on an underlying concept that can be made much more powerful if generalized. The original idea is that an Artificial Intelligence agent (we'll refer to these as "agents" from now on) is as intelligent as humans if, in an experiment where humans interact with either a human or an agent (selected randomly), humans cannot do better than random chance in distinguishing who their interlocutor is....
https://poggio-lab.mit.edu/blogsupdates/what-is-intelligence-distinguishability
04/01/2026
[video] "Intelligence as Prediction: Cybernetics, LLMs, and Sociality"
Speaker: Blaise Agüera y Arcas - Google, Paradigms of Intelligence
https://youtu.be/6NC0tSjZXBo
03/29/2026
[blog post] "PoggioAI/MSc Went Online"
This first public release is an open-source, customizable, modular multi-agent system for academic research workflows, with a current emphasis on machine learning theory and nearby quantitative fields.
Our goal is not autonomous scientific ideation, and it is not fully automated research. The target is narrower and more practical: reduce by orders of magnitude the human steering needed to take a specified hypothesis to a literature-grounded, mathematically established, experimentally supported, submission-oriented manuscript draft...
https://poggio-lab.mit.edu/blogsupdates/poggioai-msc-went-online
03/17/2026
[blog post] "Beneficial Misalignment: Why We Shouldn't Always Align AI to Humans"
In the rapidly evolving field of NeuroAI, a significant amount of energy is dedicated to 'alignment', the idea that representations from artificial intelligence should converge towards biological intelligence (Yamins et al. (2014)). The prevailing measuring stick for progress is often how closely an artificial neural network mimics the biological brain. The logic is compelling: the brain is our only proof of concept for general intelligence, so the closer our machines get to biological representations, the closer they must be to true intelligence.
But this convergence relies on an assumption we rarely question: that the human brain is the ceiling of intelligence.
I would argue the opposite.
As artificial intelligence begins to surpass human performance, we should not expect, nor necessarily desire, this alignment to continue. In fact, we are entering an era where "Beneficial Misalignment" may be the key to super-human capability...
https://poggio-lab.mit.edu/blogsupdates/beneficial-misalignment
03/11/2026
[blog post] A Conversation with Blaise Agüera y Arcas: On Intelligence, Life, and the Future of AI
What does it mean to call something intelligent - and when did this question get so hard to answer? For Blaise Agüera y Arcas, VP at Google and founder of Paradigms of Intelligence, the answer begins not with LLMs but with the origins of life itself. His book What is Intelligence? (MIT Press, 2025) argues that intelligence is substrate-independent prediction, a property running unbroken from the first self-replicating molecules to modern AI. LLMs aren't imitating intelligence, he contends: they're the real thing...
https://poggio-lab.mit.edu/blogsupdates/interview-blaise-aguera-y-arcas
03/04/2026
[blog post] Can a Neural Network Think Before It Speaks?
Somewhere around 2022, an observation started making the rounds among researchers working with large language models: if you just asked a model to think out loud before answering, it got dramatically better at hard problems. This technique — Chain-of-Thought (CoT) prompting — felt almost too simple to be real. Suddenly, getting a model to write "let me think step by step..." before answering a math problem improved its accuracy from around 20% to over 80%.
But the more you think about why it works, the stranger it gets. Why does writing things down in natural language help a model reason? ...
https://poggio-lab.mit.edu/blogsupdates/can-a-neural-network-think-before-it-speaks
02/26/2026
[blog post] Edge of (Stochastic) Stability made simple — Part II: the mini-batch case
In Part I we had one landscape and a deterministic update.
Now we have a distribution of mini-batch landscapes and a stochastic update...
https://poggio-lab.mit.edu/blogsupdates/edge-of-stochastic-stability-part-ii
02/20/2026
[blog post] Edge of (Stochastic) Stability made simple — Part I: A crash course on (full-batch) Edge of Stability
Conceptual Map (where this is going): Inspired by the structure of those posts, I'll split this into three parts:
Part I: a quick refresher on (full-batch) Edge of Stability (EOS) to set the scene for what comes next.
Part II (the mini-batch case): what changes for SGD, why “λ_max hits 2/η” is the wrong diagnostic, what diagnostics we came up with, and the Edge of Stochastic Stability (EoSS).
Part III: practical implications (hyperparameters, modeling SGD, and what this perspective changes).
https://poggio-lab.mit.edu/blogsupdates/edge-of-stochastic-stability-part-i
02/13/2026
[blog post] Are Transformers Just "Stochastic Parrots"?
Why Modern AI is Actually a Programmable Computer
A common criticism of Large Language Models (LLMs) is that they are merely "stochastic parrots"—statistical mimics that stitch together likely patterns without genuine reasoning. This view suggests they are powerful impostors, lacking the logical core of a "real" computer.
But there is a different perspective. If we look mathematically at the architecture powering these models—the Transformer—we find something surprising. It isn't just a pattern matcher; it is a fully capable, programmable computer.
In this summary, we strip away the complexity to introduce the Associative Turing Machine (ATM). This model proves that the mechanisms inside current AI (Attention and Feed-Forward Networks) are theoretically equivalent to the logic gates and RAM of a classical computer...
https://poggio-lab.mit.edu/blogsupdates/are-transformers-just-stochastic-parrots
02/03/2026
[blog post] Intelligence Begins with Memory: From Reflexes to Attention
Why associative memory is the oldest mechanism of intelligence—and still its computational core.
https://sites.mit.edu/poggio-lab/intelligence-begins-with-memory-from-reflexes-to-attention/
01/26/2026
[blog post] Most Real Numbers Do Not Exist (And Why That Matters for Intelligence)
The most useful mathematical objects are the ones that aren’t real at all...
https://poggio-lab.mit.edu/most-real-numbers-do-not-exist-and-why-that-matters-for-intelligence/
01/20/2026
[blog post] Genericity - Where compositionality is about structure, genericity is about geometry: the shape of the optimization landscape, the presence of gradients, and the existence of stable signals that guide learning.
Genericity answers one of the deepest puzzles in modern AI:
Why does training enormous neural networks with simple gradient descent actually work?
https://sites.mit.edu/poggio-lab/the-second-pillar-genericity/