your podcast episode.

*the big unlock.

The most interesting AI research right now isn't in language models.

Evan Fineberg and Sergey Obukhov's conversation on Latent Space this week is one of those rare episodes that reframes the map. Sergey led Llama 2 and Llama 3 pretraining at Meta — about as deep in frontier LLM work as you can get — and he has now pivoted to Genesis Molecular AI as CTO. That career move is itself a signal worth sitting with. The argument isn't that LLMs are unimportant. It's that diffusion models, the same primitive that reshaped image generation, are now doing something structurally more interesting in 3D molecular space than anything currently happening in language. As Fineberg puts it: "Some of the most innovative diffusion research is happening in 3D structure prediction right now. No one would have predicted that then, but now that's a pillar of diffusion."

The concrete problem being solved is protein-ligand docking: predicting the precise 3D coordinates of how a drug molecule (the "key") binds to a target protein (the "lock"). This sounds like a narrow technical sub-problem, but it has been a genuine wall for a decade. The hypothesis — that accurately modeling these 3D poses would enable more accurate potency prediction, and eventually unlock "undruggable" targets — couldn't even be tested properly before, because the models were too bad and the experimental alternatives (crystallography, cryo-EM) cost tens of thousands of dollars and months of PhD labor per structure. Genesis's Pearl model appears to be the first serious evidence that the hypothesis is actually true: better 3D pose prediction does improve downstream potency modeling. Fineberg is careful not to overclaim — he frames this as one necessary-but-not-sufficient step in a 30-property ADMET optimization problem — but the directional finding is significant.

What makes this technically distinctive isn't just the application domain but the architecture demands. Modeling protein-ligand interactions requires handling conformational flexibility (proteins aren't rigid; they breathe and shift), sparse 3D point clouds, and physical priors that flat sequence models simply can't encode. The diffusion framework turns out to be a much better fit for this geometry than GANs were — exactly the way diffusion outcompeted GANs in images, but for reasons rooted in physics rather than aesthetics. Fineberg explicitly connects the two: GANs looked like the obvious path in 2017-2018 and failed; diffusion arrived as the right primitive and unlocked the space. The implication is that we should expect similar "right primitive" unlock moments in other physical domains — materials science, climate modeling, structural biology more broadly — and that the people best positioned to see them are those, like Sergey, who have one foot in deep ML and one foot in the underlying physics.

So the AI-for-science stack is maturing into its own discipline, with its own architecture choices, evaluation frameworks, and career tracks. The Latent Space episode is the best primary-source articulation of that claim I've seen in months. This is a useful corrective — not because frontier language models don't matter, but because the most consequential applied AI work may increasingly be happening in domains where the problem structure demands something different entirely.

WATCH THIS ONE:*The Most Innovative Diffusion Research Is Happening in Drug Discovery, Not Image Generation* — https://www.youtube.com/watch?v=YQWXxnkK4dw

The combination of a Llama pretraining lead explaining why he left frontier LLMs for biotech AI, and a working technical account of why diffusion is the right primitive for 3D molecular structure prediction, makes this the highest-density single episode for understanding where AI-for-science actually stands right now.

*also worth a look.

Math as the canary — and a quiet warning about understanding vs. output.
Grant Sanderson (3Blue1Brown) joined Dwarkesh Patel this week in what is genuinely one of the better conversations about AI's relationship to deep intellectual work. The core thesis: AI progress in mathematics is the right place to watch because it has the spikiest, most legible frontier. IMO geometry problems — solved in 19 seconds via brute force since 2024. Combinatorics — still a meaningful holdout, likely because it requires something closer to playful structural intuition. Sanderson's framework for the next benchmark after IMO gold is worth internalizing: good mathematicians prove theorems, great ones generate conjectures, the greatest come up with new definitions — and that hierarchy roughly tracks the difficulty of benchmarking AI progress. The harder-to-discuss but more important point: even as AI accelerates mathematical output, it may be quietly eroding human mathematical understanding, because the work of building intuition happens through struggle, not through watching correct proofs appear. That's not a Luddite argument — it's a structural one about what gets produced versus what gets internalized.

Grant Sanderson – AI and the future of math — https://www.youtube.com/watch?v=TfyPshgMbug
Worth watching in full for Sanderson's explanation of the Hugh Montgomery / Freeman Dyson story alone — it's the clearest illustration of why cross-domain insight might actually be an LLM-native capability, and what solving a Millennium Prize problem might plausibly look like.

The "AI chemist" framing — and what agentic systems actually mean in drug discovery.
One thread in the Latent Space episode that deserves separate attention: the Genesis team is building toward what they call an "AI chemist" — not just a model that predicts binding, but an agentic system that can iterate across the full hit-to-lead optimization loop: propose a molecule, predict its 3D binding pose, evaluate potency and ADMET properties, propose a modification, repeat. This is distinct from current-generation AI copilots for chemistry, which are mostly retrieval or single-step prediction tools. The agentic framing maps the drug discovery pipeline onto the same architectural pattern showing up in software engineering agents: tool use, multi-step planning, and feedback loops that can run faster than human-in-the-loop iteration. The difference is that the stakes — and the required accuracy bar — are radically higher than in code. A hallucinated variable name doesn't kill anyone. This framing will be worth tracking as Genesis publishes more; it's the most concrete "AI agent for physical science" roadmap being articulated by anyone with both serious ML credentials and domain depth.

*contrarian corner.

Where is AI actually making progress? Depends entirely on who you're listening to.

There's a genuine map disagreement worth flagging between this week's top episodes. The How I AI benchmark episode treats frontier LLM comparison — Sonnet 5 vs. Opus 4 vs. GPT-5.5 — as the natural center of gravity for evaluating AI progress. That's a reasonable frame for practitioners choosing tools. But the Latent Space episode implicitly argues the opposite: that the most architecturally interesting and consequential AI work is happening outside language models entirely, in 3D diffusion for molecular structure prediction. These aren't just different topics — they reflect different theories of where the action is. One frame says: track the frontier LLM leaderboard, because that's where capability improvements compound. The other says: the frontier LLM leaderboard is now a lagging indicator, and the leading indicators are in domains where the problem structure demands something language models fundamentally can't do.

And as stated above, there's an underlying, quieter tension here. The Peter Diamandis MOONSHOTS podcast framing treats AI acceleration in technical fields as broadly and straightforwardly positive. Sanderson's conversation with Dwarkesh introduces a more uncomfortable idea: that acceleration in output is not the same as acceleration in understanding, and that these two things can move in opposite directions. You can have more theorems proven and fewer mathematicians who deeply understand why they're true. That's not a reason to slow down — but it is a reason to think carefully about what we mean when we say AI is "advancing" a field, versus advancing its output metrics. For those of us who care about AI adoption across industries, this distinction is going to matter more and more.

take it under advisement: the human mind is the original generative machine.

AI and the sciences: machine output vs. human understanding

your podcast episode.

*the big unlock.

*also worth a look.

*contrarian corner.

Keep Reading

Quick Links

Subscription

Socials