Super Data Science: ML & AI Podcast with Jon Krohn

Hosted by Jon Krohn

Technology ScienceInterviews guests

Website RSS feed

Episodes

1001

Latest episode

Jun 2026

Language

About the show

The latest machine learning, A.I., and data career topics from across both academia and industry are brought to you by host Dr. Jon Krohn on the Super Data Science Podcast. As the quantity of data on our planet doubles every couple of years and with this trend set to continue for decades to come, there's an unprecedented opportunity for you to make a meaningful impact in your lifetime. In conversation with the biggest names in the data science industry, Jon cuts through hype to fuel that professional impact. Whether you're curious about getting started in a data career or you're a deep technical expert, whether you'd like to understand what A.I. is or you'd like to integrate more data-driven processes into your business, we have inspiring guests and lighthearted conversation for you to enjoy. We cover tools, techniques, and implementation tricks across data collection, databases, analytics, predictive modeling, visualization, software engineering, real-world applications, commercialization, and entrepreneurship − everything you need to crush it with data science.

Listen to episodes

60 recent

June 16, 20261 hr 55 min

1001: How AI Erased My Career Moat, an Episode #1001 Special: Jon Krohn interviewed by Kirill Eremenko

For this episode #1001 special, the tables are turned: SuperDataScience founder Kirill Eremenko takes the host’s chair and Jon Krohn is the guest. They trace Jon Krohn’s path from an Oxford neuroscience PhD to a New York hedge fund to founding the AI consulting firm Y Carrot, why he regrets leaving academia and how tools like Claude Code erased his hard-won technical moat and why that makes skilled engineers more valuable than ever. Along the way: whether AI is a bubble, Jevons paradox and the data-center boom, the RICE framework for choosing AI projects, the single biggest reason AI projects fail and how a well-built AI agent could give anyone “Christopher Nolan–like” focus. Additional materials: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠https://www.superdatascience.com/1001⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠ Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information. In this episode you will learn: (03:42) From an Oxford neuroscience PhD to AI consulting (17:25) Defining AGI and why consciousness isn’t required (30:39) Are we in an AI bubble? Why we benefit either way (46:32) Jevons paradox: why cheaper AI means more data centers (01:08:31) The RICE framework for prioritizing AI projects (01:15:08) The number-one reason AI projects fail in production (01:31:50) AI, attention, and protecting your wellbeing

June 12, 20261 hr 0 min

1000: Ten Years of the Super Data Science Podcast, with Jon, Kirill and Special Guests

For this landmark 1,000th episode and the show’s 10-year anniversary, host Jon Krohn is joined by SuperDataScience founder Kirill Eremenko, who hosted the podcast for its first 400-plus episodes before handing over the reins. In a first for the show, the episode was recorded live with the audience invited to join on air, alongside surprise appearances from the team, longtime guests, and even Jon’s family. Together, Jon Krohn and Kirill look back on a decade of the podcast and field listener questions on AI’s biggest opportunities, the build-versus-buy dilemma, how to break into the field today, and how to stay grounded amid the relentless pace of AI. Additional materials:⁠ ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠www.superdatascience.com/1000⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠ Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information.

June 9, 20261 hr 15 min

999: What's Left to Build When Software Is Free, with Chip Huyen

Chip Huyen joins host Jon Krohn for this milestone episode 999 to talk about her record-breaking book "AI Engineering" the most-read title on the O'Reilly platform last year and how the AI landscape has shifted since her last appearance. Chip breaks down what separates AI engineering from machine learning engineering, makes the case for a "start simple" workflow, gets candid about the real costs of running LLMs in production, and shares why she's now fascinated by physical AI, robotics, and world models and why the durable problems worth solving are increasingly human ones. Jon Krohn guides the conversation from the practical content of the book through to where the field is heading next. Additional materials: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠https://www.superdatascience.com/999⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠ Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information. In this episode you will learn: (06:48) What separates AI engineering from machine learning engineering (14:44) The “start simple” approach: prompting, then RAG, then fine-tuning (18:19) Why web search is so painfully expensive in production (35:11) Is the “ChatGPT moment” for physical AI really here? (52:21) Why the durable problems left to solve are people problems

June 5, 202627 min

998: In Case You Missed It in May 2026

In this month’s episode of ICYMI, Jon Krohn explores how AI agents are simultaneously creating new risks and unlocking powerful new ways of working with data. Hear from Anneka Gupta, Cal Al-Dhubaib, Trevor Manz, Jazmia Henry, Jeremy Mumford, and Jacob Miller, discussing why the old cybersecurity playbook breaks down in the age of Claude Mythos, how the notebook became an AI agent’s working memory, what it really takes to build a foundation model from scratch, and why failing slowly is the most expensive mistake an AI team can make. Additional materials: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠www.superdatascience.com/998⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠ Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information. In this episode you will learn: (00:40) Why Claude Mythos Changes Everything About Cybersecurity (08:11) Why Your Notebook Should Be Your Agent’s Working Memory (13:19) What It Actually Takes to Build a Foundation Model From Scratch (20:46) Failing Slowly Is the Most Expensive AI Mistake

June 2, 20261 hr 9 min

997: How This Text-to-Video-Game AI Startup Hit 20M Users

Dr. Andrey Kurenkov returns to the show to talk about Astrocade's astronomical growth from pre-alpha to over 20 million engaged users, what it actually takes to build a vibe-coding platform that scales, and how the broader AI landscape has shifted since his last appearance. Andrey shares behind-the-scenes lessons from building B2C user-generated content products, why the real moat is community rather than tech, and his current thinking on humanoid robotics, AGI, and the AI risks people actually overlook. Additional materials: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠https://www.superdatascience.com/997⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠ Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information. In this episode you will learn: (02:11) The Astrocade elevator pitch and how it grew to 20M users (16:19) Why there's no secret sauce behind the platform (24:56) UGC as the real moat, not the AI (46:57) Why household humanoid robots are now 2–3 years away (58:33) What AGI actually means, and why Andrey is an ASI skeptic

May 29, 202629 min

996: TrueFoundry’s Nikunj Bajaj on How to Get $100M Returns on AI Agent Deployments

TrueFoundry co-founder and CEO Nikunj Bajaj speaks to Jon Krohn about how enterprises like Nvidia and Siemens are realizing returns of over $100 million from single agent deployments, the AI gateway architecture that makes it possible to connect, observe, and govern agents at scale, and why the familiar advice to “start small” is the wrong way to roll out AI agents inside a large organization. Additional materials: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠www.superdatascience.com/996 Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information.⁠⁠⁠ In this episode you will learn: (01:21) What TrueFoundry does and why agents in production need a control plane (06:32) Breaking down the AI gateway: the model, MCP, and agent gateways (16:47) Taming tool sprawl with scoped, read-only MCP access (19:10) Why the agent gateway is the hard part and the kill switch most teams lack (22:24) The five-workflow framework behind $100M agent deployments

May 26, 20261 hr 9 min

995: End-to-End Foundation Models for the Energy Industry, with Jazmia Henry

Jazmia Henry joins Jon Krohn to break down what it actually takes to build end-to-end foundation models for the energy industry. From wrangling decades of handwritten oil-and-gas documents into usable training data, to bespoke tokenizers, reinforcement learning, and inference at scale, Jazmia walks through every stage of the stack. Along the way she explains why reinforcement learning models are "bursty," what reward hacking is and how her Grounded Continuous Evaluation framework fixes it, and revisits the 2023 NeurIPS paper that argued, to widespread skepticism at the time, that scaling bad data degrades model performance. Additional materials: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠https://www.superdatascience.com/995⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠ Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information. In this episode you will learn: (10:06) The User Agnosticism Tenet (20:02) The Zillow Offers parable (23:25) Why workflows should come before agents (29:57) Why data engineering is the bedrock of AI (52:41) Why velocity is the only durable moat

May 22, 202611 min

994: AI’s Putting Recent Grads Out of Work; Here’s How to Get Hired Anyway!

Unemployment for recent computer-science graduates now rivals rates for fine-arts and anthropology majors, and undergraduate CS enrollment fell 11% in 2025. In this Five-Minute Friday, Jon Krohn walks through the data on both sides of the debate, from Stanford research showing a 13% employment drop for young workers in AI-exposed jobs, to Federal Reserve studies finding no statistically detectable link between AI adoption and reduced hiring. Jon shares his own view on where the truth lies and offers five concrete pieces of advice for graduates and senior professionals alike on how to get hired in 2026. Additional materials:⁠ ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠www.superdatascience.com/993⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠ Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information.

May 19, 20261 hr 10 min

993: How to Build AI-First Organizations, with Jacob Miller and Jeremy Mumford

For years, AI content has come in the form of “use this library, use this tool” tutorials that age out within months. Jacob Miller and Jeremy Mumford, co-authors of the brand new Wiley book Architected Intelligence, wanted to write something different, a guide to the higher-level principles of building AI products and AI-first organizations that will still be relevant in five or ten years. In this episode, the two Pattern engineers walk Jon Krohn through the core ideas of their book: why you should design products and processes so they can be executed by a human, an AI agent, or any hybrid combination; why most companies are still treating hallucinations as a model problem when they’re actually a data curation problem; why the natural progression of AI development goes skills, workflows, agents, not straight to agents; and why velocity, not models or data, is the only durable competitive advantage left. Additional materials: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠https://www.superdatascience.com/993⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠ Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information. In this episode you will learn: (10:06) The User Agnosticism Tenet (20:02) The Zillow Offers parable (23:25) Why workflows should come before agents (29:57) Why data engineering is the bedrock of AI (52:41) Why velocity is the only durable moat

May 15, 202614 min

992: Tokenmaxxing vs AI Hardware Bottlenecks

While “tokenmaxxing”, the social media trend of maximizing AI token consumption as a vanity metric, takes off online, the physical infrastructure behind AI is slamming into serious bottlenecks. In this Five-Minute Friday, Jon Krohn maps out the four overlapping supply-chain constraints choking AI compute: GPUs (with NVIDIA Blackwell sold out through mid-2026), high-bandwidth memory (quintupled demand since 2023, only three manufacturers worldwide), CPUs (agentic AI requires 12x more CPUs per GPU than chatbots), and electricity (Gartner projects power shortages will restrict 40% of AI data centres by 2027). Find out why the five biggest hyperscalers are on track to spend $725 billion on AI infrastructure in 2026, where the reasons for optimism lie, and why Jon says you should definitely not tokenmaxx. Additional materials:⁠ ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠www.superdatascience.com/992⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠ Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information.