
David Aronchick on Distributed Data Orchestration with Expanso
In this episode of Alexa's Input (AI), I sit down with David Aronchick, co-founder and CEO of Expanso and former product lead for Kubernetes at Google.Data is growing everywhere outside your data center. Solar panels in remote across a country. Security cameras at retail stores. IoT sensors across factory floors. And moving that data to the cloud for processing? It's expensive, slow, and often restricted by compliance.David is an expert when it comes to solving distribution problems. He led Kubernetes product at Google, co-founded Kubeflow to bring ML to production, and now he's building Expanso to tackle a difficult constraint: when your data can't move, how do you process it where it lives?We discuss:- The need for distributed data orchestration-Upstream data control: filtering and transforming at the source- Three forces making edge computing inevitable (physics, regulations, economics)- How to build successful open source infrastructure projects- Customer discovery and finding real pain points- His transition from Protocol Labs to founding Expanso- ETL pipelines: moving the first four steps closer to the data- Context loss and lineage in distributed systems- Processing 400,000 signals per second with 150MB agents- AI observability: attaching source metadata to training data- Running ML pipelines at the edge- Real-world deployment challenges (bandwidth, regulations, cost)Expanso is rethinking how we process data in an AI-native world—moving compute to data instead of data to compute. If you want to understand where distributed systems and edge computing are heading, this is a deep dive into the infrastructure layer beneath modern AI applications.General Podcast LinksWatch: https://www.youtube.com/@alexa_griffith Read: https://alexasinput.substack.com/ Listen: https://creators.spotify.com/pod/profile/alexagriffith/ More: https://linktr.ee/alexagriffithLearn more about the host atWebsite: https://alexagriffith.com/ LinkedIn: https://www.linkedin.com/in/alexa-griffith/Find out more about the guest atLinkedIn: https://www.linkedin.com/in/aronchick/ Twitter/X: https://x.com/aronchick GitHub: https://github.com/aronchick Expanso Website: https://expanso.io/ResourcesExpanso Website: https://expanso.io/ Kubernetes: https://kubernetes.io/ Kubeflow: https://www.kubeflow.org/ CNCF (Cloud Native Computing Foundation): https://www.cncf.io/ Protocol Labs: https://protocol.ai/KeywordsDavid Aronchick, Expanso, Kubernetes, Kubeflow, distributed systems, edge computing, data pipelines, ETL, upstream data control, Google Kubernetes Engine, open source, CNCF, observability, log processing, data lineage, provenance, schema enforcement, IoT, edge AI, distributed data, machine learning infrastructure, Protocol Labs, IPFS, Filecoin, data governance, compliance, GDPR, bandwidth optimization, data aggregation, AI infrastructure, multi-cloud, hybrid cloud, real-time processing






