Biz and Tech Podcasts > Technology > The MLOps Podcast
Last Episode Date: No Date found.
Total Episodes: Not Available
In this episode, Dean and Natanel Davidovits explore the intricacies of AI and machine learning, focusing on model efficiency, the use of APIs versus self-hosting, and the importance of defining success metrics in real-world applications. They discuss the challenges of data quality and labeling, the evolving role of data scientists in the age of LLMs, and the significance of effective communication between data science and product teams. The conversation also touches on the future of robotics in AI and the need for specialization in a rapidly changing landscape. Join our Discord community: https://discord.gg/tEYvqxwhah --- Timestamps: 00:00 Introduction to Natanel Davidovits 02:10 Optimizing AI Models for Real-World Tasks 03:47 Success Metrics in Industry vs. Academia 07:52 The Importance of Communication Between Teams 11:33 Handling Data Quality and Labeling Challenges 12:11 The Impact of LLMs on Data Science Careers 16:29 Navigating Specialized Domain Data 22:15 Trends in Machine Learning and AI 27:27 The Future of AI and Robotics 28:28 The Role of AI in Physics 33:36 Controversial Views on AI and Machine Learning 34:05 Final Thoughts and Recommendations ➡️ Natanel Davidovits on LinkedIn – https://www.linkedin.com/in/natanel-davidovits-28695312/ 🌐 Check Out Our Website! https://dagshub.com Social Links: ➡️ LinkedIn: https://www.linkedin.com/company/dagshub ➡️ Twitter: https://x.com/TheRealDAGsHub ➡️ Dean Pleban: https://x.com/DeanPlbn
In this episode, Dean speaks with Jeremie Dreyfuss, Head of AI Research and Development at Intel, about the evolving role of AI in the enterprise. Jeremie shares insights into scaling machine learning solutions, the challenges of building AI infrastructure, and the future of AI-driven innovation in large organizations. Learn how enterprises are leveraging AI for efficiency, the latest advancements in AI research, and the strategies for staying competitive in a rapidly changing landscape. Join our Discord community: https://discord.gg/tEYvqxwhah --- Timestamps: 00:00 Introduction and Overview 00:55 Challenges of Data Collection and Infrastructure 05:00 Optimizing Test Recommendations 14:42 Tips for Deploying Entire ML Pipelines 21:19 The Impact of Large Language Models (LLMs) 25:30 How to Decide About LLM Investment in the Enterprise 29:29 Evaluating Models and Using Synthetic Data 35:34 Choosing the Right Tools for ML and LLM Projects 45:21 The Beauty of Small Data in Machine Learning 48:22 Recommendations for the Audience ➡️ Jeremie Dreyfuss on LinkedIn – https://www.linkedin.com/in/jeremie-dreyfuss/ 🌐 Check Out Our Website! https://dagshub.com Social Links: ➡️ LinkedIn: https://www.linkedin.com/company/dagshub ➡️ Twitter: https://x.com/TheRealDAGsHub ➡️ Dean Pleban: https://x.com/DeanPlbn
In this episode, Dean speaks with Dror Haor, CTO at SeeTree, about the challenges of deploying AI in agriculture at scale. They explore how SeeTree integrates AI and sensor fusion to manage vast amounts of remote sensing data, helping farmers improve crop yields with high accuracy at low costs. Dror shares insights on handling data drift, customizing models for different regions, and balancing the trade-offs between cost and performance. This conversation dives deep into practical machine learning applications in agriculture, offering valuable lessons for anyone working with large-scale data and AI. Join our Discord community: https://discord.gg/tEYvqxwhah --- Timestamps: 00:00 Introduction 00:32 Production in machine learning at SeeTree 07:34 Sensor fusion in machine learning 16:26 Balancing accuracy and cost in agriculture 20:09 Customizing models for different customers and crops 24:19 Dealing with data in different domains 30:10 Tools and processes for ML at SeeTree 35:58 Building for scale 40:17 Collecting user feedback and self-improving products 42:45 Exciting developments in ML & AI 45:12 Hot takes in ML - Overfitting is good 46:34 Recommendations for the Audience ➡️ Dror Haor on LinkedIn – https://www.linkedin.com/in/dror-haor-phd-77152322/ ➡️ Dror Haor on Twitter – https://x.com/DrorHaor 🌐 Check Out Our Website! https://dagshub.com Social Links: ➡️ LinkedIn: https://www.linkedin.com/company/dagshub ➡️ Twitter: https://x.com/TheRealDAGsHub ➡️ Dean Pleban: https://x.com/DeanPlbn
In this episode, Dean speaks with Federico Bacci, a data scientist and ML engineer at Bol, the largest e-commerce company in the Netherlands and Belgium. Federico shares valuable insights into the intricacies of deploying machine learning models in production, particularly for forecasting problems. He discusses the challenges of model explainability, the importance of feature engineering over model complexity, and the critical role of stakeholder feedback in improving ML systems. Federico also offers a compelling perspective on why LLMs aren't always the answer in AI applications, emphasizing the need for tailored solutions. This conversation provides a wealth of practical knowledge for data scientists and ML engineers looking to enhance their understanding of real-world ML operations and challenges in e-commerce. Join our Discord community: https://discord.gg/tEYvqxwhah --- Timestamps: 00:00 Introduction and Background 01:59 Owning the ML Pipeline 02:56 Deployment Process 05:58 Testing and Feedback 07:40 Different Deployment Strategies 11:19 Explainability and Feature Importance 13:46 Challenges in Forecasting 22:33 ML Stack and Tools 26:47 Orchestrating Data Pipelines with Airflow 31:27 Exciting Developments in ML 35:58 Recommendations and Closing Links Dwarkesh podcast with Anthropic and Gemini team members – https://www.dwarkeshpatel.com/p/sholto-douglas-trenton-bricken ➡️ Federico Bacci on LinkedIn – https://www.linkedin.com/in/federico-bacci/ ➡️ Federico Bacci on Twitter – https://x.com/fedebyes 🌐 Check Out Our Website! https://dagshub.com Social Links: ➡️ LinkedIn: https://www.linkedin.com/company/dagshub ➡️ Twitter: https://x.com/TheRealDAGsHub ➡️ Dean Pleban: https://x.com/DeanPlbn
In this episode, Dean speaks with Michał Oleszak, an ML engineering manager at Solera. Michał shares insights into how his team is using machine learning to transform the automotive claims process, from recognizing vehicle damages in images to estimating repair costs. The conversation covers the challenges of deploying ML pipelines in production, managing data quality for computer vision tasks, and balancing technical implementation with business needs. Michał also discusses his approach to model evaluation, the benefits of monorepo architecture, and his views on exciting developments in self-supervised learning for computer vision. Join our Discord community: https://discord.gg/tEYvqxwhah --- Timestamps: 00:00 Introduction 00:42 Production for Machine Learning at Solera 03:49 Transitioning from Images to Structured Data 04:58 Combining Deep Learning and Non-Deep Learning Models 05:15 Deployment Process for Machine Learning Models 08:01 Challenges and Solutions in Monorepo Adoption 12:57 Evaluating Model and Pipeline Versions 21:57 Tools for ML Projects: Monorepo, Pants, GitHub Actions 24:04 Data Management and Data Quality 30:14 Challenges in ML Efforts: Data Quality 30:37 Excitement about Self-Supervised Learning and JEPA Architectures 34:45 Controversial Opinion: Importance of Statistics for ML 36:40 Recommendations Links 🌎Prisoners of Geography by Tim Marshall: https://www.amazon.com/Prisoners-Geography-Explain-Everything-Politics/dp/1501121472 ➡️ Michał Oleszak on LinkedIn – https://www.linkedin.com/in/michal-oleszak/ ➡️ Michał Oleszak on Twitter – https://x.com/MichalOleszak 🌐 Check Out Our Website! https://dagshub.com Social Links: ➡️ LinkedIn: https://www.linkedin.com/company/dagshub ➡️ Twitter: https://twitter.com/TheRealDAGsHub ➡️ Dean Pleban: https://twitter.com/DeanPlbn
In this episode, I chat with Ljubomir Buturovic, VP of ML and Informatics at Inflammatix. We discuss using ML to diagnose infections and blood tests in the emergency room. We dive into the challenges of building diagnostic (classification) and prognostic (predictive) modes, with takeaways related to building datasets for production use cases. Join our Discord community: https://discord.gg/tEYvqxwhah --- Timestamps: 00:00 What is Inflammatix and how do they use ML7:32 Edge Device Deployment: The Future of Model Deployment21:16 Navigating Regulatory Submission for Medical Products 26:01 Evolution of Regulatory Processes in ML for Medical Applications30:18 Challenges and Solutions in ML for Medical Applications 34:00 The Future of AI in Clinical Care40:25 The Overrated Concept of Interpretability in AI and ML45:32 RecommendationsLinks 🌎📈 Our world in data: https://ourworldindata.org/ 🚀 Profiles of the future: https://www.amazon.com/Profiles-Future-Arthur-C-Clarke-ebook/dp/B00BY7GITK ➡️ Ljubomir Buturovic on LinkedIn – https://www.linkedin.com/in/ljubomir-buturovic-798156/ ➡️ Ljubomir Buturovic on Twitter – https://x.com/ljbuturovic 🌐 Check Out Our Website! https://dagshub.com Social Links: ➡️ LinkedIn: https://www.linkedin.com/company/dagshub ➡️ Twitter: https://twitter.com/TheRealDAGsHub ➡️ Dean Pleban: https://twitter.com/DeanPlbn
In this episode, Idan Gazit, Senior Director of Research at GitHub Next, discusses his role in exploring strategic technologies and incubating long bet projects. He explains how the GitHub Next team chooses research projects and the process of exploration and theme selection. Idan also shares insights into the ML focus at GitHub Next and the challenges of evaluating the impact of AI products. He reflects on his journey into the AI space and provides advice for testing AI products in smaller organizations. Finally, he shares his thoughts on the future of AI interfaces. Join our Discord community: https://discord.gg/tEYvqxwhah --- Timestamps: 00:00 Introduction and Background 00:56 Choosing Research Projects at GitHub Next 06:09 ML Focus in GitHub Next 10:52 ML Work and the Leaky Abstraction 13:16 Idan's Journey into the AI Space 17:54 Evaluating the Impact of AI Products 24:36 Testing AI Products in Smaller Organizations 32:52 The Future of AI Interfaces 40:01 Transitioning from Prototype to Product 46:45 Challenges in the ML/AI Space 56:03 Recommendations ➡️ Idan Gazit on LinkedIn – https://www.linkedin.com/in/idangazit/ ➡️ Idan Gazit on Twitter – https://twitter.com/idangazit 🌐 Check Out Our Website! https://dagshub.com Social Links: ➡️ LinkedIn: https://www.linkedin.com/company/dagshub ➡️ Twitter: https://twitter.com/TheRealDAGsHub ➡️ Dean Pleban: https://twitter.com/DeanPlbn
In this episode, I chatted with Uri Goren, founder and CEO of Argmax, about Machine Learning and the future of digital advertising in a world moving away from cookies due to privacy laws like GDPR and CCPA. We chat about challenges in maintaining personalized ads while respecting user privacy, and new methods like probabilistic models and contextual features to cover some of the gap left by removing cookies. Join our Discord community: https://discord.gg/tEYvqxwhah --- Timestamps: 00:00 Introduction 00:35 The Rise of Privacy Regulations 1:40 The Impact of Losing Cookies 2:48 Understanding Cookies 4:33 Reasons for the Decline of Cookies 8:47 ML Leveraging Cookies in Advertising 10:32 The Shift to Contextual Features 12:53 The Future of ML without Cookies 15:23 New and Old Ways of Generating Contextual Features 20:33 Regulatory Conspiracies 22:33 Unsolved Problems in ML and AI 24:39 Predictions for the Next Year in AI and ML 26:17 Controversial Take: Overuse of LLMs 28:03 Recommendations ➡️ Uri Goren on LinkedIn – https://www.linkedin.com/in/ugoren/ 🌐 Check Out Our Website! https://dagshub.com Social Links: ➡️ LinkedIn: https://www.linkedin.com/company/dagshub ➡️ Twitter: https://twitter.com/TheRealDAGsHub ➡️ Dean Pleban: https://twitter.com/DeanPlbn
In this episode, I speak with Han-Chung Lee, a machine learning engineer with a lot of interesting takes on ML and AI. We dive into the buzz around natural language processing and the big waves in generative AI. They chat about how newcomers are racing through NLP’s history, mixing old school and new tech, and the shift towards smarter databases. Han-Chung breaks it down with his straightforward takes, making complex AI trends feel like coffee chat topics. It’s a perfect listen for anyone keen on where AI’s headed, minus the jargon. Join our Discord community: https://discord.gg/tEYvqxwhah --- Timestamps: 00:00 Intro 0:41 State of NLP and LLMs 1:33 Repeating the past in NLP 3:29 Vector databases vs. classical databases 8:49 Choosing the right LLM for an application 12:13 Advantages and disadvantages of LLMs 16:10 Where LLMs are most useful 21:13 The dark side of LLMs and can we detect it? 25:19 Thoughts on LLM leaderboard metrics 31:19 Using LLMs in regulated industries 36:40 Creating a moat in the LLM world 40:20 Evaluating LLMs 44:20 Impact of LLM on non-english languages 48:35 Thoughts on MLOps and getting ML into production 56:48 The Hardest Unsolved Problem in ML and AI 59:09 Predictions for the Future of ML and AI 1:03:25 Recommendations and Conclusion ➡️ Han Lee on Twitter – https://twitter.com/HanchungLee ➡️ Han Lee on LinkedIn – https://www.linkedin.com/in/hanchunglee/ 🌐 Check Out Our Website! https://dagshub.com Social Links: ➡️ LinkedIn: https://www.linkedin.com/company/dagshub ➡️ Twitter: https://twitter.com/TheRealDAGsHub ➡️ Dean Pleban: https://twitter.com/DeanPlbn
In this episode, I had the pleasure of speaking with Mila Orlovsky, a pioneer in medical AI. We delve into practical applications, overcoming data challenges, and the intricacies of developing AI tools that meet regulatory standards. Mila discusses her experiences with predictive analytics in patient care, offering tips on navigating the complexities of AI implementation in medical environments. This episode is packed with actionable advice and forward-thinking strategies, making it essential listening for professionals looking to impact healthcare through AI. Join our Discord community: https://discord.gg/tEYvqxwhah --- Timestamps: 00:00 Introduction and Background 4:03 Early Days of Machine Learning in Medicine 5:19 Challenges in Building Medical AI Systems 6:54 Differences Between Medical ML and Other ML Domains 15:36 Unique Challenges of Medical Data in ML 24:01 Counterintuitive Learnings on the Business Side 28:07 Impact and Value of ML Models in Medicine 29:41 The Role of Doctors in the Age of AI 38:44 Explainability in Medical ML 44:31 The FDA and Compliance in Medical ML 48:56 Feedback and Iteration in Medical ML 52:25 Predictions for the Future of ML and AI 53:59 Controversial Predictions in the Field of ML 56:02 Recommendations 57:58 Conclusion ➡️ Mila Orlovsky on LinkedIn – https://www.linkedin.com/in/milaorlovsky/ 🩺MeDS – Medical Data Science Israel Community – https://www.facebook.com/groups/452832939966464/ 🌐 Check Out Our Website! https://dagshub.com Social Links: ➡️ LinkedIn: https://www.linkedin.com/company/dagshub ➡️ Twitter: https://twitter.com/TheRealDAGsHub ➡️ Dean Pleban: https://twitter.com/DeanPlbn
Discover new partners and
collaboration opportunities —right in your inbox.
Get notified about new partnerships