The Mostly Unstructured Podcast

Hosted by The Mostly Unstructured Podcast

TechnologyInterviews guests

Website RSS feed

Episodes

Latest episode

Jun 2026

Language

About the show

Mostly Unstructured is a technology podcast that brings commentary, ideas, and insights on unstructured documents and data captured with intelligent document processing, presented in a mostly unstructured format.

Listen to episodes

46 recent

June 11, 2026Episode 928 min

S2 Ep9: Discoveries at the intelligence layer with Mike Askren

Content's value is in the intelligence it brings, regardless of what system it's found in. But there is a lot of enterprise content across many, many systems.On the Mostly Unstructured Podcast, KeyMark CMO Clay Tuten sits down with Mike Askren, VP of Product at Hyland, on how document management and ECM are becoming an intelligence layer for agentic AI, and the right size and scale problems to tackle with agents.Topics explored: Why has enterprise value moved from storing and securing content to extracting intelligence from it? How content federation connects AI services to content across on-prem, cloud, and hyperscaler systems. What an enterprise context engine does, and why the relationships between documents matter more than the engine itself. Why agentic governance matters so much. Monitoring, coaching, and shutting down agents that hallucinate or run on stale instructions. Why the highest-ROI AI work comes from the processes that are least exciting, but have the highest volume of occurrence. Questions this episode answers: What is the intelligence layer in enterprise content management? How much enterprise data is unstructured, and why does it matter for AI? What is content federation and why is it needed for enterprise AI? What is agent governance and how is it different from data governance? How do you get ROI from AI without replacing your existing systems? Where should a CIO start when moving ECM into an AI intelligence layer? What is intelligent document processing (IDP) and how does it relate to agentic automation? Subscribe for more AI talk on content intelligence, IDP, and agentic AI from the team at KeyMark, or reach out if anything caught your ear.Timestamps:00:00 – From storage to intelligence: the ECM shift01:58 – What "unstructured content" really means03:01 – Mike's role at Hyland and content federation04:11 – The content-fueled agentic enterprise06:45 – Why 70–90% of enterprise data goes untapped08:03 – Agentic governance and context you can trust09:25 – Human-in-the-loop feedback and coaching agents10:22 – The control tower: monitoring and stopping agents12:03 – Agents as digital employees13:45 – Advice for CIOs under pressure15:23 – Start small: the attainable win, not the moonshot18:39 – Where the ROI actually hides19:47 – Practical outcomes: claims, HR, government21:03 – First steps into the intelligence layer24:45 – From IDP to agentic automation to new workflows27:19 – Slow down, ask questions

June 4, 2026Episode 818 min

S2 Ep8: Building Trust and Domain Agents with Matisha Ladiwala

Ed sits down for a conversation with Matisha Ladiwala, Vice President and General Manager for Hyland's Content Intelligence Cloud (CIC), to discuss building, governing, and deploying domain-specific agents. They cover how governance helps AI pilots succeed through trust, and dig into Hyland's federation strategy, the agent mesh architecture, MCP as a semantic layer, and domain-specific knowledge graphs.

May 13, 2026Episode 737 min

S2 Ep7: Are Data Lakes Drying Up?

Data lakes remain sources of truth, but AI accesses data beyond what lives in a lake to acquire valuable metadata and semantic understanding. In this episode of the Mostly Unstructured Podcast, KeyMark CMO Clay Tuten sits down with Josh Heller of Crushable.ai, to dismantle the myth that a lake has to be the final resting place of data.Read the companion article for info on data at rest vs data in motion, and integration tactics: https://www.keymarkinc.com/have-the-best-practices-for-data-integration-changed/ TOPICS COVERED IN THIS EPISODE:• What separates a data warehouse from a data lake?• Why the semantic layer is the most underrated shift in enterprise AI right now?• How AI vectorizes unstructured data — and why that changes the data-readiness conversation?• The difference between access and storage and which matters more?• How do "conversational data queries" replace legacy BI dashboards across every level of an org?QUESTIONS THIS EPISODE ANSWERS:• Is a data lake still necessary for enterprise AI?• How do you operationalize data not in a lake?• What is a semantic layer?• How do you know when your data foundation is good enough to move forward with AI?• What should enterprise leaders audit before starting an AI initiative?• Why does agentic AI make data governance more important?WHO THIS IS FOR:CDOs, CIOs, and operations leaders looking to understand if they should double down on data lakes, or embrace MCP connectors. And, anyone evaluating enterprise AI, intelligent automation, or agentic AI during a fundamental shift in the understanding of data access. Subscribe to the Mostly Unstructured Podcast for more conversations on enterprise AI, data readiness, and intelligent automation from the team at KeyMark.

April 17, 2026Episode 643 min

S2 Ep6: Agentic AI Needs Governance: Discussing Data Readiness, Audit Trails, and Trust

Data governance is the foundation of enterprise AI. If your data is not AI-ready, your copilots, agents, and automations can return bad answers, expose risk, and make the wrong decisions faster.In this episode of the Mostly Unstructured Podcast, Clay and Ed break down why enterprise AI success isn't just about model performance, but starts with data readiness, traceability, audit trails, validation, policy, and clear ownership across the business.Read our KeyMark companion article:https://www.keymarkinc.com/managing-a...Topics explored in this episode:• What data governance for AI actually means• Why many AI failures start with governance failures• How bad data, shadow AI, and weak controls create enterprise risk• Why traceability, monitoring, auditing, and validation matter before agents make decisions• How bias, compliance, privacy, and trust affect enterprise AI rollouts• What CIOs, CDOs, IT leaders, operations leaders, and compliance teams should ask before scaling AIIn this episode, Clay and Ed address key AI questions:• What is data governance for AI?• Why is data governance important for enterprise AI?• What makes enterprise data AI-ready?• Who owns AI governance in an organization?• How do you reduce AI risk without slowing innovation?• How do you govern agentic AI responsibly?If you are evaluating enterprise AI, agentic AI, intelligent document processing, or AI automation, this episode directs seekers in establishing smart AI beginnings with data governance for accurate data, and AI governance for output guardrails.

March 6, 2026Episode 526 min

S2 Ep5: Does a business have any business training an LLM?

Enterprise LLMs: RAG vs Fine‑Tuning, IDP & Governance In this episode of the Mostly Unstructured podcast, Ed and Clay discuss whether it’s better to train a domain‑specific LLM or leverage foundational models like ChatGPT, Gemini and Claude. They explain the trade‑offs between fine‑tuning and retrieval‑augmented generation (RAG), and why Intelligent Document Processing (IDP) is vital for turning unstructured data into usable context. In this discussion, we cover: Why training your own LLM is risky and often unnecessary compared to adopting and building from a foundational model. How retrieval‑augmented generation (RAG) delivers more accurate results than simple fine‑tuning. The importance of Intelligent Document Processing (IDP) for ingesting unstructured data and building domain context. Real‑world lessons on AI governance, including the Air Canada bereavement‑policy chatbot case. Managing bias, hallucinations and toxicity in enterprise models. Measuring your return on AI investment. For those thrown by the excessive acronyms, let's define:LLM = Large Language ModelRAG = Retrieval‑Augmented GenerationIDP = Intelligent Document Processing. For more insights on enterprise AI for data intelligence, visit our website and read our blog on training an LLM referenced in the episode.Website: https://www.keymarkinc.com/Blog: https://www.keymarkinc.com/how-to-tra...

February 17, 2026Episode 430 min

S2 Ep4: A Selectively Deep Analysis of AI Analyst Predictions

Deep Analysis is a well-informed and highly respected analyst team with a lot to say about AI in 2026. Ed and Clay are informed, and we respect them a lot internally, but mostly – they’re a little unstructured. In Ep. 4, Ed and Clay dive deep into some of the trends outlined in Deep Analysis’s 2026 trends report.

January 23, 2026Episode 321 min

S2 Ep3: You know me I'm down with MCP

The Model Context Protocol is your USB-C for AI API tool calling. Ed and Clay talk about MCP introduced by Anthropic, the rapid adoption of the protocol across AI vendors, the opportunities it creates, as well as the headaches it induces for IT.

January 9, 2026Episode 231 min

S2 Ep2: IDP Due Diligence in a Crowded Marketplace

Ed and Clay discuss the explosion of IDP entrants in the AI document processing and data analysis marketplace, cover questions and considerations when assessing the viability of an IDP solution, vetting demos, and review a helpful tool for performing IDP due diligence.

December 3, 2025Episode 117 min

S2 Ep1: Intelligent Document Processing - The Keystone Tech

The first episode of KeyMark's new podcast series - Mostly Unstructured explores a range of topics related to intelligent document processing for the strategic transformation of unstructured data. In E01 - Intelligent Document Processing - The Keystone Tech, Ed is joined by Colin Toomey to lay some foundational groundwork for IDP as a fundamental necessity for data pipelines, workflows, and artificial intelligence systems.

September 26, 2024Episode 3717 min

S1 Ep37: Intelligent Data Capture Series: The Then and Now of OCR and AI, Part 1

In the second episode of our Intelligent Data Capture Series, KeyMark CMO Clay Tuten is joined by CEO Jim Wanner and COO Cameron Boland to discuss what exactly Data Capture is. Dive into the basics of OCR, extraction, validation, separation, and deep learning.Make sure to check out part 2 for the evolution of Capture and why it is critical to businesses.