<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
  xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd"
  xmlns:podcast="https://podcastindex.org/namespace/1.0"
  xmlns:atom="http://www.w3.org/2005/Atom"
  xmlns:content="http://purl.org/rss/1.0/modules/content/">
  <channel>
    <title>CTAIO Labs Podcast</title>
    <link>https://ctaio.dev/en/podcast/</link>
    <description>Hands-on AI experiments, tested and explained. Each season explores one piece of the AI stack — voice cloning, agent orchestration, agentic search — with real results, not theory. Hosted by Thomas Prommer.</description>
    <language>en</language>
    <lastBuildDate>Thu, 18 Jun 2026 06:00:00 GMT</lastBuildDate>
    <atom:link href="https://ctaio.dev/en/podcast/feed.xml" rel="self" type="application/rss+xml"/>
    <itunes:author>Thomas Prommer</itunes:author>
    <itunes:owner>
      <itunes:name>Thomas Prommer</itunes:name>
      <itunes:email>thomas@prommer.net</itunes:email>
    </itunes:owner>
    <itunes:image href="https://ctaio.dev/podcast/artwork/ctaio-labs-3000x3000.png"/>
    <itunes:category text="Technology"/>
    <itunes:explicit>no</itunes:explicit>
    <itunes:type>episodic</itunes:type>
    <podcast:locked>no</podcast:locked>
    <podcast:guid>a1b2c3d4-e5f6-7890-abcd-ef1234567890</podcast:guid>
    
    <item>
      <title>S02E01: I Built the Same Agent in 6 Orchestrators</title>
      <description>LangGraph vs CrewAI vs AutoGen vs OpenAI Swarm vs Pydantic AI vs LlamaIndex. Same 3-step research agent built in each — DX, cost, reliability scorecard.</description>
      <enclosure url="https://ctaio.dev/audio/podcast/CTAIO-Labs-S02E01-Agent-Orchestrators.mp3" length="40192833" type="audio/mpeg"/>
      <guid isPermaLink="false">ctaio-labs-s02e01-agent-orchestrators</guid>
      <pubDate>Thu, 18 Jun 2026 06:00:00 GMT</pubDate>
      <link>https://ctaio.dev/en/podcast/agentic-orchestration/agent-orchestrators/</link>
      <itunes:title>S02E01: I Built the Same Agent in 6 Orchestrators</itunes:title>
      <itunes:duration>41:51</itunes:duration>
      <itunes:episode>1</itunes:episode>
      <itunes:season>2</itunes:season>
      <itunes:episodeType>full</itunes:episodeType>
      <itunes:explicit>no</itunes:explicit>
      <itunes:image href="https://ctaio.dev/podcast/artwork/ctaio-labs-3000x3000.png"/>
      <content:encoded><![CDATA[<h2>The experiment</h2><p>The question every engineering leader is asking in 2026: which agent framework do we standardize on? So I built the <strong>same three-step research agent in six of them</strong> — LangGraph, CrewAI, AutoGen, OpenAI Swarm, Pydantic AI, and LlamaIndex — and compared developer experience, cost, reliability, and debuggability, apples to apples. The Season 2 opener of CTAIO Labs.</p><h2>The two camps</h2><p>The six split faster than expected into two camps, and that single design choice drives everything downstream — debuggability, reliability, how each fails under load:</p><ul><li><strong>Opinionated state</strong> — the framework owns state for you (LangGraph, Pydantic AI).</li><li><strong>Defer state to you</strong> — you manage it (OpenAI Swarm, CrewAI).</li></ul><h2>The six, in one line each</h2><ul><li><strong>LangGraph</strong> — agent as a directed graph, explicit typed state; most mature, best docs; shipped three breaking changes in six months.</li><li><strong>CrewAI</strong> — Crew/Agents/Tasks abstraction; the easiest on-ramp, ~30 lines of Python; defers state to you.</li><li><strong>AutoGen (Microsoft)</strong> — everything framed as conversations between agents; strong enterprise backing.</li><li><strong>OpenAI Swarm</strong> — intentionally minimal (Agents + handoffs); an educational reference, least production-ready.</li><li><strong>Pydantic AI</strong> — type safety at the agent layer; lowest impedance to standard Python engineering.</li><li><strong>LlamaIndex Agents</strong> — agent primitives on a RAG heritage; strongest when the agent is mostly retrieval.</li></ul><h2>The verdict</h2><p>Two frameworks separated from the pack, and it's close: <strong>LangGraph and CrewAI</strong> — two fundamentally opposed architectures that both win. CrewAI gets you to a working agent fastest; LangGraph gives you explicit control and typed state for when things go wrong. The tiebreaker isn't the tool, it's your team's priority. The other four are situational: Pydantic AI for typed-Python shops, LlamaIndex when it's mostly retrieval, AutoGen for Microsoft-enterprise stacks, and Swarm for learning, not production.</p><h2>What I got wrong</h2><p>I expected LangGraph's boilerplate to be a deal-breaker for small teams. It wasn't — the typed state graph earns its keep the moment an agent run goes sideways.</p><h2>Timestamps</h2><ul><li>00:00 — Intro</li><li>01:32 — The experiment: the same agent in six frameworks</li><li>05:34 — Two camps: opinionated state vs. defer-to-you</li><li>07:00 — The six, framework by framework</li><li>25:30 — What I got wrong predicting this</li><li>31:49 — The verdict: LangGraph &amp; CrewAI, neck-and-neck</li><li>40:56 — Outro &amp; what's next in Season 2</li></ul><h2>Links</h2><ul><li><a href='https://ctaio.dev/en/labs/agentic-orchestration/framework-comparison/'>Full lab report: I built the same agent in 6 frameworks</a></li><li><a href='https://ctaio.dev/en/labs/agentic-orchestration/topology-patterns/'>Next up — Monolith, Handoff, or Swarm? Three topologies in production</a></li><li><a href='https://ctaio.dev/en/labs/agentic-orchestration/observability-tools/'>Agent observability: Langfuse vs LangSmith vs Phoenix vs Helicone</a></li></ul>]]></content:encoded>
      <podcast:transcript url="https://ctaio.dev/podcast/transcripts/s02e01-agent-orchestrators.vtt" type="text/vtt"/>
    </item>

    <item>
      <title>S01E03: How to Clone Your Brain — 3 Second-Brain Paradigms Tested Head-to-Head</title>
      <description>Same corpus, same 7 questions, three architectures. Production RAG hallucinated. Gemini 1M-context aced the hardest question and ran out of budget on others. The /opt + Claude Code setup I already had won on faithfulness. Closes Season 1: Building My AI Twin.</description>
      <enclosure url="https://ctaio.dev/audio/podcast/CTAIO-Labs-EP03-Second-Brain.mp3" length="38282341" type="audio/mpeg"/>
      <guid isPermaLink="false">ctaio-labs-s01e03-second-brain</guid>
      <pubDate>Tue, 16 Jun 2026 06:00:00 GMT</pubDate>
      <link>https://ctaio.dev/en/podcast/my-ai-clone/second-brain/</link>
      <itunes:title>S01E03: How to Clone Your Brain — 3 Second-Brain Paradigms Tested Head-to-Head</itunes:title>
      <itunes:duration>39:52</itunes:duration>
      <itunes:episode>3</itunes:episode>
      <itunes:season>1</itunes:season>
      <itunes:episodeType>full</itunes:episodeType>
      <itunes:explicit>no</itunes:explicit>
      <itunes:image href="https://ctaio.dev/podcast/artwork/ctaio-labs-3000x3000.png"/>
      <content:encoded><![CDATA[<h2>The experiment</h2><p>Same corpus, same seven hard synthesis questions, three competing architectures — a head-to-head test of how to actually clone a knowledge base. The punchline: a plain folder of markdown files navigated by Claude Code beat a production-grade RAG pipeline, 5 wins to 2. Total cost of the brain experiment: $4.30.</p><h2>The three contenders</h2><ul><li><strong>Production RAG (Ask CTAIO)</strong> — OpenAI text-embedding-3-small → sqlite-vec → gpt-4.1-mini. The enterprise playbook. Won 2 of 7.</li><li><strong>Gemini 2.5 Pro long-context dump</strong> — 705,000 tokens pasted raw, no retrieval. Won 1 of 7 — the single hardest question every other system failed.</li><li><strong>File-based + Claude Code</strong> — markdown files in /opt plus an agent with read and grep (Karpathy's “LLM wiki”). Won 5 of 7.</li></ul><h2>Three failure modes</h2><ul><li><strong>RAG confabulates</strong> — it invented an ElevenLabs shutdown that exists nowhere, because semantically adjacent but factually disconnected chunks let the model bridge gaps from its pretraining.</li><li><strong>Long-context exhausts its budget</strong> — Gemini burned its entire output-token allocation computing attention over 705k tokens and timed out before answering the hardest questions.</li><li><strong>File-based is brittle but honest</strong> — a case-sensitive grep missed a heading on capitalization — then admitted it could not find the answer instead of hallucinating one.</li></ul><h2>Faithfulness vs fluency</h2><p>The crux: basic read/grep tools mechanically enforce faithfulness — stick to the corpus, flag your limits — while a RAG pipeline's generative step optimizes for fluency at the cost of truth. For a knowledge system, a tool that can say “I don't know” beats one that sounds confident and is wrong.</p><h2>The working-memory trap</h2><p>A five-turn probe: turn 1 said “never include dollar figures,” turn 5 returned them anyway. The cause is a six-message rolling history cap — turn 1 was popped off the stack. Sfeir's “working-memory gap,” demonstrated reproducibly.</p><h2>The economics</h2><p>The full second-brain experiment — the seven-question battery plus the working-memory probe across all three systems — cost exactly $4.30 in API calls. That is the brain experiment only; the voice (EP01) and video-avatar (EP02) layers carried their own separate, larger costs.</p><h2>Timestamps</h2><ul><li>00:00 — Intro</li><li>01:25 — The “digital Ferrari” trap</li><li>04:30 — The test: one corpus, seven hard questions</li><li>06:00 — Contender 1: production RAG (Ask CTAIO)</li><li>08:48 — Contender 2: Gemini 2.5 Pro long-context dump</li><li>10:59 — Contender 3: file-based + Claude Code (Karpathy)</li><li>14:24 — The scoreboard: markdown files win 5 of 7</li><li>18:53 — Failure 1: RAG confabulates an ElevenLabs shutdown</li><li>21:14 — Failure 2: Gemini exhausts its compute budget</li><li>23:10 — Failure 3: a case-sensitive grep misses a heading</li><li>28:03 — The working-memory trap: the 6-message window &amp; Sfeir's gap</li><li>33:10 — Faithfulness vs fluency: why “I don't know” wins</li><li>35:48 — The real cost: $4.30 for the brain experiment only</li><li>38:57 — Outro &amp; Season 2 preview</li></ul><h2>Links</h2><ul><li><a href='https://ctaio.dev/en/labs/my-ai-clone/second-brain/'>Full lab report: 3 second-brain paradigms tested head-to-head</a></li><li><a href='https://ctaio.dev/en/second-brain/'>AI Second Brain: the complete guide</a></li><li><a href='https://ctaio.dev/en/ask-ctaio/'>Ask CTAIO — the live RAG demo tested in this episode</a></li><li><a href='https://prommer.net/en/tech/build-ask-tom-rag-sqlite-vec/'>How the production RAG (Ask Tom) was built</a></li><li><a href='https://ctaio.dev/en/podcast/my-ai-clone/voice-cloning/'>S01E01: Voice cloning</a> &middot; <a href='https://ctaio.dev/en/podcast/my-ai-clone/video-avatars/'>S01E02: Video avatars</a></li></ul>]]></content:encoded>
      <podcast:transcript url="https://ctaio.dev/podcast/transcripts/ep03-second-brain.vtt" type="text/vtt"/>
    </item>

    <item>
      <title>S01E02: Build Your AI Twin — Clone Your Face and Body</title>
      <description>How to build your AI twin with video. I tested HeyGen Avatar V against Synthesia, Akool, Tavus and AI Studios. HeyGen Avatar V scored 7/10 — only render I would put on a paying client&apos;s homepage. Plus the multilingual finding that should stop any CTO from shipping non-English avatar video without manual transcript review.</description>
      <enclosure url="https://ctaio.dev/audio/podcast/CTAIO-Labs-EP02-Video-Avatars.mp3" length="44056887" type="audio/mpeg"/>
      <guid isPermaLink="false">ctaio-labs-ep02-video-avatars</guid>
      <pubDate>Wed, 29 Apr 2026 06:00:00 GMT</pubDate>
      <link>https://ctaio.dev/en/podcast/my-ai-clone/video-avatars/</link>
      <itunes:title>S01E02: Build Your AI Twin — Clone Your Face and Body</itunes:title>
      <itunes:duration>45:53</itunes:duration>
      <itunes:episode>2</itunes:episode>
      <itunes:season>1</itunes:season>
      <itunes:episodeType>full</itunes:episodeType>
      <itunes:explicit>no</itunes:explicit>
      <itunes:image href="https://ctaio.dev/podcast/artwork/ctaio-labs-3000x3000.png"/>
      <content:encoded><![CDATA[<h2>What I Tested</h2><p>Five AI video avatar engines tested hands-on against the same script and the same speaker:</p><ul><li><strong>HeyGen Avatar V</strong> — 15-sec clip → Diffusion Transformer clone (launched April 8 2026). <strong>Rated 7/10</strong> on the four-dimension lens.</li><li><strong>Synthesia Selfie Avatar</strong> — Starter $18/mo, photo-trained. <strong>Rated ≤2/10</strong> — voice clone mispronounced my own surname.</li><li><strong>Akool Free Instant Avatar</strong> — Basic Free tier. <strong>Rated 3/10</strong>, share-only output.</li><li><strong>Tavus Personal Avatar / Replica</strong> — Starter $59/mo. <strong>Skipped</strong> — no free-trial path, persona library signaled wrong fit.</li><li><strong>AI Studios / DeepBrain</strong> — Free tier UI. <strong>Skipped</strong> — API gated to Enterprise sales.</li></ul><h2>The Hallucination Gap</h2><p>The single most important finding. HeyGen rewrites non-English scripts: my German render replaced "Roth" (my hometown) with "Fürz in Rot" — crude German slang. "prommer.net" rendered as "Proma.net." Spanish dropped to 32% character similarity, French to 23%. Synthesia preserves the script but mis-clones the voice ("Prommer" → "Prahm"). Both failures are silent. Neither platform's UI warns you.</p><h2>Key Findings</h2><ul><li>HeyGen Avatar V is the only render that cleared 7/10 for editorial — and only in English. Pair with manual transcript review for any localized output.</li><li>Synthesia's wardrobe-by-text-prompt feature is genuinely unique. Worth the upgrade if your audience does not know the speaker's voice.</li><li>Akool Free is the cheapest path to a custom AI avatar at $0 — but the output is share-only and voice-identity is the disqualifier.</li><li>Tavus is architecturally a real-time conversational video API. Wrong fit for editorial; right fit for customer-support video agents.</li><li>EU AI Act Article 50 deadline is August 2, 2026. No platform tested ships fully machine-detectable watermarking.</li></ul><h2>The Practitioner Gap</h2><p>I cross-referenced this test against what growth teams actually use in production (Advise Slack corpus, 30 channels, ~100k messages, Q1 2026). The result: enterprise vendors are invisible in the practitioner layer. HeyGen owns talking-head VSLs. Sora through Arcads owns ecom UGC ads. Synthesia, Colossyan, Tavus, AI Studios — zero mentions across the corpus.</p><h2>Total Experiment Spend</h2><p>$47 across the test window. $29 HeyGen Creator + $18 Synthesia Starter. Akool and AI Studios at $0. Tavus evaluated but skipped before payment.</p><h2>Timestamps</h2><ul><li>00:00 — Intro</li><li>TBD — The bottom-line scoreboard</li><li>TBD — Hallucination Gap: HeyGen rewrites scripts, Synthesia mis-clones voices</li><li>TBD — Platform deep dives</li><li>TBD — Sales-friction report (Tavus + AI Studios skip rationale)</li><li>TBD — CTO Playbook + EU AI Act compliance</li><li>TBD — Outro</li></ul><h2>Links</h2><ul><li><a href='https://ctaio.dev/en/labs/my-ai-clone/video-avatars/'>Full article: Build Your AI Twin — Clone Your Face and Body</a></li><li><a href='https://ctaio.dev/en/labs/my-ai-clone/compare/heygen-vs-synthesia/'>HeyGen vs Synthesia: I Tested Both. HeyGen Wins for Editorial</a></li><li><a href='https://ctaio.dev/en/labs/my-ai-clone/guides/heygen-alternatives/'>7 Best HeyGen Alternatives in 2026</a></li><li><a href='https://ctaio.dev/en/podcast/my-ai-clone/voice-cloning/'>S01E01: Voice cloning (Part 1 of Building My AI Twin)</a></li><li><a href='https://ctaio.dev'>CTAIO — AI strategy for tech leaders</a></li><li><a href='https://prommer.net'>Thomas Prommer</a></li></ul>]]></content:encoded>
      <podcast:transcript url="https://ctaio.dev/podcast/transcripts/ep02-video-avatars.vtt" type="text/vtt"/>
    </item>

    <item>
      <title>S01E01: I Cloned My Voice With 8 AI Engines — Here&apos;s What Won</title>
      <description>ElevenLabs, Cartesia, Coqui and 5 more voice cloning engines tested head to head. Audio demos, cost breakdown, and blind A/B test results.</description>
      <enclosure url="https://ctaio.dev/audio/podcast/CTAIO-Labs-EP01-Voice-Cloning.mp3" length="44038226" type="audio/mpeg"/>
      <guid isPermaLink="false">ctaio-labs-ep01-voice-cloning</guid>
      <pubDate>Mon, 23 Mar 2026 06:00:00 GMT</pubDate>
      <link>https://ctaio.dev/en/podcast/my-ai-clone/voice-cloning/</link>
      <itunes:title>S01E01: I Cloned My Voice With 8 AI Engines — Here&apos;s What Won</itunes:title>
      <itunes:duration>46:26</itunes:duration>
      <itunes:episode>1</itunes:episode>
      <itunes:season>1</itunes:season>
      <itunes:episodeType>full</itunes:episodeType>
      <itunes:explicit>no</itunes:explicit>
      <itunes:image href="https://ctaio.dev/podcast/artwork/ctaio-labs-3000x3000.png"/>
      <content:encoded><![CDATA[<h2>What We Tested</h2><p>Eight voice cloning engines evaluated across quality, cost, training data requirements, and multilingual support:</p><ul><li>ElevenLabs</li><li>Cartesia</li><li>Coqui / XTTS</li><li>LMNT</li><li>Fish Audio</li><li>StyleTTS2</li><li>OpenAI</li><li>Deepgram</li></ul><h2>Key Findings</h2><ul><li>The open-source option (Coqui XTTS) needed just 5 seconds of audio</li><li>The winner (Cartesia) needed 54 minutes but produced a clone that fooled colleagues in blind tests</li><li>Cost ranged from free (open source) to $99/month (enterprise)</li><li>Multilingual support varied wildly — only 2 engines handled German well</li></ul><h2>Timestamps</h2><ul><li>00:00 — Introduction &amp; credentials</li><li>02:07 — AI voice clone bridge (Cartesia demo)</li><li>02:59 — NotebookLM deep dive begins</li><li>42:00 — Key takeaways</li><li>45:26 — Outro &amp; next episode preview</li></ul><h2>Links</h2><ul><li><a href='https://ctaio.dev/en/labs/my-ai-clone/voice-cloning/'>Full article with comparison table and audio demos</a></li><li><a href='https://ctaio.dev'>CTAIO — AI strategy for tech leaders</a></li><li><a href='https://prommer.net'>Thomas Prommer</a></li></ul>]]></content:encoded>
      <podcast:transcript url="https://ctaio.dev/podcast/transcripts/ep01-voice-cloning.vtt" type="text/vtt"/>
    </item>
  </channel>
</rss>