Deck Architect v4

Tier 3 · What we built 8 min read

Before this, read:

The deep-research agent — the research workflow that feeds most deck briefs

The deck architect started as a way to turn research output into something presentable without spending 2 hours in PowerPoint. By v4 it generates McKinsey-style decks with a three-pass visual QA loop that catches typographic and layout problems before the slide leaves the system. It has a known limitation — it’s typographic, not photographic, so any slide that needs photorealistic imagery requires external art — but within that constraint it produces presentation-quality output.

The pipeline

agents/deck_architect_v4/orchestrator.py runs four steps:

Content planning — the LLM receives the brief and produces a structured content_plan: one entry per slide, with title, narrative type, content, and chart specification. The planner prompt (prompts/planner-v4.md) emphasizes action-titles, one-idea-per-slide structure, and source citations where available. Palette selection (which visual theme to use) happens here too — the LLM picks from the available palettes, or a palette_override argument forces a specific one.
Template filling — template_editor.py takes the content plan and fills the appropriate slide template for each narrative type (title, statement, chart, three-column, quote, etc.). Content goes in; layout is handled by the template. The LLM does not write layout code.
Render — python-pptx generates the .pptx file. Saved to ~/agent-system/output/decks/.
Visual QA loop — render the pptx as images, read the images via LLM vision, critique for layout problems (title collisions, number overlaps, chart-as-dots, text overflow), generate fixes, apply fixes to the source templates, re-render. Repeat up to 3 rounds.

The output is a .pptx file and, optionally, a Google Slides upload via the gog slides create-from-markdown path.

The visual QA loop bug

The QA loop was broken for a period after v4 shipped — not in concept, but in implementation. The llm_vision_call function in agents/shared/llm_client.py routed through OpenRouter using the OpenAI-format image block (image_url). Anthropic’s API uses a different structure for image inputs (source.type: "base64", source.media_type). The translation layer — _to_anthropic_content() — was never implemented in _call_anthropic_direct.

The result: when OpenRouter was unavailable and the fallback to Anthropic direct fired, the vision call failed silently. The QA loop ran, received no useful critique, and reported clean — even when slides had real visual problems.

The fix (commit a8d5a93, merged 2026-06-08) was adding _to_anthropic_content() to the direct-call path: translate image_url OpenAI blocks into proper Anthropic image content blocks before sending. One function, about 15 lines. The CHANGELOG entry:

“Root-caused+fixed the deck-engine QA bug (llm_client._call_anthropic_direct never translated OpenAI image_url→Anthropic image blocks; added _to_anthropic_content — merged a8d5a93, fixes ALL future deck QA).”

After the fix, the QA loop caught real problems on the flagship deck run: chart-as-dots rendering (a chart that rendered as individual scattered points instead of a bar chart), title collisions on two slides, and a number overlap on slide 17. All three were fixed by the build agent before the deck shipped.

The flagship deck

The first fully QA’d deck was “Chatbot to Nerve Center” — a 19-slide harness-engineering presentation built on 2026-06-08:

Brief came from the deep-research run on AI coding agents (370 sources)
Three movements: what a harness is, what the org-scale system looks like, the engineering decisions that made it work
Visual QA ran 3 rounds; real breakage was caught and fixed in round 2
Numbers were filesystem-audited before inclusion: 30 agent packages, 907 py files, 155 crons (the research figures, which were higher, were discarded)
Uploaded to Google Slides (1nS46zNpX28MtUMeaLXgoa2Em_HL5-2YT1rNggOCbb9g) and sent as .pptx

The CEO-verified the cover, slide 12, and slide 17 renders by eye before sending. Those are the three slides the QA loop flagged in earlier rounds — verifying them manually confirmed the fixes held.

What “typographic not photographic” means

The deck architect generates slides where the visual content is typography and charts. It does not generate photorealistic images, photographs, or logo art. This is a deliberate constraint, not a missing feature.

The reason: the engine has no image-generation step. Ideogram or DALL-E integration would be an add-on, and adding it would create the same IP-risk surface that the merch operator’s IP gate guards against. A slide deck that needs a photo of a specific person or a brand logo requires that asset to be sourced separately and injected as a path reference.

Within the typographic constraint, the engine handles charts (bar, line, scatter), bold statement slides, multi-column layouts, and quote slides well. McKinsey-style presentations are mostly text and charts anyway — the constraint covers the majority of use cases.

Running it

python3.12 -m agents.deck_architect_v4 "Brief here: three-movement story of how a Telegram bot became an org..."
# or with a palette override:
python3.12 -m agents.deck_architect_v4 "Brief..." --palette consulting_dark --output-name my-deck

The deck lands in ~/agent-system/output/decks/my-deck.pptx. If gog is configured, append --upload to get a Google Slides link.

Next: The agent that manages JD’s X presence — with killswitches, daily caps, a self-learning content strategy, and a follow engine — is Socrates, the X/Twitter autopilot.