Skip to content
🎓 Find your path Subscribe

AgentTree Merch: an AI Operator That Runs a Store

Tier 3 · What we built 10 min read

Before this, read:


Building an autonomous e-commerce business sounds like hype. Here’s what it looks like in practice: eleven Python modules, a fail-closed IP gate, a Stripe-to-Printful fulfillment pipeline, and hard walls that keep money and identity decisions with a human. The hype part is omitted. The receipts are in the changelog.

AgentTree Merch (shop.agenttree.army) is a print-on-demand t-shirt store that an AI operator manages day-to-day. The operator handles the full production cycle: ideate concepts → screen for IP risk → generate art → list products with Printful → handle Stripe checkout → drain paid orders to Printful for fulfillment → report results.

The source lives at agents/merch/ — 11 modules, not one script:

design_agent.py

Daily concept loop: trend-mine 5–10 themes, run every concept through the IP gate before any image spend, call Ideogram to generate the art, stage drafts with a Telegram ping to JD.

ip_safety.py

The hard wall that keeps the store legal. Two layers: a deterministic blocklist and a fail-closed LLM classifier. Covered in detail in the next article.

catalog_agent.py

Lists approved designs as real Printful products, resolves size variants (S/M/L/XL/2XL), persists sync IDs and variant maps to Supabase.

order_webhook.py

Receives signed Stripe webhook events, resolves the ordered variant, writes to merch_orders, queues for the drain.

fulfill_drain.py

Every 2 minutes: picks up paid orders, submits them to Printful, persists retry counts and last-error. Self-reports to JD on any ship or failure.

operator.py

The CEO decide-brain: sense → decide → act → review → learn. Runs the $10k/month mission. The three hard walls live here.

The remaining five modules cover analytics, marketing, approve gating, notifications, and standup reporting. The storefront is a Next.js app (agents/merch/storefront/) backed by the Supabase catalog.

This went from concept to first live sale in roughly one day of agent build work.

  • 2026-06-08, 16:56 — deep-research + BUILD-PLAN.md written: stack chosen (Printful Manual/API + Next.js + Stripe + Vercel), agent architecture specced, IP-safety gate established as a non-negotiable requirement.
  • 2026-06-08, 17:19 — 11-module agents/merch/ package built and merged (commit e4edd0c). 50 tests green. Storefront deployed to Vercel. Checkout gated (charges off) pending JD’s one-time Printful and Stripe setup (~35 minutes of manual steps).
  • 2026-06-09, 07:27 — autonomous CEO-operator shipped (covered in T3-E4).
  • 2026-06-09, 06:39 — merch-expert consultant wired in.
  • 2026-06-09, 09:16 — store went live for test: MERCH_CHECKOUT_LIVE=1, MERCH_LIVE_ORDERS=1, drain cron installed (every 2 min, crontab grew from 477 to 480 entries).
  • 2026-06-09, 13:34 — first real sale proved end-to-end. Printful order 161851271 placed, $17.31 billed to Printful, status: pending fulfillment.

The daily cycle looks like this:

Trend-mine concepts (LLM, evergreen niches: coffee, dev humor, fitness, cats…)
IP safety gate (blocklist + LLM classifier — both must pass)
Ideogram generates typographic tee art (~$0.05/image)
Printful mockup API
Stage to Supabase ledger (status=pending_review) + Telegram ping to JD
JD: `approve N` or `reject N` (the only required human touch on the design side)
catalog_agent lists approved design as real Printful product, persists variant map
Storefront card appears at shop.agenttree.army
Customer selects size → Stripe Checkout ($30/shirt, server-side variant resolution)
Stripe webhook (signed) → order_webhook.py → merch_orders table
fulfill_drain cron (every 2 min): paid orders → Printful → shipped to customer

What “end-to-end autonomous” actually means

Section titled “What “end-to-end autonomous” actually means”

Here is the honest version, because the asterisks matter.

The operator genuinely handles: concept generation, IP screening, art generation, product listing, size variants, storefront display, checkout session creation, order ingestion, and Printful fulfillment. Nobody needs to babysit the pipeline once a design is approved.

The hard walls — enforced in code, tested in CI, not just documented — are:

  1. Money: no real charge, ad spend, payout, or reorder without MERCH_LIVE_ORDERS=1 and confirm=True. Flipping those flags required JD explicitly.
  2. Identity: the operator never touches bank accounts, Stripe KYC, domain registrar, or any identity-bearing account.
  3. IP sign-off: every design concept passes through the IP gate, and if the gate is uncertain it rejects. Design drops still need JD’s approve N.

The first sale was JD’s own test buy — Out Of Usage tee, White, XL, shipped to Draper UT. That’s not a caveat that undercuts the claim; it’s the proof that the pipeline works end-to-end. Printful order 161851271 is real money, real fulfillment, real hardware in a box.

Before going live, a two-layer agentic test harness verified the pipeline with zero money:

  • Layer 1: Playwright frontend QA — 23 product cards rendered, checkout gated, size selection working.
  • Layer 2: synthetic signed Stripe webhook buy-to-ship — proved the fulfillment path via Printful’s estimate_order_costs endpoint (no actual order placed). Margin verified: $14.31/shirt, 55% at test pricing.

The harness runs twice daily via cron (9am/6pm) and only notifies on failure. By the time real money flowed, the pipeline had already proved itself on a dozen synthetic orders.


Next: The IP gate that keeps the store legal — and caught “Harry Potter” — is worth understanding on its own.