The IP-Safety Gate: a Hard Wall in Code
Before this, read:
- AgentTree Merch: the AI operator — what the gate protects
- Root-cause-first as a build process — the design philosophy behind fail-closed gates
Printful, Printify, Etsy, and Amazon all permanently ban repeat IP offenders. Not suspended — banned. The store and every product on it disappears. A single Harry Potter tee slipping through costs more than a lawsuit settlement; it costs the store itself.
That’s why the IP-safety gate is the first thing designed in agents/merch/ and the only module that gets its own section in the BUILD-PLAN.md: “the design pipeline must NEVER generate or list a concept that has not passed screen_concept().”
How it works
Section titled “How it works”The gate lives at agents/merch/ip_safety.py. It exposes one public function:
screen_concept(concept, slogan="", *, require_llm=True, model="sonnet") -> IPVerdict(safe: bool, score: int, reasons: [str], layer: str)Two layers run in sequence: cheap first, expensive second.
-
Blocklist pre-filter (deterministic, free, runs first)
A hardcoded denylist of brand names, celebrity names, sports teams, character/franchise names, bands, and IP-risk phrases (“logo”, “trademark”, “in the style of [named artist]”, “copyright”). Each entry matches as a whole-word, case-insensitive substring.
The denylist is extended by a YAML overlay at
~/clawd/domains/merch/state/ip-blocklist.yaml— no code change needed to add new entries.The blocklist alone can produce a reject (it is a high-precision filter). It cannot produce a pass. Passing requires the LLM tier.
-
LLM IP-risk classifier (Sonnet tier, ~$0.001/call)
The prompt: “Does this concept reference any real brand, person, character, logo, slogan, band, or copyrighted work? Score 0–10. Anything ≥ 3 → reject. Refuse on doubt.”
The score threshold is intentionally low. A false reject costs one wasted concept. A false pass costs the store.
The fail-closed contract
Section titled “The fail-closed contract”The most important property of this gate is what it does when it can’t decide.
# From ip_safety.py:# FAIL-CLOSED CONTRACT (non-negotiable): if the LLM is unavailable / errors /# the output can't be parsed, the verdict is REJECT, never "safe".If the LLM is unavailable, the answer is REJECT. If the response can’t be parsed, REJECT. If require_llm=True (the production default) and the LLM tier didn’t run, the blocklist pass alone is not enough — REJECT.
This is the opposite of how most gating logic works. Most gates assume the absence of a signal means safety. This gate assumes the absence of a signal means uncertainty, and uncertainty means no.
What it caught
Section titled “What it caught”During the initial testing of the design drop pipeline, the gate caught a concept that referenced “Harry Potter” — not by name in the test prompt, but by design theme that implied the IP. The blocklist hit before any image generation cost was incurred.
The CHANGELOG entry for 2026-06-08 17:19 notes: “IP-safety gate (blocklist + fail-closed LLM classifier; caught ‘Harry Potter’ unnamed in testing).”
“Unnamed in testing” is the key detail. The concept didn’t use the trademarked name directly. The LLM classifier recognized the implied reference and scored it above the threshold. That’s the case a deterministic blocklist alone can’t catch, and it’s precisely why two layers are better than one.
Integration in the design pipeline
Section titled “Integration in the design pipeline”Every concept generated by design_agent.py hits the gate before any image spend:
# From design_agent.py — no Ideogram call happens before thisfor concept in mined_concepts: verdict = ip_safety.screen_concept(concept["theme"], concept.get("slogan", "")) if not verdict.safe: log.warning("IP gate rejected %r — reasons: %s", concept["theme"], verdict.reasons) continue # Only safe concepts reach Ideogram image_url = generate_image(concept)The cost ordering matters. The gate is cheap (a few cents per batch, mostly free on the blocklist layer). Image generation on Ideogram costs real money per call. Running the cheap gate first means IP-unsafe concepts never trigger paid image generation.
The operator’s decide() function enforces the same gate at the decision level — no automated action can skip or bypass the IP check. This is one of the three hard walls tested in CI.
Extending the blocklist
Section titled “Extending the blocklist”Adding new blocked terms requires no code change. Edit the YAML overlay:
extra_terms: - "nintendo" - "pokemon" - "starbucks"extra_phrases: - "just do it" - "think different"The gate reads the overlay on every call. New entries take effect immediately without a restart or deployment.
What this pattern teaches
Section titled “What this pattern teaches”The IP gate is a small module with an outsized lesson: the cost of a wrong answer is not symmetric. When the downside of a false positive (wasted concept) and the downside of a false negative (banned store, lawsuit) are this different, the right threshold is not “50/50.” It is “reject unless you’re confident.”
Every autonomous system that touches legal, financial, or safety-adjacent decisions should start with that asymmetry analysis. The gate isn’t conservative because of timidity. It’s conservative because the math is unambiguous.
Next: The gate held on the first real order. What else the first sale found — three fulfillment bugs caught live.