Use the scrape cache
For any read-only use case, the 1-hour Supabase cache prevents redundant fetches. Wire your scraper to the cache layer from the start.
The agent browser (agents/shared/browser.fetch_page(), registry slug agent-browser) is how agents in this system access and interact with web content. It is not a single implementation — it is a routing layer with a primary path (local Playwright) and a fallback (Steel.dev container) for sites that detect and block headless browsers.
from agents.shared.browser import fetch_page
result = fetch_page( url="https://example.com", authenticated_profile="x.com" # optional: use a warm session)Internally:
scrape_cache table). Repeated fetches within the cache window return the stored result rather than hitting the site again. Per-domain rate limiting is applied at the router level.Generic fetch_page works for read-only content. For sites that require interaction — posting, navigating authenticated flows, filling forms — per-site adapters live at agents/scrapers/<site>.py.
The BizBuySell adapter was shipped as the first concrete example: python3.12 -m agents.scrapers.bizbuysell --state CO --price-max 500000 --json. It navigates the business-listing search, paginates results, and returns structured JSON.
The X.com (Twitter) adapter is a more complex case because X requires authentication and changes its DOM structure frequently. The adapter uses a persistent profile (agent_browser/profiles/x.com) — a Chrome profile that stays logged in — so the agent does not need to authenticate on every run.
The authenticated-profile pattern is the most practically useful piece of the agent browser architecture.
The problem it solves: many sites (LinkedIn, Outlook, X, university portals) require OAuth login. Running a Playwright session that logs in fresh on every call is slow, fragile (OAuth flows change), and often blocked (repeated logins from the same IP trigger fraud detection).
The solution: create a Chrome profile directory with the site logged in, then reuse that profile on every subsequent call. The session stays warm as long as the cookies are valid.
# Using a warm X.com profileresult = fetch_page( url="https://x.com/compose/tweet", authenticated_profile="x.com" # reads from agent_browser/profiles/x.com)For X specifically (Socrates), the profile is populated via chrome_cookie_extractor.py: it reads JD’s logged-in Chrome profile, AES-decrypts the cookies via macOS Keychain, and writes auth_token and ct0 to the profile. After one manual run and one keychain permission grant, posting to X is fully autonomous — the profile stays warm until X invalidates the session.
The CHANGELOG has a representative failure worth noting (CHANGELOG 2026-05-19): the X reply composer was broken because the post_reply_browser function was launching a signed-out legacy .chrome-x-profile instead of the warm logged-in @Th3RealSocrates persistent profile. The symptom: headless dry-run opened the composer but hit “Reply to join the conversation” — the logged-out wall. The fix: repoint PROFILE_DIR and user agent to the correct persistent profile.
This is the most common class of authenticated-browser bug: the session reference is stale or pointing to the wrong profile. The fix is always the same: verify the profile path, verify the session is actually warm, add a smoke-test that checks for the logged-in state marker before doing the real action.
agent_browser packageOne item in the CHANGELOG that is worth flagging honestly: “Found that the fancy agent_browser package had its source excised (bytecode only) — flagged for JD; built on the surviving x_browser_reply pattern instead” (CHANGELOG 2026-06-08 11:17).
The agents/agent_browser package exists on disk. Its Python source was excised at some point, leaving only bytecode. The actual browser work in the system runs on x_browser_reply and agents/shared/browser.fetch_page(). The registry slug agent-browser points to a package that is partly hollow.
This is an honest disclosure, not a cover-up. If you are looking at the agent-browser package expecting full source code, read x_browser_reply and agents/shared/browser.py instead — those are the live implementations.
Use the scrape cache
For any read-only use case, the 1-hour Supabase cache prevents redundant fetches. Wire your scraper to the cache layer from the start.
Warm profiles, not fresh logins
Every OAuth site should have a named profile directory. Create it once, keep it warm, re-authenticate only when the session expires.
Test the logged-out state explicitly
Write a smoke test that checks for the presence of a logged-in UI element before running any authenticated action. Catch the “Reply to join the conversation” wall before it goes to production.
Per-site adapters for structured data
If you need structured output (not just rendered HTML), build a site-specific adapter. Generic fetch + LLM extraction is slower and more expensive than a targeted adapter.
Next: Everything stores its state somewhere. Supabase as the Shared Backend covers 200+ tables, UPSERT stabilization, the compute-IO incident, and token-spend tracking.