PRD — SEO Sentinel v1: Local SEO Automation agent
First Managed Agent deployment for SEO Navigator. Scope is deliberately narrow: the 5 Local SEO Automation modules from Workflow 1 (GBP Analyzer, On-Page Intelligence, Geographic Intelligence, Citation Intelligence, AI Visibility Tracker). This PRD is the execution spec for Trung (IT), with clearly labelled inputs needed from Jake and the SEO team.
Confident (grounded in platform.claude.com/docs/en/managed-agents): Agent / Environment / Session lifecycle, built-in toolset, MCP config, event streaming, pricing model ($0.08/session-hour + standard tokens), beta header requirements.
Needs verification with Anthropic before Trung finalizes config:
- Secrets management — I don't see an explicit
secretsorenv_varsfield in the Environment schema. Trung should confirm how API keys (Apify, Firecrawl, OpenAI, Gemini, Perplexity) are passed into the container securely. Fallback plan included below. - Container CPU/memory/disk specs — docs reference "cloud containers" but don't publish hardware. Geographic Intelligence module with DBSCAN clustering may need verification it fits.
- Exact session-hour billing for idle time — docs say idle waiting for input doesn't bill, but behavior while the container waits on external API responses (Apify runs can take 60-90s) should be confirmed.
These are flagged inline throughout the PRD and collected in Section 14: Open Questions. It's okay to ship without perfect answers — just don't guess.
Contents
- Executive summary
- Scope & out of scope
- Architecture at a glance
- What Jake owns (strategic inputs)
- What SEO team owns (domain inputs)
- What Trung owns (technical build)
- Agent definition
- Environment definition
- Secrets & API credentials
- MCP servers
- Trigger mechanism & orchestration
- Test plan (T1 & T2)
- Deployment sequence (day-by-day)
- Open questions & verification items
- Risks & mitigations
- Cost model
- Definition of Done
- RACI matrix
- Appendix: full curl reference
1 Executive summary The one-paragraph version
Deploy one Claude Managed Agent — SEO Sentinel v1 — that runs the 5 Local SEO Automation modules (GBP Analyzer, On-Page Intelligence, Geographic Intelligence, Citation Intelligence, AI Visibility Tracker) against a single client on demand. The agent is triggered by a ClickUp task (status change to "Ready") or a Slack slash command. It runs in an isolated Anthropic-managed cloud container, reads client context from a handoff payload JSON, calls external APIs (Apify, Firecrawl, Google Maps, OpenAI, Gemini, Perplexity) via bash, queries SEO Utils via MCP, and writes a structured audit report to /mnt/session/outputs/. Results post to Slack and update the ClickUp task. Target: ~$1.50–2.00 per run, ~20–30 min wall-clock, replacing ~3 hours of manual analyst work for the initial audit.
2 Scope & out of scope Ruthless containment
In scope for v1
| Module | What the agent does | Data source |
|---|---|---|
| 1. GBP Analyzer | Runs 44-point Google Business Profile audit for client location, benchmarks against 3 competitors, scores completeness + optimization. | Apify GBP scraper actor + Claude reasoning |
| 2. On-Page Intelligence | Crawls client homepage + top 5 service pages, categorizes URLs, analyzes on-page SEO (titles, H1s, schema, internal links, content depth). | Firecrawl API |
| 3. Geographic Intelligence | Generates ranking grid (7×7 or 13×13) around client address, scrapes rankings for 5 target keywords, runs DBSCAN clustering, outputs heatmap data (JSON for Leaflet). | Google Maps API + Apify SERP actor + scikit-learn |
| 4. Citation Intelligence | Scrapes 40+ major directory citations, validates NAP (Name/Address/Phone) consistency, flags mismatches. | Apify citation scraper actor |
| 5. AI Visibility Tracker | Queries ChatGPT, Gemini, Perplexity (+ optionally Claude) for 5 client-relevant queries; scores whether client name appears in the response and in what position. | OpenAI API, Gemini API, Perplexity API |
Explicitly out of scope for v1
- Monthly recurring execution. v1 runs on-demand per client onboarding. Monthly recurring (Workflow 4) is a later version.
- GBP Posts / GBP replies / any write-back to Google. Read-only only.
- QA Grader integration. Grader agent is a separate Sprint 1 W3 task; v1 ships with manual SEO Lead review.
- PM Pulse coordinator. v1 is invoked directly; multi-agent orchestration is Sprint 2.
- HITL Slack gate. v1's outputs are read-only reports — no gated actions. HITL infra comes with PM Pulse.
- Memory store. Research preview only; v1 doesn't need persistence across sessions yet.
- Multiple concurrent clients. v1 runs serially. Concurrency is a config change later, not an architecture change.
Workflow 1 is the highest-leverage first agent for three reasons: (1) every new client needs it (high frequency), (2) it's read-only (low blast radius if the agent makes mistakes), (3) it exercises all the infrastructure Trung needs to build anyway (bash, MCP, external APIs, file outputs). If we can ship this, we can ship anything else in the roadmap.
3 Architecture at a glance Four boxes, three arrows
The agent's brain is Claude Sonnet 4.6 doing reasoning — deciding which module to run, interpreting Apify output, writing the audit narrative. The agent's hands are bash scripts calling external APIs inside the container. Anthropic runs the loop; Trung wires the hands.
4 What Jake owns Strategic decisions only Jake can make
- Approve Anthropic API spend budget. Estimated $50–100 for Sprint 1 testing (agent definition iteration + T1 + T2 dry runs on synthetic client). Going-concern estimate: ~$1.75 per production run × ~8 active clients onboarded per quarter = ~$14/quarter. Effectively negligible; approval is formality. Effort: 15min review.
- Approve external API spend. Apify (~$49/mo plan), Firecrawl (~$19/mo plan), OpenAI (~$5/mo for visibility queries), Gemini (free tier works), Perplexity (~$5/mo). Total: ~$80/mo in external API subs if we don't have them already. Confirm which already exist on the company card. Effort: 30min to audit existing subs.
- Sign off on agent behavior boundaries. Explicit: is the agent allowed to (a) write to ClickUp task comments, (b) post to Slack channels, (c) query client GBP via our GBP access — yes/no each. Effort: 20min decision call with Trung.
- Select 2 test clients for T1 and T2. Needs: one synthetic (fake data, safe to break) for T1, one real low-risk client (Shamrock or Procam) for T2 with prior notification that we're running automation. Effort: 30min + client heads-up message.
- Approve the Definition of Done (Section 17) in writing. What "shipped" means. Prevents scope drift mid-sprint. Effort: 15min review after PRD read.
- Submit the Research Preview application. Not blocking for v1 (v1 uses only stable features), but needed for Sprint 2 coordinator work. claude.com/form/claude-managed-agents. Effort: 45min form fill.
Total Jake effort: ~2.5h across a week.
5 What SEO team owns Domain knowledge only the SEO team can provide
- Write the agent's system prompt (draft). This is the "who the agent is and how it thinks" — not the step-by-step. Consolidate from existing skills:
seo-navigator-agency-os,koray-city-page-auditor,ai-visibility-audit, plus local SEO methodology. Target: 2,000–3,500 words. SEO Lead: 4h. Jake reviews. - Define the 44-point GBP rubric. Which 44 things are we scoring, and how is each scored (binary / 0–5 / 0–10)? Current Local SEO Automation doc mentions "44-point audit" but the scoring matrix needs to be explicit for the agent to apply it consistently. SEO Lead: 3h.
- Define keyword + competitor selection logic. For a new client, how does the agent decide which 5 keywords to grid-rank and which 3 competitors to benchmark against? Rules, not examples. Example rule: "Pick top 3 Google-ranked competitors within 5 miles of the client for the client's primary service keyword." SEO Lead: 2h.
- Define output report structure. What sections does the final audit report have? What's the executive summary format? What's the format for recommendations (Priority 1/2/3)? Reference: existing audit deliverable templates. SEO Lead + Senior TL: 3h.
- Provide 1 reference audit for T1. An existing high-quality audit (manually produced) the agent's output will be compared against. Senior TL: 1h to package.
- Validate T1 output (Day 18). Side-by-side comparison of Sentinel's output vs. reference audit. Score on coverage, accuracy, structure. Pass/fail call. SEO Lead: 2h.
- Validate T2 output (Day 22). Same, but on a real client's data. Pass/fail call with specific issue list. SEO Lead: 2h.
Total SEO team effort: ~17h across 2 weeks.
6 What Trung owns Technical build — the core PRD
The rest of the PRD (Sections 7–19) is effectively Trung's spec. Headlines:
- Prerequisites sanity check (Day 8). Confirm API key has Managed Agents access. Verify SEO Utils MCP cloudflared tunnel uptime (>99% over last 7 days). Confirm all external API subscriptions active. Apply
anthropic-beta: managed-agents-2026-04-01header works on a test call. 1h. - Create the shared Environment. Section 8. 1h.
- Upload skill files + rubrics + helper scripts to environment mount. Section 8. 2h.
- Create the SEO Sentinel Agent. Section 7. Iterate on system prompt with SEO Lead. 3h.
- Configure MCP servers. Section 10. ClickUp, Slack, SEO Utils via cloudflared tunnel. 1h.
- Build the orchestration script (v1). Section 11. Node.js on small VPS, handles webhook → create session → stream SSE → fetch outputs → post to Slack. 4h.
- T1 + T2 execution. Section 12. Run smoke tests, iterate on failures, handoff to SEO Lead for validation. 2h (execution only — failures can add 2–6h).
Total Trung effort: ~14h baseline, budget 20h for iteration.
7 Agent definition The configuration itself
An Agent in Managed Agents is a reusable, versioned bundle of: model + system prompt + tools + MCP + skills. Referenced by ID from every session.
System prompt — v1 DRAFT (for SEO Lead + Senior TL to iterate)
Below is a working first draft Claude wrote to give SEO Lead and Senior TL something concrete to edit, not a blank page. Target final length ~1,500–2,500 words after your pass (mine is ~1,100 — intentionally spare, your methodology detail will fill it out).
What I got right: structure, output contract, guardrails, tone guidance, module playbook skeleton.
What you need to fill in: the actual Koray methodology depth, the specific 44-point GBP rubric logic, your detailing-vertical specifics, examples of "good" recommendation phrasing, and the self-check questions that catch common failure modes.
Don't overdo it. This goes in the agent's system field — it's re-processed every session turn. Keep it tight. Skills (mounted as files) carry the deep methodology reference; the system prompt should be the executable playbook, not the textbook.
When you edit this draft, focus your pass on:
- Methodology depth (section "Methodology you operate from"). I kept it spare; your version should pull in the specific Koray concepts that matter most to detailing + service-area businesses.
- Module scoring logic. Each module section has a "Good output characteristics" subsection — expand with 2–3 concrete examples from past audits of what "direct, specific, prioritized" looks like.
- Self-check questions. I wrote 4. You probably have 8 more from the times audits have failed in review. Add them — they're the cheapest way to catch failure modes before the report reaches a human.
- The competitor selection rule. I left it as "3 competitors from the payload." But who should those 3 be — top GBP-ranked? Closest geographic? Most similar service menu? State the rule explicitly.
Target word count after your pass: 1,500–2,500. If you go over 3,000, we have a skill-mounting problem — consider moving deep methodology into a skill file referenced from the prompt.
Once SEO Lead finishes the system prompt and we mount the 5 skills, measure total system context tokens. If >80K tokens, this will affect cost per run significantly (system prompt is re-read on every turn, though cached after first 5min). Action: Trung runs a dry session after the agent is created, captures usage.input_tokens from the first turn, reports back. If over budget, we split skills to Catalyst or trim the system prompt.
8 Environment definition The container template
An Environment is the container template — packages, networking, mounts. Reusable across sessions. Multiple sessions can share one environment but each session gets its own isolated container instance (filesystem state is NOT shared across sessions).
Files to mount (via Files API upload, then reference in agent's skills[] or manually via bash)
| File | Purpose | Owner |
|---|---|---|
koray-city-page-auditor.md | Koray methodology reference | Senior TL pulls from existing skill |
ai-visibility-audit.md | AI visibility scoring methodology | Senior TL |
seo-navigator-agency-os.md | Agency methodology | Senior TL |
gbp-44-point-rubric.json | Scoring matrix for GBP module | SEO Lead (Section 5) |
output-report-template.md | Final audit report structure | SEO Lead (Section 5) |
handoff-schema.json | Client context payload shape | PM (already defined in 90-day plan W1) |
helpers/apify_run.sh | Generic Apify actor runner | Trung writes |
helpers/firecrawl_crawl.sh | Firecrawl wrapper | Trung writes |
helpers/geo_grid.py | Grid generator + DBSCAN clustering | Trung writes |
helpers/ai_visibility.py | Multi-LLM visibility query runner | Trung writes |
Apify, Firecrawl, OpenAI, Gemini, Perplexity all have REST APIs. We could wrap them as custom MCP servers, but that's 5 more services Trung has to maintain. For v1, curl them from bash scripts inside the container. Faster to ship, easier to debug. Revisit if the pattern repeats across agents and the maintenance burden of ad-hoc scripts becomes larger than hosting an MCP.
9 Secrets & API credentials The verification-needed item
The Environment schema I've seen in the public-beta docs (packages, networking, type) does NOT show an explicit field for secrets or environment variables. I don't want to invent a mechanism that doesn't exist.
Trung action: Before Day 8, verify the secrets mechanism with Anthropic via one of:
- Anthropic sales/support contact — confirm how API keys are passed to the container.
- Test API call: attempt to include
env_varsin the environment config payload and inspect error response. Schema validators typically give useful hints. - Search
platform.claude.com/docs/en/managed-agents/environmentsfull page (I may not have seen it all).
Plan A — if Managed Agents supports environment-level secrets
Store these on the environment (or wherever the API dictates):
APIFY_API_TOKENFIRECRAWL_API_KEYOPENAI_API_KEY(for AI visibility module only — not the agent itself)GEMINI_API_KEYPERPLEXITY_API_KEYGOOGLE_MAPS_API_KEY
Bash scripts read from $APIFY_API_TOKEN etc. Standard 12-factor app pattern.
Plan B — fallback if no native secrets mechanism
Pass the keys as part of the initial user.message content from the orchestration script, bundled inside the handoff payload. The agent's first bash action is to export them into environment variables for the duration of the session. Downsides:
- Keys appear in the session event log — visible to anyone with Console access to the session.
- Keys live in memory of the session, which is isolated per session, so blast radius is limited to one session's container.
- Rotation requires sending new keys per session instead of rotating once at the environment level.
Plan B is acceptable for v1 testing. Not acceptable for v2 production at scale. If Anthropic confirms no native secrets, we file a feature request with their sales team and commit to migrating once available.
Plan C — most conservative
Build a thin secrets-proxy MCP server that the agent calls to fetch credentials per request, hosted on the same VPS as the orchestration script. MCP auth tokens themselves become the "secret" passed into the agent. Over-engineered for v1; mentioning only for completeness.
Start with Plan A attempt → Plan B fallback if Plan A doesn't exist. Document the decision in ClickUp and revisit at end of Sprint 1. Never commit API keys to git.
10 MCP servers What the agent can talk to natively
| MCP | URL | Purpose for Sentinel v1 | Status |
|---|---|---|---|
| ClickUp | https://mcp.clickup.com/mcp | Read client task details + update task status. Post results to task comments. | Live |
| Slack | https://mcp.slack.com/mcp | Post audit results to #seo-automation channel. | Live |
| SEO Utils | https://<tunnel>.cfargotunnel.com | Read existing rank tracking data as baseline reference. Per seo-utils-mcp-guide skill, use query_database on organic_rank_tracker_* tables. | Depends on Mac Mini |
SEO Utils MCP — the fragile one
SEO Utils runs locally on one Mac Mini, tunneled via cloudflared tunnel --url http://localhost:19515. If the Mac Mini reboots, loses network, or the tunnel process dies, Sentinel can still run modules 1–5 but loses the rank baseline reference. Action items for Trung:
- Wrap the tunnel in a systemd/launchd service with auto-restart.
- Add a health-check: simple curl to the tunnel URL every 5min from a monitoring script, Slack alert on 3 consecutive fails.
- In the agent system prompt, instruct: "If SEO Utils MCP is unavailable, continue with modules 1–5, flag in output that baseline rank data is missing, do not fail the overall run."
Explicitly NOT used by Sentinel v1
- Google Drive MCP — v1 writes outputs to
/mnt/session/outputs/and the orchestration script uploads to Drive. Agent doesn't touch Drive directly. - QueryMind MCP — that's for Content Catalyst, not Sentinel.
- GoHighLevel MCP — Sentinel v1 doesn't touch CRM. That's Revenue Relay scope.
- Figma MCP — not relevant.
11 Trigger mechanism & orchestration How the agent actually gets invoked
Because the orchestration layer is reusable across every future agent (Content Catalyst, Revenue Relay, Ad Arbitrage, Build Bot, PM Pulse), it has its own dedicated PRD: prd_orchestration_harness.html.
That PRD covers: the config-file pattern that lets new agents plug in without code changes, the session lifecycle state machine, SSE event handling with reconnect logic, HITL gate handling (stubbed for v1, real in Sprint 2 for PM Pulse), idempotency, logging, VPS deployment, and a 7-test integration suite.
This section (11) remains here as a Sentinel-specific summary so SEO team and Jake have enough context to understand what Sentinel depends on, without needing to read the full orchestration spec. The deep spec is Trung's domain.
v1 triggers (Trung to implement)
- Manual CLI trigger (Day 14–18, for testing): Trung runs
./sentinel-run.sh <client_id>on the VPS. Script reads client payload from local JSON, creates session, streams SSE, prints output. - ClickUp webhook trigger (Day 22 onward): ClickUp task in the Local SEO Onboarding list changes status to "Ready (Automate)". Webhook hits orchestration script's
/webhook/sentinelendpoint. Script resolves client from task's custom fields, creates session. - Slack slash command (optional stretch):
/sentinel audit <client-name>in #seo-team. Useful for ad-hoc reruns.
Orchestration script responsibilities
Deliverable routing
- JSON audit → Google Drive folder
/Clients/{client_name}/Audits/sentinel-audit-{timestamp}.json - Markdown audit → Same folder,
.mdextension - Summary (3 bullets) → Slack
#seo-automationchannel + ClickUp task comment - Grid heatmap data → Separate Drive file for client-facing deliverable rendering
12 Test plan — T1 & T2 Two gates before declaring done
| Test | When | Input | Pass criteria | Owner |
|---|---|---|---|---|
| T1 — synthetic client | Day 18 | Fake client: "SN Test Detailing", Ho Chi Minh City, generic detailing service menu, 3 synthetic competitors. Safe to break. | All 5 modules execute without fatal error. Output file exists at correct path, valid JSON, covers all sections per output template. Session cost <$3. Run time <45 min. | Trung (execute) → SEO Lead (validate) |
| T2 — real low-risk client | Day 22 | Shamrock Detailing Columbus OH or Procam Detailing Bullhead City AZ (per Jake's pick in Section 4). Real GBP, real competitors, real rank data. | All T1 criteria +: SEO Lead side-by-side review vs. prior manual audit scores ≥ 80% coverage + 0 factual errors + useful recommendations. Output passes "would I send this to the client after light editing?" test. | Trung (execute) → SEO Lead (validate) |
Per the 90-day plan exit criteria: T1 must pass 3 consecutive clean runs before we move on. A single clean run could be luck. Three rules out most randomness. If run 3 fails but runs 1–2 passed, we don't promote — we investigate why it failed.
13 Deployment sequence — day by day What happens when
| Day | Owner | Milestone | Exit check |
|---|---|---|---|
| 1–3 | Jake | Section 4 items 1–5 complete | Budget approved, test clients selected, DoD signed |
| 1–7 | SEO Lead + TL | Section 5 items 1–5 complete | System prompt draft + 44-point rubric + keyword logic + output template all exist |
| 8 | Trung | Prereqs verified | Test API call to /v1/agents returns 200. Tunnel uptime confirmed. |
| 8 | Trung | Secrets plan decided (Section 9) | Plan A confirmed working OR Plan B locked in. Documented in ClickUp. |
| 9 | Trung | Environment env_seonav_prod created | env_… ID captured. Dry session spawns successfully + packages installed. |
| 10–11 | Trung | Files uploaded to environment mount | All skill files, rubrics, helper scripts exist in container at known paths. |
| 12–13 | Trung + SEO Lead | Agent SEO Sentinel created + v1 system prompt in place | agent_… ID captured. Token count verified <80K on first turn. |
| 14 | Trung | MCP servers configured + tested | Agent can read a known ClickUp task, can post to Slack, can query SEO Utils. |
| 15–17 | Trung | Orchestration script v1 working (manual CLI mode) | Manual ./sentinel-run.sh completes end-to-end on synthetic data. |
| 18 | Trung → SEO Lead | T1 — 3 clean runs on synthetic client | 3 consecutive passes. Output validated by SEO Lead. |
| 19–21 | Trung | ClickUp webhook integration | Task status change triggers session. Slack + ClickUp postback works. |
| 22 | Trung → SEO Lead → Jake | T2 — real client run | SEO Lead sign-off. Jake notified. Sprint 1 exit complete for this agent. |
14 Open questions & verification items The things I don't know — don't guess
Per user preferences: these are explicit "I don't know, let's verify at the source" items. None should block Day 1 work, but all should be answered before T2.
- Secrets mechanism for Managed Agents environments. (See Section 9.) Path: ask Anthropic support or inspect the Environment API response schema for
env_vars/secretsfields. Blocker for: clean secret handling. Workaround: Plan B (pass in user.message). - Is idle time while waiting for Apify runs billed as session-hour? Apify GBP scraper can take 60–90s per run. If we're running 3 competitors × 5 keywords of grid data, that's several minutes of the agent waiting on Apify. Does billing accrue during this wait? The skill says "idle waiting for human input doesn't bill" but waiting on external APIs may differ. Blocker for: accurate cost estimate. Workaround: measure actual cost on T1 runs, update estimate.
- Maximum session duration cap. Is there a hard kill at 2h, 6h, 24h? Full Sentinel run should be ~20-30min, so well under any reasonable cap, but worth knowing for future. Non-blocker.
- Container specs (CPU/memory/disk). DBSCAN clustering on a 13×13 grid × 5 keywords = 845 points. Trivial on any modern machine, but worth confirming container has at least 512MB memory for numpy/pandas operations. Non-blocker, likely fine.
- Data residency. Docs mention
inference_geoparameter — does it apply to Managed Agents? Some detailing clients might require US-only inference. Non-blocker for first 3 test clients. - MCP call billing against session tokens. When Sentinel calls ClickUp MCP or SEO Utils MCP, are the MCP response tokens counted against the agent's input tokens? Almost certainly yes, but worth measuring the weight on the system prompt. Non-blocker.
- The `skills` field in agent config — how does progressive disclosure actually work? The docs describe it, but I haven't seen the exact Files API upload flow in the excerpts I have. Trung should test uploading one skill file first, confirm it appears mounted in the container, before uploading all 5. Blocker for: skill mounting. Low risk, just test early.
15 Risks & mitigations What could break the ship date
| Risk | Severity | Mitigation |
|---|---|---|
| Secrets mechanism isn't what I assumed; Plan B leaks keys into session logs | High | Verify Day 8. If Plan B is only option, restrict Claude Console access to Jake + Trung only for now. Rotate keys monthly. |
| System prompt + skill mount is too large (>80K tokens) | High | Measure token count on first dry run. If over, trim by: splitting skills between Sentinel and future Catalyst, removing agency-os skill from Sentinel (not needed for read-only audits). |
| SEO Utils cloudflared tunnel down during a run | Medium | Agent gracefully degrades — flags missing baseline data in output, completes modules 1–5 anyway. Health check + Slack alert on tunnel drop. |
| Apify or Firecrawl rate-limits us during a run | Medium | Upgrade to paid tier that matches our run frequency. Retry logic in bash helpers (3 retries with exponential backoff). Flag in output if a module partially failed. |
| SEO Lead's system prompt doesn't ship in time for Day 12 | Medium | Trung uses placeholder system prompt to unblock infrastructure testing. Real system prompt is a PATCH (version bump) on the agent, doesn't require recreation. |
| Output quality is inconsistent across runs | Medium | Baseline against reference audit in T1. If inconsistent, tighten system prompt with more explicit output contract (exact section headers, exact field names). |
| Orchestration script is buggy, sessions leak or double-trigger | Low-Med | Idempotency key in webhook handler (use ClickUp task ID + timestamp). Max 1 active Sentinel session per client at a time, enforced client-side. |
| Cost overruns — retry loops inflate token spend | Low-Med | Session cost alert: Slack ping if any single session >$5. Daily total alert if >$30. |
| Container lacks a Python package we need | Low | Agent can pip install inside the container at runtime. Slows first run but fixes itself. Add to environment packages list on next update. |
| Google Maps API quota exceeded during grid generation | Low | Cache geocoded addresses in a local SQLite at /mnt/session/. Budget: 2500 free requests/day; grid of 169 points × few clients/day stays well under. |
16 Cost model What this actually costs per run
Real numbers come after 3–5 T1 runs. These are prior-belief starting points, not commitments. Update the model with measured numbers after Day 18.
| Cost component | Est. per run | Assumptions |
|---|---|---|
| Claude tokens — input | ~$0.30 | ~80K input tokens (system + skills + handoff payload), mostly cached after first minute. Cache reads at 10% of base rate. |
| Claude tokens — output | ~$0.45 | ~30K output tokens (audit report content + module narrations) |
| Session-hour | ~$0.035 | 25 min active runtime × $0.08/hr. Actual may be lower if idle-on-external-API doesn't bill. |
| Web search (built-in tool) | ~$0.10 | ~10 searches × $0.01 each ($10/1000) |
| Apify runs | ~$0.40 | GBP scraper + citation scraper + SERP scraper. Free tier likely covers testing. |
| Firecrawl | ~$0.05 | Single site crawl, ~20 pages |
| Google Maps API | ~$0.00 | Under free tier (2500 req/day) |
| OpenAI (ChatGPT for AI visibility) | ~$0.05 | 5 queries × ~500 output tokens |
| Gemini API | ~$0.00 | Free tier |
| Perplexity API | ~$0.05 | 5 queries |
| Total per run: | ~$1.45–1.90 | |
At 8 new clients onboarded per quarter × ~$1.75/run = ~$14/quarter production cost. Plus ongoing external API subscriptions (~$80/mo) which are fixed costs that would exist anyway.
17 Definition of Done Jake signs this off before work starts
- Agent exists.
agent_…ID for "SEO Sentinel" is recorded in ClickUp and in version control. - Environment exists.
env_seonav_prodcreated, packages installed, files mounted, reusable. - Secrets path decided. Either Plan A confirmed working or Plan B explicitly accepted by Jake, documented in ClickUp.
- Orchestration script runs on VPS. Can be triggered via manual CLI or ClickUp webhook. Posts deliverables to Slack + ClickUp.
- T1 passes 3 consecutive clean runs on synthetic client. SEO Lead validated output.
- T2 passes once on real low-risk client. SEO Lead would send output to client with light editing only.
- Cost per run measured and within $3 per run (2x buffer over estimate).
- Runbook exists in ClickUp. Trung documents: how to retrigger a run, how to check logs, how to rotate API keys, who to escalate to.
- Known issues logged. Any workarounds, rubric tuning needs, system prompt iterations captured for v2 planning.
- Jake reviews T2 output and signs off. Final gate. Green-light to use on next real client onboarding.
18 RACI matrix Who does what, who signs off
R = Responsible (does the work) · A = Accountable (signs off) · C = Consulted · I = Informed
| Task | Jake | Trung | SEO Lead | Senior TL | PM |
|---|---|---|---|---|---|
| Overall v1 ship decision | A | C | C | I | I |
| Budget approval (Claude + external APIs) | R | C | |||
| Research Preview application | R | I | |||
| Test client selection | A | R | |||
| System prompt authoring | A | C | R | C | |
| 44-point GBP rubric | R | C | |||
| Output report template | A | R | R | ||
| Handoff schema | C | C | C | R | |
| Prerequisites verification (Day 8) | R | ||||
| Secrets mechanism verification | A | R | |||
| Environment creation | R | ||||
| File mount (skills + rubrics + helpers) | R | C | R | ||
| Agent creation + version management | R | ||||
| MCP wiring (ClickUp, Slack, SEO Utils) | R | ||||
| Helper scripts (apify, firecrawl, geo_grid, ai_visibility) | R | C | |||
| Orchestration script (Node.js on VPS) | R | ||||
| ClickUp webhook integration | R | C | |||
| T1 execution | R | C | |||
| T1 validation (output quality) | R | C | |||
| T2 execution on real client | I | R | C | ||
| T2 validation + final sign-off | A | I | R | ||
| Runbook documentation | I | R | C | ||
| Cost monitoring + alerting | I | R |
19 Appendix — Full curl reference Copy-pasteable for Trung
A1. Environment creation
A2. Agent creation (minimal — iterate later)
A3. Manual test session
A4. Fetch session outputs
A5. Update agent (versioned)
- Managed Agents overview
- Quickstart
- Agent setup
- Environments
- Events & streaming
- Uploaded SEO Navigator docs: Ops Blueprint, Managed Agent Swarm Roadmap, Architecture diagram
- Existing skills:
managed-agents-doc,seo-navigator-agency-os,seo-utils-mcp-guide