Product Requirements Document · v1.0 · April 19, 2026

PRD — SEO Sentinel v1: Local SEO Automation agent

First Managed Agent deployment for SEO Navigator. Scope is deliberately narrow: the 5 Local SEO Automation modules from Workflow 1 (GBP Analyzer, On-Page Intelligence, Geographic Intelligence, Citation Intelligence, AI Visibility Tracker). This PRD is the execution spec for Trung (IT), with clearly labelled inputs needed from Jake and the SEO team.

Owner: Jake (accountable) · Trung (responsible) Target ship date: End of Sprint 1 W3 (Day 22) Est. IT effort: ~14 hours over 3 weeks Depends on: Anthropic Managed Agents ${' '}Public Beta

🔗 Paired document: The orchestration layer (triggers, session lifecycle, delivery routing, VPS deployment) is a separate reusable PRD — prd_orchestration_harness.html. This Sentinel PRD focuses on what the agent does. The orchestration PRD focuses on how it gets invoked. Trung reads both; SEO team mostly needs this one.

⚠️ What I'm confident about vs. what needs verification

Confident (grounded in platform.claude.com/docs/en/managed-agents): Agent / Environment / Session lifecycle, built-in toolset, MCP config, event streaming, pricing model ($0.08/session-hour + standard tokens), beta header requirements.

Needs verification with Anthropic before Trung finalizes config:

Secrets management — I don't see an explicit secrets or env_vars field in the Environment schema. Trung should confirm how API keys (Apify, Firecrawl, OpenAI, Gemini, Perplexity) are passed into the container securely. Fallback plan included below.
Container CPU/memory/disk specs — docs reference "cloud containers" but don't publish hardware. Geographic Intelligence module with DBSCAN clustering may need verification it fits.
Exact session-hour billing for idle time — docs say idle waiting for input doesn't bill, but behavior while the container waits on external API responses (Apify runs can take 60-90s) should be confirmed.

These are flagged inline throughout the PRD and collected in Section 14: Open Questions. It's okay to ship without perfect answers — just don't guess.

Executive summary
Scope & out of scope
Architecture at a glance
What Jake owns (strategic inputs)
What SEO team owns (domain inputs)
What Trung owns (technical build)
Agent definition
Environment definition
Secrets & API credentials
MCP servers
Trigger mechanism & orchestration
Test plan (T1 & T2)
Deployment sequence (day-by-day)
Open questions & verification items
Risks & mitigations
Cost model
Definition of Done
RACI matrix
Appendix: full curl reference

1 Executive summary The one-paragraph version

Deploy one Claude Managed Agent — SEO Sentinel v1 — that runs the 5 Local SEO Automation modules (GBP Analyzer, On-Page Intelligence, Geographic Intelligence, Citation Intelligence, AI Visibility Tracker) against a single client on demand. The agent is triggered by a ClickUp task (status change to "Ready") or a Slack slash command. It runs in an isolated Anthropic-managed cloud container, reads client context from a handoff payload JSON, calls external APIs (Apify, Firecrawl, Google Maps, OpenAI, Gemini, Perplexity) via bash, queries SEO Utils via MCP, and writes a structured audit report to /mnt/session/outputs/. Results post to Slack and update the ClickUp task. Target: ~$1.50–2.00 per run, ~20–30 min wall-clock, replacing ~3 hours of manual analyst work for the initial audit.

Modules in scope

~$1.75

Est. cost per run

~25m

Est. wall-clock per run

~3h

Manual analyst time replaced

2 Scope & out of scope Ruthless containment

In scope for v1

Module	What the agent does	Data source
1. GBP Analyzer	Runs 44-point Google Business Profile audit for client location, benchmarks against 3 competitors, scores completeness + optimization.	Apify GBP scraper actor + Claude reasoning
2. On-Page Intelligence	Crawls client homepage + top 5 service pages, categorizes URLs, analyzes on-page SEO (titles, H1s, schema, internal links, content depth).	Firecrawl API
3. Geographic Intelligence	Generates ranking grid (7×7 or 13×13) around client address, scrapes rankings for 5 target keywords, runs DBSCAN clustering, outputs heatmap data (JSON for Leaflet).	Google Maps API + Apify SERP actor + scikit-learn
4. Citation Intelligence	Scrapes 40+ major directory citations, validates NAP (Name/Address/Phone) consistency, flags mismatches.	Apify citation scraper actor
5. AI Visibility Tracker	Queries ChatGPT, Gemini, Perplexity (+ optionally Claude) for 5 client-relevant queries; scores whether client name appears in the response and in what position.	OpenAI API, Gemini API, Perplexity API

Explicitly out of scope for v1

Monthly recurring execution. v1 runs on-demand per client onboarding. Monthly recurring (Workflow 4) is a later version.
GBP Posts / GBP replies / any write-back to Google. Read-only only.
QA Grader integration. Grader agent is a separate Sprint 1 W3 task; v1 ships with manual SEO Lead review.
PM Pulse coordinator. v1 is invoked directly; multi-agent orchestration is Sprint 2.
HITL Slack gate. v1's outputs are read-only reports — no gated actions. HITL infra comes with PM Pulse.
Memory store. Research preview only; v1 doesn't need persistence across sessions yet.
Multiple concurrent clients. v1 runs serially. Concurrency is a config change later, not an architecture change.

💡 Why this scope

Workflow 1 is the highest-leverage first agent for three reasons: (1) every new client needs it (high frequency), (2) it's read-only (low blast radius if the agent makes mistakes), (3) it exercises all the infrastructure Trung needs to build anyway (bash, MCP, external APIs, file outputs). If we can ship this, we can ship anything else in the roadmap.

3 Architecture at a glance Four boxes, three arrows

# High-level flow — SEO Sentinel v1 run [1] ClickUp status → "Ready" ──webhook──▶ [2] Orchestration script (VPS) │ ▼ POST /v1/sessions │ ▼ [3] SEO Sentinel session (Anthropic container) │ ├─ reads handoff payload (client context JSON) ├─ runs 5 modules in sequence via bash: │ ├─ apify_run.sh gbp-scraper (module 1) │ ├─ firecrawl_crawl.sh (module 2) │ ├─ geo_grid.py (module 3) │ ├─ apify_run.sh citation-scraper (module 4) │ └─ ai_visibility.py (module 5) ├─ MCP calls to SEO Utils (for rank baseline reference) ├─ Claude synthesizes final report └─ writes /mnt/session/outputs/sentinel-audit-{client_id}.json + .md │ ▼ [4] Files API fetch → Slack post + ClickUp task update + Drive upload

🧠 Brain vs Hands reminder

The agent's brain is Claude Sonnet 4.6 doing reasoning — deciding which module to run, interpreting Apify output, writing the audit narrative. The agent's hands are bash scripts calling external APIs inside the container. Anthropic runs the loop; Trung wires the hands.

4 What Jake owns Strategic decisions only Jake can make

Jake Must complete before Day 8

Approve Anthropic API spend budget. Estimated $50–100 for Sprint 1 testing (agent definition iteration + T1 + T2 dry runs on synthetic client). Going-concern estimate: ~$1.75 per production run × ~8 active clients onboarded per quarter = ~$14/quarter. Effectively negligible; approval is formality. Effort: 15min review.
Approve external API spend. Apify (~$49/mo plan), Firecrawl (~$19/mo plan), OpenAI (~$5/mo for visibility queries), Gemini (free tier works), Perplexity (~$5/mo). Total: ~$80/mo in external API subs if we don't have them already. Confirm which already exist on the company card. Effort: 30min to audit existing subs.
Sign off on agent behavior boundaries. Explicit: is the agent allowed to (a) write to ClickUp task comments, (b) post to Slack channels, (c) query client GBP via our GBP access — yes/no each. Effort: 20min decision call with Trung.
Select 2 test clients for T1 and T2. Needs: one synthetic (fake data, safe to break) for T1, one real low-risk client (Shamrock or Procam) for T2 with prior notification that we're running automation. Effort: 30min + client heads-up message.
Approve the Definition of Done (Section 17) in writing. What "shipped" means. Prevents scope drift mid-sprint. Effort: 15min review after PRD read.
Submit the Research Preview application. Not blocking for v1 (v1 uses only stable features), but needed for Sprint 2 coordinator work. claude.com/form/claude-managed-agents. Effort: 45min form fill.

Total Jake effort: ~2.5h across a week.

5 What SEO team owns Domain knowledge only the SEO team can provide

SEO Lead + Senior TL Must complete before Day 14

Write the agent's system prompt (draft). This is the "who the agent is and how it thinks" — not the step-by-step. Consolidate from existing skills: seo-navigator-agency-os, koray-city-page-auditor, ai-visibility-audit, plus local SEO methodology. Target: 2,000–3,500 words. SEO Lead: 4h. Jake reviews.
Define the 44-point GBP rubric. Which 44 things are we scoring, and how is each scored (binary / 0–5 / 0–10)? Current Local SEO Automation doc mentions "44-point audit" but the scoring matrix needs to be explicit for the agent to apply it consistently. SEO Lead: 3h.
Define keyword + competitor selection logic. For a new client, how does the agent decide which 5 keywords to grid-rank and which 3 competitors to benchmark against? Rules, not examples. Example rule: "Pick top 3 Google-ranked competitors within 5 miles of the client for the client's primary service keyword." SEO Lead: 2h.
Define output report structure. What sections does the final audit report have? What's the executive summary format? What's the format for recommendations (Priority 1/2/3)? Reference: existing audit deliverable templates. SEO Lead + Senior TL: 3h.
Provide 1 reference audit for T1. An existing high-quality audit (manually produced) the agent's output will be compared against. Senior TL: 1h to package.
Validate T1 output (Day 18). Side-by-side comparison of Sentinel's output vs. reference audit. Score on coverage, accuracy, structure. Pass/fail call. SEO Lead: 2h.
Validate T2 output (Day 22). Same, but on a real client's data. Pass/fail call with specific issue list. SEO Lead: 2h.

Total SEO team effort: ~17h across 2 weeks.

6 What Trung owns Technical build — the core PRD

Trung (IT Lead) Day 8 through Day 22

The rest of the PRD (Sections 7–19) is effectively Trung's spec. Headlines:

Prerequisites sanity check (Day 8). Confirm API key has Managed Agents access. Verify SEO Utils MCP cloudflared tunnel uptime (>99% over last 7 days). Confirm all external API subscriptions active. Apply anthropic-beta: managed-agents-2026-04-01 header works on a test call. 1h.
Create the shared Environment. Section 8. 1h.
Upload skill files + rubrics + helper scripts to environment mount. Section 8. 2h.
Create the SEO Sentinel Agent. Section 7. Iterate on system prompt with SEO Lead. 3h.
Configure MCP servers. Section 10. ClickUp, Slack, SEO Utils via cloudflared tunnel. 1h.
Build the orchestration script (v1). Section 11. Node.js on small VPS, handles webhook → create session → stream SSE → fetch outputs → post to Slack. 4h.
T1 + T2 execution. Section 12. Run smoke tests, iterate on failures, handoff to SEO Lead for validation. 2h (execution only — failures can add 2–6h).

Total Trung effort: ~14h baseline, budget 20h for iteration.

7 Agent definition The configuration itself

An Agent in Managed Agents is a reusable, versioned bundle of: model + system prompt + tools + MCP + skills. Referenced by ID from every session.

# Agent config for SEO Sentinel v1 { "name": "SEO Sentinel", "model": "claude-sonnet-4-6", "description": "Local SEO Automation agent. v1 scope: 5 modules (GBP, On-Page, Geographic, Citations, AI Visibility).", "system": "<SEE SECTION 5 — SEO Lead writes this. Rough skeleton below.>", "tools": [ { "type": "agent_toolset_20260401" } // All built-in tools enabled: bash, read, write, edit, glob, grep, web_fetch, web_search // If we need to lock down later, use configs[] to disable individuals ], "mcp_servers": [ { "type": "url", "url": "https://mcp.clickup.com/mcp", "name": "clickup" }, { "type": "url", "url": "https://mcp.slack.com/mcp", "name": "slack" }, { "type": "url", "url": "https://<TUNNEL-URL>.cfargotunnel.com", "name": "seo-utils" } ], "skills": [ // Progressive-disclosure skills — uploaded via Files API, referenced by file_id // Plan to mount: // - koray-city-page-auditor (file_id_1) // - ai-visibility-audit (file_id_2) // - consensus-content-audit (file_id_3) // - seo-utils-mcp-guide (file_id_4) // - seo-navigator-agency-os (file_id_5) ], "metadata": { "owner": "seo-nav", "workflow": "workflow-1-local-seo", "version_notes": "v1 initial" } } // Response includes: id (agent_...), version (starts at 1, increments on update)

System prompt — v1 DRAFT (for SEO Lead + Senior TL to iterate)

📝 How to use this draft

Below is a working first draft Claude wrote to give SEO Lead and Senior TL something concrete to edit, not a blank page. Target final length ~1,500–2,500 words after your pass (mine is ~1,100 — intentionally spare, your methodology detail will fill it out).

What I got right: structure, output contract, guardrails, tone guidance, module playbook skeleton.
What you need to fill in: the actual Koray methodology depth, the specific 44-point GBP rubric logic, your detailing-vertical specifics, examples of "good" recommendation phrasing, and the self-check questions that catch common failure modes.

Don't overdo it. This goes in the agent's system field — it's re-processed every session turn. Keep it tight. Skills (mounted as files) carry the deep methodology reference; the system prompt should be the executable playbook, not the textbook.

# ═══════════════════════════════════════════════════════════════ # SEO SENTINEL · SYSTEM PROMPT v1 DRAFT # Author: Claude draft, 2026-04-19 # Review: SEO Lead + Senior TL before Day 12 # ═══════════════════════════════════════════════════════════════ # IDENTITY You are SEO Sentinel, SEO Navigator's Local SEO Automation agent. You produce comprehensive, accurate, actionable local SEO audits for local service businesses — primarily automotive detailing shops, also roofing, HVAC, dentistry, and similar owner-operated local businesses. You work for SEO Navigator, a boutique local SEO agency. Your outputs feed into human deliverables the agency sends to paying clients. Quality matters more than speed. A wrong recommendation loses the client's trust in the agency; a slow audit loses nothing. # YOUR JOB ON EVERY RUN On each run, you will receive a client handoff payload as JSON. You will: 1. Validate the payload has every required field. Halt if missing. 2. Execute five audit modules in sequence (detailed below). 3. Synthesize findings into a structured report. 4. Write two output files to /mnt/session/outputs/: - sentinel-audit-{client_id}.json (machine-readable, for downstream tools) - sentinel-audit-{client_id}.md (human-readable, for the SEO team) 5. Self-check the report before ending your turn. Revise if deficient. You end your turn only after both files exist and pass self-check. # METHODOLOGY YOU OPERATE FROM Local SEO has three ranking signals in Google's local pack: proximity, relevance, and prominence. Your audits surface what the business can actually influence — relevance (on-page, categories, services) and prominence (citations, reviews, backlinks, authority). Proximity is fixed by the client's physical address; you note it only as context. For on-page work, you apply Koray Tuğberk's semantic content network framework — pages are evaluated by source context fit, topical coverage, entity relationships, and query fan-out. See the koray-city-page-auditor skill mounted at /mnt/skills/ for detail. Don't re-derive it; reference it. For GBP work, you apply SEO Navigator's 44-point rubric at /mnt/skills/gbp-44-point-rubric.json. Every point is binary or 0-10 scored per the rubric spec. Don't invent new criteria. For AI visibility, the question is not "does the client rank in Google" but "does the client appear in generative answers to buyer-intent prompts." See the ai-visibility-audit skill. # THE FIVE MODULES — YOUR PLAYBOOK ## Module 1: GBP Analyzer What you produce: 44-point GBP score + competitor benchmark + top-5 gaps. How you get the data: - Run bash helpers/apify_run.sh gbp-scraper <client_gbp_url> - For each of the 3 competitors in the payload, run the same script. - Parse the JSON output each returns. How you score: - Load /mnt/skills/gbp-44-point-rubric.json. - For each of the 44 points, compute the score from the Apify output. - Sum weighted points → composite score 0-100. - For each competitor, compute the same composite. - Identify the top 5 gaps (rubric points where client underperforms the competitor average by the largest margin). Good output characteristics: - Specific: "Client lists 4 services; competitors average 11" — not "Services could be expanded." - Actionable: each gap names the exact GBP field to update. - Honest: if a rubric point couldn't be scored (data missing), mark null, don't guess. ## Module 2: On-Page Intelligence What you produce: Audit of the client's homepage + top 5 service pages. How you get the data: - Run bash helpers/firecrawl_crawl.sh <client_website_url> - Identify the 5 priority pages (homepage + top 4 service pages by internal link count). What you check per page: - Title tag (present, under 60 chars, contains primary keyword, unique) - Meta description (present, under 160 chars, compelling CTA) - H1 (single, matches page intent) - Heading hierarchy (H2s nested under H1, no skipped levels) - Schema.org markup (LocalBusiness, Service, FAQPage where relevant) - Internal links (count, anchor text variety) - Word count (thin < 300, thorough 800+, varies by page type) - Image alt text coverage % - Primary keyword in first 100 words Good output characteristics: - One table per page with all checks + pass/fail/flag for each. - Site-wide patterns called out ("4 of 5 pages missing LocalBusiness schema" is more useful than noting it 4 times). ## Module 3: Geographic Intelligence What you produce: Ranking heatmap data + visibility percentage per keyword + cluster analysis. How you get the data: - Run python helpers/geo_grid.py --address "{business_address}" --radius 5 --keywords <5 keywords from payload> - Script generates a 13×13 grid (169 points) around the address at 5-mile radius, scrapes rank for each keyword at each point via Apify SERP actor, runs DBSCAN to cluster hot/cold zones. - Output: GeoJSON file + summary JSON per keyword. What you report: - Per keyword: average rank, % of grid in top 3, % in top 10, % unranked. - Hot zones (clusters where rank ≤ 3) and cold zones (clusters where rank > 10 or unranked) — plain-English description of each. - Comparison to any existing baseline from SEO Utils MCP (see below). Good output characteristics: - Lead with the number, not the methodology. "12% top-3 coverage for 'ceramic coating near me'" is the insight. - Always cross-reference baseline. If no baseline exists, note that this run establishes it. ## Module 4: Citation Intelligence What you produce: NAP consistency scan across 40+ directories. How you get the data: - Ground truth NAP = client_name + business_address + contact_phone_primary from payload. - Run bash helpers/apify_run.sh citation-scraper "{client_name}" - Compare each directory listing's NAP to ground truth. What you report: - Directory matrix: directory name | listed | NAP match | mismatches. - Priority fix list: start with top-tier directories (Apple Maps, Yelp, Facebook, Bing Places, Yellow Pages) and any where phone/address mismatch (hurts rankings more than a missing listing). ## Module 5: AI Visibility Tracker What you produce: Client visibility score across ChatGPT, Gemini, Perplexity. How you get the data: - Derive 5 buyer-intent queries from client's priority services + city (example: "best ceramic coating in Columbus OH", "paint protection film installer near me"). Document the queries in your output. - Run python helpers/ai_visibility.py --queries <queries.json> - For each LLM × query combination, the script returns the response text. - Parse each response. Record: does the client appear? Competitors mentioned? In what position? What you report: - Per-LLM matrix: query | client mentioned (Y/N) | position | competitors mentioned. - Composite visibility score: (% of queries where client appears) × (inverse of average position when mentioned). - Gap analysis: if competitors appear but client doesn't, note why (thin content, missing entity markup, missing review volume). # INTERACTING WITH TOOLS Bash tool: Your primary execution surface. Run helper scripts under /mnt/skills/helpers/. Capture stdout + stderr. Parse JSON output with jq. Write intermediate data to /mnt/session/outputs/raw/. Read / write / edit tools: File operations. Use these for constructing the final JSON + Markdown outputs. Web search + web fetch: Use sparingly and only for (a) verifying a competitor's current website content, (b) checking a local news source cited in an audit, (c) disambiguating a business name. Never use for core audit data — the helper scripts are the source of truth. MCP — seo-utils: Query existing rank tracking baselines. Per the seo-utils-mcp-guide skill: use query_database on organic_rank_tracker_* tables — NOT the DataForSEO action tools unless the payload explicitly asks for fresh keyword research. MCP — clickup: Post run progress + results to the triggering task comment. Update task status when called out by the orchestration layer. MCP — slack: Post module completion updates to #seo-automation. # OUTPUT CONTRACT You write two files to /mnt/session/outputs/ before ending your turn. File 1: sentinel-audit-{client_id}.json { "client_id": "...", "client_name": "...", "audit_date": "ISO-8601 timestamp", "agent_version": "1.0.0", "overall_score": 0-100, "modules": { "gbp": { score, rubric_scores, competitor_benchmarks, top_5_gaps }, "onpage": { pages_audited, per_page_findings, sitewide_patterns }, "geographic": { keywords, per_keyword_metrics, heatmap_geojson_path }, "citations": { directory_matrix, priority_fixes }, "ai_visibility": { per_llm_matrix, visibility_score, gap_analysis } }, "priority_recommendations": [ { "priority": 1, "area": "...", "action": "...", "rationale": "...", "effort": "..." }, ... ], "errors_encountered": [ ... ] // empty if all modules succeeded } File 2: sentinel-audit-{client_id}.md Structure: 1. Executive Summary — 3-5 bullets that fit on one screen. Lead with the number. "Overall score: 71/100. Biggest issue: 8 missing GBP services." Not: "This audit examines multiple dimensions of the client's local SEO..." 2. Score Dashboard — overall + per-module scores in a table. 3. Module 1: GBP — findings + top 5 gaps + competitor benchmark table. 4. Module 2: On-Page — per-page findings + sitewide patterns. 5. Module 3: Geographic Intelligence — keyword visibility + hot/cold zones. 6. Module 4: Citations — directory matrix + priority fixes. 7. Module 5: AI Visibility — per-LLM matrix + gap analysis. 8. Priority Recommendations — P1 (do this week), P2 (this month), P3 (someday). Each with: specific action, why it matters, rough effort estimate. 9. Appendix — queries used, raw data file paths, methodology notes. # GUARDRAILS — WHAT YOU MUST NOT DO - Never post content to the client's GBP, website, social channels, or any external system on their behalf. You are read-only. - Never send emails on the client's or agency's behalf. - Never modify client assets (GBP listings, website files, directory listings, etc.). - Never use web_search or LLM API calls as a substitute for the helper scripts. The scripts are deterministic, reproducible, and accountable. - Never fabricate data. If a scraper returns empty, flag that module with "status": "partial" and continue. - Never pad the report. If a section has nothing worth saying, say so and move on. # GUARDRAILS — WHAT YOU MUST DO - Read the handoff payload first. Confirm every required field is present. If any are missing, halt and post a Slack message listing exactly what's missing, then stop. - Write raw module outputs to /mnt/session/outputs/raw/ as you go. This lets a human re-inspect individual module data without re-running. - After writing both output files, re-read them and self-check: · Are all 5 modules populated? · Are all priority_recommendations specific (contain concrete next steps, not "improve SEO")? · Is the overall_score internally consistent with module scores? · Does the Markdown executive summary match the JSON data? If self-check fails, revise once. If it still fails, mark the output with "self_check_failed": true and end your turn — the human will review. # WHEN TO PROCEED AUTONOMOUSLY VS. WHEN TO HALT Proceed autonomously: - Payload complete, helper scripts returning expected data, MCP calls succeeding. - Individual module fails — flag that module partial, continue with remaining four. Halt and ask the human (post to Slack, stop): - Payload missing required fields. - 3+ modules failing (systemic issue — network, credentials, infrastructure). - Competitor selection ambiguous: no 3 clear competitors identifiable within the client's service radius. - Self-check fails after one revision attempt. # YOUR TONE IN OUTPUTS - Direct. Skip hedging language. State what's true. - Specific. Numbers and named gaps, not vague descriptions. - Prioritized. Every recommendation gets P1/P2/P3, not a flat list. - Honest. If something couldn't be audited, say so explicitly. A short report with real findings beats a padded report with guesses. Think of your reader as a senior SEO strategist reviewing your work in 20 minutes. They need to know what's wrong, what to fix first, and why, with enough evidence that they can defend the recommendations to the client. They do not need to know how you got there — that's the raw data in the appendix. # END OF SYSTEM PROMPT v1 DRAFT # Next: SEO Lead expands methodology sections, Senior TL validates # output contract against existing deliverable templates.

✏️ SEO Lead iteration guide

When you edit this draft, focus your pass on:

Methodology depth (section "Methodology you operate from"). I kept it spare; your version should pull in the specific Koray concepts that matter most to detailing + service-area businesses.
Module scoring logic. Each module section has a "Good output characteristics" subsection — expand with 2–3 concrete examples from past audits of what "direct, specific, prioritized" looks like.
Self-check questions. I wrote 4. You probably have 8 more from the times audits have failed in review. Add them — they're the cheapest way to catch failure modes before the report reaches a human.
The competitor selection rule. I left it as "3 competitors from the payload." But who should those 3 be — top GBP-ranked? Closest geographic? Most similar service menu? State the rule explicitly.

Target word count after your pass: 1,500–2,500. If you go over 3,000, we have a skill-mounting problem — consider moving deep methodology into a skill file referenced from the prompt.

⚠️ Verify token count before finalizing

Once SEO Lead finishes the system prompt and we mount the 5 skills, measure total system context tokens. If >80K tokens, this will affect cost per run significantly (system prompt is re-read on every turn, though cached after first 5min). Action: Trung runs a dry session after the agent is created, captures usage.input_tokens from the first turn, reports back. If over budget, we split skills to Catalyst or trim the system prompt.

8 Environment definition The container template

An Environment is the container template — packages, networking, mounts. Reusable across sessions. Multiple sessions can share one environment but each session gets its own isolated container instance (filesystem state is NOT shared across sessions).

# Environment config for SEO Sentinel v1 { "name": "seonav-prod", "config": { "type": "cloud", "packages": { "pip": [ "requests", // HTTP for external APIs "pandas", // tabular data handling "numpy", // grid math "scikit-learn", // DBSCAN for Geographic Intelligence "beautifulsoup4", // on-page HTML parsing fallback "lxml", // fast XML parsing for sitemaps "geopy" // distance + geo utils for grid generation ], "apt": [ "jq", // JSON manipulation in bash "curl" // should be default, but explicit is safer ] }, "networking": { "type": "unrestricted" // v1 uses unrestricted for simplicity. Lock to "limited" in v2: // "type": "limited", // "allowed_hosts": [ // "api.apify.com", "api.firecrawl.dev", // "api.openai.com", "generativelanguage.googleapis.com", // "api.perplexity.ai", "maps.googleapis.com" // ], // "allow_mcp_servers": true, // "allow_package_managers": true } } } // Response includes: id (env_...) // Packages are pre-installed before the agent starts. Cached across sessions.

Files to mount (via Files API upload, then reference in agent's `skills[]` or manually via bash)

File	Purpose	Owner
`koray-city-page-auditor.md`	Koray methodology reference	Senior TL pulls from existing skill
`ai-visibility-audit.md`	AI visibility scoring methodology	Senior TL
`seo-navigator-agency-os.md`	Agency methodology	Senior TL
`gbp-44-point-rubric.json`	Scoring matrix for GBP module	SEO Lead (Section 5)
`output-report-template.md`	Final audit report structure	SEO Lead (Section 5)
`handoff-schema.json`	Client context payload shape	PM (already defined in 90-day plan W1)
`helpers/apify_run.sh`	Generic Apify actor runner	Trung writes
`helpers/firecrawl_crawl.sh`	Firecrawl wrapper	Trung writes
`helpers/geo_grid.py`	Grid generator + DBSCAN clustering	Trung writes
`helpers/ai_visibility.py`	Multi-LLM visibility query runner	Trung writes

💡 Why helpers as scripts, not MCP servers

Apify, Firecrawl, OpenAI, Gemini, Perplexity all have REST APIs. We could wrap them as custom MCP servers, but that's 5 more services Trung has to maintain. For v1, curl them from bash scripts inside the container. Faster to ship, easier to debug. Revisit if the pattern repeats across agents and the maintenance burden of ad-hoc scripts becomes larger than hosting an MCP.

9 Secrets & API credentials The verification-needed item

🛑 Verification required before Day 8

The Environment schema I've seen in the public-beta docs (packages, networking, type) does NOT show an explicit field for secrets or environment variables. I don't want to invent a mechanism that doesn't exist.

Trung action: Before Day 8, verify the secrets mechanism with Anthropic via one of:

Anthropic sales/support contact — confirm how API keys are passed to the container.
Test API call: attempt to include env_vars in the environment config payload and inspect error response. Schema validators typically give useful hints.
Search platform.claude.com/docs/en/managed-agents/environments full page (I may not have seen it all).

Plan A — if Managed Agents supports environment-level secrets

Store these on the environment (or wherever the API dictates):

APIFY_API_TOKEN
FIRECRAWL_API_KEY
OPENAI_API_KEY (for AI visibility module only — not the agent itself)
GEMINI_API_KEY
PERPLEXITY_API_KEY
GOOGLE_MAPS_API_KEY

Bash scripts read from $APIFY_API_TOKEN etc. Standard 12-factor app pattern.

Plan B — fallback if no native secrets mechanism

Pass the keys as part of the initial user.message content from the orchestration script, bundled inside the handoff payload. The agent's first bash action is to export them into environment variables for the duration of the session. Downsides:

Keys appear in the session event log — visible to anyone with Console access to the session.
Keys live in memory of the session, which is isolated per session, so blast radius is limited to one session's container.
Rotation requires sending new keys per session instead of rotating once at the environment level.

Plan B is acceptable for v1 testing. Not acceptable for v2 production at scale. If Anthropic confirms no native secrets, we file a feature request with their sales team and commit to migrating once available.

Plan C — most conservative

Build a thin secrets-proxy MCP server that the agent calls to fetch credentials per request, hosted on the same VPS as the orchestration script. MCP auth tokens themselves become the "secret" passed into the agent. Over-engineered for v1; mentioning only for completeness.

🔑 Recommended for v1

Start with Plan A attempt → Plan B fallback if Plan A doesn't exist. Document the decision in ClickUp and revisit at end of Sprint 1. Never commit API keys to git.

10 MCP servers What the agent can talk to natively

MCP	URL	Purpose for Sentinel v1	Status
ClickUp	`https://mcp.clickup.com/mcp`	Read client task details + update task status. Post results to task comments.	Live
Slack	`https://mcp.slack.com/mcp`	Post audit results to #seo-automation channel.	Live
SEO Utils	`https://<tunnel>.cfargotunnel.com`	Read existing rank tracking data as baseline reference. Per `seo-utils-mcp-guide` skill, use `query_database` on `organic_rank_tracker_*` tables.	Depends on Mac Mini

SEO Utils MCP — the fragile one

⚠️ Single point of failure

SEO Utils runs locally on one Mac Mini, tunneled via cloudflared tunnel --url http://localhost:19515. If the Mac Mini reboots, loses network, or the tunnel process dies, Sentinel can still run modules 1–5 but loses the rank baseline reference. Action items for Trung:

Wrap the tunnel in a systemd/launchd service with auto-restart.
Add a health-check: simple curl to the tunnel URL every 5min from a monitoring script, Slack alert on 3 consecutive fails.
In the agent system prompt, instruct: "If SEO Utils MCP is unavailable, continue with modules 1–5, flag in output that baseline rank data is missing, do not fail the overall run."

Explicitly NOT used by Sentinel v1

Google Drive MCP — v1 writes outputs to /mnt/session/outputs/ and the orchestration script uploads to Drive. Agent doesn't touch Drive directly.
QueryMind MCP — that's for Content Catalyst, not Sentinel.
GoHighLevel MCP — Sentinel v1 doesn't touch CRM. That's Revenue Relay scope.
Figma MCP — not relevant.

11 Trigger mechanism & orchestration How the agent actually gets invoked

🔗 Orchestration is its own PRD — see the paired document

Because the orchestration layer is reusable across every future agent (Content Catalyst, Revenue Relay, Ad Arbitrage, Build Bot, PM Pulse), it has its own dedicated PRD: prd_orchestration_harness.html.

That PRD covers: the config-file pattern that lets new agents plug in without code changes, the session lifecycle state machine, SSE event handling with reconnect logic, HITL gate handling (stubbed for v1, real in Sprint 2 for PM Pulse), idempotency, logging, VPS deployment, and a 7-test integration suite.

This section (11) remains here as a Sentinel-specific summary so SEO team and Jake have enough context to understand what Sentinel depends on, without needing to read the full orchestration spec. The deep spec is Trung's domain.

v1 triggers (Trung to implement)

Manual CLI trigger (Day 14–18, for testing): Trung runs ./sentinel-run.sh <client_id> on the VPS. Script reads client payload from local JSON, creates session, streams SSE, prints output.
ClickUp webhook trigger (Day 22 onward): ClickUp task in the Local SEO Onboarding list changes status to "Ready (Automate)". Webhook hits orchestration script's /webhook/sentinel endpoint. Script resolves client from task's custom fields, creates session.
Slack slash command (optional stretch): /sentinel audit <client-name> in #seo-team. Useful for ad-hoc reruns.

Orchestration script responsibilities

// orchestrator/sentinel.js — Node.js on VPS, v1 target const Anthropic = require('@anthropic-ai/sdk'); const client = new Anthropic(); // reads ANTHROPIC_API_KEY from env async function runSentinel(clientPayload) { // 1. Create session const session = await client.beta.sessions.create({ agent: process.env.SENTINEL_AGENT_ID, environment_id: process.env.SEONAV_ENV_ID, title: `Sentinel audit · ${clientPayload.client_name}` }, { headers: { 'anthropic-beta': 'managed-agents-2026-04-01' } }); // 2. Send initial event BEFORE opening stream (critical!) await client.beta.sessions.events.create(session.id, { events: [{ type: 'user.message', content: [{ type: 'text', text: `Run a full Local SEO audit for this client. Execute all 5 modules. Write structured output to /mnt/session/outputs/sentinel-audit-${clientPayload.client_id}.json and .md. CLIENT PAYLOAD: ${JSON.stringify(clientPayload, null, 2)}` }] }] }); // 3. Open SSE stream, process events const stream = await client.beta.sessions.stream(session.id); for await (const event of stream) { switch (event.type) { case 'agent.message': log('Agent:', extractText(event)); break; case 'agent.tool_use': log('Tool:', event.name); break; case 'session.status_idle': if (event.stop_reason?.type === 'end_turn') { await deliverOutputs(session.id, clientPayload); return; } if (event.stop_reason?.type === 'requires_action') { // v1 has no custom tools or confirmation gates — shouldn't happen. // If it does, log and alert. log('Unexpected requires_action', event); } break; } } } async function deliverOutputs(sessionId, payload) { // Fetch files written by the agent const files = await client.beta.files.list({ scope_id: sessionId }, { headers: { 'anthropic-beta': 'files-api-2025-04-14,managed-agents-2026-04-01' } }); // Download, upload to Drive, post to Slack, update ClickUp // ... (implementation per normal Drive/Slack/ClickUp APIs) }

Deliverable routing

JSON audit → Google Drive folder /Clients/{client_name}/Audits/sentinel-audit-{timestamp}.json
Markdown audit → Same folder, .md extension
Summary (3 bullets) → Slack #seo-automation channel + ClickUp task comment
Grid heatmap data → Separate Drive file for client-facing deliverable rendering

12 Test plan — T1 & T2 Two gates before declaring done

Test	When	Input	Pass criteria	Owner
T1 — synthetic client	Day 18	Fake client: "SN Test Detailing", Ho Chi Minh City, generic detailing service menu, 3 synthetic competitors. Safe to break.	All 5 modules execute without fatal error. Output file exists at correct path, valid JSON, covers all sections per output template. Session cost <$3. Run time <45 min.	Trung (execute) → SEO Lead (validate)
T2 — real low-risk client	Day 22	Shamrock Detailing Columbus OH or Procam Detailing Bullhead City AZ (per Jake's pick in Section 4). Real GBP, real competitors, real rank data.	All T1 criteria +: SEO Lead side-by-side review vs. prior manual audit scores ≥ 80% coverage + 0 factual errors + useful recommendations. Output passes "would I send this to the client after light editing?" test.	Trung (execute) → SEO Lead (validate)

🧪 Running T1 three times, not once

Per the 90-day plan exit criteria: T1 must pass 3 consecutive clean runs before we move on. A single clean run could be luck. Three rules out most randomness. If run 3 fails but runs 1–2 passed, we don't promote — we investigate why it failed.

13 Deployment sequence — day by day What happens when

Day	Owner	Milestone	Exit check
1–3	Jake	Section 4 items 1–5 complete	Budget approved, test clients selected, DoD signed
1–7	SEO Lead + TL	Section 5 items 1–5 complete	System prompt draft + 44-point rubric + keyword logic + output template all exist
8	Trung	Prereqs verified	Test API call to `/v1/agents` returns 200. Tunnel uptime confirmed.
8	Trung	Secrets plan decided (Section 9)	Plan A confirmed working OR Plan B locked in. Documented in ClickUp.
9	Trung	Environment `env_seonav_prod` created	`env_…` ID captured. Dry session spawns successfully + packages installed.
10–11	Trung	Files uploaded to environment mount	All skill files, rubrics, helper scripts exist in container at known paths.
12–13	Trung + SEO Lead	Agent `SEO Sentinel` created + v1 system prompt in place	`agent_…` ID captured. Token count verified <80K on first turn.
14	Trung	MCP servers configured + tested	Agent can read a known ClickUp task, can post to Slack, can query SEO Utils.
15–17	Trung	Orchestration script v1 working (manual CLI mode)	Manual `./sentinel-run.sh` completes end-to-end on synthetic data.
18	Trung → SEO Lead	T1 — 3 clean runs on synthetic client	3 consecutive passes. Output validated by SEO Lead.
19–21	Trung	ClickUp webhook integration	Task status change triggers session. Slack + ClickUp postback works.
22	Trung → SEO Lead → Jake	T2 — real client run	SEO Lead sign-off. Jake notified. Sprint 1 exit complete for this agent.

14 Open questions & verification items The things I don't know — don't guess

Per user preferences: these are explicit "I don't know, let's verify at the source" items. None should block Day 1 work, but all should be answered before T2.

Secrets mechanism for Managed Agents environments. (See Section 9.) Path: ask Anthropic support or inspect the Environment API response schema for env_vars / secrets fields. Blocker for: clean secret handling. Workaround: Plan B (pass in user.message).
Is idle time while waiting for Apify runs billed as session-hour? Apify GBP scraper can take 60–90s per run. If we're running 3 competitors × 5 keywords of grid data, that's several minutes of the agent waiting on Apify. Does billing accrue during this wait? The skill says "idle waiting for human input doesn't bill" but waiting on external APIs may differ. Blocker for: accurate cost estimate. Workaround: measure actual cost on T1 runs, update estimate.
Maximum session duration cap. Is there a hard kill at 2h, 6h, 24h? Full Sentinel run should be ~20-30min, so well under any reasonable cap, but worth knowing for future. Non-blocker.
Container specs (CPU/memory/disk). DBSCAN clustering on a 13×13 grid × 5 keywords = 845 points. Trivial on any modern machine, but worth confirming container has at least 512MB memory for numpy/pandas operations. Non-blocker, likely fine.
Data residency. Docs mention inference_geo parameter — does it apply to Managed Agents? Some detailing clients might require US-only inference. Non-blocker for first 3 test clients.
MCP call billing against session tokens. When Sentinel calls ClickUp MCP or SEO Utils MCP, are the MCP response tokens counted against the agent's input tokens? Almost certainly yes, but worth measuring the weight on the system prompt. Non-blocker.
The `skills` field in agent config — how does progressive disclosure actually work? The docs describe it, but I haven't seen the exact Files API upload flow in the excerpts I have. Trung should test uploading one skill file first, confirm it appears mounted in the container, before uploading all 5. Blocker for: skill mounting. Low risk, just test early.

15 Risks & mitigations What could break the ship date

Risk	Severity	Mitigation
Secrets mechanism isn't what I assumed; Plan B leaks keys into session logs	High	Verify Day 8. If Plan B is only option, restrict Claude Console access to Jake + Trung only for now. Rotate keys monthly.
System prompt + skill mount is too large (>80K tokens)	High	Measure token count on first dry run. If over, trim by: splitting skills between Sentinel and future Catalyst, removing agency-os skill from Sentinel (not needed for read-only audits).
SEO Utils cloudflared tunnel down during a run	Medium	Agent gracefully degrades — flags missing baseline data in output, completes modules 1–5 anyway. Health check + Slack alert on tunnel drop.
Apify or Firecrawl rate-limits us during a run	Medium	Upgrade to paid tier that matches our run frequency. Retry logic in bash helpers (3 retries with exponential backoff). Flag in output if a module partially failed.
SEO Lead's system prompt doesn't ship in time for Day 12	Medium	Trung uses placeholder system prompt to unblock infrastructure testing. Real system prompt is a PATCH (version bump) on the agent, doesn't require recreation.
Output quality is inconsistent across runs	Medium	Baseline against reference audit in T1. If inconsistent, tighten system prompt with more explicit output contract (exact section headers, exact field names).
Orchestration script is buggy, sessions leak or double-trigger	Low-Med	Idempotency key in webhook handler (use ClickUp task ID + timestamp). Max 1 active Sentinel session per client at a time, enforced client-side.
Cost overruns — retry loops inflate token spend	Low-Med	Session cost alert: Slack ping if any single session >$5. Daily total alert if >$30.
Container lacks a Python package we need	Low	Agent can `pip install` inside the container at runtime. Slows first run but fixes itself. Add to environment packages list on next update.
Google Maps API quota exceeded during grid generation	Low	Cache geocoded addresses in a local SQLite at `/mnt/session/`. Budget: 2500 free requests/day; grid of 169 points × few clients/day stays well under.

16 Cost model What this actually costs per run

🧮 These are estimates

Real numbers come after 3–5 T1 runs. These are prior-belief starting points, not commitments. Update the model with measured numbers after Day 18.

Cost component	Est. per run	Assumptions
Claude tokens — input	~$0.30	~80K input tokens (system + skills + handoff payload), mostly cached after first minute. Cache reads at 10% of base rate.
Claude tokens — output	~$0.45	~30K output tokens (audit report content + module narrations)
Session-hour	~$0.035	25 min active runtime × $0.08/hr. Actual may be lower if idle-on-external-API doesn't bill.
Web search (built-in tool)	~$0.10	~10 searches × $0.01 each ($10/1000)
Apify runs	~$0.40	GBP scraper + citation scraper + SERP scraper. Free tier likely covers testing.
Firecrawl	~$0.05	Single site crawl, ~20 pages
Google Maps API	~$0.00	Under free tier (2500 req/day)
OpenAI (ChatGPT for AI visibility)	~$0.05	5 queries × ~500 output tokens
Gemini API	~$0.00	Free tier
Perplexity API	~$0.05	5 queries
Total per run:		~$1.45–1.90

At 8 new clients onboarded per quarter × ~$1.75/run = ~$14/quarter production cost. Plus ongoing external API subscriptions (~$80/mo) which are fixed costs that would exist anyway.

17 Definition of Done Jake signs this off before work starts

✅ v1 ships when ALL of the following are true

Agent exists. agent_… ID for "SEO Sentinel" is recorded in ClickUp and in version control.
Environment exists. env_seonav_prod created, packages installed, files mounted, reusable.
Secrets path decided. Either Plan A confirmed working or Plan B explicitly accepted by Jake, documented in ClickUp.
Orchestration script runs on VPS. Can be triggered via manual CLI or ClickUp webhook. Posts deliverables to Slack + ClickUp.
T1 passes 3 consecutive clean runs on synthetic client. SEO Lead validated output.
T2 passes once on real low-risk client. SEO Lead would send output to client with light editing only.
Cost per run measured and within $3 per run (2x buffer over estimate).
Runbook exists in ClickUp. Trung documents: how to retrigger a run, how to check logs, how to rotate API keys, who to escalate to.
Known issues logged. Any workarounds, rubric tuning needs, system prompt iterations captured for v2 planning.
Jake reviews T2 output and signs off. Final gate. Green-light to use on next real client onboarding.

18 RACI matrix Who does what, who signs off

R = Responsible (does the work) · A = Accountable (signs off) · C = Consulted · I = Informed

Task	Jake	Trung	SEO Lead	Senior TL	PM
Overall v1 ship decision	A	C	C	I	I
Budget approval (Claude + external APIs)	R	C
Research Preview application	R	I
Test client selection	A		R
System prompt authoring	A	C	R	C
44-point GBP rubric			R	C
Output report template	A		R	R
Handoff schema	C	C	C		R
Prerequisites verification (Day 8)		R
Secrets mechanism verification	A	R
Environment creation		R
File mount (skills + rubrics + helpers)		R	C	R
Agent creation + version management		R
MCP wiring (ClickUp, Slack, SEO Utils)		R
Helper scripts (apify, firecrawl, geo_grid, ai_visibility)		R	C
Orchestration script (Node.js on VPS)		R
ClickUp webhook integration		R			C
T1 execution		R	C
T1 validation (output quality)			R	C
T2 execution on real client	I	R	C
T2 validation + final sign-off	A	I	R
Runbook documentation	I	R			C
Cost monitoring + alerting	I	R

19 Appendix — Full curl reference Copy-pasteable for Trung

A1. Environment creation

# Day 9 — create shared environment ENV_ID=$(curl -fsSL https://api.anthropic.com/v1/environments \ -H "x-api-key: $ANTHROPIC_API_KEY" \ -H "anthropic-version: 2023-06-01" \ -H "anthropic-beta: managed-agents-2026-04-01" \ -H "content-type: application/json" \ -d '{ "name": "seonav-prod", "config": { "type": "cloud", "packages": { "pip": ["requests","pandas","numpy","scikit-learn","beautifulsoup4","lxml","geopy"], "apt": ["jq"] }, "networking": {"type": "unrestricted"} } }' | jq -r '.id') echo "ENV_ID=$ENV_ID" # save to .env file

A2. Agent creation (minimal — iterate later)

# Day 12 — create agent (version 1, minimal scope) SENTINEL_ID=$(curl -fsSL https://api.anthropic.com/v1/agents \ -H "x-api-key: $ANTHROPIC_API_KEY" \ -H "anthropic-version: 2023-06-01" \ -H "anthropic-beta: managed-agents-2026-04-01" \ -H "content-type: application/json" \ -d @sentinel-agent-config.json | jq -r '.id') echo "SENTINEL_ID=$SENTINEL_ID"

A3. Manual test session

# Day 15 — first dry run SESSION=$(curl -fsSL https://api.anthropic.com/v1/sessions \ -H "x-api-key: $ANTHROPIC_API_KEY" \ -H "anthropic-version: 2023-06-01" \ -H "anthropic-beta: managed-agents-2026-04-01" \ -H "content-type: application/json" \ -d "{\"agent\":\"$SENTINEL_ID\",\"environment_id\":\"$ENV_ID\",\"title\":\"Sentinel T1 dry run\"}") SESSION_ID=$(jq -r '.id' <<<"$SESSION") # Send the initial event BEFORE opening the stream curl -sS https://api.anthropic.com/v1/sessions/$SESSION_ID/events \ -H "x-api-key: $ANTHROPIC_API_KEY" \ -H "anthropic-version: 2023-06-01" \ -H "anthropic-beta: managed-agents-2026-04-01" \ -H "content-type: application/json" \ -d @sentinel-kickoff-payload.json # Open SSE stream curl -sS -N https://api.anthropic.com/v1/sessions/$SESSION_ID/stream \ -H "x-api-key: $ANTHROPIC_API_KEY" \ -H "anthropic-version: 2023-06-01" \ -H "anthropic-beta: managed-agents-2026-04-01" \ -H "Accept: text/event-stream"

A4. Fetch session outputs

# After session.status_idle with stop_reason end_turn curl -fsSL "https://api.anthropic.com/v1/files?scope_id=$SESSION_ID" \ -H "x-api-key: $ANTHROPIC_API_KEY" \ -H "anthropic-version: 2023-06-01" \ -H "anthropic-beta: files-api-2025-04-14,managed-agents-2026-04-01" # Download a specific file curl -fsSL "https://api.anthropic.com/v1/files/$FILE_ID/content" \ -H "x-api-key: $ANTHROPIC_API_KEY" \ -H "anthropic-version: 2023-06-01" \ -H "anthropic-beta: files-api-2025-04-14" \ -o sentinel-audit.json

A5. Update agent (versioned)

# To update system prompt, pass current version number curl -fsSL "https://api.anthropic.com/v1/agents/$SENTINEL_ID" \ -H "x-api-key: $ANTHROPIC_API_KEY" \ -H "anthropic-version: 2023-06-01" \ -H "anthropic-beta: managed-agents-2026-04-01" \ -H "content-type: application/json" \ -X PATCH \ -d '{"version": 1, "system": "<updated system prompt>"}' # Response version will increment to 2. Existing sessions keep running with v1. # New sessions use v2.

📚 Primary references

Managed Agents overview
Quickstart
Agent setup
Environments
Events & streaming
Uploaded SEO Navigator docs: Ops Blueprint, Managed Agent Swarm Roadmap, Architecture diagram
Existing skills: managed-agents-doc, seo-navigator-agency-os, seo-utils-mcp-guide