Related: For the detailed Managed Agents deployment plan (89 tasks, PRD specs, skills migration), see Managed Agent Deployment · Action Plan · This doc = full 7-gap transformation · That doc = G1 Agent Fleet deep-dive
Execution Plan — April 2026

90-Day Transformation Execution Plan

7 gaps × 13 actions + 7 workstreams → sequenced into weekly sprints with time allocations. Target: agent fleet operational by July 2026.

190-Day Execution Plan
2Workstreams × Gap Correlation
3Managed Agents Deployment
4UAT Automation

Strategic Advice

You own 24 of 62 tasks but most of your work is decisions and reviews, not execution. Total Jake commitment: ~57 hours across 12 weeks (~4.5 hrs/week). Total delegated: ~117 hours across your team.

  • Gaps 2, 5, and 6 run in parallel — different owners (you, SEO Lead, PM). Assign Day 1 and they execute independently.
  • Your highest-leverage hours are system prompts and role cards. Nobody else can design agent architectures or redefine roles.
  • Use Claude to draft, you to decide. SOPs, role cards, one-pagers — Claude drafts in 20 min, you review in 15 min.
  • Don't touch G4 Hybrid OS until agents are running. Sprint 2 at earliest. Beta enrollment is a phone call, not a project.
  • Ship the positioning video last. You need case studies and agent results before you have a story worth telling.
S1
Sprint 1 — Days 1–30
"Structure the Knowledge, Redefine the Roles"
Jake: ~24 hrsTeam: ~55 hrsGaps: G2 Knowledge Base (HIGH) · G3 Role Redefinition (HIGH) · G5 AEO Layer (MED) · G6 Velocity (MED)
Week 1Audit, Schema, Role Cards
G2 Knowledge Base

SOP Audit + Top 10 Selection + Conversion Schema

Jake: 3hwith Senior Team Lead
Pull full SOP inventory from Google Docs, ClickUp, Skills folder. Rank by execution frequency. Select top 10 for conversion. Design structured markdown schema (input/output fields, decision trees, QA gates). Brief Senior Team Lead on schema so they start converting independently.
Output: SOP inventory · Top 10 ranked · Markdown schema template · Senior TL briefed
G3 Role Redef.

Role Card Template + All 9 Role Cards

Jake: 5hSolo (Claude-assisted)
Design one-page role card template: agents managed, human judgment calls, QA responsibilities, escalation triggers. Draft all 9 role cards — Claude drafts each in ~15 min, you review ~15 min. Roles: SEO Lead, Content Lead, PM/Account Strategist, IT Lead, CRM/Ads Specialist, Web Designer, Developer, Senior Team Lead, Content Writer.
Output: Role card template · All 9 role cards drafted
G3 Role Redef.

Workshop Deck Preparation

Jake: 1.5h
Build Agent Manager shift workshop deck. Structure: why shifting → what changes per role → demo one agent workflow (SEO reporting) → Q&A. Schedule for Week 2.
Output: Workshop deck · Session scheduled
Delegated — Parallel Week 1
SEO Lead3hMap Semantic SEO Production workflow. Identify AEO gate insertion point. (G5 AEO Layer)
PM / Acct Strategist4hDefine velocity metrics. Create ClickUp custom fields for tracking. (G6 Velocity)
Week 2Workshop + Review Parallel Tracks
G3 Role Redef.

Team Workshop: Agent Manager Shift

Jake: 2hAll Team Members
Deliver workshop. Present shift, walk through role cards, demo one agent workflow. Record session for future onboarding.
Output: Workshop delivered + recorded
G3 Role Redef.

1:1 Role Validation Check-ins (5 of 9)

Jake: 2h5 team members × ~20 min
Validate each person articulates: which agents they manage, which decisions remain human, QA responsibilities. Adjust cards from feedback.
Output: 5 role validations done · Refinements noted
G5 AEO + G6 Velocity

AEO Gate + Velocity Schema Approval

Jake: 2hSEO Lead + PM
Review SEO Lead's AEO insertion point — approve or redirect. Co-author updated Pillar 3 SOP. Review PM's velocity definitions and ClickUp fields — approve schema. Green-light baseline measurement.
Output: AEO gate approved · Velocity schema approved · Both teams proceed independently
Delegated — Parallel Week 2
Senior Team Lead12hConvert SOPs 1–5 to structured markdown using Week 1 schema. (G2 Knowledge Base)
SEO Lead4hUpdate Pillar 3 SOP with AEO step. Add AI visibility score to monthly report template. (G5 AEO Layer)
PM / Acct Strategist6hMeasure velocity baseline for clients 1–8. Backfill historical data. (G6 Velocity)
Week 3SOP Validation + Remaining Check-ins
G2 Knowledge Base

Validate Converted SOPs 1–5

Jake: 2.5hwith Pillar Leads
Run each of the first 5 converted SOPs through Claude agent on test client scenario. Score accuracy (≥90% target). Flag revisions back to Senior Team Lead. ~30 min per SOP.
Output: 5 SOPs validated or sent back
G3 Role Redef.

Remaining 4 Role Validation Check-ins

Jake: 1.5h4 team members × 20 min
Complete remaining validations. Finalize all 9 role cards. Hand off to PM to update ClickUp role assignments.
Output: All 9 validations done · PM updates ClickUp · G3 Role Redefinition closed
Delegated — Parallel Week 3
Senior Team Lead12hConvert SOPs 6–10. Apply revision notes from 1–5 validation. (G2 Knowledge Base)
PM / Acct Strategist6hVelocity baseline clients 9–15. Build ClickUp velocity dashboard. (G6 Velocity)
SEO Lead4hTrain team on consensus audit Skill. Run AEO audit on all new content for validation week. (G5 AEO Layer)
PM / Acct Strategist2hUpdate ClickUp role assignments for Agent Manager definitions. (G3 Role Redefinition)
Week 4Sprint 1 Close + Architecture Sketch
G2 Knowledge Base

Validate SOPs 6–10 + Index All in ClickUp

Jake: 2.5hwith Pillar Leads
Same validation for SOPs 6–10. Once all pass, PM indexes in ClickUp SOP Library.
Output: All 10 SOPs validated + indexed · G2 Knowledge Base closed
G1 Agent Fleet

Agent Fleet Architecture Sketch (Sprint 2 Prep)

Jake: 2h
Map 6 pillars to agents: name, scope, Skills to wire, I/O schema outline, trigger conditions. Architecture sketch — detailed system prompts in Sprint 2.
Output: 6-agent architecture map · Sprint 1 scorecard reviewed
Delegated — Week 4
PM / Acct Strategist2hIndex all 10 knowledge bases in ClickUp SOP Library. (G2 Knowledge Base)
PM / Acct Strategist1hSet velocity reduction targets: 50% within 60 days. (G6 Velocity)
SEO Lead2hConfirm 100% of new content passed AEO audit for full week. G5 AEO Layer closed.
S2
Sprint 2 — Days 31–60
"Deploy the Agent Fleet, Test Productization"
Jake: ~22 hrsTeam: ~40 hrsGaps: G1 Agent Fleet (HIGH) · G4 Hybrid OS (MED-HIGH) · G7 Outbound + Brand (MED)
Week 5Agent System Prompts — All 6
G1 Agent Fleet

System Prompts + I/O Schemas: All 6 Pillar Agents

Jake: 6h~1h per agent with Pillar Lead
Per agent (~60 min): draft system prompt with Claude (~20 min), refine with Pillar Lead (~15 min), wire to Skills (~15 min), define triggers + QA handoff.

1 Market Intel → Keyword Architect, AI Visibility · 2 GBP → Agency OS, AI Visibility · 3 Organic → Consensus Audit, Content Audit · 4 Ads → Copy Lab, Campaign Builder, Radius Optimizer · 5 Web → AI Carousel, AI Visibility · 6 CRM → GHL API, HL Assistant, Lead Engine
Output: All 6 system prompts · Skills wired · Triggers + QA handoff defined
Delegated — Week 5
IT Lead4hTechnical wiring: connect prompts to Skills infrastructure, test API connections. (G1 Agent Fleet)
PM / Acct Strategist2hSet up ClickUp Agent Registry structure. (G1 Agent Fleet)
Week 6Agent Testing + Outbound Flywheel
G1 Agent Fleet

Test 6 Agents on Client Scenarios (Round 1)

Jake: 3hwith Pillar Leads
1 test per agent (6 total, ~30 min each). Real client per pillar. Score output quality, accuracy, time savings. Flag prompt refinements. Pillar Leads run second round independently.
Output: 6 test runs scored · Refinement notes
G7 Outbound

Outbound → Content Flywheel + Case Study Agent

Jake: 3hContent Lead + IT Lead
Design n8n workflow: CRM closed deal → case study agent → content queue → social scheduler. Build case study agent — system prompt + input schema + output format. Test on 1 recent win. IT Lead wires n8n after.
Output: Workflow designed · Case study agent built + tested
Delegated — Week 6
Each Pillar Lead6h totalRun second test scenario per agent independently. Document results. (G1 Agent Fleet)
IT Lead6hWire n8n workflow end-to-end. (G7 Outbound + Brand)
Week 7Hybrid OS + Agent Registry
G4 Hybrid OS

Package Hybrid OS + Draft One-Pager

Jake: 3hIT Lead + Content Lead
Define package: GHL Snapshot components, agents included, dashboard scope, strategy call format, community structure. Draft one-pager with Claude — positioning, inclusions, $1,500–$2,000/mo pricing.
Output: Hybrid OS spec · One-pager draft
G1 Agent Fleet

Review All 12 Tests + Refine + Registry

Jake: 3hPillar Leads + PM
Review Pillar Leads' 6 independent test results. Fix 2–3 prompts needing refinement. Document all 6 agents in ClickUp Agent Registry.
Output: 12 tests reviewed · Prompts refined · Agent Registry v1 live
Delegated — Week 7
IT Lead5hBuild GHL Snapshot for Hybrid OS. (G4 Hybrid OS)
Content Lead2hPolish one-pager. (G4 Hybrid OS)
PM / Acct Strategist3hDocument Agent Registry. Build beta onboarding checklist. (G1 Agent Fleet + G4 Hybrid OS)
Week 8Beta Enrollment + Sprint Close
G4 Hybrid OS

Enroll 2 Beta Clients

Jake: 1.5hPM supports onboarding
Call 2 long-retention clients. Pitch Hybrid OS beta with one-pager. Get commitment. PM activates onboarding checklist.
Output: 2 beta clients enrolled
G7 Outbound

Review Case Studies + Sprint 2 Scorecard

Jake: 1.5h
Review Content Lead's 3 auto-generated case study drafts. Approve or revise. Confirm social pipeline connected. Sprint 2 scorecard.
Output: 3 case studies approved · Pipeline live · Sprint 2 closed
Delegated — Week 8
Content Lead6hGenerate 3 case study drafts. Connect social scheduler to LinkedIn + YouTube. (G7 Outbound + Brand)
PM / Acct Strategist2hActivate beta onboarding for both clients. (G4 Hybrid OS)
S3
Sprint 3 — Days 61–90
"Measure, Iterate, Scale"
Jake: ~11 hrsTeam: ~22 hrsGaps: G1 Agent Fleet (HIGH) · G4 Hybrid OS (MED-HIGH) · G7 Outbound + Brand (MED)
Weeks 9–10Agent Review + Positioning
G1 Agent Fleet

Agent Fleet Health Report v1

Jake: 3hPillar Leads provide data
Compile: accuracy rate, time savings, error types, override frequency for all 6 agents. Calculate total delivery hours reduction vs. baseline. Write Health Report v1. Retrain prompts. Update Registry.
Output: Health Report v1 · Prompts retrained · Target: ≥85% accuracy, ≥30% hour reduction
G7 Outbound

AI-Native Positioning — Website + Sales

Jake: 3hContent Lead supports
Rewrite seonavigator.online About + services pages. Update sales deck and proposals with AI-native messaging. Brief Content Lead on 3 social posts.
Output: Website copy updated · Sales materials refreshed · Social briefs assigned
Delegated — Weeks 9–10
Each Pillar Lead6h totalCollect agent performance data for Health Report. (G1 Agent Fleet)
IT Lead3hApply prompt refinements and Skills wiring updates. (G1 Agent Fleet)
PM / Acct Strategist2hUpdate Agent Registry. Update sales proposals. (G1 Agent Fleet + G7 Outbound)
Weeks 11–12Beta Evaluation + Video + Close
G4 Hybrid OS

Hybrid OS Beta Evaluation + Go/No-Go

Jake: 2.5hPM provides data
Review beta feedback. Calculate delivery cost reduction vs. DFY. Decision gate: both retained + outcomes equivalent + cost ≥40% down = Go. Write memo. If Go: outline Tier 3 launch plan.
Output: Go/No-Go memo with data · If Go: Tier 3 launch plan
G7 Outbound

Positioning Video + Final Scorecard

Jake: 2.5h
Record 3-min positioning video: problem → solution → proof. One take, ship it. Final 90-day scorecard — score every metric, lessons learned, next 90-day plan.
Output: Video recorded · Scorecard complete · Next 90-day plan · All gaps closed or at Go/No-Go
Delegated — Weeks 11–12
PM / Acct Strategist4hCollect beta feedback. Prepare Go/No-Go data. (G4 Hybrid OS)
Content Lead4hPublish 3 social posts. Polish + publish positioning video. (G7 Outbound + Brand)
PM / Acct Strategist3hIf Go: build Tier 3 onboarding sequence from Jake's outline. (G4 Hybrid OS)

Time Budget Summary

~57h
Total Jake time
~117h
Total delegated team
~4.5h
Avg Jake hrs/week
12 wk
Full transformation
SprintWeekGapsFocusJakeTeam
S1W1G2 Knowledge Base · G3 Role RedefinitionSOP Audit · Schema · 9 Role Cards · Workshop Prep9.5h7h
W2G3 Role Redefinition · G5 AEO Layer · G6 VelocityWorkshop · Check-ins · AEO + Velocity Approval6h22h
W3G2 Knowledge Base · G3 Role RedefinitionSOP Validation 1–5 · Remaining Check-ins4h24h
W4G2 Knowledge Base · G1 Agent FleetSOP Validation 6–10 · Agent Architecture Sketch4.5h5h
S2W5G1 Agent Fleet ArchitectureSystem Prompts — All 6 Agents6h6h
W6G1 Agent Fleet · G7 Outbound + BrandAgent Testing · Outbound Flywheel6h12h
W7G4 Hybrid OS · G1 Agent FleetHybrid OS Package · Test Review · Registry6h10h
W8G4 Hybrid OS · G7 Outbound + BrandBeta Enrollment · Case Studies · Sprint Close3h8h
S3W9–10G1 Agent Fleet · G7 Outbound + BrandAgent Health Report · Positioning6h11h
W11–12G4 Hybrid OS · G7 Outbound + BrandBeta Go/No-Go · Video · Final Scorecard5h11h

Team Load by Role

RoleS1S2S3TotalKey Work
Senior Team Lead24h24hSOP conversion (all 10 to structured markdown)
PM / Account Strategist15h7h9h31hVelocity tracking, ClickUp config, Agent Registry, beta onboarding, feedback
IT Lead15h3h18hAgent wiring, n8n automation, GHL Snapshot, prompt refinements
Content Lead8h4h12hOne-pager, case studies, social publishing, video production
SEO Lead9h9hAEO gate, Pillar 3 SOP, team training, validation
Each Pillar Lead6h6h12hAgent testing, performance data collection
All Team Members2h2hWorkshop participation

After Week 12, the transformation infrastructure is built. ~57 hours of Jake-time and ~117 hours of team-time builds the full operating system. Everything after is iteration — refining Skills, expanding orchestration, scaling Hybrid OS if greenlit.

How Workstreams Map to the 90-Day Gaps

The 7 internal workstreams are the operational engine that runs alongside the 7 strategic gaps. Each workstream feeds into one or more gaps — its AI tasks become the execution layer for the transformation. Below, each workstream shows which gaps it directly supports, what week it activates, and which tasks are AI-automatable vs. human-owned.

01
Workflow Optimization + SOP System
W1–2 Priority
Maps to 90-Day Gaps
G2 Knowledge Base SOPs are the raw material for agent-ready knowledge bases. Sprint 1, Weeks 1–4.
G1 Agent Fleet Structured SOPs become the knowledge layer agents query. Sprint 2, Weeks 5–7.
G3 Role Redefinition RACI from SOPs defines human vs. agent ownership. Sprint 1, Weeks 1–3.
AI Tasks (Automatable)7
SOP document drafts from process descriptions8h/wk576
Master workflow map generation (v1)5h/wk392
RACI matrix generation from role descriptions3h/wk448
Top 10 SOP bottleneck identification2h/wk648
SOP gap analysis across existing docs3h/wk448
How-To video script generation4h/wk392
ClickUp template descriptions from SOPs4h/wk448
Human-Owned (Strategic)5
Workflow audit interviews with dept leads
Final SOP approval and sign-off
Video recording walkthroughs (screen + voice)
RACI ownership confirmation with each person
Template QA before publishing to Vault
Key Deliverables
Master workflow map (end-to-end)
SOP Library — docs + videos
ClickUp templates with SOPs attached
RACI baked into every template
02
Onboarding Simplification
W3–6 Standardize
Maps to 90-Day Gaps
G3 Role Redefinition Onboarding checklists encode the new Agent Manager role definitions. Sprint 1.
G2 Knowledge Base Welcome packs reference the structured SOP library. Depends on WS01 completion.
AI Tasks (Automatable)5
Onboarding checklist generation per role6h/wk729
ClickUp onboarding task auto-generation4h/wk512
Role-specific welcome doc + resource packs3h/wk448
Day 1 Slack welcome + channel routing auto2h/wk648
30/60/90 day check-in reminders2h/wk512
Human-Owned (Strategic)5
Day 1 in-person culture welcome (CEO or PM)
Buddy assignment — judgment call, personality fit
30/60/90 check-in conversations
Hiring manager W4 progress review
Culture immersion — values storytelling
Key Deliverables
Day 1 → Week 4 onboarding flow
Checklists: HR → PM → Department
ClickUp auto-task generation on hire
Role-specific welcome resource packs
03
Communication + Automation
W2–4 + W7
Maps to 90-Day Gaps
G6 Velocity Deadline automations + status updates accelerate time-to-deliverable. Sprint 1.
G7 Outbound + Brand Slack→ClickUp automation prevents work dying in chat. Sprint 2, Week 7.
AI Tasks (Automatable)6
Slack message → ClickUp task auto-creation10h/wk648
Deadline reminder automation sequences8h/wk729
Overdue notification auto-routing + escalation5h/wk576
Automated weekly status update generation6h/wk512
Slack channel structure audit + recommendations2h/wk448
Stripe + ClickUp + Slack integration (Zack)4h384
Human-Owned (Strategic)4
Slack channel rules agreement with team
Communication migration decision — CEO+PM
Escalation path definition for critical blockers
Communication culture coaching (1:1)
Key Deliverables
Slack channel structure + rules
Migration plan: ClickUp chat → Slack
Automations: tasks, reminders, status updates
Stripe + ClickUp + Slack integration (Zack)
04
Internal Brain System
W3–4 + W9
Maps to 90-Day Gaps
G1 Agent Fleet Internal Brain is the coordinator agent for all workstreams. Sprint 2–3.
G6 Velocity Proactive alerts and dashboards drive deployment speed. Sprint 1, Week 4.
AI Tasks (Automatable)6
Proactive deadline alert automation8h/wk729
Weekly project health digest generation6h/wk648
Automated meeting action item extraction4h/wk648
Blocker detection from ClickUp status changes3h/wk448
Real-time bottleneck dashboard auto-update3h/wk448
Meta Agent async status collection4h/wk294
Human-Owned (Strategic)4
Dashboard KPI selection and design
Escalation decisions when brain surfaces blockers
Strategic interpretation of weekly health reports
Meta Agent persona, scope, and name definition
Key Deliverables
Automated reminders: due dates, deliverables, blockers
Weekly project health digest (auto-compiled)
Real-time visibility dashboard
Meta Agent MVP (Phase 2 — W9+)
05
Recurring Tasks + Templates
W1–2 Highest
Maps to 90-Day Gaps
G2 Knowledge Base Templates are how SOPs become executable in ClickUp. Sprint 1, Weeks 1–4.
G1 Agent Fleet Template-based workflows become agent trigger conditions. Sprint 2.
G5 AEO Layer GBP + OnPage module templates include AEO audit step. Sprint 1.
AI Tasks (Automatable)6
ClickUp recurring task template generation10h/wk729
Template gap audit across departments4h/wk576
SOP-to-template conversion automation5h/wk512
GBP Intelligence module templates3h/wk392
OnPage Intelligence module templates3h/wk392
CSV import file generation for ClickUp bulk load2h/wk448
Human-Owned (Strategic)4
Dept lead review of each template before publishing
Former PM's incomplete templates — dedicated review session
Training team leads on Template Vault usage
Judgment on which templates need video walkthroughs
Key Deliverables
ClickUp templates for every recurring workflow
SOPs attached to each template
Template Vault organization inside ClickUp
GBP / OnPage / Citation module imports
06
Performance Framework
W3–7 Sequential
Maps to 90-Day Gaps
G3 Role Redefinition SOWs formalize the Agent Manager role boundaries. Sprint 1.
G6 Velocity KPI scorecards track deployment speed and agent performance. Sprint 1–2.
G4 Hybrid OS Performance data proves whether Hybrid OS delivers equivalent outcomes. Sprint 3.
AI Tasks (Automatable)6
Department KPI scorecard generation5h/wk576
Weekly KPI report auto-compilation6h/wk729
OKR draft generation from strategic goals4h/wk448
Department SOW drafts from role descriptions4h/wk448
Performance review template generation3h/wk512
OKR + KPI tracking cadence automation3h/wk448
Human-Owned (Strategic)5
OKR calibration sessions with department heads
Final performance review conversations (1:1)
Compensation decisions tied to KPI outcomes
Strategic OKR direction-setting (CEO)
SOW sign-off and boundary-setting with each dept lead
Key Deliverables
Department SOWs — responsibilities + boundaries
Department KPIs — monthly execution metrics
Quarterly OKRs — direction + growth
Performance Review System: skills + execution + mindset
07
Culture + Mindset Upgrade
W4–12 Sustained
Maps to 90-Day Gaps
G3 Role Redefinition "Ownership Operator" is the cultural shift that makes Agent Manager roles stick. Sprint 1.
G7 Outbound + Brand Culture storytelling feeds the positioning narrative. Sprint 3.
AI Tasks (Automatable)5
Ownership Operator training material drafts4h/wk392
Hire-to-culture criteria documentation2h/wk392
Culture guideline docs (expected behaviors)3h/wk392
Workshop agenda + slide deck generation2h/wk343
Behavioral expectation rubric for reviews2h/wk392
Human-Owned (Strategic)5
Live Ownership Operator workshops (CEO or PM-led)
CEO cultural storytelling — modeling publicly
Peer recognition moments (spontaneous, human)
Culture hiring interview panels
Behavior coaching 1:1 conversations
Key Deliverables
Ownership Operator training + workshops
Hire-to-culture criteria (structured)
Culture guidelines — expected behaviors
Behavior tied to performance reviews

Workstream × Gap Correlation Matrix

WorkstreamG1 Agent FleetG2 Knowledge BaseG3 Role Redef.G4 Hybrid OSG5 AEOG6 VelocityG7 Outbound
WS01 Workflow + SOP●●●●●
WS02 Onboarding●●
WS03 Communication●●
WS04 Internal Brain●●●●●
WS05 Templates●●●
WS06 Performance●●
WS07 Culture●●

●●● = primary driver  ·  ●● = strong support  ·  ● = contributes

The 7 workstreams are the internal operational backbone. The 7 gaps are the strategic transformation targets. They're not separate — every workstream directly feeds at least 2 gaps. Execute the workstreams and the gaps close automatically.

What Changes with Managed Agents

Claude Managed Agents replaces your current DIY agent infrastructure. Instead of building your own agent loops, sandbox, tool wiring, and error recovery, Anthropic runs all of that on their infrastructure. You define the agent (system prompt, tools, MCP servers, Skills), define the environment (packages, network), and start sessions. The agent runs autonomously for hours, persists through disconnections, and self-recovers from errors.

  • Your existing Claude Skills become the brain. Skills like Keyword Architect, Radius Optimizer, Consensus Audit — these wire directly into Managed Agent definitions. No rebuild needed.
  • MCP servers you already use connect natively. ClickUp, Slack, GHL, Google Drive, Figma — Managed Agents supports MCP out of the box. Your current 6-MCP architecture plugs right in.
  • Multi-agent coordination (research preview) is exactly your 6-pillar fleet model. A coordinator agent can delegate to specialist agents — this is the Agency OS pattern at infrastructure level.
  • Pricing: standard API tokens + $0.08/hr active runtime. No idle-time charges. A 4-hour SEO audit that runs overnight costs ~$0.32 in runtime on top of token costs.
  • This replaces Gap 1 (Agent Fleet Architecture) entirely. The infrastructure Anthropic now hosts is what you were going to build yourself.

Architecture: Before vs. After Managed Agents

Before — Current DIY Stack
Individual Claude Skills triggered manually
Custom n8n workflows for each automation
No persistent sessions — each call is stateless
No error recovery — failures require human restart
No multi-agent coordination
Context window management is manual
~2–3 months to wire each new agent
After — Managed Agents
Agents defined by YAML or natural language
Anthropic handles sandbox, auth, tool execution
Persistent sessions across hours/days
Auto error recovery + checkpoint/resume
Multi-agent coordination (coordinator → specialists)
Built-in prompt caching + compaction
~1–2 days per agent deployment
Agent
Model + system prompt + tools + MCP servers + Skills. Created once, referenced by ID.
Environment
Cloud container with packages (Python, Node.js), network rules, mounted files. Your sandbox.
Session
Running agent instance. Persistent file system, conversation history, hours-long execution.
Events
Messages between your app and agent. User turns, tool results, status updates. All persisted.

6 Department Agents — Managed Agent Definitions

P1
SEO Sentinel
37h/wk saved
Managed Agent Configuration
ModelClaude Opus 4.6 SkillsAgency OS · Content Audit · AI Visibility MCP ServersSEO Utils · Google Drive · ClickUp · Slack EnvironmentPython 3.12 + Node.js 22 · Network: SEO Utils API, GSC API, QueryMind Agent ManagerSEO Lead (human gate: strategy sign-off, client delivery) Session ModelScheduled daily: rank monitoring + alerts. On-demand: audits, briefs, reports
Automated by Managed Agent8
Monthly SEO performance reports10h/wk729
Technical audit data collection & crawl7h/wk648
Position tracking & rank monitoring alerts5h/wk648
Topical map & keyword clustering4h/wk448
AI Overview consensus content audit3h/wk392
Competitor AI visibility analysis3h/wk448
Schema markup audit & generation2h/wk448
GSC anomaly detection & weekly alerts2h/wk648
Remains Human-Owned6
SEO strategy & direction decisions
Client presentations & data interpretation
Algorithm penalty recovery decisions
Competitive positioning calls
Link building strategy & partner negotiations
Final content calendar approval
Deployment approach: Define agent via API with system prompt from existing SEO Sentinel skill. Mount SEO Utils MCP + Google Drive MCP. Schedule daily session for rank monitoring (runs ~20 min overnight). On-demand sessions triggered by ClickUp task status change via n8n webhook. Deploy time: 1 day.
P2
Content Catalyst
30h/wk saved
Managed Agent Configuration
ModelClaude Sonnet 4.6 (speed-optimized for volume) SkillsAgency OS · Content Audit · Lead Engine · Consensus Audit MCP ServersGoogle Drive · ClickUp · Slack EnvironmentPython 3.12 · Network: QueryMind API, web search Agent ManagerContent Lead (human gate: quality review, brand voice, publish) Session ModelOn-demand: triggered per content brief. Batch: weekly meta optimization runs
Automated by Managed Agent7
Content brief generation (Koray EAV method)15h/wk729
Meta title & description batch optimization4h/wk576
Research & citation compilation4h/wk448
AI Overview consensus scoring3h/wk392
Internal linking map generation3h/wk392
Social post repurposing from articles3h/wk448
EAV/semantic entity gap identification2h/wk336
Remains Human-Owned5
Content strategy & editorial direction
Brand voice calibration per client
Final quality review before publish
Subject matter expertise injection
Experience-based content creation
Deployment approach: Define agent with Content Audit + Consensus Audit skills. Mount Google Drive MCP for brief templates. Session triggered per ClickUp content task. Agent generates brief → runs AEO audit → outputs to Google Drive → pings Content Lead in Slack for review. Deploy time: 1 day.
P3
Revenue Relay
28h/wk saved
Managed Agent Configuration
ModelClaude Sonnet 4.6 SkillsLead Engine · HL Assistant · GHL API · Agency OS MCP ServersGHL (GoHighLevel) · ClickUp · Slack · Gmail EnvironmentNode.js 22 · Network: GHL API, Twilio SMS API Agent ManagerCRM Lead (human gate: upsell conversations, deal negotiations, relationship) Session ModelContinuous: monitors pipeline. Triggered: new lead → follow-up sequence
Automated by Managed Agent7
Lead follow-up automation sequences (GHL)10h/wk648
Meeting notes & action item extraction4h/wk648
GHL workflow automation build & test5h/wk448
Review request automation sequences3h/wk512
Pipeline stage smart tag automation3h/wk448
SMS/email drip sequence copy drafts3h/wk392
Lead scoring & smart segmentation2h/wk336
Remains Human-Owned5
CRM strategy & pipeline architecture
Upsell conversations & deal negotiations
Client relationship & trust building
Strategic campaign decisions
Onboarding experience design
Deployment approach: Define agent with GHL API + HL Assistant + Lead Engine skills. Mount GHL MCP for pipeline read/write. Long-running session monitors pipeline for new leads → triggers follow-up sequence → drafts SMS/email → flags high-value leads to human. Deploy time: 2 days (GHL auth + sequence testing).
P4
Ad Arbitrage
25h/wk saved
Managed Agent Configuration
ModelClaude Opus 4.6 (complex multi-step campaign work) SkillsAgency OS · AI Visibility · AI Carousel · Copy Lab · Campaign Builder · Radius Optimizer · Conversion Tracker · Remarketing Engine · Performance Auditor · Paid-Organic Crosswalk MCP ServersGoogle Ads API · ClickUp · Slack · Google Drive EnvironmentPython 3.12 + Node.js 22 · Network: Google Ads API, Meta Ads API Agent ManagerAds Lead (human gate: budget changes, campaign launches, creative direction) Session ModelScheduled weekly: performance reports. On-demand: campaign builds, audits
Automated by Managed Agent7
Performance report generation & analysis10h/wk729
Ad copy variation drafts (Google + Meta)5h/wk512
Competitor ad monitoring & SERP analysis4h/wk448
Audience research & targeting maps3h/wk343
A/B test result analysis & summaries3h/wk448
AI shopping carousel feed optimization3h/wk392
Budget pacing alerts & threshold reports2h/wk648
Remains Human-Owned5
Campaign strategy & budget allocation
Creative direction & visual asset briefs
Platform policy compliance review
High-spend optimization judgment
New channel strategy decisions
Deployment approach: This agent has the most Skills wired (10). Use Managed Agents' multi-agent coordination: define a coordinator agent that dispatches to specialist sub-agents (Copy Lab, Campaign Builder, Radius Optimizer, etc.) based on task type. Scheduled weekly session pulls Google Ads data → generates performance report → flags anomalies → drafts optimizations for human review. Deploy time: 2 days.
P5
Build Bot
18h/wk saved
Managed Agent Configuration
ModelClaude Sonnet 4.6 SkillsAgency OS · AI Visibility · AI Carousel MCP ServersFigma · ClickUp · Slack EnvironmentNode.js 22 + Python 3.12 · Lighthouse CLI · Network: web fetch for audits Agent ManagerIT Lead (human gate: architecture decisions, security, UX strategy) Session ModelOn-demand: triggered per migration/audit task. Batch: schema audits
Deployment approach: Define agent with AI Visibility + AI Carousel skills. Environment includes Lighthouse CLI for Core Web Vitals audits. Sessions triggered per migration task or schema audit request. Outputs redirect maps, schema markup, QA checklists to ClickUp. Deploy time: 1 day.
P6
PM Pulse
27h/wk saved
Managed Agent Configuration
ModelClaude Sonnet 4.6 SkillsAgency OS · HL Assistant MCP ServersClickUp · Slack · Google Drive · Gmail · Google Calendar EnvironmentPython 3.12 · Network: ClickUp API, Google Workspace APIs Agent ManagerPM / Account Strategist (human gate: scope negotiation, resource allocation, risk) Session ModelScheduled: daily status digests, weekly reports. On-demand: onboarding, SOPs
Deployment approach: This is the coordinator agent — the "Internal Brain" from WS04. Scheduled daily session pulls ClickUp task statuses across all departments → generates health digest → flags blockers → posts to #ops Slack channel. Monthly session compiles client reports. Also runs as the onboarding agent: new hire tagged → checklist auto-generated → Slack welcome fired. Deploy time: 1 day.

Deployment Timeline — Revised with Managed Agents

Impact on 90-Day Plan

Managed Agents compresses Sprint 2 (Agent Fleet deployment) from 4 weeks to ~1 week. The infrastructure you were going to build (sandbox, error recovery, state management, tool orchestration) is now handled by Anthropic. Your Sprint 2 work becomes: define agents → test → deploy. That frees up ~15 hours of Jake-time and ~20 hours of IT Lead time.

DayAgentDeploy TimeOwnerWhat Changes vs. Original Plan
31PM Pulse (P6)1 dayJake + PMDeploy first — it's the coordinator. Daily health digests start immediately.
32SEO Sentinel (P1)1 dayJake + SEO LeadHighest hours recovered. Overnight rank monitoring sessions start Day 33.
33Content Catalyst (P2)1 dayJake + Content LeadContent briefs + AEO audit run as single session. No separate AEO step needed.
34–35Revenue Relay (P3)2 daysJake + CRM LeadGHL MCP auth setup takes an extra day. Pipeline monitoring starts Day 36.
36–37Ad Arbitrage (P4)2 daysJake + Ads LeadMost complex (10 Skills). Multi-agent coordination: coordinator + sub-agents.
38Build Bot (P5)1 dayJake + IT LeadSimplest agent. Lighthouse CLI in environment. Schema audits on-demand.
8 days
All 6 agents deployed (was 4 weeks)
165h/wk
Total hours recoverable across fleet
$0.08/hr
Runtime cost (+ standard token pricing)

Multi-Agent Coordination — The Agency OS Pattern

Managed Agents' multi-agent coordination (research preview) maps directly to your Agency OS coordinator/specialist architecture. The PM Pulse agent becomes the meta-coordinator that can delegate to specialist agents:

COORDINATORPM Pulse receives task → identifies which specialist agent handles it → delegates → collects output → synthesizes SPECIALISTSEO Sentinel ← "Run technical audit for Modern Collision" → returns audit report SPECIALISTContent Catalyst ← "Generate 5 content briefs for Elite Finish" → returns briefs with AEO scores SPECIALISTAd Arbitrage ← "Weekly performance report for Procam Detailing" → returns report with optimizations SPECIALISTRevenue Relay ← "Activate follow-up sequence for 12 new leads" → runs GHL automation
Access note: Multi-agent coordination is currently in research preview. Request access to join. Your existing Agency OS skill already implements the coordinator/specialist pattern — Managed Agents gives it infrastructure-level support.

Claude Managed Agents is the infrastructure answer to Gap 1 (Agent Fleet Architecture). Instead of spending Sprint 2 building sandboxes and error recovery, you spend 8 days defining and deploying 6 agents on Anthropic's hosted infrastructure. The remaining Sprint 2 time shifts to testing, refinement, and Hybrid OS work — accelerating the entire 90-day timeline by 2–3 weeks.

UAT Automation Across All 7 Departments

We mapped every UAT checklist item across all 7 teams (334 total checks) against Claude Managed Agent capabilities. Result: 239 fully automated, 77 semi-automated, 18 manual-only. Each department gets a dedicated UAT agent triggered by ClickUp status change, gating publishing, and reporting to Slack. Total cost per full-site UAT: ~$9.65.

Department Teams
SEO
9 SOPs + QueryMind
66 · 20 · 1
CRM
34 · 7 · 1
Content
+QueryMind Planner/Jobs
38 · 9 · 3
Google Ads
25 · 6 · 1
Meta Ads
18 · 7 · 1
Web Design
30 · 22 · 10
IT
28 · 6 · 1

Auto   Semi   Manual   |   SEO covers 9 SOP categories + QueryMind pipelines  ·  Content includes QueryMind Planner + Jobs

SEO
SEO Team UAT — 87 checks
66 auto · 20 semi · 1 manual
Skills: Agency OS · Content Audit · AI Visibility · Consensus Audit · Koray City Page Auditor · AI Carousel Optimizer
Platforms: QueryMind (Topical Maps, Content Auditor) · SEO Utils MCP · GSC API · Local SEO Automation Toolkit
Structure: 9 SOP categories (monthly + special) + 2 QueryMind pipeline validations. Four agents triggered independently at different stages.
1. Content Creation CP — 17 checks → seo-uat-agentMonthly · Per page
Pre-Publish — On-Page + Schema (12)StatusAgent Method
Topic validated against QueryMind coverage gaps + GSC opportunityAUTOPulls GSC data + QueryMind gap_label, ranks by search volume × gap priority
URL structure (no duplicate/incorrect slugs)AUTOCrawls sitemap, checks duplicates, trailing slashes, casing
Title & Meta Description complete + uniqueAUTOParses <title> + meta, checks length, flags duplicates
Heading hierarchy (H1→H2→H3)AUTOValidates single H1, sequential nesting, no skips
Sitemap.xml accessible + validAUTOFetches /sitemap.xml, validates XML, checks URLs return 200
Robots.txt not blocking important pagesAUTOParses Disallow rules vs. sitemap URLs
Canonical tags correct (self-referencing)AUTOChecks self-referencing canonical per page
Internal links working (no 404s)AUTOCrawls all links, reports 404s/500s with source
Images have alt textAUTOParses <img> tags, flags missing/empty alt
Page speed acceptable (Lighthouse)SEMIRuns Lighthouse — human confirms threshold per project
Schema markup valid (JSON-LD)SEMIValidates JSON-LD syntax — human checks business logic
Index/noindex correctAUTOChecks meta robots + X-Robots-Tag headers
Post-Publish (5)StatusAgent Method
Request indexing via GSC API + update ClickUp statusAUTOGSC URL Inspection API → request indexing, update ClickUp task to "Index Requested"
Share published page on GBPSEMIAgent drafts GBP post copy + image — human publishes (no GBP write API)
Verify redirects & canonical in productionAUTOFollows redirect map, checks 301 + canonical on live URL
CQS audit triggered on published contentAUTOChains to auditor-uat-agent → ContentAudit record created
Monitor ranking changes for published pages (7-day window)AUTOSEO Utils MCP, flags drops >3 positions vs. baseline
2. GBP Management CP — 4 checks → seo-ops-agentMonthly
GBP Monthly Operations (4)StatusAgent Method
Check GBP performance (Maps, Local Pack, keyword movement)AUTOSEO Utils MCP grid reports, WoW rank delta, flags drops >3 positions → Slack summary
Optimize services, products & categoriesSEMIAgent audits GBP vs. competitors (web scrape), proposes changes — human applies in GBP dashboard
Manage GBP content (posts, photos, videos)SEMIAgent generates post copy from templates + Content AI — human publishes via GBP or GHL Social Planner
Monitor reviews statusAUTOChecks GHL Reputation dashboard or scrapes Google reviews, flags new negatives → Slack alert. Chains to SNMS P6 Reviews AI.
3. GSC Management CP — 4 checks → seo-ops-agentMonthly
GSC Monthly Operations (4)StatusAgent Method
Monitor performance & trends (clicks, impressions, CTR, position)AUTOGSC API → compares vs. prior period, flags anomalies >15% change → Slack + ClickUp
Check indexing & sitemap statusAUTOGSC Sitemaps API + URL Inspection API for key pages, flags "Not Indexed" with reasons
Audit technical issues (404s, redirects, crawl budget)SEMIAgent crawls site (bash + curl), checks 404s, redirect chains, duplicate titles — human implements fixes
Monitor crawl errors (ongoing alerts)AUTOGSC crawl stats API, alerts on new 404s/5xx errors → Slack
4. GMC Management CP — 5 checks → seo-ops-agentMonthly · E-commerce clients only
GMC Monthly Operations (5)StatusAgent Method
Sync Shopify inventory & Google SheetsAUTOShopify API → export inventory → write to Google Sheets. Fully scriptable in managed container.
Check item availability (stock ≤0 detection)AUTOQueries Shopify inventory API, flags items with stock ≤0, cross-references GMC feed
Out of stock update (SKU reconciliation GMC ↔ Shopify)AUTOGMC Content API export → Shopify check → match SKUs → update GMC status → log to report sheet
Fix GMC errors (missing fields, policy violations)SEMIPulls GMC diagnostics API, proposes fixes — human applies policy-related fixes
Product optimization (AI Carousel Optimizer)AUTOAI Carousel Optimizer skill: EAV descriptions, custom label mapping, title optimization. Batch 40 products/session.
5. Monthly Report — 3 checks → seo-ops-agentMonthly · Per project
Monthly Report Generation (3)StatusAgent Method
Review task overview + embed ClickUp deliverables on CRMAUTOClickUp MCP → query completed tasks, generate summary, embed in GHL custom field or Google Doc
Export rank tracking report (SEO Utils) & embed on CRMAUTOSEO Utils MCP → rank tracking data export → generate report (xlsx/HTML) → upload to Drive → embed link in CRM
Rerun grid reports, export & embed on CRMAUTOSEO Utils MCP → trigger grid re-run → wait for completion → export → embed. Chains with previous step for single-pass monthly report.
6. Citation Management CP — 2 checks → seo-ops-agentSpecial · Not monthly
Citation Operations (2)StatusAgent Method
Audit current citations (NAP consistency across directories)AUTOLocal SEO Automation toolkit: scrapes major directories, compares NAP, generates audit report
Submit new citations to directoriesMANUALRequires CAPTCHA, email verification, human login on 20+ directories. Agent prepares data package only.
7. SEO NEO CP — 4 checks → seo-ops-agentSpecial · Per campaign
SEO NEO Operations (4)StatusAgent Method
Project pre-assembly & setup (ClickUp + tool config)SEMIAgent creates ClickUp structure from template, populates config — human provides strategy inputs
Set up & run heatmapSEMIAgent triggers heatmap tool via API if available — human reviews visual output
Run SEO NEO campaigns (GBP Blast, Snipper, DAS, RD100)SEMIAgent orchestrates via tool APIs where available — human monitors execution
Monitor grid ranking changes post-campaignAUTOSEO Utils MCP → automated grid pulls + delta tracking vs. pre-campaign baseline
8. Website Audit — 6 checks → seo-ops-agentSpecial · New projects
New Project Audit (6)StatusAgent Method
KW analysis (seed → clustering via QueryMind topical map)AUTOQueryMind pipeline: seed → expansion → clustering → CORE/OUTER. Chains to topical-map-uat-agent.
City analysis (local SEO grid per market)AUTOSEO Utils MCP + Local SEO Automation: city-level grid data, generates comparison report
Grid report + Whitespark/SEO Utils rank reportAUTOSEO Utils MCP → automated grid pull + rank tracking export. Being consolidated into single report.
LLM visibility check (ChatGPT, Gemini, Perplexity)AUTOAI Visibility Audit skill: queries AI search engines for business presence, scores visibility
AI ranking check across AI search enginesAUTOAI Visibility Audit skill: checks ranking position in AI-generated results
Analyze data & build SEO game planSEMIAgent synthesizes all audit data into structured report with recommendations — human makes strategic decisions on priorities/budget
9. Google Optimization CP — 4 checks → seo-ops-agentSpecial · New projects only
GBP New Project Optimization (4)StatusAgent Method
Foundation content setup + competitor analysisAUTOLocal SEO Automation: scrapes competitor GBP data (categories, hours, services, reviews), generates gap report
GBP content preparation (posts, service descriptions, Q&A)SEMIAgent drafts all content from templates + Content AI — human reviews before publishing
GBP optimization (fields, categories, attributes)SEMIAgent generates optimized field values — human applies in GBP dashboard (no write API)
Final review & sign-offSEMIAgent generates checklist completion report — human strategic sign-off
QueryMind: Topical Map & Clustering — 18 checks → topical-map-uat-agentPipeline · On completion
CSI & Clustering Quality (10)StatusAgent Method
CSI defined: CE, SC, CSI fields populated in workspaceAUTOQueries workspace settings → checks CE/SC/CSI non-null
Seed keyword produces ≥50 expanded keywordsAUTOCounts ClusterKeyword records → flags <50
No single-keyword clusters (min 3 per cluster)AUTOQueries cluster sizes → flags any <3
No mega-cluster (>30% of total in one cluster)AUTOCalculates max cluster % → flags >30%
All clusters have CORE/OUTER classificationAUTOChecks cluster.type field → flags NULL
Cluster names are descriptive (not generic)SEMIChecks for generic patterns — human validates semantic quality
No orphan keywords (all assigned to a cluster)AUTOCounts keywords with NULL cluster_id = 0
Priority labels (P1–P4) assigned from gap analysisAUTOChecks priority field distribution, flags >50% NULL
CORE clusters cover central entity's primary topicsSEMICompares CORE names vs. CSI — human confirms relevance
CSV import handles edge cases (duplicates, encoding, empty rows)AUTOUploads test CSV with known issues, checks import count
Coverage & Gap Tracking (8)StatusAgent Method
Coverage baseline calculated (CORE published / total CORE)AUTOValidates coverage % against manual count
Content gaps identified (covered/gap/unique labels)AUTOChecks gap_label populated ≥80% of keywords
Published URLs linked to ClusterKeyword recordsAUTOCross-references published_url with site crawl
Coverage % updates on new publishAUTOSimulates publish → checks recalculation fires
Visualization renders all clustersAUTOChecks Vue Flow node count matches DB cluster count
CORE clusters show correct keyword countsAUTOCompares UI badges against database counts
Gap keywords visually distinct from coveredSEMIChecks CSS status classes — human validates clarity
Search volume + KD populated where availableAUTOChecks search_volume NULL rate, flags >30% missing
QueryMind: Content Auditor & CQS Scoring — 20 checks → auditor-uat-agentPipeline · On completion
Pre-Audit Validation (4)StatusAgent Method
Source URL accessible (200, content extractable ≥500 words)AUTOFetches URL, checks status + text length
Content extraction successful (Readability output valid)AUTOChecks extraction output length and quality
SERP benchmark fetched (≥5 competitors if keyword provided)AUTOCounts SERP results + competitor scrape success
Degradation level recorded if services unavailableAUTOChecks ContentAuditStep.degradation_level
CQS Scoring (8)StatusAgent Method
All 6 dimensions scored (CSI, CoR, Density, SRL, TF-IDF, EEAT)AUTOCounts ContentAuditScore records = 6
Each dimension score 0–10 rangeAUTOValidates score bounds on all 6 records
CQS formula correct: (CSI×0.25 + CoR×0.20 + Density×0.15 + SRL×0.10 + TF-IDF×0.10 + EEAT×0.20) × 10AUTORecalculates from dimensions, compares against stored score
Weights sum to 1.0AUTOSums weight fields = 1.00
CQS 0–100 + AI Citability 0–10 in valid rangeAUTOValidates bounds on both scores
Dimension weights match specAUTOChecks each: CSI=0.25, CoR=0.20, etc.
Score stored on ClusterKeyword.cqs_scoreAUTOChecks ClusterKeyword updated
Re-audit shows delta from previous scoreSEMIChecks for prior audit, calculates delta — human validates
Report Quality (8)StatusAgent Method
BEFORE/AFTER examples per dimensionAUTOParses report for before/after fields — flags missing
SRL transformations (CE as Agent rewrites)AUTOChecks SRL section, validates CE as grammatical agent
Headings marked [OK] / [CHANGE] / [NEW]AUTOParses heading audit — all headings have marker
BLUF suggestions per H2 sectionAUTOChecks BLUF section ≥1 suggestion per H2
E-E-A-T blocks (4 dimensions with ready-to-paste content)AUTOValidates 4 EEAT sub-sections exist
Priority system (CRITICAL/HIGH/MEDIUM/BONUS/SKIP)AUTOChecks recommendations have priority labels
Report in target languageSEMILanguage detection — human confirms quality
TF-IDF term map with section assignmentsAUTOChecks term-to-section mapping exists with ≥10 terms
4 Agents — Independent Triggers
seo-uat-agentCP1 technical checks
Trigger: ClickUp → "Ready for UAT"
~5–10 min · $0.50/run
seo-ops-agentCP2–9 monthly + special
Trigger: Scheduled monthly or ClickUp → "Run Monthly SEO"
~15–25 min · $0.60/run
topical-map-uat-agentQueryMind TM pipeline
Trigger: Pipeline status = "completed"
~5–8 min · $0.45/run
auditor-uat-agentQueryMind CQS pipeline
Trigger: ContentAudit status = "completed"
~6–10 min · $0.65/run
CTN
Content Team UAT — 50 checks
38 auto · 9 semi · 3 manual
Skills: Agency OS · Content Audit · Consensus Audit · Lead Engine
Scope: Standard content publishing checks + QueryMind Content Planner brief validation + QueryMind Content Job pipeline (Quick Generate + From Topical Map dual-path). QueryMind orchestrates the 4-step brief pipeline and the 5-step content job pipeline with built-in degradation handling. Three agents triggered at different stages.
Section A — Content Publishing (12 checks → content-uat-agent)
Pre-Publish (9)StatusAgent Method
No typos or grammar issuesAUTOClaude reads full text, flags errors with corrections
Clear CTAs (Book / Contact / Sign up)SEMIIdentifies CTA elements — human judges effectiveness
Content aligns with brand voiceSEMICompares against brand guide — human confirms
No placeholders (lorem ipsum, TBD)AUTORegex + Claude scan for lorem ipsum, TBD, TODO
Internal & external links workingAUTOCrawls all links, checks status codes
Layout correct (desktop & mobile)SEMIScreenshots at viewports — human reviews
Images / videos load properlyAUTOChecks media src URLs for 200, dimensions >0
Legal pages available (Privacy, Terms)AUTOChecks footer links to /privacy-policy, /terms → 200
No duplicate contentAUTOSimilarity scoring against other site pages
Post-Publish (3)StatusAgent Method
Validate content on productionAUTORe-runs pre-publish checks on production URL
Monitor user behavior (scroll, bounce)MANUALNeeds GA4/Hotjar data — human interprets
Optimize content for conversionMANUALStrategic decision: performance data + creative judgment
Section B — QueryMind: Content Planner & Brief Pipeline (22 checks → planner-uat-agent)
Pipeline Step Validation (12)StatusAgent Method
Topic research: CSI foundations populated (CE, SC, CSI, semantic frame, query fanout)AUTOChecks ContentBrief.csi_foundations JSON — all required fields non-null
Competitor analysis: ≥5 competitors scraped with EAV triplesAUTOCounts competitor URLs in eav_matrix, EAV count ≥30
URR classification applied (UNIQUE/ROOT/RARE distribution)AUTOChecks URR labels, validates ROOT = highest count
H1 follows formula: CE + UNIQUE attribute + SC contextSEMIParses H1 against CSI — human validates creative quality
H2 headings map to ROOT attributesAUTOCross-references H2s against ROOT attributes
H3 headings map to RARE attributes or FAQAUTOCross-references H3s against RARE + query fanout
BLUF per H2 section (≤50 words)AUTOExtracts BLUF, counts words, flags >50
RAG chunks: 200–500 words, no cross-referencesAUTOCounts words, scans for "as mentioned above"
All 9 brief sections populatedAUTOChecks all 9 JSON columns — flags NULL
Content gaps prioritized P1–P4AUTOValidates content_gaps has priority labels
Copywriter checklist has 15 itemsAUTOCounts items in copywriter_checklist array
Degradation level recordedAUTOChecks ContentBriefStep.degradation_level
Brief Quality (6)StatusAgent Method
Brief in correct target languageAUTOLanguage detection vs. topical map language_name
UNIQUE differentiators ≥2 anglesAUTOCounts items in unique_differentiators
TF-IDF terms + LSI keywords ≥10AUTOChecks keywords_and_terms non-empty
Internal links target other topical map keywordsAUTOCross-references linking suggestions vs. ClusterKeyword table
Quality metrics has target CQS per dimensionAUTOChecks quality_metrics for 6 CQS dimensions
Brief is actionable for content writerSEMIClaude reviews for clarity — human confirms
Post-Approval Handoff (4)StatusAgent Method
Approved brief creates job with source_type='topical_map'AUTOChecks ContentJob: source_type, content_brief_id fields
Job runs only steps 9–12 (not full 12-step)SEMIChecks step log — human confirms no unnecessary steps
Brief data injected into draft promptAUTOChecks draft metadata for brief reference
Batch planning respects P1→P4 orderingSEMIChecks queue order vs. priorities — human reviews
Section C — QueryMind: Content Job Pipeline (16 checks → content-job-uat-agent)
Draft Quality (8)StatusAgent Method
Topical Map path: only steps 9–12 executedAUTOChecks step log — steps 1–8 skipped for topical_map source
Draft incorporates all 9 brief sectionsAUTOChecks draft metadata for section references
Draft follows H1/H2/H3 from approved briefAUTOParses headings, compares against article_structure
BLUF in each H2 (first 50 words = direct answer)AUTOExtracts first 50 words per H2, checks answer pattern
EAV triples from brief referenced in draftSEMIScans for key entities — human confirms coverage
Humanization removes AI patternsAUTOFlags "in conclusion", "it's important to note", excess passive
Formatting follows brand guideSEMIChecks structure — human validates voice
QA catches placeholder contentAUTORegex for TBD, lorem, [insert]
Post-Publish Coverage (8)StatusAgent Method
ClusterKeyword.published_url set + returns 200AUTOChecks field + fetches URL
ClusterKeyword.content_job_id linkedAUTOValidates FK relationship
Coverage % recalculatedAUTOManual calc vs. TopicalMap.coverage_percentage
Published keyword shows green on mapAUTOChecks status = 'published'
Internal links from brief in published contentAUTOFetches URL, checks for internal links to map pages
Published content passes SEO UATAUTOTriggers seo-uat-agent on published URL
CQS audit triggered on published contentAUTOChecks ContentAudit record created
Content quality acceptable (human final review)MANUALDoes content meet client quality standards
3 Agents — Independent Triggers
content-uat-agentTrigger: ClickUp → "Content Ready for Review"
~3–5 min · $0.30/run
planner-uat-agentTrigger: ContentBrief.status = "completed"
~8–12 min · $0.70/run
content-job-uat-agentTrigger: ContentJob.status = "completed"
~5–8 min · $0.50/run
CRM
CRM Team + SNMS 6-Pillar UAT — 42 checks
34 auto · 7 semi · 1 manual
Skills: GHL API · HL Assistant · Lead Engine · Pipeline Reactivation Engine · SNMS Lead Engine
Scope: Standard CRM UAT + full SNMS 6-Pillar GHL deployment validation (Lead Capture, NEPQ, Nurture, Upsell, VIP, Reviews)
Standard CRM — Lead Capture & Flow (11)StatusAgent Method
All forms submittable (contact, booking, register)AUTOSubmits test data to each form, checks 200 + redirect
Data mapping correct (name, phone, email)AUTOSubmits payload → checks GHL contact via API → validates fields
No duplicate leads on re-submitAUTOSubmits same lead twice → checks GHL for duplicates
Leads go to correct pipeline/stageAUTOQueries GHL pipeline API → confirms stage match
Workflows triggered on lead creationAUTOChecks GHL workflow execution log for test contact
Leads assigned to correct repsAUTOQueries contact assignment vs. routing rules
Email/SMS delivery successfulAUTOChecks GHL conversation history for test contact
No delivery failures in logsAUTOQueries activity log for bounce/failure events
Retry mechanism worksSEMISimulates failure — retry window human-defined
Alerts for unassigned leadsAUTOCreates unassigned contact → checks alert fires
DND / opt-out flags validatedSEMIChecks DND flags — compliance needs legal review
P1 — Speed-to-Lead (<30 sec) (5)StatusAgent Method
Primary Bot ("SNMS Lead Intake") in Auto-Pilot mode, 1s waitAUTOGHL API: checks bot config — mode=auto-pilot, wait_time=1
Chat Widget (All-in-One) installed on shop websiteAUTOFetches shop URL, checks for GHL chat widget script in source
Bot responds to test message in <30 secondsAUTOSends test SMS → measures response timestamp delta
WF-01 creates opportunity + auto-tags sourceAUTOCreates test contact → checks opp created + src_* tag applied
SLA timer fires internal notification at 5 minAUTOCreates contact with no bot engagement → checks notification log
P2 — NEPQ Qualification via Flow Builder (5)StatusAgent Method
Flow Builder: 4 NEPQ nodes configured (Situation→Problem→Implication→Need Payoff)AUTOGHL API: checks bot flow structure — 4 Capture Information nodes exist
Custom fields populated (vehicle, pain_point, services)AUTORuns test conversation → checks contact custom fields populated
AI Splitter branches correctly (Qualified → Book / Not → Nurture)AUTOTests both branches: qualified contact gets booking link, unqualified enters WF-03
Appointment Booking action sends link to correct calendarAUTOChecks booking link URL matches service-specific calendar
Knowledge Base trained (website crawled + 10 FAQ + objection doc)AUTOGHL API: checks KB sources count ≥3, FAQ count ≥10
P3 — 90-Day Nurture (4)StatusAgent Method
5 service-specific campaigns loaded (Stop on Reply = ON)AUTOGHL API: checks 5 campaigns exist with stop_on_reply=true
Re-Engagement Bot activates on nurture replyAUTOSimulates reply in nurture → checks Re-Engagement bot responds
Tag-based segmentation: hot/warm/cold applied by behaviorAUTOSimulates reply + no-reply scenarios → checks correct tags applied
450+ O3 templates loaded in SMS + Email libraryAUTOGHL API: counts SMS + email templates, flags if <400
P4–P6 — Upsell, VIP, Reviews (10)StatusAgent Method
Upsell WF-11 triggers on Won + service tag + correct wait daysAUTOSimulates Won opp with svc_ppf → checks WF-11 fires at Day 30
Membership products configured in GHL Payments/StripeAUTOGHL API: checks recurring products exist (Monthly Detail Club, Annual Plan)
B2B pipeline separate with 6 stagesAUTOChecks pipeline config: 6 stages from Prospecting → Closed
VIP tier tags auto-applied (Bronze on 1st win)AUTOSimulates first Won → checks vip_bronze tag applied
3-year post-purchase WF-08 fires per service typeSEMIChecks WF-08 structure has service-tag branching — human validates sequence timing
Win-back WF-09 triggers after 90-day inactivityAUTOChecks workflow date filter: last activity >90 days + has Won opp
Reviews AI Auto-Pilot active (4–5 stars, 2h delay)AUTOGHL API: checks Reviews AI config — mode=auto-pilot, star_filter=4-5, wait=2h
Negative reviews (1–3 stars) NOT auto-replied — task created for Sales ManagerAUTOSimulates 2-star review event → checks task created, no auto-reply
Review request WF-07 fires after "Completed" stageAUTOMoves opp to Completed → checks SMS with review link sent after 2h
Referral trigger links working + attribution tags appliedAUTOClicks test trigger link → checks src_referral + referred_by_* tags
Post-Publish — Production Validation (7)StatusAgent Method
Re-test all forms in productionAUTOSame form tests against production URL
Full happy path: lead → AI → qualify → book → service → review → referralSEMIAgent runs automated flow — human monitors for edge cases
Full sad path: lead → no reply → nurture → cold → win-backSEMIAgent triggers flow — human validates timing/messaging quality
Voice AI Agent answers test calls (3+ scenarios)SEMIAgent dials test number — human evaluates conversation quality
Monitor first response time + contact rate (48h window)SEMIAgent pulls GHL data after 48h — human validates sample size
Stuck leads + automation failures checked dailyAUTODaily session: contacts stuck >24h in same stage
All 14 workflows firing correctly in productionMANUALRequires 5-day soft launch monitoring with real leads
Agent: crm-uat-agent · Opus 4.6 · ~15–25 min · ~$1.40/run
SkillsGHL API · HL Assistant · Lead Engine · SNMS Lead Engine · Pipeline ReactivationMCPGHL · ClickUp · SlackTriggerClickUp → "CRM Ready for UAT" → n8n → session (runs SNMS 6-pillar checks in addition to standard)
GAds
Google Ads + 8 Skills UAT — 32 checks
25 auto · 6 semi · 1 manual
Platform: Google Ads Orchestrator
Skills: Keyword Architect · Copy Lab · Campaign Builder · Radius Optimizer · Conversion Tracker · Remarketing Engine · Performance Auditor · Paid-Organic Crosswalk
Tracking + Tagging (8)StatusAgent Method
Google Tag / GTM properly installedAUTOChecks page source for gtag.js or GTM container
Conversion: form submissions trackedAUTOSubmits test form, checks dataLayer push / GA4 event
Conversion: button clicks trackedAUTOSimulates clicks, checks dataLayer events
Conversion: purchase/booking trackedSEMIValidates tracking code — actual purchase needs human
Remarketing tags active (Google Ads pixel)AUTOChecks for remarketing pixel in page source
Google Ads linked with Google AnalyticsAUTOQueries GA4 Admin API for linked accounts
UTM tracking correct on all ad URLsAUTOParses ad URLs for utm params, validates consistency
GHL pipeline sync configured (GCLID tracking)AUTOConversion Tracker skill: checks GHL webhook → Ads offline import setup
Keyword Architect + Copy Lab (8)StatusAgent Method
Keyword clusters generated with match types assignedAUTOKeyword Architect output: validates clusters have broad/phrase/exact assignments
Negative keyword list created (shared + campaign-level)AUTOChecks negative list exists with ≥50 terms, no keyword cannibalization
Ad group structure follows keyword clustering (no cross-pollination)AUTOValidates each ad group's keywords belong to a single semantic cluster
RSA headlines: 15 per ad group, unique, within 30 charsAUTOCopy Lab output: counts headlines, checks char limits, flags duplicates
RSA descriptions: 4 per ad group, within 90 charsAUTOCounts descriptions, validates char limits
Ad strength rating ≥ "Good" on all RSAsSEMIAgent checks Ads API ad_strength field — human reviews if "Average"
Sitelinks, callouts, call extensions configuredAUTOCopy Lab output: checks extension count ≥4 sitelinks + ≥4 callouts
A/B test framework defined (pin positions documented)SEMIAgent checks for pin strategy doc — human validates creative rationale
Campaign Builder + Radius Optimizer (8)StatusAgent Method
Campaign structure matches Campaign Builder specAUTOCompares live Ads structure against Campaign Builder output
Bid strategy correctly set (Maximize Conversions / Target CPA)AUTOAds API: checks bid_strategy field matches spec
Budget allocation matches specified tierAUTOSums campaign budgets, compares against total spend plan
Proximity/radius rings configured with bid modifiersAUTORadius Optimizer output: validates tiered rings in Ads location targeting
Google Ads Scripts installed for radius automationSEMIChecks script exists in Ads account — human validates logic
Landing page loads fast (Lighthouse ≥70)AUTORuns Lighthouse on all landing page URLs from campaigns
Ad scheduling matches business hoursAUTOAds API: checks ad_schedule settings against client hours
Remarketing audiences created (RLSA + Customer Match)AUTORemarketing Engine: checks audience lists exist in Ads account
Post-Launch — Performance + Crosswalk (8)StatusAgent Method
Conversions verified in real-time via GA4SEMIQueries GA4 Real-Time API — human confirms volume
Quality Score audit: all keywords ≥5AUTOPerformance Auditor: pulls QS per keyword, flags <5
Wasted spend identified (search terms report audit)AUTOPerformance Auditor: analyzes search terms, flags irrelevant with spend
Tracking not dropping (conversion count vs 7-day avg)AUTODaily check: flags >30% conversion drop
Paid-Organic Crosswalk analysis generatedAUTOCrosswalk skill: merges Ads data + organic rankings, identifies overlap
Cost savings opportunities identified from organic overlapSEMIAgent flags keywords ranking #1-3 organically with paid spend — human decides
Impression share >70% on brand termsAUTOAds API: checks brand campaign impression share
Value-based conversion assignment per service typeMANUALConversion Tracker: validates values match client revenue — requires business input
Agent: gads-uat-agent · Opus 4.6 · ~12–18 min · ~$1.10/run
SkillsKeyword Architect · Copy Lab · Campaign Builder · Radius Optimizer · Conversion Tracker · Remarketing Engine · Performance Auditor · Paid-Organic CrosswalkMCPGoogle Ads API · ClickUp · SlackTriggerClickUp → "Ads Ready for UAT" → n8n → session (runs full 8-skill validation)
Meta
Meta Ads + 5 Skills UAT — 26 checks
18 auto · 7 semi · 1 manual
Skills: Meta Campaign Accelerator (8-agent pipeline) · Meta Ads Optimizer · Meta Ads Diagnostics · Meta Ads Performance Hub · Meta Lead Gen Engine
Pixel + Events (6)StatusAgent Method
Meta Pixel base code + PageView event activeAUTOFetches page, checks for fbq('track','PageView')
Lead event firing on form submitAUTOSubmits test form → checks Lead event registered
Purchase/CompleteRegistration eventSEMIValidates code exists — actual purchase needs human test
Domain verified in Business ManagerAUTOChecks DNS TXT record or meta tag for FB verification
Aggregated Event Measurement configuredAUTOQueries Business Manager API for AEM config
Custom conversions set with correct URL rulesAUTOPulls custom conversion list, validates URL rules
Campaign Accelerator — Pre-Launch (10)StatusAgent Method
Buyer persona generated (VoC mining + competitor intel)AUTOAccelerator pipeline: checks persona doc exists with demographics + psychographics
4-Campaign architecture built (C-1 ASC+, C-2 Manual ABO, C-3 Creative Lab, C-4 Retargeting)AUTOChecks Meta Ads account for 4 campaigns matching architecture spec
6 Hook Archetypes applied across ad creativesSEMIAgent checks creative copy against hook patterns — human validates creative quality
Anchor Offer defined (irresistible lead magnet for detailing)SEMIAgent checks offer exists in ad copy — human validates offer strength
Messenger flow configured (if using Messenger ads)AUTOLead Gen Engine: checks Messenger automation exists in business page
Instant Forms built with correct field mapping to GHLAUTOLead Gen Engine: checks form fields + webhook/CRM integration active
Post ID strategy documented (graduation criteria)SEMIChecks for Post ID graduation doc — human validates criteria
Audience targeting matches buyer persona (interests, lookalikes, custom)AUTOCompares campaign audience settings against persona demographics
Pre-launch checklist passed (Diagnostics skill)AUTODiagnostics: runs 5-layer check (Objective→Targeting→Creative→Bidding→LP)
Lead delivery to GHL pipeline workingAUTOSubmits test lead via Instant Form → checks GHL contact + opp created
Post-Launch — Performance + Optimization (10)StatusAgent Method
Real-time event tracking validatedSEMIChecks Events Manager test events — human confirms behavior
Lead events match CRM data (Meta → GHL count alignment)AUTOCross-references Meta lead count with GHL contact creation timestamps
CPL within target rangeSEMIPerformance Hub: pulls spend + leads — human judges vs. target CPL
Campaign scoring Green/Yellow/Red appliedAUTOPerformance Hub: scores each campaign against benchmarks
Budget allocation quadrant analysis (Stars/Hidden Gems/Cash Cows/Money Pits)AUTOOptimizer: maps campaigns to quadrant, flags Money Pits for review
Audience saturation check (frequency <3)AUTOOptimizer: checks frequency metrics, flags saturation
Creative fatigue detection (CTR declining >20%)AUTODiagnostics: compares WoW CTR, flags declining creatives
Funnel analysis: lead → appointment → show → closeSEMIOptimizer: pulls funnel data — human validates stage conversion accuracy
Phone call tracking matched (if call ads)SEMILead Gen Engine: checks call tracking integration — human confirms accuracy
90-day roadmap generated from performance dataMANUALStrategic planning requiring human judgment on scaling direction
Agent: meta-uat-agent · Opus 4.6 · ~10–15 min · ~$0.95/run
SkillsMeta Campaign Accelerator · Meta Ads Optimizer · Meta Ads Diagnostics · Meta Ads Performance Hub · Meta Lead Gen EngineMCPClickUp · Slack · GHLTriggerClickUp → "Meta Ads Ready for UAT" → n8n → session (runs full 5-skill validation)
WEB
Web Design + Landing Page Pipeline + Playwright UAT — 62 checks
30 auto · 22 semi · 10 manual
Pipeline: Design-First (Figma MCP Go → Astro + Tailwind) or Code-First (Claude Code → Astro) → Playwright QA → Vercel/Netlify Deploy
Tools: Figma MCP Go (80+ tools) · Astro MCP · Playwright (visual/responsive/interaction/a11y) · GitHub Actions CI/CD
Visual & Brand Consistency (10)StatusAgent Method
UI matches Figma design (if Design-First)SEMIFigma MCP: export baseline → Playwright screenshot → side-by-side comparison
Colors consistent with brand paletteAUTOExtracts CSS color values, compares against brand palette doc
Typography correct (font, size, line height)AUTOAudits computed styles vs. design system specs
Spacing & alignment consistentSEMIChecks CSS consistency — visual alignment needs human
Buttons/forms/cards consistent across pagesAUTOAudits all component CSS classes for consistency
No visual mismatch between pagesSEMIMulti-page screenshot comparison
Icon style consistent (no mixed libraries)SEMIDetects icon libraries in use, flags mixed sets
No placeholder assets (dummy images, stock IDs)AUTOChecks filenames/alt for "placeholder", "dummy", "sample"
Images high quality (not blurry/upscaled)SEMIChecks resolution vs. display size — human reviews quality
Design token consistency (CSS variables match Figma variables)AUTOFigma MCP: get_variable_defs → compare against CSS custom properties
Playwright — Visual Regression (6)StatusAgent Method
Full-page screenshot regression test passes (≤1% diff)AUTOnpx playwright test visual — fullPage comparison against baseline
Per-section regression: Hero, Services, Testimonials, CTA, Contact, FooterAUTOPer-section element screenshots vs. baselines
Desktop (1440px) visual test passesAUTOPlaywright project: Desktop Chrome viewport
Mobile (375px) visual test passesAUTOPlaywright project: iPhone 14 viewport
Tablet (768px) visual test passesAUTOPlaywright project: iPad viewport
Figma baseline comparison (Design-First only)SEMIClaude multimodal: reads Figma export + implementation screenshot, reports diffs
Playwright — Responsive Layout (8)StatusAgent Method
Mobile: nav shows hamburger, cards stack verticallyAUTOPlaywright: checks #menu-toggle visible, nav hidden, cards Y-stacked
Desktop: nav shows links, cards in gridAUTOPlaywright: checks nav visible, cards same-Y position
No broken layout on smaller screens (no overflow)AUTODetects horizontal scroll, text overflow, clipping
Text readable on all devices (≥14px mobile)AUTOChecks computed font-size, WCAG AA contrast
CTA buttons mobile-friendly (tap target ≥48×48px)AUTOValidates tap target sizes per Google guidelines
No dead-end pages (all pages have nav + CTA)AUTOCrawls pages, flags any without navigation links
Real device testing (beyond emulation)SEMIEmulated tests auto — real iOS/Android needs human with device
Cross-browser: Chrome, Safari, EdgeSEMIChromium auto — Safari/Edge need BrowserStack or manual
Playwright — Interactions + Accessibility (10)StatusAgent Method
Mobile menu opens/closes correctlyAUTOPlaywright: click toggle → menu visible → click link → menu closes
Hamburger has aria-expanded attribute togglingAUTOChecks aria-expanded false→true→false on toggle
CTA buttons have hover styles (visual change)AUTOScreenshots default + hover state, validates difference
Contact form accepts input and validatesAUTOPlaywright: fills all fields, verifies values, screenshots filled form
Heading hierarchy correct (single H1, sequential H2→H3)AUTOCounts H1 (must be 1), validates heading sequence
All images have alt text or role="presentation"AUTOIterates all <img>, checks alt or role attribute
Form inputs have associated labelsAUTOChecks label[for=id] exists for each input
Keyboard navigation reaches all interactive elementsAUTOTab through all elements, counts focused interactive elements
Color contrast WCAG AA compliantAUTOWCAG contrast check on all text/background combinations
Font size accessible (≥12px minimum)AUTOMin font-size check across breakpoints
Build + Deploy Pipeline (8)StatusAgent Method
Astro build compiles without errors (npm run build)AUTORuns build command, checks exit code 0
GitHub Actions CI pipeline passesAUTOChecks workflow run status via GitHub API
Vercel/Netlify deployment successfulAUTOChecks deployment status API, validates live URL returns 200
SEO: meta tags, sitemap.xml, robots.txt presentAUTOChains to seo-uat-agent for standard SEO checks
User journey smooth (landing → form → submit → thank you)MANUALRequires user testing or session recording
No confusing steps in conversion flowMANUALAgent maps flow but can't judge confusion
Collect feedback from client/stakeholdersMANUALHuman communication — can't automate
Performance: Lighthouse ≥70 mobile, ≥80 desktopAUTORuns Lighthouse via Playwright, validates scores
Post-Deploy — Production Monitoring (20)StatusAgent Method
Live UI matches approved designSEMISide-by-side screenshots — human sign-off
UI inconsistencies in production detectedAUTORe-runs all Playwright tests on production URL
Fix spacing/alignment issuesMANUALRequires dev work — agent flags only
Users understand flow (session recording)MANUALNeeds Hotjar/FullStory analysis
No unexpected drop-offSEMIPulls GA4 funnel data — human interprets
CTA visibility strongSEMIAbove-fold CTA check — human reviews impact
Touch interactions work on real devicesMANUALTouch/swipe needs real device testing
Identify UX improvement opportunitiesSEMIAnalyzes data patterns — human decides priorities
Propose UI/UX enhancementsSEMIDrafts recommendations — human reviews
No users completing key actions with frictionMANUALRequires real user testing
Agent: web-uat-agent · Opus 4.6 · ~15–25 min · ~$1.30/run
SkillsAgency OS · AI VisibilityMCPFigma MCP Go · ClickUp · SlackEnvNode + Puppeteer + Lighthouse + Playwright CLI · GitHub API accessTriggerClickUp → "Design Ready for UAT" → session (auto-detects Design-First vs Code-First path)
IT
IT Team UAT — 35 checks
28 auto · 6 semi · 1 manual
Pre-Publish — Infrastructure (4)StatusAgent Method
Production env configured (domain, SSL)AUTODNS resolution, SSL cert validity + expiry, headers
HTTPS working (no mixed content)AUTOCrawls pages, flags http:// resource loads
Env variables correct (API keys, DB)SEMITests API connectivity — human verifies secrets
No staging/dev configs remainAUTOChecks for staging URLs, debug flags, console.log
Pre-Publish — APIs & Auth (5)StatusAgent Method
All APIs functioningAUTOHealth-check each endpoint, check 200 + valid response
Timeout & error handlingAUTOSends slow/invalid requests, checks graceful handling
Retry logic worksAUTOSimulates failure, monitors retry in logs
Auth (OAuth/token) worksAUTOTests valid + invalid creds, checks token refresh
Webhooks sending & receivingAUTOTriggers event, checks receipt on endpoint
Pre-Publish — User Flows + Security (8)StatusAgent Method
Full user flow works (landing→form→submit→thanks)AUTOPuppeteer navigates full flow, validates each step
Form validation correctAUTOEmpty/invalid/valid submissions, checks messages
Error messages clearSEMICaptures messages — human judges end-user clarity
reCAPTCHA workingSEMIChecks script loads + renders — can't solve (by design)
No exposed API keys/secretsAUTOScans source + JS bundles for key patterns
Input validation (XSS/SQLi)AUTOSubmits common payloads, checks sanitization
Security headers (CORS, CSP)AUTOChecks CSP, X-Frame, X-Content-Type, HSTS
SSL/TLS secureAUTOTLS version, cipher suite, cert chain
Pre-Publish — Performance + Testing (7)StatusAgent Method
Page load <3 secondsAUTOLighthouse TTFB + FCP + LCP measurement
No JS/CSS errorsAUTOCaptures console errors via Puppeteer
API response time acceptableAUTOTimes each call, flags >500ms
Images optimizedAUTOChecks sizes >200KB, WebP/AVIF usage
Mobile / tablet / desktop testingAUTOEmulates 6+ viewports, captures errors
Cross-browser (Chrome, Safari, Edge)SEMITests Chromium — Safari/Edge need manual or BrowserStack
Error logging + alerts configuredAUTOTriggers errors, checks logs appear in monitoring
Post-Publish (11)StatusAgent Method
Server uptimeAUTOPings every 5 min, alerts Slack on downtime
Error logs with real trafficAUTOGroups by type/frequency, flags new errors
API failure ratesAUTOQueries monitoring for failure %, alerts >1%
User journeys not brokenSEMIRe-runs flows — edge cases need real monitoring
No unexpected crashesAUTOMonitors 5xx + process restarts
CRM receives dataAUTOTest form → check GHL contact creation
Payment works in productionMANUALReal payment test needs human with test card
Webhooks not failingAUTOMonitors delivery logs for failures
Traffic spike handlingAUTOBasic load test, checks response degradation
Memory / CPU stableAUTOQueries hosting metrics API
Error rate within thresholdAUTO24h error rate vs. threshold (<0.5%)
Agent: it-uat-agent · Opus 4.6 · ~15–30 min · ~$1.20/run
SkillsAgency OSMCPClickUp · SlackEnvPython + Node + Puppeteer + Lighthouse + curl + jqTriggerClickUp → "Tech Ready for UAT" → session

Full-Site UAT: One Trigger, All 12 Agents in Parallel

When a client site reaches final staging, one ClickUp status change to "Full UAT" triggers all 12 agents via n8n. Each runs its checklist independently. The site only goes live when all departments pass.

AgentChecksAutoSemiManualTimeCost
seo-uat-agent171430~5–10 min$0.50
seo-ops-agent3220111~15–25 min$0.60
crm-uat-agent423471~15–25 min$1.40
content-uat-agent12732~3–5 min$0.30
gads-uat-agent322561~12–18 min$1.10
meta-uat-agent261871~10–15 min$0.95
web-uat-agent62302210~15–25 min$1.30
it-uat-agent352861~15–30 min$1.20
topical-map-uat-agent SEO181530~5–8 min$0.45
planner-uat-agent CTN221840~8–12 min$0.70
auditor-uat-agent SEO201730~6–10 min$0.65
content-job-uat-agent CTN161321~5–8 min$0.50
TOTAL (12 Agents)3342397718~30–40 min~$9.65
Before — Manual UAT All Teams
Each dept runs their checklist manually
~2–4 hours per team per site
Total: ~15–25 hours per full site UAT (incl. content pipeline)
Inconsistent, no cross-team report
Post-publish monitoring ad-hoc
~15–25 hrs/week agency-wide
After — 7 Agents in Parallel
All 7 agents run simultaneously
~30–40 min total (parallel)
239 of 334 checks fully automated
Unified report in ClickUp + Slack
Post-publish daily × 14 days
~$9.65/run · human only for 18 manual items

Full-site UAT becomes a $9.65, 40-minute automated process instead of a 20-hour manual effort. 239 checks never get skipped. 77 semi-automated checks flag for human judgment. 18 manual items (UX testing, payment, user feedback) stay human because they should be.