Performance Enhancer Agent — Design
Date: 2026-04-22
Status: Approved for planning
Artifact location: docs/performance-enhancer/performance-enhancer.md (committed); docs/performance-enhancer/runs/ (generated, gitignored)
Purpose
A generic, reusable Claude Code agent prompt that:
- Restarts the Rails dev server with clean logs.
- Logs in as a seeded test user.
- Drives a user-specified, replayable set of UI scenarios via MCP Playwright.
- Captures per-scenario server-side and client-side performance metrics.
- Analyzes the logs, identifies hotspots, researches root causes in the codebase.
- Emits a fix-plan prompt (another
.md) that — when handed to a fresh Claude session — implements the fixes AND re-executes the same scenarios to produce a full professional before/after comparison report.
The agent itself does not implement fixes. It produces a handoff prompt. The fix prompt enforces comparison as a completion gate.
Non-goals
- Not a production load/benchmark tool. Dev env only.
- Not a replacement for APM in production.
- Not meant to be run in CI. Human-initiated.
- Does not propose architectural changes beyond perf fixes.
Inputs
The only user-supplied input is the ## Scenarios section of performance-enhancer.md. Users edit this section before each run.
Grammar:
- One step per line, imperative.
- Reference routes by path, not human description.
- Form submissions include the data as an inline hash.
- Click targets use the visible button/link text.
Example:
- Visit /dashboard
- Visit /goals
- Visit /goals/new; fill {title: "Perf test goal", target_date: "2026-12-31"}; submit
- On the resulting /goals/:id page, click "Edit"; change title to "Perf test goal edited"; submit
- Visit /debts
The scenario list is the contract between the before run and the after run — both must execute the same list verbatim for the comparison to be meaningful.
Architecture
Tools
- MCP Playwright (already available in Claude Code sessions with the plugin). Uses
browser_navigate,browser_click,browser_fill_form,browser_type,browser_snapshot, etc. - Bash for process control, log truncation, log parsing.
- Rails dev server started via
bin/dev(foreman + Procfile.dev; web on port 3004). - Existing
bulletgem →log/bullet.log. - Existing
log/development.log→ server-side request metrics.
No new dependencies. No new gems. No JS test toolchain.
Credentials
Hardcoded in the .md: [email protected] / 123456. This user is created by db/seeds.rb. Users must have seeded the DB before running the agent.
File layout
docs/performance-enhancer/
├── performance-enhancer.md (committed; the generic agent)
└── runs/ (gitignored; all generated)
├── <ts>-before.json (raw per-scenario metrics)
├── <ts>-findings.md (hotspot analysis + root causes)
├── <ts>-plan.md (the fix prompt)
├── <ts>-after.json (written by the plan)
└── <ts>-comparison.md (written by the plan)
<ts> is a single timestamp per run (format YYYY-MM-DD-HHMMSS) shared across all five files. This links a before run to its after run deterministically.
.gitignore will add docs/performance-enhancer/runs/.
Execution flow
Phase 1 — Setup (clean slate)
- Kill any running
bin/dev, foreman, orrails serverprocesses.- Graceful first (SIGTERM), then force (SIGKILL) if needed.
- Verify port 3004 is free before proceeding.
- Truncate
log/development.logandlog/bullet.log(: > log/development.log). - Start
bin/devin the background. - Poll
http://localhost:3004until it returns a non-5xx response (timeout after 60s). - Open a Playwright browser session via MCP, navigate to the sign-in page, authenticate as
[email protected] / 123456.
Phase 2 — "Before" run
For each step in ## Scenarios, in order:
- Record a step-start timestamp (ms precision).
- Execute the step via Playwright (
browser_navigate,browser_click,browser_fill_form, etc., as the step dictates). - Capture Playwright navigation time (time-to-load-event for navigations; click-to-next-paint for in-page clicks).
- Record step-end timestamp.
- Tail
log/development.logfor allStarted/Completedrequest pairs whose Rails timestamp falls in[step-start, step-end]. For each request extract:- HTTP method, path, status
- Total time (ms)
- View render time (ms)
- ActiveRecord time (ms) and query count (parsed from
ActiveRecord: Xms (N queries)) - Allocations (ms)
- Collect any lines from
log/bullet.logwhose timestamp falls in the same window. - Append a structured record to
runs/<ts>-before.json.
before.json / after.json schema
{
"timestamp": "2026-04-22-143000",
"scenario_source": [
"Visit /dashboard",
"Visit /goals",
"Visit /goals/new; fill {title: \"Perf test goal\", target_date: \"2026-12-31\"}; submit"
],
"scenarios": [
{
"index": 1,
"step": "Visit /goals",
"playwright_ms": 412,
"requests": [
{
"method": "GET",
"path": "/goals",
"status": 200,
"total_ms": 387,
"view_ms": 120,
"db_ms": 245,
"query_count": 42,
"allocations_ms": 22
}
],
"bullet_warnings": [
"N+1 detected: Goal => [:category] in GoalsController#index"
]
}
]
}
Identical schema in both runs is mandatory — the comparison depends on it.
scenario_source is a verbatim copy of the user's ## Scenarios list at the time of the before run. This is the immutable contract the after run must replay. If performance-enhancer.md is edited between runs, the plan prompt uses before.json.scenario_source as the source of truth, not the live .md.
Phase 3 — Analysis & artifact generation
- Build a per-scenario summary table from
before.json. - Flag hotspots using editable thresholds defined at the top of
performance-enhancer.md:total_ms > 500query_count > 30- any Bullet warning
db_ms > 0.6 * total_ms- repeated identical queries within a single request (loose dup detection)
- For the top hotspots, read the Rails code responsible (controller action, model, view partials) to identify root causes: missing
includes,.counton relations in views, unscoped loads, fragment caching opportunities, etc. - Optionally consult context7 for framework-specific guidance on any non-obvious fix.
- Write
runs/<ts>-findings.md:- Summary table (scenario × metric)
- Ordered hotspot list with root-cause notes and suggested fixes
- Per-hotspot: file references with line numbers
- Write
runs/<ts>-plan.md(see next section).
Phase 4 is not executed by this agent. It is defined as instructions inside the plan.md the agent generates.
The generated plan.md (fix prompt)
The generic performance-enhancer.md holds this template. The agent fills in placeholders at generation time and writes the concrete plan as runs/<ts>-plan.md.
# Performance fix plan — <timestamp>
## Context
<PLACEHOLDER: one-paragraph summary of the app area(s) touched, populated by the agent>
## Before-run artifacts (do not modify)
- Scenarios (contract): runs/<ts>-before.json → `scenario_source` (frozen copy; authoritative)
- Raw metrics: docs/performance-enhancer/runs/<ts>-before.json
- Findings: docs/performance-enhancer/runs/<ts>-findings.md
## Hotspots to fix (ordered by impact)
<PLACEHOLDER: ordered list. Each entry:
N. <route or action> — <metric that flagged it> — <root cause, with file:line refs> — <proposed fix>
>
## Implementation rules
- Follow AGENTS.md patterns (no app/services, user-owned models, Pundit with user).
- Prefer eager-loading, counter caches, scopes, and fragment caching over new abstractions.
- No behavior changes — perf only. Existing tests must still pass.
- Run `bin/rails test` (or the project's test command) before the verification gate.
## Verification gate (REQUIRED — do not claim completion without this)
After implementing the fixes you MUST:
1. Stop any running bin/dev processes.
2. Truncate log/development.log and log/bullet.log.
3. Start bin/dev; wait until :3004 responds.
4. Log in as [email protected] / 123456 via MCP Playwright.
5. Execute the EXACT scenarios from runs/<ts>-before.json `scenario_source`,
in order, verbatim. No substitutions, no skipping. Do NOT read the live
performance-enhancer.md — it may have been edited since the before run.
6. Capture metrics using the schema defined in performance-enhancer.md
(same shape as <ts>-before.json).
7. Write runs/<ts>-after.json.
8. Write runs/<ts>-comparison.md containing:
- Header: date, scenario count, before-run <ts>, after-run <ts>.
- Per-scenario side-by-side table:
| metric | before | after | delta | delta % | regression? |
Rows: total_ms, db_ms, query_count, view_ms, allocations_ms, playwright_ms.
- Aggregate totals row across all scenarios.
- Regression flag on any metric that got worse by >5%.
- Per-hotspot status: fixed / improved / unchanged / regressed, with the numbers.
- "Remaining issues" section listing hotspots not addressed, with justification.
- Bullet warnings: list of warnings present before but absent after; list of any new ones.
9. Do not report completion until runs/<ts>-comparison.md exists and the numbers are in it.
Structure of the committed performance-enhancer.md
Top-to-bottom:
- Title + one-paragraph purpose.
- How to invoke. "Open in Claude Code, edit the Scenarios section below, say 'run the performance enhancer'."
- Prerequisites (terse): MCP Playwright plugin active;
db:seedrun so[email protected]exists; port 3004 free. - Credentials:
[email protected] / 123456(one line). - Hotspot thresholds — editable constants (request ms, query count, db-fraction, etc.).
## Scenarios— the user-edited section, with a worked example and grammar notes.## Agent instructions— do-not-edit. Phases 1–3 in imperative form with exact commands:- Phase 1: kill/poll/truncate/start commands.
- Phase 2: log-parsing hints (regex for
Started,Completed,ActiveRecord:lines), timestamp-window strategy, JSON write contract. - Phase 3: hotspot rules, findings.md structure, when to read code for root causes.
- JSON schema block for
before.json/after.json(shared by both runs). - Plan-prompt template — the full block from the previous section with
<PLACEHOLDER>markers. - Appendix — troubleshooting:
- "port 3004 is held" →
lsof -ti :3004 | xargs kill -9 - "bullet.log is empty" → verify
BULLET_ENABLEDin config/environments/development.rb - "a scenario step fails mid-run" → record the failure with full error, continue with remaining steps, flag in findings.md; do not abort the run
- "port 3004 is held" →
Error handling
- Server fails to boot within 60s → abort; surface the last ~50 lines of development.log to the user.
- Login fails → abort; instruct user to check
db:seedran and report. - A single scenario step fails → record the error in the step's record, continue with the next step. The findings report includes a "scenario failures" section.
- Log parsing misses a request → store
requests: []for that step with aparse_notefield; don't fabricate numbers.
Testing strategy
Not unit-tested — this is a prompt artifact. Validation is end-to-end: run the agent against a known page with a known N+1, confirm:
before.jsoncaptures the expected query count and Bullet warning.findings.mdflags the N+1 with correct file:line.plan.mdcontains the verification gate block verbatim.- Running the plan prompt (by hand) on a fixed branch produces a
comparison.mdthat shows the N+1 resolved.
This smoke test is documented in the spec but not automated.
Open questions
None at design time. Any ambiguities surface during the writing-plans step.
Out of scope (explicitly)
- LCP / web vitals (server-side perf is the target; can be added later per page if needed).
- Running the agent in CI.
- Cross-run history / trendlines beyond a single before/after pair.
- Production APM integration.
- Automated fix implementation — that is the plan prompt's responsibility, by design.