Performance Enhancer Agent

A reusable prompt for diagnosing and fixing real-world performance issues in this Rails app. You supply an ordered list of UI scenarios; the agent restarts the dev server with clean logs, drives the scenarios via MCP Playwright, captures per-request server metrics (request time, query count, DB time, Bullet N+1 warnings) plus Playwright navigation time, analyzes hotspots in the code, and emits a self-contained fix-plan prompt you hand to a fresh Claude session. That plan prompt is required — by its own verification gate — to re-run the identical scenarios after implementation and produce a full before/after comparison report.

How to invoke

Edit the ## Scenarios section below with the pages and actions you want measured.
Open this file in Claude Code.
Say: "Run the performance enhancer against this scenario list."
Claude follows the instructions in ## Agent instructions verbatim.

You do not edit anything outside ## Scenarios (and ## Hotspot thresholds if you want to tune sensitivity). Everything else is the agent's operational contract.

Prerequisites

The MCP Playwright plugin is active in your Claude Code session (provides browser_navigate, browser_fill_form, browser_click, etc.).
You have run bin/rails db:seed so [email protected] exists.
Port 3004 is free (the agent will kill stray processes; see troubleshooting if stuck).
You are on a branch where uncommitted work is OK — the agent restarts the server but does not mutate code.

Credentials (hardcoded — dev only)

Email: [email protected]
Password: 123456
Login URL: http://localhost:3004/users/sign_in

These come from db/seeds.rb. This file is dev-only; do not use these credentials to describe a production user.

Hotspot thresholds (editable)

The agent flags a request as a hotspot if any of these are true:

total_ms > 500
query_count > 30
db_ms > 0.6 * total_ms (DB-bound)
any Bullet warning emitted during the request
two or more identical queries within one request (loose dup detection by SQL fingerprint)

Tune these by editing the numbers. The agent reads them from this section at run time.

Scenarios

This is the section you edit before each run. Each step below becomes one Playwright action executed by the agent, in order. The after-run (triggered by the generated plan) replays this list verbatim from a frozen copy — do not change ordering, do not re-phrase steps between the before and after runs, or the comparison will be invalid.

Grammar

One step per line, imperative.
Reference routes by path (e.g. /goals, /goals/new), not by human description.
Form submissions include the data as an inline hash: fill {field: "value", other_field: "value"}; submit.
Click targets use the visible button/link text in double quotes: click "Edit".
Chain actions on the same page with ; within a single line.
Use the resulting URL phrasing when an action produces a redirect: On the resulting /goals/:id page, click "Edit"....
Skip auth steps — the agent signs in first automatically.

Example (replace with your own)

- Visit /dashboard
- Visit /goals
- Visit /goals/new; fill {title: "Perf test goal", target_date: "2026-12-31"}; submit
- On the resulting /goals/:id page, click "Edit"; change title to "Perf test goal edited"; submit
- Visit /debts
- Visit /market_lists

Your scenarios

Visit /dashboard

Agent instructions

Do not edit anything below this line. The agent follows these steps in order. Each phase must complete successfully before the next begins (except where explicitly noted — e.g., a single failed scenario step does not abort Phase 2).

Before starting, the agent captures a timestamp TS in the format YYYY-MM-DD-HHMMSS. This single timestamp is reused for every artifact produced by this run: <TS>-before.json, <TS>-findings.md, <TS>-plan.md. Use date "+%Y-%m-%d-%H%M%S" to generate it and echo it for the record.

Phase 1 — Setup (clean slate)

Kill any existing dev servers.

pkill -TERM -f "foreman start -f Procfile.dev" 2>/dev/null || true
pkill -TERM -f "rails server" 2>/dev/null || true
sleep 2
pkill -KILL -f "foreman start -f Procfile.dev" 2>/dev/null || true
pkill -KILL -f "rails server" 2>/dev/null || true
lsof -ti :3004 | xargs -r kill -9 2>/dev/null || true

Verify port is free:

lsof -ti :3004 && echo "STILL_HELD" || echo "FREE"

Must print FREE. If it prints STILL_HELD, see the troubleshooting appendix before continuing.

Truncate logs.

: > /code/life-management/log/development.log
: > /code/life-management/log/bullet.log

Ensure the runs directory exists.

mkdir -p /code/life-management/docs/performance-enhancer/runs

Start bin/dev in the background. Use the Bash tool's run_in_background option; do not block. Note: Procfile.dev already runs bin/rails logs:clear on boot, which is redundant with step 2 but harmless.
```
bin/dev
```

Poll for readiness, then confirm stability. Retry up to 60 times with a 1s sleep for the initial success, then sleep 3s and re-poll to catch the case where foreman's cascade-SIGTERM tore Puma down right after it bound the port (e.g. because a sibling process crashed on boot).

for i in $(seq 1 60); do
  code=$(curl -s -o /dev/null -w "%{http_code}" http://localhost:3004/users/sign_in || echo "000")
  if [ "$code" != "000" ] && [ "$code" != "500" ] && [ "$code" != "502" ] && [ "$code" != "503" ]; then
    echo "READY after ${i}s (HTTP $code)"
    break
  fi
  sleep 1
done
# Stability recheck — if foreman killed everyone after a sibling crash,
# the port will be free again by now.
sleep 3
recheck=$(curl -s -o /dev/null -w "%{http_code}" http://localhost:3004/users/sign_in 2>/dev/null || echo "000")
if [ "$recheck" != "200" ] && [ "$recheck" != "302" ]; then
  echo "UNSTABLE after 3s (HTTP $recheck)"
else
  echo "STABLE after 3s (HTTP $recheck)"
fi

If after 60s no success line appears, or the stability recheck reports UNSTABLE, abort and surface the last 50 lines of log/development.log to the user. Common cause of UNSTABLE: a sibling Procfile.dev process (worker.1, css.1) exited, and foreman's default behavior is to kill all processes when any one exits.

Log in via MCP Playwright.
- browser_navigate to http://localhost:3004/users/sign_in.
- browser_fill_form with user[email] = [email protected] and user[password] = 123456.
- Submit the form (click the submit button, or browser_press_key Enter on the password field).
- Wait for navigation to complete. Confirm success by browser_snapshot and checking the resulting URL is no longer /users/sign_in (typically redirects to /dashboard or root).
- If login fails: abort, report the error, and recommend the user run bin/rails db:seed.

Phase 2 — "Before" run

For each step in the user's ## Scenarios section, in order:

Record step-start timestamp (ms precision). Use date +%s%3N on Linux.
Execute the step via the appropriate MCP Playwright tool(s):
- Visit /path → browser_navigate to http://localhost:3004/path.
- fill {field: "value", ...}; submit → browser_fill_form with the hash, then submit (click submit button or press Enter).
- click "Button Text" → browser_click on the element with matching visible text.
- change title to "..." → browser_fill_form on the title field with the new value.
Capture Playwright navigation time (time from action issue to load event). If the MCP tool does not report it directly, use the elapsed wall-clock from step-start to when the snapshot stabilizes.
Record step-end timestamp.
Harvest server-side metrics from log/development.log for all requests whose Rails timestamp falls in [step-start, step-end]. Parse each Completed block. Example line format:
```
Started GET "/goals" for 127.0.0.1 at 2026-04-22 14:30:12 -0300
Processing by GoalsController#index as HTML
  Goal Load (4.2ms)  SELECT ...
Completed 200 OK in 387ms (Views: 120.3ms | ActiveRecord: 245.1ms (42 queries, 12 cached) | Allocations: 45213)
```
Some Rails versions (8.1+) substitute GC: <ms> for Allocations: <n> in the trailing segment. Both shapes appear in the wild — accept either.

Extraction regex hints (Ruby-style, adapt as needed):
- Started (\w+) "([^"]+)" → method, path
- Completed (\d+) \S+ in (\d+)ms \(Views: ([\d.]+)ms \| ActiveRecord: ([\d.]+)ms \((\d+) queries → status, total_ms, view_ms, db_ms, query_count
- Allocations: (\d+) → allocations (integer count). If absent, try GC: ([\d.]+)ms → gc_ms. Store whichever matched; set the other field to null. If neither matches, both are null.
Harvest Bullet warnings from log/bullet.log for lines whose timestamp falls in the same window (Bullet log timestamps are RFC-3339).
Append the record to runs/<TS>-before.json. The file is a single JSON object; build it in memory as you go and write it at the end of Phase 2.

Shared JSON schema for `before.json` and `after.json`

Both runs must produce this exact shape. The comparison depends on it.

{
  "timestamp": "2026-04-22-143000",
  "run_kind": "before",
  "scenario_source": [
    "Visit /dashboard",
    "Visit /goals",
    "Visit /goals/new; fill {title: \"Perf test goal\", target_date: \"2026-12-31\"}; submit"
  ],
  "scenarios": [
    {
      "index": 1,
      "step": "Visit /goals",
      "playwright_ms": 412,
      "requests": [
        {
          "method": "GET",
          "path": "/goals",
          "status": 200,
          "total_ms": 387,
          "view_ms": 120,
          "db_ms": 245,
          "query_count": 42,
          "allocations": 45213,
          "gc_ms": null
        }
      ],
      "bullet_warnings": [
        "USE eager loading detected: Goal => [:category]"
      ],
      "parse_note": null
    }
  ]
}

scenario_source is a verbatim copy of the user's ## Scenarios list (each step a string, in order). This is the authoritative contract for the after run. If performance-enhancer.md is edited between runs, the plan prompt reads scenario_source from this JSON, not the live .md.
run_kind is "before" in this file, "after" in the post-fix file.
If a request cannot be parsed for a step, use requests: [] and set parse_note to a human string describing what went wrong. Do not invent numbers.
If a scenario step itself fails (e.g., Playwright can't find the button), include the step with requests: [], bullet_warnings: [], and parse_note: "scenario failure: <error>". Continue to the next step — do not abort the run.

Write the final JSON to /code/life-management/docs/performance-enhancer/runs/<TS>-before.json with 2-space indentation.

Phase 3 — Analysis & artifact generation

Build the per-scenario summary table. For each scenario in before.json, produce a row with: index, step, total playwright ms, sum of request total_ms, sum of db_ms, sum of query_count, count of Bullet warnings.
Apply the hotspot rules from ## Hotspot thresholds to every request in before.json. A request is a hotspot if it matches any rule. Collect all hotspots with: scenario index, request method + path, which rule(s) fired, the offending numbers.
Research root causes. For each hotspot, read the responsible Rails code:
- From the path, derive the controller + action via bin/rails routes | grep <path>.
- Open the controller file, the action, and any instance-variable models it loads.
- Open the matching view/partials under app/views/....
- Look for: missing includes / preload, .count on relations in views (use .size with preloaded associations or counter caches), unscoped loads (Model.all where a current_user scope is expected), N+1 on associated scopes, uncached expensive computations, repeated identical queries.
- Note exact file:line references for each finding.
Optionally consult context7 (mcp__plugin_context7_context7__query-docs) for framework-specific guidance on any non-obvious fix — e.g., Rails counter caches, load_async, Bullet configuration, fragment caching API. Only use it when the canonical fix is unclear from the code.

Write runs/<TS>-findings.md with this structure:

# Performance findings — <TS>

## Summary table

| # | Step | Playwright ms | Request ms (sum) | DB ms (sum) | Queries (sum) | Bullet warnings |
|---|------|--------------:|-----------------:|------------:|--------------:|----------------:|
| 1 | ... | ... | ... | ... | ... | ... |

## Hotspots (ordered by impact)

### Hotspot 1: GET /goals — 42 queries, N+1 on Goal => [:category]
**Rule(s) fired:** query_count > 30; Bullet warning
**Numbers:** total_ms=387, db_ms=245, query_count=42
**Root cause:** `app/controllers/goals_controller.rb:12` loads `Goal.where(user: current_user)`; the index view at `app/views/goals/index.html.erb:23` iterates and calls `goal.category.name` per row.
**Proposed fix:** Add `.includes(:category)` in the controller query.

### Hotspot 2: ...

## Scenario failures (if any)
<list any steps with parse_note starting "scenario failure"; include the error string>

Write runs/<TS>-plan.md using the template in the next section. Fill in every <PLACEHOLDER:...> with concrete content from the findings; leave the rest of the template byte-for-byte identical.
Report back to the user in chat: the count of scenarios executed, the count of hotspots flagged, and the three output paths (before.json, findings.md, plan.md). Do not summarize the findings inline — the files are the deliverable.

Plan-prompt template

When emitting runs/<TS>-plan.md, the agent writes exactly the content below. Placeholders are marked <PLACEHOLDER:...> — the agent substitutes each one with concrete content derived from findings.md. Every other byte (including the verification gate) is copied verbatim.

# Performance fix plan — <PLACEHOLDER:TS>

> This is a self-contained Claude Code prompt. Open it in a fresh session and say: **"Execute this performance fix plan."** You do not need any other context.

## Context

<PLACEHOLDER: one-paragraph summary of the app area(s) touched by the hotspots — controllers, models, feature name. Written by the agent based on findings.>

## Before-run artifacts (do not modify)

- **Scenarios (contract):** `docs/performance-enhancer/runs/<PLACEHOLDER:TS>-before.json` → `scenario_source` field (frozen copy; authoritative — do NOT read the live `performance-enhancer.md`, it may have been edited).
- **Raw metrics:** `docs/performance-enhancer/runs/<PLACEHOLDER:TS>-before.json`
- **Findings:** `docs/performance-enhancer/runs/<PLACEHOLDER:TS>-findings.md`

## Hotspots to fix (ordered by impact)

<PLACEHOLDER: ordered list. Each entry has this exact shape:

N. **<METHOD> <path>** — <metric that flagged it: e.g., 42 queries, N+1>
   - **Root cause:** <plain English, with file:line refs>
   - **Proposed fix:** <concrete code-level change>

Repeat for each hotspot in descending impact order.>

## Implementation rules

- Follow `AGENTS.md` patterns: no `app/services/`, user-owned models (`belongs_to :user`), Pundit policies receive `(user, record)`.
- Prefer eager-loading (`includes` / `preload`), counter caches, scopes, and fragment caching over new abstractions.
- **No behavior changes** — performance only. Existing tests must still pass.
- Run the project's test command (`bin/rails test`) before proceeding to the verification gate.
- Commit each hotspot fix as its own commit with a message naming the route and the fix.

## Verification gate (REQUIRED — do not claim completion without this)

After implementing every fix, you MUST complete this gate. Skipping any step means the plan is not done, regardless of how confident you feel about the fixes.

1. **Stop any running `bin/dev` process.**
   ```bash
   pkill -TERM -f "foreman start -f Procfile.dev" 2>/dev/null || true
   pkill -TERM -f "rails server" 2>/dev/null || true
   sleep 2
   lsof -ti :3004 | xargs -r kill -9 2>/dev/null || true
   ```

2. **Truncate logs.**
   ```bash
   : > /code/life-management/log/development.log
   : > /code/life-management/log/bullet.log
   ```

3. **Start `bin/dev`; poll `http://localhost:3004/users/sign_in` until it returns a non-5xx response** (60s timeout).

4. **Log in as `[email protected] / 123456`** via MCP Playwright (`browser_navigate` → `/users/sign_in`, `browser_fill_form` with `user[email]` and `user[password]`, submit).

5. **Execute the EXACT scenarios from `runs/<PLACEHOLDER:TS>-before.json` → `scenario_source`**, in order, verbatim. No substitutions, no skipping. Do NOT read the live `performance-enhancer.md` — it may have been edited since the before run.

6. **Capture metrics using the schema in `performance-enhancer.md`** (`## Agent instructions` → Phase 2 → shared JSON schema). Build a JSON object with `run_kind: "after"` and the same `scenario_source` array copied from the before file.

7. **Write `runs/<PLACEHOLDER:TS>-after.json`** with 2-space indentation.

8. **Write `runs/<PLACEHOLDER:TS>-comparison.md`** with this exact structure:

   ```markdown
   # Performance comparison — <TS>

   - **Before run:** <TS> (kind: before)
   - **After run:** <TS> (kind: after)
   - **Scenarios:** <count>

   ## Per-scenario comparison

   ### Scenario 1 — <step text>

   | metric | before | after | delta | delta % | regression? |
   |---|---:|---:|---:|---:|:---:|
   | total_ms | ... | ... | ... | ... | ... |
   | db_ms | ... | ... | ... | ... | ... |
   | query_count | ... | ... | ... | ... | ... |
   | view_ms | ... | ... | ... | ... | ... |
   | allocations | ... | ... | ... | ... | ... |
   | gc_ms | ... | ... | ... | ... | ... |
   | playwright_ms | ... | ... | ... | ... | ... |

   Skip rows where both `before` and `after` are `null` (metric not reported by this Rails version).

   (repeat per scenario; for multi-request scenarios, sum across requests)

   ## Aggregate totals

   | metric | before | after | delta | delta % |
   |---|---:|---:|---:|---:|

   ## Regressions

   Flag any metric that got worse by **>5%**. List: scenario index, metric, before → after, explanation if known.

   ## Per-hotspot status

   For each hotspot from the original `<TS>-findings.md`:
   - **Hotspot N: <METHOD> <path>** — **fixed / improved / unchanged / regressed** — <before numbers> → <after numbers>

   ## Bullet warnings diff

   - **Resolved:** warnings present in before, absent in after (one per bullet)
   - **New:** warnings absent in before, present in after (one per bullet — these are regressions)
   - **Persisting:** warnings present in both

   ## Remaining issues

   Any hotspot not fixed (with one-sentence justification), plus any new issues surfaced by the after run.
   ```

9. **Do NOT report completion until `runs/<PLACEHOLDER:TS>-comparison.md` exists on disk and every table in it is populated with real numbers.** An empty table or a "TODO" inside the comparison is a failure — the plan is not done.

Troubleshooting

Port 3004 is held

If lsof -ti :3004 keeps returning a PID after the Phase 1 kill commands, a non-foreman process is holding it (Docker, another Rails app, a zombie Puma). Force-kill:

lsof -ti :3004 | xargs -r kill -9

If that still doesn't free it, run sudo lsof -i :3004 to identify the holder and stop it manually. Do not retry the agent until lsof -ti :3004 returns empty.

`log/bullet.log` is empty after the run

Bullet only writes to its log when it detects issues and is enabled for the environment. Confirm:

grep -n "Bullet" /code/life-management/config/environments/development.rb

You should see Bullet.enable = true and Bullet.bullet_logger = true. If those are absent or commented out, Bullet is off — re-enable, restart bin/dev, re-run the agent.

An empty bullet.log with Bullet enabled just means no N+1 / unused-eager-load issues were detected in the scenarios you ran — this is not an error.

A scenario step fails mid-run

The agent does NOT abort on a single step failure. It records the step with requests: [], bullet_warnings: [], and parse_note: "scenario failure: <error>", then moves on. findings.md surfaces all failures under a Scenario failures section.

If you see repeated failures, the most common causes are:

Route doesn't exist (typo) — fix the step in ## Scenarios.
Button text changed — fix the step.
Form field name changed — inspect the page with browser_snapshot manually and fix the fill hash.
Step depends on prior-step state that didn't happen (e.g., an Edit click after a failed create) — reorder or add the missing prerequisite.

The after run produces very different Playwright numbers from the before run

Playwright nav time has some run-to-run variance (usually <10%). The server-side numbers (total_ms, db_ms, query_count) are the trustworthy signal; they're derived from Rails's own log lines and are deterministic for a given code path + data shape. Focus the comparison on those. The >5% regression flag in comparison.md applies per-metric; a noisy Playwright value alone should not block acceptance if server metrics improved.

"I want to re-run just the after phase" after already running a plan

Don't. The timestamp ties before to after. If you want a fresh measurement, run the agent again (new TS) — it's fast. If you just want to sanity-check a single page, run it manually with curl -w or the browser dev tools; don't reuse a stale plan.