~projectsagent-flight-recorder
observe

Agent Flight Recorder

Think of it as: Chrome DevTools for AI agents

Debug, replay, and audit every AI agent run.

Wraps your agent command and records the full timeline: prompt, model, tool calls, shell commands, file edits, errors, retries, final diff, and cost.

stableMITlocal-onlyno telemetry
quickstart · agent-flight-recorder
1# install (binary is `agent-flight`)
2npm install -g @agentopssec/agent-flight-recorder
3
4# wrap any agent command
5agent-flight run -- codex "fix the failing tests"
6agent-flight run -- gemini "refactor this component"
7
8# inspect later
9agent-flight inspect latest
10agent-flight export run_001 --html
01

Why this exists

Agents make changes quickly, but the path they took is hard to inspect after the fact.

The Flight Recorder wraps your agent command and records the full timeline — prompt, model, every tool call, every shell command, errors, retries, the final diff, and an estimated cost.

Inspect a run, replay context, export it as HTML, or diff two runs against each other.

  • What prompt started this run?
  • Which agent and model were used?
  • What files did the agent read or edit?
  • What commands did it run? Did tests fail before they passed?
  • Did the agent retry or loop?
  • How long did the run take and what did it cost?
02

Quickstart

One command. No account. No telemetry.

quickstart · agent-flight-recorder
1# install (binary is `agent-flight`)
2npm install -g @agentopssec/agent-flight-recorder
3
4# wrap any agent command
5agent-flight run -- codex "fix the failing tests"
6agent-flight run -- gemini "refactor this component"
7
8# inspect later
9agent-flight inspect latest
10agent-flight export run_001 --html
03

Run timeline

A real run from the test suite — fixing failing tests in a Next.js app.

01Prompt received
0ms
02Model call · 12.8s · 18,400 tokens
12.8s
03filesystem.read package.json
13.0s
04shell.exec npm test
14.2s
05TypeScript compile failure
27.1s
06src/lib/parser.ts · modified
38.4s
07package.json · modified
41.2s
08shell.exec npm install
52.0s
09shell.exec npm test
1m 14s
10Tests passed · 6 files changed
1m 32s
11Estimated cost · $0.42
4m 12s
04

Run log shape

Stored locally as JSON. Used by Agent Review and Agent Cost Lens.

runs/run_001.json
1{ "runId": "run_001",
2 "agent": "codex", "prompt": "fix the failing tests",
3 "events": [
4 { "type": "shell.exec", "command": "npm test", "exitCode": 1, "durationMs": 12944 },
5 { "type": "file.change", "path": "src/lib/parser.ts", "changeType": "modified" }
6 ],
7 "gitDiffSummary": { "filesChanged": 3, "insertions": 44, "deletions": 18 },
8 "estimatedCost": 0.18 }
05

CLI reference

run -- <agent> <prompt>Wrap an agent command and record the full timeline.
inspect [latest|run_id]Show the timeline for a run.
replay <run_id>Walk through a run step by step.
listList previous runs.
export <run_id> --htmlExport a portable HTML report.
diff <run_id>Show the final code diff for a run.