Agent Flight Recorder

Think of it as: Chrome DevTools for AI agents

Debug, replay, and audit every AI agent run.

Wraps your agent command and records the full timeline: prompt, model, tool calls, shell commands, file edits, errors, retries, final diff, and cost.

stableMITlocal-onlyno telemetry

Install github.com/AgentOpsSec/agent-flight-recorder Docs

quickstart · agent-flight-recorder

1# install (binary is `agent-flight`)
2npm install -g @agentopssec/agent-flight-recorder
3 
4# wrap any agent command
5agent-flight run -- codex "fix the failing tests"
6agent-flight run -- gemini "refactor this component"
7 
8# inspect later
9agent-flight inspect latest
10agent-flight export run_001 --html

Why this exists

Agents make changes quickly, but the path they took is hard to inspect after the fact.

The Flight Recorder wraps your agent command and records the full timeline — prompt, model, every tool call, every shell command, errors, retries, the final diff, and an estimated cost.

Inspect a run, replay context, export it as HTML, or diff two runs against each other.

What prompt started this run?
Which agent and model were used?
What files did the agent read or edit?
What commands did it run? Did tests fail before they passed?
Did the agent retry or loop?
How long did the run take and what did it cost?

Quickstart

One command. No account. No telemetry.

quickstart · agent-flight-recorder

1# install (binary is `agent-flight`)
2npm install -g @agentopssec/agent-flight-recorder
3 
4# wrap any agent command
5agent-flight run -- codex "fix the failing tests"
6agent-flight run -- gemini "refactor this component"
7 
8# inspect later
9agent-flight inspect latest
10agent-flight export run_001 --html

Run timeline

A real run from the test suite — fixing failing tests in a Next.js app.

01Prompt received

0ms

02Model call · 12.8s · 18,400 tokens

12.8s

03filesystem.read package.json

13.0s

04shell.exec npm test

14.2s

05TypeScript compile failure

27.1s

06src/lib/parser.ts · modified

38.4s

07package.json · modified

41.2s

08shell.exec npm install

52.0s

09shell.exec npm test

1m 14s

10Tests passed · 6 files changed

1m 32s

11Estimated cost · $0.42

4m 12s

Run log shape

Stored locally as JSON. Used by Agent Review and Agent Cost Lens.

runs/run_001.json

1{ "runId": "run_001",
2  "agent": "codex", "prompt": "fix the failing tests",
3  "events": [
4    { "type": "shell.exec", "command": "npm test", "exitCode": 1, "durationMs": 12944 },
5    { "type": "file.change", "path": "src/lib/parser.ts", "changeType": "modified" }
6  ],
7  "gitDiffSummary": { "filesChanged": 3, "insertions": 44, "deletions": 18 },
8  "estimatedCost": 0.18 }

CLI reference

run -- <agent> <prompt>	Wrap an agent command and record the full timeline.
inspect [latest\|run_id]	Show the timeline for a run.
replay <run_id>	Walk through a run step by step.
list	List previous runs.
export <run_id> --html	Export a portable HTML report.
diff <run_id>	Show the final code diff for a run.