When an AI agent goes off-script, can you reconstruct what happened?
VIBES captures every action your agents take. VERIFY seals it cryptographically. PRISM scores risk before merge. But once an incident is suspected — a prompt injection, tool abuse, leaked credential, or compromised model — you need to detect, preserve, reconstruct, and report. TRACE is the open IR & forensics extension that turns VIBES audit data into court‑ready evidence and STIX‑compatible incident reports.
Built on VIBES audit data · Sealed via VERIFY · Triggered by PRISM · Standalone & provider‑agnostic
For a plain-language overview, see TRACE for Users →
Agentic systems — Claude Code, Codex, Gemini CLI, Cursor's agent mode, autonomous MCP-driven loops — now write production code, call tools, spawn sub-agents, and act on behalf of users at scale. The blast radius of a single compromised session has grown accordingly: a prompt-injected agent can read credentials, exfiltrate source, push backdoored commits, and delegate the same compromise to a dozen sub-agents in seconds.
Traditional security tooling wasn't built for this. SIEMs see network flows but not delegation graphs. EDR sees process trees but not chain-of-thought drift. Code review sees diffs but not the prompt that produced them. When an incident happens in agent space, responders end up reconstructing events from screen recordings and shell history.
The data already exists. Every agent governed by VIBES is generating annotations, decision records, delegation traces, and tool-invocation logs. What's missing is a standard for using that data when something goes wrong — turning it into IoCs, evidence bundles, replayable timelines, and portable incident reports.
TRACE is the missing layer between continuous audit and post‑incident response.
Without an IR standard:
TRACE (Threat Response & Agentic Compromise Examination) is the Incident Response and forensics extension to the VIBES standard. It defines:
IoC vocabulary — standardized indicators of agent compromise, queryable against VIBES audit dataEvidence bundle format — tamper-evident packages with cryptographic chain of custodyReconstruction primitives — timeline, delegation graph, blast-radius queries over multi-agent sessionsTRACE-IR report schema — STIX-compatible JSON for downstream SIEM, ticketing, and disclosureContainment playbooks — standardized remediation actions keyed to IoC classMaturity tiers — reactive, proactive, and autonomous response posturesThink of it as a forensics kit for autonomous agents. VIBES is the surveillance camera always recording. VERIFY is the tamper-evident seal on the tape. PRISM is the alarm threshold. TRACE is what the investigator does when the alarm fires — freeze the scene, dust for prints, reconstruct the timeline, write the report that holds up in court.
TRACE is standalone and provider-agnostic. It does not require cooperation from any AI provider, although tool-provider cosigning strengthens the chain of custody. It targets agentic systems specifically — Claude Code, Codex, Gemini CLI, Cursor agents, MCP servers, and autonomous loops — but the data model is generic enough for any tool emitting VIBES annotations.
TRACE defines a seven-step lifecycle from continuous capture through portable report. Each step is automatable; tier-1 (reactive) deployments run on demand, tier-3 (autonomous) deployments run continuously.
VIBES records annotations, decisions, delegation traces, and tool calls into .ai-audit/ as agents work.
vibetrace scan evaluates the IoC vocabulary against the audit log, flagging matches with severity and false-positive guidance.
An IoC match (or a human responder) opens an incident with a unique incident_id and freezes the affected sessions.
The audit snapshot is bundled, hashed, and sealed in a VERIFY DSSE envelope — chain of custody from this moment forward.
Replay the session timeline, walk the delegation graph, and compute blast radius (files touched, creds reachable, external calls).
Quarantine annotations, generate revert ranges, rotate exposed credentials. PRISM auto-gates quarantined data from future merges.
Emit a TRACE-IR JSON document (STIX 2.1 compatible) for SIEM ingestion, regulator disclosure, or feedback into EVOLVE.
Steps 1–2 run continuously. Steps 3–7 fire when an incident is declared — either by an automated IoC match crossing a threshold, by a PRISM Critical-band event, or by a human responder using vibetrace incident open.
TRACE sits between continuous agent audit and the existing security ecosystem. It consumes VIBES data, produces sealed evidence and structured reports, and feeds tools your team already operates.
Claude Code, Codex, Gemini, Cursor, MCP servers
.ai-audit/ annotations, decisions, edges, delegation
IoC scan, incident declare, evidence freeze, replay
STIX 2.1 compatible report + sealed evidence bundle
SIEM, ticketing, regulators, insurers, EVOLVE feedback
The format is intentionally simple: structured JSON aligned with STIX 2.1 object types so existing security tooling can ingest TRACE incidents without bespoke parsers. No private wire protocol, no provider-specific endpoints, no required cloud service.
The TRACE specification publishes a versioned catalog of indicators. Each IoC has a stable ID (TRACE-IOC-{class}-{nnn}), a detection rule expressed as a query over VIBES JSONL, severity guidance, false-positive notes, and an optional MITRE ATLAS mapping. The catalog is open; community contributions extend it.
System-prompt-shaped strings appearing in tool_result or fetched URL bodies, then referenced by subsequent reasoning. Detects indirect injection from MCP servers, web pages, and file contents.
Bash commands, file writes outside the working directory, or network egress that doesn't match the declared task. Cross-references prompt_text intent against actual command records.
Sub-agent sessions spawned without a matching parent prompt, delegation depth exceeding the policy ceiling, or sub-agents reading resources their parent never had context for.
High-assurance reasoning traces referencing entities never in context, sudden topic shifts mid-session, or scratchpad content inconsistent with the declared task. Requires Medium+ assurance.
Reads against .aws/credentials, .ssh/, .env, ~/.config/gh/, browser cookie stores, followed by network egress or external write within the same session.
MCP responses violating the server's advertised schema, signatures that don't validate against published keys, or replay-attack timing on cached responses. Targets MCP-server supply-chain attacks.
Source-file reads followed by encoded payloads in outbound HTTP bodies, paste-service POSTs, or DNS exfil patterns. Looks for the read-stage-egress sequence rather than any single call.
Action volume, model swap, or geographic-IP shift mid-session that doesn't match the user's historical baseline. Requires baselining over prior sessions.
An IoC match by itself is not a confirmed incident — it is a triage signal. The vibetrace scan CLI annotates each match with confidence, evidence references, and recommended next action.
When an incident is declared, TRACE produces a content-addressed evidence package at .ai-audit/incidents/<incident-id>/. The package is sealed in a VERIFY DSSE envelope with a TRACE-specific predicateType, providing forensic-grade chain of custody from the moment of declaration forward.
The bundle includes:
audit-snapshot.tar.gz — frozen copy of .ai-audit/ at incident timetool-traces.jsonl — bash history, file system diff, network flows correlated to session IDsmcp-logs.jsonl — MCP request/response pairs with timingfs-diff.patch — unified diff of files modified during the affected session windowprovider-correlation.json — case-id tokens for requesting tool-provider logs out-of-band optionalTool-provider cosigning on the evidence bundle is the strongest form of TRACE chain of custody — the provider attests "this audit data was generated by my system at the claimed time," eliminating the fabrication and post-hoc-editing threats from the IR analysis. Cosigning is optional; bundles signed by responder alone remain valid.
Once an evidence bundle is sealed, TRACE provides query and visualization primitives for understanding what actually happened. These run locally against the sealed bundle — no network calls required.
Chronological reconstruction of every prompt, tool call, file edit, sub-agent spawn, and decision record across one or more affected sessions. Output as text, JSON, or interactive HTML. Anchored on VIBES session and edge records.
Renders the parent → child agent DAG with credential and context flow annotations on the edges. Surfaces which sub-agent inherited which secrets from which parent prompt, and where context boundaries were crossed.
Lists every file touched, every credential read, every external endpoint contacted, and every sub-agent spawned during the affected window. Produces the input set for containment playbooks (revert range, rotation list, notification chain).
Given the same prompts and context, would a clean instance of the same model produce the same output? Distinguishes user-driven misuse from model-specific compromise. Optional; requires local model access.
Across a fleet, find every session that touched a specific compromised file, used a specific MCP server version, ran a specific model build, or matched a given IoC. Critical for scoping fleet-wide impact when an upstream provider discloses a model issue.
The portable output of an investigation is a TRACE-IR JSON document — a structured incident report aligned with STIX 2.1 object types so it ingests cleanly into existing security tooling. No bespoke parser required; if your stack speaks STIX, it speaks TRACE-IR.
TRACE-IR documents emit a parallel STIX 2.1 bundle (stix_bundle_ref) populated with standard STIX objects: incident, indicator, observed-data, tool, identity, note. Agentic-system specifics that don't fit standard STIX vocabulary are emitted as STIX x-trace-* custom objects with documented schemas.
Note on redaction: TRACE-IR documents may contain prompt text, source-code excerpts, and credential identifiers. The TRACE specification deliberately defines no built-in redaction model. Redaction and PII identification are left to dedicated tooling that operates over the open TRACE-IR format, allowing organizations to apply their own classification, retention, and disclosure policies.
TRACE deployments mature along three tiers, paralleling the assurance ladder in VIBES. Start at Reactive; advance to Proactive when you want continuous surveillance; reach Autonomous when policy is clear enough for automated containment.
On-demand investigation
Continuous detection
Automated containment
TRACE is designed to compose with the other extensions, but does not require any of them beyond VIBES itself.
| Standard | How TRACE uses it |
|---|---|
| VIBES | Required substrate. Annotations, decision records, edge records, and delegation traces are the input to IoC detection and timeline reconstruction. |
| VERIFY | Optional but recommended. Evidence bundles use DSSE envelopes with the trace/evidence/v1 predicate type. Tool-provider cosigning eliminates the fabrication and post-hoc-editing threats from IR analysis. |
| PRISM | Optional severity provider. PRISM ≥ 0.8 events can auto-declare a TRACE incident. Quarantined annotations from closed incidents are excluded from PRISM aggregates so the score reflects only trusted activity. Organizations may substitute any external scoring system — severity is pluggable. |
| EVOLVE | Optional feedback channel. Closed-incident TRACE-IR reports feed back as training signal — new IoCs become detection rules, remediation patterns update agent decision policies, and recurring incident classes become governance constraints. |
Standalone by design. TRACE does not require provider cooperation, a centralized registry, or any of the other extensions. A team running plain VIBES Low-assurance can still produce TRACE-IR reports — they just have less data to investigate with. As tool providers adopt VIBES cosigning and teams layer on PRISM, the forensic fidelity of TRACE rises with them.
vibetrace CLIThe reference implementation ships as a sibling to vibecheck. Like vibecheck, vibetrace runs locally against the project's .ai-audit/ directory; no network calls are required for detection, evidence sealing, or report generation.
Every command is scriptable for CI/CD and can be embedded in existing IR runbooks. The TRACE specification defines the file formats, the IoC catalog, the playbook contract, and the report schema; alternative implementations are welcome as long as they conform.
Four explicit choices shape the TRACE specification:
TRACE-IR documents are plain JSON aligned with STIX 2.1 object types so existing SIEMs, ticketing systems, and disclosure pipelines ingest them with no custom parsers. No proprietary wire format.
TRACE does not require cooperation from any AI provider. Adoption is welcomed but optional — providers that want to participate can add cosigning and standardized log-disclosure tokens, but TRACE is fully usable without them.
TRACE deliberately defines no PII or secret-redaction layer. Redaction is a domain that warrants dedicated tooling operating over the open TRACE-IR format, so organizations can apply their own classification, retention, and disclosure policies.
Severity is sourced from PRISM by default, but the severity.source field is pluggable. Organizations with existing CVSS-style scoring, custom risk engines, or third-party severity services can substitute them without changing the report schema.
Like the other VIBES-family standards, TRACE follows a draft → review → ratification process. Draft versions are working documents subject to change.
| Version | Date | Status | Notes |
|---|---|---|---|
| 0.1-draft | 2026-05-07 | Draft | Initial TRACE extension draft. IoC vocabulary, evidence bundle format, TRACE-IR JSON schema (STIX-compatible), maturity tiers, vibetrace CLI surface. |
| 1.0 | TBD | Pending | Target: stable release after public review and reference vibetrace implementation. |
TRACE is an extension to VIBES — the Incident Response and forensics layer of the VIBES family. It builds on VIBES audit data and composes with the other extensions (VERIFY, PRISM, EVOLVE) but is independently adoptable.
VIBES — the audit substrate. TRACE consumes annotations, decisions, and delegation traces.
VERIFY — cryptographic sealing. TRACE evidence bundles are DSSE envelopes; cosigning hardens chain of custody.
PRISM — risk scoring. PRISM Critical-band events can auto-declare TRACE incidents; pluggable with external scoring.
EVOLVE — agent learning. Closed TRACE incidents feed back as IoC tuning and governance signal.
TRACE is a draft — the strongest version emerges from a community of responders, tool builders, and security teams who run agents in production.
Contribute to the IoC catalog, propose new agent-specific indicators, refine the TRACE-IR schema, or share IR runbooks from your own incident postmortems.
Get involved →Build a vibetrace alternative, integrate TRACE-IR ingestion into your SIEM, write IoC detection rules for your stack, or add a redaction tool that operates over the open format.
Implementation guide →Make the case for structured IR data on agent activity in your organization. Run a tabletop exercise on a simulated agent compromise. Show that "we had no visibility" is no longer the right answer.
Resources →