Joe Krisciunas

Research

Cartograph Public Dossier


Public share surface

Workflow compromise in agentic systems

The full public artifact now lives in a dedicated Cartograph dossier with a stronger methods story, sharper architecture framing, and a strict public/private disclosure boundary.

Open dossierRead brief

Snapshot

246
attack cases
64.9%
highest public model result
0
high-risk tool executions

Flagship claims


Workflow text becomes operational authority.

Runbooks, handoff notes, review sheets, and approval text can be adopted as active authority rather than treated as passive context.

Summarize, review, and handoff tasks can drift into action preparation.

The failure often appears before the final risky tool: the model asks for the exact command, recipient, destination, or missing approval detail needed to continue the malicious workflow.

Containment and compromise are different safety signals.

A runtime can successfully block dangerous execution after unsafe tool intent has already formed. Public reporting has to preserve both signals.

Architecture spine


Threat taxonomy and case corpus

Cartograph organizes the agentic surface into nine threat channels and a public program scope of 246 attack cases across 20 datasets.

Adaptive attacker lanes

Attack generation combines iterative attacker loops, multi-turn escalation, and realistic workflow poisoning cases.

CapabilityOS containment layer

The execution shell separates model behavior from runtime permissioning through layered controls: tool allowlists, approval gates, egress controls, and path restrictions.

Dual-signal scoring

Cartograph scores unsafe intent formation separately from final execution and adds intermediate workflow compromise signals such as semantic capture.

Replay-valid evidence

Every public claim in this packet is meant to resolve to replay-backed evidence, a bounded transcript, or an explicitly downgraded aggregate result.