I'm looking into building a hard "Action Authorization Boundary" (AAB) that sits outside the agent's context window entirely. The idea is to intecept the tool-call, normalize it into intent against a deterministic YAML policy before execution.
A few questions for those building in this space:
Canonicalization: How do you handle the messiness of LLM tool outputs? If the representation isn't perfectly canonical, the policy bypasses seem trivial.
Stateful Intent: How do you handle sequences that are individually safe but collectively risky? For example, an agent reading a sensitive DB (safe) and then making a POST request to an external API (dangerous exfiltration).
Latency: Does moving the "gate" outside the model-loop add too much overhead for real-time agentic workflows?
I’ve been working on a CAR (Canonical Action Representation) spec to solve this, but I’m curious if I'm overthinking it or if there’s an existing firewall for agents standard I'm missing.
loading...