Agentic Coding with Composable Security Guarantees
Coding agents are a remarkable and transformative technology. They also pose real security and governance risks. But that’s where a lot of discussions stop. This one goes further. We’re going to dig into why and how to reestablish a degree of control.
If you’ve used a tool like Claude Code or Cursor, you’ve surely seen this. Agents display several tendencies that are a direct threat to security, like:
Bias toward path of least resistance, often at the expense of safeguards
Bias toward increasing scope/blast radius of changes, even contrary to instructions
Bias toward action over clarification in the face of ambiguity
At the same time, agentic coding erodes some of the implicit safety nets that security relies on. Agents operate at a velocity and volume that outpaces reasonable human review. They also break the personal accountability chain that grounds human developers.
For many teams, the benefits already outweigh those very real risks. We’ve crossed the Rubicon. Now it’s incumbent on us to manage those risks.
So, how do we maintain security in the face of super-human output with reduced human accountability? We make accountability systemic instead of personal. And there’s already a playbook for that: DevSecOps.
But while thinking about DevSecOps in an agentic world, a pattern emerged that I’m calling Composable Security Guarantees. And that pattern makes this problem far tidier than I initially thought possible.
Just Enough DevSecOps To Set The Stage
There are four prerequisites that, to me, are fundamental to maintaining confidence in your application security in an agentic world.
1. Isolated, secure toolsets
The first step is isolating critical security logic into dedicated, hardened toolsets that live outside the agent’s normal blast radius.
In practice, this often means separate packages or repositories with clearer ownership boundaries and a much higher bar for change. The point isn’t perfect isolation, but to ensure that security protections aren’t casually edited, weakened, or removed as part of routine feature work.
2. Explicit alignment with security requirements
Once security logic is isolated, its guarantees need to be explicit.
Human engineers rely on implicit knowledge of standards (OWASP guidance, ASVS controls, best practices, etc.). But when operating amidst agentic loops and reduced human involvement, it’s important to make these relationships and resources as clear as possible. They carry the intent and expectations.
3. Machine-readable artifacts and evidence
It’s not enough to just capture the intent and expectations. To turn those connections into operational intelligence under high-velocity, high-volume conditions, you need machine readable artifacts and evidence.
Structured test results, manifests, SBOM entries, etc. all play a part here. Having visibility into security requirement coverage and clear traceability from violations to mapped requirements and physical locations in the codebase is incredibly powerful.
4. Apply a Zero-Trust mindset to agent behavior
It’s tempting to believe agents will “do the right thing”. Or that you can steer them into doing the right thing with instructions. But they frequently don’t. And you generally can’t, at least not reliably enough to trust for security purposes.
So, rather than trusting agents will do the right thing, we’ll trust that they won’t. In other words, we need to be able to automatically verify the agent didn’t undermine application security.
Given that we’ve isolated our secure toolset from the agent and aligned its security protections with standards in a machine-readable way, what remains is verifying secure usage.
Composable Security Guarantees Emerge
Now comes the critical insight: what if the secure toolset itself were responsible for verifying that it’s being used safely?
Imagine a secure package that provides:
An internal automated test suite, with cases explicitly mapped to security requirements, serving as evidence of the guarantees it claims to provide; AND
Application-facing static checks that verify the tool is being used correctly and not circumvented.
If a consuming application passes those checks, it can be said to transitively inherit the security guarantees of the package. And when those checks are enforced in the normal development loop and CI/CD pipelines, trust in agent behavior is replaced with continuous verification.
In other words, you create composable security guarantees.
Tool X satisfies Security Requirements A, B, and C when used correctly. Application Y is demonstrated to use Tool X correctly. Therefore, Application Y meets Security Requirements A, B, and C.
Example: Secure Server-Side HTTP Client
Server-side HTTP requests provide a particularly clean application of this concept, especially in Node.js applications. Not all applications will be so straightforward, but we’ll come back to that.
Naturally, you don’t want to expose your application to risks of Server-Side Request Forgery (SSRF) or unsafe operational defaults such as missing timeouts or unbounded response sizes.
But if you don’t provide a shared, hardened client, you’re likely to see agents reaching for fetch and constructing requests on an ad hoc basis. This sort of inline construction is very hard to get right consistently, practically guaranteeing exposure to one or more threats if the agent is unchecked by systemic controls.
So, you set forth to write your @acme/secure-http-client package.
You’re meticulous about the attack vectors, protecting against:
Unsanctioned destinations (via allow lists or deny lists, as appropriate)
DNS masking and rebinding attacks
Unsafe redirects
Unbounded responses and compression bombs
...and a long tail of other edge cases
More importantly, you’ve got a robust test suite verifying those protections and explicitly mapping them to requirements (ASVS controls, CVE classes, or in-house standards). If a regression is introduced, the package’s own CI/CD pipeline fails with a clear signal about which guarantee was violated.
But the client doesn’t stop there.
You also bundle application-facing static checks, perhaps ESLint or Semgrep rules, that verify the client is being used correctly. For example, the rules might enforce that the application:
Never imports
fetch,axios, orundici(pure server-side); OROnly imports
fetchoraxiosin modules that also importclient-only(hybrid)
Because these checks are versioned and distributed with the package, they can be enforced in the normal development loop and CI/CD pipelines of consuming applications.
For any given application release, this creates an auditable evidence chain: The package version contains evidence of its security guarantees, while the application’s pipeline generates evidence of passing the bundled static checks. Now you can attest that the application satisfies those security requirements.
That’s Composable Security Guarantees in practice.
Turning Composable Security into a System
As of today, I’ve implemented this process once as a proof-of-concept (PoC) internal tool. While not a reference architecture by any means, it did successfully illustrate that this concept can work end-to-end.
The PoC’s foundation was a simple Node.js Commander CLI that had responsibilities on both sides of the boundary (package and application).
On the package side, the tool:
Generates a small, versioned manifest describing the package’s external checks and mapped requirements, based on folder and metadata conventions
Provides an
npmbuild hook that ensures any necessary build steps for static checks are performed (e.g., creating a CodeQL bundle) and the manifest and check artifacts are included indist/
On the application side, the tool provides commands that:
Scan top-level dependencies for manifests and install the manifests and checks into a local dotfolder at the application root
Ensure the relevant SAST tools (Semgrep, CodeQL, ESLint, etc.) are configured to source the installed checks
Verify that the locally installed checks remain in sync with the versions declared by the installed dependencies
That basic CLI was enough to take the proof-of-concept surprisingly far.
By leaning on existing metadata conventions in tools like Vitest, Semgrep, CodeQL, and ESLint, I was able to map both internal tests and application-facing checks to specific security requirements. Existing runners/reporters were able to surface those requirements in violations (usually not the default runner/reporter, though).
For a production-grade system, however, this pattern and its tooling would ideally integrate much more deeply with the existing security ecosystem.
Check results could be emitted into SPDX or CycloneDX SBOMs
Findings could be standardized via SARIF for broader tool compatibility
Distribution could leverage the centralized rule repositories for Semgrep and CodeQL
While the current SBOM specifications don’t provide a native way to model contingent attestations (i.e., these requirements are satisfied if these other checks pass), they are designed for extensibility. Tooling built around Composable Security Guarantees could be extension-aware and handle the contingencies properly.
Critically, none of this requires inventing an entirely new security stack. It’s a new way of connecting existing tools and artifacts to provide stronger safeguards against agentic risks. Rapidly changing landscapes demand this sort of rapidly deployable response.
What’s Next?
There’s no ducking it: More exploration is needed to understand which classes of requirements are amenable to secure, isolated tooling and machine-verifiability. SSRF was a very simple example. Not all of them are. The size of that footprint places a hard limit on how far Composable Security Guarantees can go.
Other open questions include:
Are there other ways to provide composable security guarantees for use cases that don’t map well to static analysis?
How can off-the-shelf rules from tools like Semgrep, CodeQL, ESLint, etc. be incorporated into this system?
How should verification edge cases be handled (e.g., alternative checks, configurable checks, disabled checks, partial adoption)?
Should this system support auto-fixes, agent guidance, or other mechanisms for remediation?
If you’re working on agentic development, security, static analysis, or adjacent areas, I’d love to hear your thoughts. This idea would benefit greatly from open collaboration and real-world testing.
Lastly, if you’ve read this far and happen to be hiring, I’m currently looking for my next opportunity.

