AI-Generated Code and the Security Review Gap

A colleague recently showed me a pull request where every file had been written by an AI coding assistant. The code was clean. The tests passed. It went through review in twenty minutes. Nobody noticed that one imported package hadn't been audited, that the JWT verification was missing an algorithm check, or that the pinned dependency version had a CVE disclosed six months earlier.

The PR merged. Nothing broke — not yet. But the conditions for a future incident were quietly introduced that Tuesday afternoon.

Why the training data problem is subtle

Models are trained on code written to solve problems, not write security tutorials. Some of it was written when a particular library version was current or a cryptographic primitive was considered acceptable. The model learned from all of it.

A model asked to write JWT verification might suggest:

def get_current_user(token: str):
    payload = jwt.decode(
        token,
        settings.PUBLIC_KEY,
        algorithms=["RS256", "HS256"]  # accepts both
    )
    return payload.get("sub")

Accepting both RS256 and HS256 is the textbook algorithm confusion vulnerability. An attacker who knows your public key can sign a token with HS256 using that key as the HMAC secret. The model suggested both because it has seen examples where developers were being flexible or debugging. A reviewer focused on logic correctness sees a JWT decode and moves on.

The fix is one line — algorithms=["RS256"] — but it requires someone to ask the question.

The dependency surface expands without deliberate choice

When a developer reaches for a library, there is usually some deliberation. When an AI assistant writes an import statement, the model suggests what it has seen most frequently. The developer reads the code for correctness and accepts it. The dependency enters the project without anyone asking whether it should.

This creates three specific risks. More dependencies mean more supply chain attack surface. AI-suggested dependencies do not always exist — hallucinated package names give attackers a ready target via namespace squatting. And AI tools suggest version pins based on training data: if a CVE was disclosed after the training cutoff, the model will happily suggest the vulnerable version. The code works. The audit trail shows the version was pinned intentionally.

Speed changed the security calculus

Development velocity with AI tools is genuinely higher. The security question is not whether AI makes code faster. It is whether the review process that used to catch security issues keeps up.

The honest answer is that it does not, automatically. Review processes evolved when the bottleneck was writing code. When code appears faster than humans can review it with the same depth, something gives — usually the slow, skeptical read where you ask why something was done a particular way and whether a safer alternative exists.

What changes for automated tooling

If human review of AI-generated code is faster and shallower, and the code has a higher base rate of certain issues — outdated dependencies, subtly wrong cryptographic patterns, hallucinated packages — then automated scanning needs to cover the gap.

Dependency scanning on every commit. If AI tools generate version pins that reflect a past state of the ecosystem, SCA tooling needs to check the current state on every change. A weekly scan misses CVEs disclosed after the AI generated the code.

Secret detection in the PR. Models trained on real code have seen secrets in configuration files, test fixtures, and environment variable defaults. They reproduce those patterns. Scanning needs to run before approval, not after merge.

Static analysis for common AI mistakes. Algorithm confusion, disabled verification options, overly permissive input validation — these appear in predictable contexts. Rules written for these patterns catch issues that generic linting misses.

SBOM generation at build time. If the dependency graph is partly AI-generated and reflects a past state of the ecosystem, knowing exactly what is in your tree becomes more valuable. When a CVE drops, you need to know whether you are affected before you read about it.

The teams that get this right will not slow down. They route the speed gain through a security check that was already necessary and now has a clearer job to do.

Automated SBOM generation, SCA, and secret scanning are features of Lumstep — the platform this blog is part of.

AI-Generated Code and the Security Review Gap

Why the training data problem is subtle

The dependency surface expands without deliberate choice

Speed changed the security calculus

What changes for automated tooling

Or let Lumstep handle it.

Keep reading

AI-Generated Code and the Security Review Gap

Why the training data problem is subtle

The dependency surface expands without deliberate choice

Speed changed the security calculus

What changes for automated tooling

Or let Lumstep handle it.

Keep reading

The Miasma Worm: A Developer's Guide to the Supply Chain Attack Rewriting the Rules

The Anatomy of a Credential Leak: From Forgotten API Key to Breach