Edition 30: The SDLC is changing and so will AppSec (again)

Every time software development changes, so does AppSec. The LLM-powered coding era will be no different.

Jul 22, 2025

LLM-powered tools are changing code generation. What does this mean for the SDLC and how does that impact AppSec?

It’s an exciting time to be a software developer. There seems to be a new tool/model/approach every few months that promises to dramatically improve productivity. Sometimes it works, and other times it’s all hype.

While there are many amazing use cases for LLMs, code generation has probably been the area with the clearest impact. From the heady days of GPT-powered GitHub Copilot to the vibe-coding era of Cursor to the near-complete autonomy that Claude Code promises, it feels like we have gone through three generations of improvements in under two years.

While there is active debate on the efficacy of this phenomenon, there’s one particular hot-take that troubles me, the take that “software engineering has been completely transformed”. I think this is premature at best and crystal-ball gazing at worst (i.e., no real proof except vibes).

Slide from my presentation at TieCon SV in May 2025

The above slide is from a talk I gave at TieCon Silicon Valley in May 2025. I argued that for software teams of reasonable size, while the SDLC is undergoing a transformation again, the transformation is not yet complete. Code generation has definitely transformed, but it still gets in to the same code repo (GitHub/GitLab), goes through a similar review process (with some AI sprinkled on, but AI code reviews haven’t had the same levels of success that AI code generation has had), and goes through the same CI & CD steps before being live on production.

This may not be entirely true for small development teams or indie-hackers. Tools like Bolt, Lovable, & Replit are transforming code generation to code deployment. However, in my conversations with Developers and AppSec teams, this transformation hasn’t reached serious software teams at scale.

Once the transformation is complete, what will the SDLC look like? My super-duper-hot-take is “I don’t know.” Over the ~2 years that I’ve built an LLM-powered product company, I’ve learned that it’s crucial to respond to changes in AI, but there isn’t a great deal of value in predicting where things will ultimately end up. But here’s what I do know. Any new SDLC will need to answer (at a minimum) the following questions:

How should we manage prompts?

There are two kinds of prompts that are relevant: Prompts used by developers to generate code and prompts stored in GitHub that will be used by your production systems (or “Agents”, if you are all fancy). This point is more about the latter.

Today, prompts are stored like code or config is. This is a problem given that versioning works differently in prompts than in code. They also need a different cadence of “testing”. While we have reasonable frameworks for unit testing, what’s the framework for writing Evals when prompts change? As of today, there are only a handful of battle-tested Eval protocols, and most of them work well for 1 use case: chatbots.

When the SDLC is transformed, we will have battle-tested answers to how prompts should be stored (versioning) and how they should be tested (evals)

What should automated code-review and prompt-review look like?

Today, LLM-powered code review mimics human reviewers. They aim to look for errors, lack of adherence to standards (internal and external), possible security issues, and so on. While this is a good start, LLM-generated code brings different additional risks to the forefront, such as:

Are there risky changes in this code that are unnecessary?
Is there a possibility that this change may lead to the consumption of a large number of tokens (and hence burn through cash)?

To be clear, these risks also exist with human-written code; it’s just that humans learn differently and make different kinds of mistakes. For instance, it’s apparent to a human developer that a small change request cannot possibly require 12,000 changes, which rethinks software design. This does not need to be explicitly stated to a human developer. But when prompted insufficiently, even the best models can make such mistakes today.

What does this mean for AppSec?

I have a theory that everytime SDLC changes, AppSec changes in subtle ways. The surest indicator of that is the fact that we’ve had new leaders in AppSec with every iteration of the SDLC.

We’ve had a new leader in AppSec tooling every time the SDLC evolved

Some things remain the same with each iteration:

You have to secure the changes made
You have to watch-out for supply chain issues when you introduce 3rd party code/libraries/APIs and watch-out for new defects introduced in those 3rd party components in production

A few things change with each generation of the SDLC:

The cadence of code being written and deployed (we are getting faster with each iteration)
There is new attack surface added with each generation (e.g.: the CI/CD pipeline itself can be a target for attacks)

While the details may change, I expect these trends to hold with LLM-powered SDLCs too. However, I do think there will be a few other trends, specific to this generation of changes, that needs to be addressed:

The goal of new SDLCs is to reduce or remove bottleneck in the process of writing, reviewing, and deploying code. In the pre-LLM world, the largest bottleneck in most modern teams was writing code. Engineers may take hours/days to write code. The reviews take minutes/hours, and deployment (in many cases) happens in minutes. With LLM-powered coding, the bottlenecks will move from code generation to code review. Anecdotally, I’ve heard of Staff and Principal engineers being drowned in code reviews, many times of AI slop. In this particular case, the bottleneck on code generation is not removed. It’s just moved to code review. In the medium-term, I expect SOPs and new tooling to help clear this bottleneck
The “thinking” is moving from coding to prompting/design stage. A bulk of security controls are implemented in code. While some of these are architectural issues, many of them are implementation details that good developers know about. When we abstract away code generation, we will have to provide these specific instructions about these implementation details to Cursor/Claude Code. That happens in the Prompting stage for feature-specific details and in config (e.g.: security.md) for company-wide or repo-wide details. AppSec programs and toolings need to address this shift.
There will be a lot of code generated by non-developers and we must assume this will be more insecure in comparison . In the last 10-15 years, the AppSec community has put in a lot of effort to engage developers. We’ve had varying amounts of success, but we’ve definitely moved the ball. But there’s been minimal effort in engaging with Product Managers and Designers, who are now expected to push code.

Won’t cursor just replace all of AppSec?

OK, so I know I said earlier in this post that I don’t want to make predictions on AI, but if you put a gun to my head and forced me to take a bit, I’d say Cursor (or other LLM-powered tools) will not just write magically secure code and replace all of AppSec. Here’s why:

There is no proof so far that code generated by LLMs are more or less secure than human-written code. Depending on your biases, you can find studies that point in either direction, but there is no definitive answer.
But can you not “prompt” Cursor to write secure code? Well, yes and no. We have found in our testing that providing security requirements to Cursor pushes it to write more secure code, but we now have to rely on the quality of that prompt (see point about thinking changing from code to prompt). Unless those requirements are well-written, things don’t improve.
Tools like Cursor (or Claude Code) focus on code generation, but AppSec issues can emerge from other places too (supply-chain, cloud configuration etc.).
On a slightly tangentially note, a core principle of all of risk management (including AppSec) are maker-checker systems. The person making the system should not be the one checking the system. Security issues arise because of biases from systems, assumptions made by humans/tools etc. You can’t expect the tools that have these biases to also somehow check for these biases and remove them. Nothing I have seen from LLMs tell me that they are beyond these.

That’s it for today! What changes are you seing in the SDLC because of LLMs? How can AppSec keep up pace? Is Vibecoding making you more productive or creating brain rot? Let me know! You can drop me a message on Twitter (or whatever it is called these days), LinkedIn, or email. If you find this newsletter useful, share it with a friend or colleague or on social media.

Boring AppSec

Edition 30: The SDLC is changing and so will AppSec (again)

Every time software development changes, so does AppSec. The LLM-powered coding era will be no different.

What does this mean for AppSec?

Won’t cursor just replace all of AppSec?