Edition 26: Scaling Security Design Reviews and why the time is now
"Developer enablement" is all the rage in AppSec and rightly so. The best time to do it is just before they start building.
The best AppSec teams spend a lot of effort enabling builders (Developers, DevOps, Architects, Product teams, etc.) to build securely. This usually means two things:
Build artifacts that help all developers: Secure-by-default libraries, security standards, security champions program, and developer security training.
Detect security defects introduced by developers as early as possible: SAST, DAST, IAST, Manual PenTesting, etc.
While the first category truly enables developers, it is not just-in-time. The latter does integrate with the SDLC but waits for developers to make mistakes before pointing them out.
There’s a third, more critical category: Provide developers contextual feedback on the precise feature they are building before they start writing code.
Hypothesis
Providing your developers with contextual security requirements is a great way to avoid design flaws and reduce obvious security bugs later in the lifecycle. Security Design Reviews (SDR) are a great way to do this. Thanks to Gen AI, SDR is having its Snyk moment (i.e., seamlessly integrate a Security assessment in the developer workflow, with minimal, manual involvement from Security). AppSec teams must perform automated SDR on every new feature their developers decide to build.
A 3-phase approach to building an SDR program
Before we dive deep into the “How,” here are a few things to keep in mind:
Define your goals. Do you want to “enable” developers or “enforce” these requirements? If it’s the former, your goal is to inform the developer and be done with it. If it’s the latter, you may need to find a way to “block” pipelines. This is much harder in the pre-coding phase.
FWIW - I am not a fan of enforcing requirements this early in the lifecycle. AppSec teams lack all the context devs do, and we are better off working on passing on the information and letting them take the call on what needs to be implemented. But depending on the company’s culture, YMMV.
Who is doing the heavy lifting to generate the requirements? Developers, AppSec teams, or Security Champions? Is the goal to build a self-serve platform for developers or a platform to help AppSec teams scale the program? The program's UX will alter depending on the answer to this question.
It’s essential to understand how developers and product teams document their plans. Are all new features documented well in Jira? Is there a PRD and TechSpec associated with every new feature? Is everything a Slack thread (or god-forbid, an MS Teams thread? :P), or is all planning happening on a Whiteboard inside a conference room, and there is no trail of anything that happened?
As with any new initiative, it’s best to break the problem into smaller phases and learn from each. Here’s how I’d recommend building the program:
Phase 1: Understand the landscape and build stepping stones
Understand where developers and product teams track new features. If you are lucky, there may be a central location where things are tracked (Sharepoint, Google Drive, and Jira are popular choices).
Decide the right stage to perform the design review. Unlike code, planning documents (such as design documents or PRDs) are not always in a “done” state. They usually go through drafts and reviews before they are considered “final.” You will have to choose when to assess/scan the document. There is no correct answer here. It may be best to pick a stage and run with it. It’s pretty easy to change this, so the cost of inaction may be higher than picking the wrong time.
Risk-rank each new feature. Define a “high-risk” feature and leverage Gen AI to determine if your scanning input fits the bill. This activity can help you understand how many new features need deeper security analysis when running continuously.
Set a flag to indicate that the security team will perform manual assessments on High-risk features. For now, you can ignore medium and low-risk features. In other words, the tool's purpose (in this phase) is to highlight high-risk features. While this may seem like a small win, seasoned AppSec folks will know how hard it is to scale this with precision (and without manual inputs from developers).
Leverage open-source tools like the Open AI Security team’s SDLCbot to do the heavy lifting on the AI portion. Be sure to modify the prompts to suit your company’s needs.
Phase 2: Auto-generate threats and requirements.
Now that you can assess each new feature, figure out what the outcome of the assessment should be. Is it a list of threats using STRIDE? A list of security requirements for developers? Just a security-relevant summary of the document?
Once you have the above, generate assessment results that match your requirements. Projects like STRIDEGPT can help you get up and running quickly, but you must significantly modify the prompts and how input is handled to make it work for your company.
In general, single-prompt Gen AI apps are excellent for PoCs but hard to productionize. Use these tools to prove a concept. Only build the full-blown tool if the upside is significant enough to justify the effort needed. More on this later in the post.
The input (document, Jira ticket, etc.) will likely not contain sufficient information to produce the perfect output. Ensure the tool is built so that you can publish open questions or request more information from the developers.
Make it easy to send the feedback to developers/product teams.
Pet peeve: Don’t make developers and product teams visit your custom portal to understand requirements. They hate it and probably will find ways to avoid consuming them. Instead, publish them where they typically read requirements.
Phase 3: Build a feedback loop
Enable back and forth between the tool and the builder. Results should change if the document changes.
Make it easy for developers to answer the open questions and automatically update results based on the answers.
A few thoughts before we end:
Why hasn’t this been done before? The input to SDR is unstructured. Traditional security tools need structured data to run rules against (SAST→code, DAST→traffic, and so on). All this changes with Gen AI. LLMs can extract context from unstructured data such as documents and architecture diagrams. A lot more needs to be done to scale SDR, but the underlying technical challenge is now a solved problem.
PoC v/s Production: Building a compelling PoC will be simple, but building a full-blown solution will require human and tech investment. Your best bet may be to build a PoC, show it to your stakeholders, and gauge whether a full-blown product will indeed be helpful for your company. If yes, commit sufficient resources and build a full-blown solution. If you have an ML team in your organization, involving them now may be a good idea (you don’t need them for the PoC).
Build v/s Buy: Over the last nine months, I have spoken to hundreds of folks in AppSec teams, startups, and large Security companies, who all agree that we can use Gen AI to scale threat modeling. There is no doubt that we will have multiple companies that will provide this offering. Having said that, if you have a security engineering team with reasonable dev + LLM chops, this problem can be solved internally. As discussed in edition 9, you should start with an attempt to “build” and move on to “buy” only if the trade-offs aren’t worth it.
Do you have the Gen AI chops needed to build this?
In this post, I am not going too deep into the Gen AI parts of building SDR. There’s much to consider, from Eval frameworks to having sufficient training data (also called “ground truth”) to test your tool against. We will dive deeper into this in a separate post. But for now, it’s important to know that if you are part of an internal security team that has a culture of building custom solutions, automated SDR is now a solvable problem.
Sponsor
Everyone understands the value of threat modeling, but have a hard time scaling it. Seezo is on a mission to leverage Gen AI to automate threat modeling to generate security requirements for developers, potential threat reports for your AppSec team, and a penetration testing checklist for your PenTesters. Sign-up for early access or schedule a call to learn more about us.
In conclusion
Scaling threat modeling is an idea whose time has come. Implementing automated Security Design Reviews to provide developers with security requirements is a quick win. Beyond the apparent benefits for developers, these techniques can also help meet compliance obligations, especially in industries like healthcare and fintech.
That’s it for today. Do you think the benefits of Gen AI are overblown? Are there other ways to scale threat modeling? Let us know! You can drop me a message on Twitter (or whatever it is called these days), LinkedIn, or email. If you find this newsletter useful, share it with a friend or colleague or share it on your social media feed.