Secure Coding Education Throughout the Entire SDLC is Critical

Dec. 18, 2023
In the race to embrace GenAI for coding, let’s not forget the importance of human reviews

It sometimes feels like generative AI is taking over the world. No one can predict what the true economic and social impact of the technology will be—although it’s already emerging as a game changer in areas like customer service, sales and marketing, R&D, and coding. On the latter, we’re already seeing tools hit the market to help supercharge the productivity of DevOps teams. But this is where the danger lies. Gartner research shows over a third of organizations are deploying AI-powered application security tools to help identify and correct any issues arising from GenAI-generated code.

There is a lack of maturity in the AI space right now, which can cause this approach to backfire in devastating ways. If AI misses a problem in the first place, will it be able to pick it up in review? Instead of tool chasing, we should remember the basics. GenAI is great but only if we recognize its limitations and ensure humans are on hand to perform code reviews. This, in turn, will require security knowledgeable humans and therefore continuous and deep secure coding education.

How GenAI is Transforming Software Development

How transformational could GenAI be? McKinsey estimates it could add $2.6 trillion to $4.4 trillion annually across 63 use cases. In software engineering specifically, its direct impact on productivity could range from 20-45% of current annual spending.

That comes from accelerating initial code drafts, code correction and refactoring, root-cause analysis, and the generation of new system designs—as well as freeing up engineers’ time to focus on higher-value tasks like code and architecture design. There are claims that developers using Microsoft’s GitHub Copilot were able to complete tasks 55% faster than those not, but these success stories cannot distract from the risks.

GenAI may use outdated third-party libraries containing security vulnerabilities and fail to follow the correct standards when generating code, leading to insecure software. A study from the University of Quebec found that in only five of 21 cases did the GenAI model produce secure code at the first request. And when told to correct the code, it did so in just seven cases. A separate paper from Stanford University claimed that developers using GenAI produced less secure code than those not using the tools, and believed their code was more secure than if written manually.

Chasing Tools Down a Dead-End Lane

To improve code security, organizations are increasingly investing in application security tools. The risk of doing so is that they’ll simply create an extra layer of false comfort as organizations continue to place too much trust in tools which should be the net if other processes fail and not a solution unto itself. The Gartner study cited earlier claims that 34% of organizations are using or currently implementing unspecified “AI application security” tools to address risks related to GenAI code. In terms of their security concerns, most point to leaked secrets (57%) and incorrect or biased outputs (58%).

It can be tempting to see the solution to technology-related challenges like this as simply implementing more technology—this time in the form of AI-based appsec tools. But despite what vendors of such tools may proclaim in their marketing, this might create more problems than it solves.

This is just another example of the tool chasing that has become endemic in cybersecurity—leading to IT teams buckling under the weight of too many ineffective and little-used point solutions. If GenAI made a mistake that led to leaked secrets, incorrect outputs, or vulnerable code, what’s to say that the AI tool performing the review won’t do the same? The bottom line is that AI tools often make mistakes when coding because they lack the same level of context that a human expert can bring to the table. It’s also true that if large language models (LLMs) are trained on code containing mistakes, they will absorb and potentially repeat these errors.

Unfortunately, there was no recognition of these challenges in a recent Presidential Executive Order on “Safe, Secure, and Trustworthy Artificial Intelligence.” In that document, the White House pointed to the importance of establishing “an advanced cybersecurity program to develop AI tools to find and fix vulnerabilities in critical software.” It did not acknowledge that the focus should initially be on designing secure code in the first place, which also means ensuring GenAI-generated code is as secure as possible.

Time For Continuous Training

We need code generated by GenAI to be reviewed by human eyes. Ideally, this should be done by developers with application security expertise, who know what typical errors and omissions to look out for and are armed with a security checklist to add another level of rigor to these reviews.

Better training can be useful here, first by helping developers better understand and spot typical GenAI errors, and then by leveraging their in-depth training and newfound knowledge to help developers fix them early in the software development lifecycle (SDLC). Lessons can empower developers to use AI assistants more productively, safely, and effectively in the first instance. Understanding what queries and prompts to input and how best to use helper functions could result in more secure code—which in turn means less pressure on testing and review teams. That’s the kind of security by design that all organizations should aspire to.

This is an ongoing journey with much more road left to run, so any training programs must be continuous. This will ensure DevOps teams have the latest knowledge at their fingertips to reduce cyber risk and improve the quality of code as GenAI and the threat landscape evolves. Training programs should also include everyone with a part to play in the SDLC—from QA to user experience – meeting each learner at their learning level. This takes pressure off developers and appsec and builds a culture of security in the organization.

“Organizations that don’t manage AI risk will witness their models not performing as intended and, in the worst case, can cause human or property damage. This will result in security failures, financial and reputational loss, and harm to individuals from incorrect, manipulated, unethical or biased outcomes,” warned Gartner distinguished VP analyst, Avivah Litan, in their recent study.

Best not to take that risk by assuming AI security tools can mitigate GenAI coding risks. Instead, let’s bring security by design back to basics: starting with continuous secure coding education for every member of the SDLC.

Michael Burch is the Director of Application Security at Security Journey. Burch has diverse work experience that spans various industries and roles. Michael currently serves as the Director of Application Security at Security Journey, where they are responsible for creating vulnerable code examples and managing a team of application security engineers. Before this, they worked as an Application Security Engineer at the same company, developing security training content and leading the creation of secure coding best practices.