Protecting GenAI: A Guide to OWASP Top 10 for LLMs

LLMs (Large language models), the technology that powers generative AI and agentic AI, have opened a host of opportunities for businesses to operate more efficiently and scalably. But they’ve also created a variety of novel LLM security risks, such as:

Prompt injection vulnerabilities, which attackers can use to “trick” LLMs into exposing sensitive data or performing malicious actions.
The “poisoning” of training data with malicious information is designed to manipulate model behavior.
Insecure management of embeddings, another potential vector through which attackers could exfiltrate sensitive information.
Lack of sufficient access controls for restricting how users interact with LLMs or AI agents.

Unfortunately, traditional application security solutions don’t adequately address all of these risks because most of these issues don’t affect traditional applications.

This is why the OWASP (Open Web Application Security Project), a nonprofit devoted to cybersecurity advocacy and guidance, has developed a list of the Top 10 Best Practices for LLM Security as part of its genAI security project. While OWASP’s recommendations aren’t enough to guarantee a strong security posture on their own, they’re an excellent starting point for AI security.

To provide guidance on putting the OWASP Top 10 for LLM to use, this article breaks down exactly which practices OWASP recommends for protecting LLMs. It also explains how to apply these practices to the real world.

The OWASP top 10 for LLMs include:

Prompt injection
Sensitive information disclosure
LLM supply chain risks
Data and model poisoning
Improper output handling
Excessive agency
System prompt leakage
Vector and embedding weaknesses
Misinformation
Unbounded consumption

First released in 2023, the OWASP Top 10 For LLM list covers (you guessed it!) ten areas of concern related to LLM security. The categories overlap a bit, but each mostly focuses on a different type of risk or potential attack.

These areas of focus have evolved significantly since OWASP first released its list. For example, the group has added guidance related to LLM supply chain security. Let’s take a look at each OWASP LLM security recommendation in detail.

A detailed look at OWASP Top 10 for Large Language Model apps

1. Prompt injection

Prompts are the requests or instructions that users issue to LLMs. Using carefully crafted prompts, attackers can potentially bypass these guardrails.

Prompt injection attacks typically have one of two goals:

Exfiltrate sensitive or private data: For instance, an attacker could use a malicious prompt to request information that the LLM knows about another user or business.
Carry out malicious actions: If the LLM is connected to AI agents (meaning software tools or applications that can perform actions in response to instructions from an LLM), prompt injection attacks could manipulate the agents in ways that cause harm. For example, a model could be “tricked” into telling an agent to delete a sensitive file.

The simplest way to mitigate prompt injection risks is to filter prompts by inspecting them before they reach the LLM. Filtering makes it possible to identify and block malicious requests.

It’s also possible to filter model output. Doing so doesn’t prevent prompt injection per se because malicious prompts can still reach LLMs. But it can help to mitigate prompt injection’s impact by filtering out risky responses.

2. Sensitive information disclosure

Large language models frequently have access to vast amounts of private information, ranging from proprietary code to personally identifiable information (PII) and far beyond. This information may be included in their training data. It could also be made accessible to models via techniques like retrieval-augmented generation (RAG), which feeds them additional information that was not part of the training data. In addition, data that users share with models when feeding them prompts could include sensitive information.

Normally, LLM developers design models to prevent them from leaking sensitive information. For example, they may instruct models to recognize PII and avoid sharing it in output. But due to issues like flaws within model designs, insufficient controls related to user access or malicious prompts, sensitive information can leak out.

One way to mitigate this risk is to control which data models can access in the first place by filtering sensitive information out of training data and RAG datasets. Model prompts can also be inspected for sensitive information, which can then be blocked before it reaches the model.

However, since it’s difficult to guarantee that models will never leak sensitive data, monitoring model output is also an important mitigation technique. Organizations should inspect output for sensitive data and, if possible, block it before it reaches its destination.

3. LLM supply chain risks

Few organizations build LLMs entirely from scratch, which is a tremendously complex endeavor. Instead, they often start with third-party “foundation models,” which they then modify to suit their unique use cases or needs. They may also use third-party data to support tasks like training.

While these practices can speed up AI development, they also introduce significant risks because vulnerabilities within third-party LLMs or related resources can impact the organizations that use them.

The best way to mitigate LLM supply chain risks is to follow the best practices that apply to software supply chain security in general: Avoid using resources from untrusted third parties, and carefully scan or inspect resources you do use (no matter how much you trust the source).

4. Data and model poisoning

LLMs work by parsing large volumes of data and recognizing patterns – a process known as training. The contents of the data play a key role in determining how models behave and which output they generate.

This means that, by introducing malicious data into training data sets, attackers can potentially manipulate data behavior.

In addition, attackers may also inject malicious code into models as another way of “poisoning” their operations.

One step toward mitigating these risks is to secure LLM supply chains, since training data and model code obtained from third-party sources could be a vector for attack.

It’s also a best practice to scan the data and code for signs of malicious content. Finally, strong access controls should exist to restrict who can modify model data and code.

5. Improper output handling

Improper output handling occurs when a model’s output is not properly validated and sanitized before passing it to users.

For example, if a model includes PII in an output and no scanning process takes place to detect the PII and assess whether it should be accessible to the model user, this would be a case of improper output handling.

Improper output handling doesn’t stem from flaws in LLMs themselves; it has to do with the way organizations manage the output generated by LLMs.

The best way to mitigate this category of risk is to ensure that model output is systematically inspected to detect sensitive data and, where necessary, block or anonymize it before it reaches users.

6. Excessive agency

Excessive agency is a broad term that refers to any type of instance where an LLM performs actions that should not be available to it. The actions could be the result of malicious activity, like prompt injection. But they could also include behavior (like deleting a file or accessing a sensitive database) that is outside the scope of an LLM’s intended use case.

Models don’t perform actions on their own; they rely on integrations with external systems, like AI agents, to take action on their behalf.

For this reason, the best way to prevent excessive agency is to restrict the integrations or extensions available to models.

Monitoring model interactions with external tools can also help.

7. System prompt leakage

LLMs often contain embedded instructions, known as system prompts, that guide overall model behavior. Normally, end-users should not be able to discover or manipulate system prompts. Only model developers have a legitimate reason to control this information. However, through techniques like prompt injection, attackers may be able to abuse system prompts.

Mitigating system prompt leakage risks primarily involves designing models in ways that minimize the amount of sensitive information included in system prompts. Ideally, technical information, such as access credentials, will be stored externally, rather than integrated directly into a model.

In addition, filtering input and output can help to detect malicious prompts (and their resulting outputs) that aim to manipulate system prompts.

8. Vector and embedding weaknesses

Most modern LLMs process data using vectors and embeddings, which they use to translate text, images or other data into numerical information. By manipulating vectors and embeddings, attackers can potentially also manipulate model behavior.

Since these attacks are essentially prompt injections via special channels, the best way to protect against them is to use the same techniques that are effective against prompt injection in general: Monitoring input, including hidden input. Output filtering, too, can mitigate the risk of attackers obtaining sensitive data through vector and embedding weaknesses.

9. Misinformation

Misinformation risks occur when an LLM shares information that is not true, typically because it “hallucinates.” Hallucination is inherent to all LLMs, and it’s impossible to guarantee that a model will never generate misinformation.

A first step toward mitigating misinformation risks is to choose models with lower rates of hallucination. Fine-tuning models and providing them with additional information via RAG can also help to reduce hallucination rates, since hallucination is more likely to occur when a model doesn’t have access to all of the information it needs to respond to a user prompt accurately.

However, since hallucination is impossible to prevent entirely, businesses should also design processes in ways that mitigate the harm that misinformation from models could cause. For example, rather than trusting models to make high-stakes decisions on their own, an organization could keep a human “in the loop” by requiring manual approval.

10. Unbounded consumption

Unbounded consumption refers to the possibility that models may process information or generate output without limit, to the point that the organization that operates the model experiences harm. This can happen due to flaws in internal model design, but it’s more commonly the result of prompts (which could be malicious or benign) that place a very heavy computational load on a model.

Unbounded consumption could cause a model to stop responding to user prompts because it exhausts all of the resources available to it. It can also cost an organization a lot of money, since companies have to pay for the CPU and memory resources that a model consumes.

Preventing unbounded consumption is relatively straightforward: When deploying models, organizations should take steps to restrict how many prompts users can issue within a given time period, and how much data they can include in each prompt. It can also help to monitor model resource consumption levels and halt the processing of prompts that are causing a model to consume excessive CPU and memory.

Beyond OWASP Top 10: Future outlook and evolving AI threats

The OWASP Top 10 Best Practices are a great starting point for helping to secure LLMs against the threats they face today. But one of the tricky things about AI security is that the AI technology landscape is evolving very rapidly, and the protections that suffice today may not be enough in the future.

For example, as multimodal capabilities (which allow models to process information in multiple forms – text, images, video, and more) become common, threat actors could potentially find new ways of injecting prompts by uploading non-textual data to models. Likewise, AI supply chains are growing increasingly complex as more and more AI software vendors and open source projects emerge, which means organizations will need to contend with a growing range of supply chain risks.

Agentic AI, too, represents a fast-changing frontier that brings many novel security risks with it. When businesses allow AI agents to carry out actions autonomously, they face challenges like accidental deletion of critical data (which has already happened to one unfortunate company), malicious prompts against agents, and supply chain attacks that target the code that powers agents (which, like LLMs, often depend on a variety of third-party software development resources).

This is why it’s important to keep following the OWASP Top 10 recommendations for LLMs as they evolve. Tracking the landscape of generative AI security solutions (which OWASP monitors in a separate project, and which Aqua can also help to navigate) is critical, too, since the types of protections that businesses need to secure AI will evolve along with AI technology and risks.

A proactive approach to AI security

Being proactive has always been a core tenet of effective cybersecurity. But in the age of generative AI, the ability to identify and respond to threats before they turn into breaches is more important than ever – which is why IT admins, software developers, and cybersecurity professionals should pay close attention to the OWASP top 10 LLM security risks, which represent some of the most up-to-date guidance about LLM security.

It’s also why organizations should bolster their cybersecurity arsenals by investing in both AI security posture management (AI-SPM) solutions and runtime security agents that can help protect AI-based workloads when they are in operation. Used alongside traditional application security and runtime security tools, AI security tools help to ensure that businesses can leverage AI fully to enhance efficiency and scalability, while keeping AI security risks in check.

FAQs

What is the OWASP Top 10 for LLM Security?

The OWASP Top 10 for LLM Security is a framework developed by the Open Web Application Security Project to identify and address the most critical security risks affecting large language models and generative AI systems. It provides practical best practices to help organizations secure their AI applications against emerging threats such as prompt injection, data poisoning, and insecure plugin or embedding management.

Why is the OWASP LLM Security Top 10 important?

Preventing unbounded consumption is relatively straightforward: When deploying models, organizations should take steps to restrict how many prompts users can issue within a given time period, and how much data they can include in each prompt. It can also help to monitor model resource consumption levels and halt the processing of prompts that are causing a model to consume excessive CPU and memory.

What are examples of LLM security risks addressed by OWASP?