Aqua Blog

OPA Gatekeeper Bypass Reveals Risks in Kubernetes Policy Engines

OPA Gatekeeper Bypass Reveals Risks in Kubernetes Policy Engines

Implementing Kubernetes securely can be a daunting task. Fortunately, there are tools in the K8s toolshed that provide out-of-the-box solutions using a single click. One such tools is OPA Gatekeeper. It is a great out-of-the-box security checkpoint to enforce security policies on Kubernetes. But are users using it correctly? Do they understand its limitations? Our new research says not necessarily!

In this blog, we dive into the potential risks of Kubernetes policy enforcement, focusing on how seemingly secure rules, such as those used in OPA Gatekeeper, can be bypassed if not carefully configured. We uncover ways to bypass the k8sallowedrepos policy and demonstrated how minor misconfigurations, such as missing trailing slashes, can open the door to unauthorized actions.

We explore alternative solutions like Kyverno and Kubewarden, introduce a more robust policy we developed, and share practical recommendations for securing your policies. This blog will help you proactively strengthen Kubernetes security and avoid common pitfalls that could compromise your cluster.

What is OPA Gatekeeper?

OPA Gatekeeper is an admission controller that validates requests to manage Kubernetes resources (create, update and delete) in a cluster using the Open Policy Agent (OPA). Gatekeeper allows you to apply security policies to help meet your organization’s compliance and security requirements.

Gatekeeper introduces two key concepts that provide administrators with powerful and flexible control over their cluster: Constraints and ConstraintTemplates, both of which are inherited from the Open Policy Agent.

  • Constraints represent your security policy by defining the conditions and enforcement requirements.
  • ConstraintTemplates are reusable statements, written in Rego, that provide the logic for evaluating specific fields in Kubernetes objects based on the requirements set in constraints

Using Gatekeeper, Kubernetes administrators can improve control over their clusters by defining policies (custom constraints and constraint templates) to meet specific needs. Alternatively, they can use a standard library of constraints and templates available in the Gatekeeper repository for quick adoption and enforcement of common policies.

OPA Gatekeeper architecture for Kubernetes policy enforcement

These policies don’t only serve as guardrails for enforcing best practices, security, and compliance, but also act as a line of defense against attackers who may already have access to your cluster. By enforcing strict policies, Gatekeeper can prevent malicious actions, such as unauthorized deployments or changes to critical resources, helping to mitigate potential damage.

Research Focus: Blocking Unapproved Image Sources in Kubernetes

Let’s go over an example of a common, widely used policy that OPA Gatekeeper can help us with. Imagine you want to restrict the repositories from which container images can be pulled in your Kubernetes cluster. You can use the allowedrepos policy from the OPA Gatekeeper policy library. This policy ensures that any attempt to pull container images from unapproved sources is denied, protecting your cluster from potentially malicious containers.
After installing OPA Gatekeeper, users can apply the ConstraintTemplate for this policy to their cluster. This ConstraintTemplate contains the Rego logic that defines the restrictions.

The ConstraintTemplate for the k8sallowedrepos policy defines the structure and logic of the policy

The ConstraintTemplate for the k8sallowedrepos policy defines the structure and logic of the policy, implemented using Rego

Next, the user needs to create a Constraint YAML file that specifies the values the policy will enforce or validate. Let’s consider a very common scenario. The user wants to define specific registries from which images will be pulled to the K8s clusters. For instance, allowing only these two registries to be used: openpolicyagent repository on Docker Hub and the user’s private registry at myregistry.com. The user can define a Constraint file for the k8sallowedrepos policy:

The Constraint for the k8sallowedrepos policy specifies the registries and repositories to which the policy is enforced.

The Constraint for the k8sallowedrepos policy specifies the registries and repositories to which the policy is enforced. These values are defined under the repos parameter

This Constraint file enforces the defined restrictions, ensuring that any attempt to pull container images from unapproved sources is blocked, thereby enhancing the security of the cluster.

The user applies the Constraint and ConstraintTemplate files (1) to their K8s cluster. Afterward, attempts to pull images from registries not listed in the Constraint file will fail. For example, in (2), deploying the Nginx image from DockerHub fails as it isn’t from myregistry.com or the openpolicyagent repository

The user applies the Constraint and ConstraintTemplate files (1) to their K8s cluster. Afterward, attempts to pull images from registries not listed in the Constraint file will fail. For example, in (2), deploying the Nginx image from DockerHub fails as it isn’t from myregistry.com or the openpolicyagent repository

How to Bypass OPA gatekeeper’s k8sallowedrepos Policy

In the k8sallowedrepos policy, a security risk arises from how the Rego logic is written in the ConstraintTemplate file. This risk is further amplified when users define values in the Constraint YAML file that do not align with how the Rego logic processes them. This mismatch can result in policy bypasses, making the restrictions ineffective.

How the k8sallowedrepos Policy Works

The Rego logic for this policy is relatively straightforward. It retrieves the value of the image field from the pod the user is trying to deploy and verifies whether it matches any of the allowed values specified in the Constraints file. If the image value does not match the allowed list, an error is generated to notify the user.

Deploying the "malicious" image from attacker-dkr.io fails because it is not from myregistry.com or the openpolicyagent repository

Deploying the “malicious” image from attacker-dkr.io fails because it is not from myregistry.com or the openpolicyagent repository

For instance, in this policy, the following code is used to perform the check:

strings.any_prefix_match(container.image, input.parameters.repos)

This function checks whether the container image being deployed (container.image) matches any of the allowed repository prefixes defined in the Constraint YAML file (input.parameters.repos). While this approach may seem effective, relying on prefix matching for domains introduces significant security risks. Attackers can bypass this policy by using subdomains or similar prefixes to bypass restrictions, potentially allowing unauthorized images to be deployed.

Scenarios to Bypass the k8sallowedrepos Policy

In our research we found a way to bypass the k8sallowedrepos policy. If the user omits placing / at the end of the constraint repos value, this enables attackers to completely bypass the policy. We found plenty of examples in the wild.

Example of a misconfigured Constraint file with repositories missing a “/”

Example of a misconfigured Constraint file with repositories missing a “/”

Below we provide 2 scenarios/examples to this bypass.

1. Domain Bypass

A constraint template might intend to restrict users to pull images only from my-ecr.azurecr.io . However, due to the policy’s reliance on prefix matching, and if the domain is not terminated with a / (e.g., my-ecr.azurecr.io/), an attacker can create a subdomain like my-ecr.azurecr.io.attacker.com and host malicious images such as my-ecr.azurecr.io.attacker.com/malicious. This bypasses the policy entirely.

A misconfigured repo parameter lets attackers bypass the policy by using subdomains matching allowed repos

A misconfigured repo parameter lets attackers bypass the policy by using subdomains matching allowed repos

2. DockerHub Repository Bypass

Another common scenario involves constraints designed to allow images only from a specific repository on Docker Hub, such as openpolicyagent. If the constraint template specifies openpolicyagent as the allowed value without a trailing / it could inadvertently allow other similarly named namespaces or domains. For example:

  • Namespace bypass: openpolicyagent-attacker (on Docker Hub), as an attacker can create a new repository on Docker Hub with a prefix matching the allowed repos value.
  • Subdomain bypass: openpolicyagent.attacker.com
Bypassing the constraint by creating a new Docker Hub account

Bypassing the constraint by creating a new Docker Hub account

In both cases, the lack of precision in the policy logic permits unauthorized images to be pulled, potentially introducing malicious images.

Key Insights and Recommendations for Rego policies

Rego Function Risks: Using functions like endswith(), startswith(), any_prefix_match(), and any_suffix_match() can be risky when applied to domains, repositories, namespaces, and more. These functions leave room for exploitation because attackers can create variations that bypass these limitations.

The Importance of Proper Configuration: Users must be vigilant about correctly configuring constraints. When specifying domains or namespaces:

  • Always terminate domain names with a / if restricting access to a root domain (e.g., my-company-ecr.azurecr.io/).
  • Ensure namespaces are explicitly defined and scoped correctly (e.g., docker.io/openpolicyagent/ for Docker Hub).

Cloud Service Providers (CSPs) and OPA Gatekeeper

We expanded our research to CSPs to further understand the usage of OPA Gatekeeper. We found that several major cloud providers offer integrations or similar functionalities to enforce policies in Kubernetes clusters, leveraging OPA Gatekeeper or equivalent tools. Here’s an overview of how GCP, Azure, and AWS implement these capabilities:

Google Cloud Platform (GCP)

  • Google Cloud provides a managed, officially supported version of Gatekeeper called Policy Controller. The Policy Controller is built on the Gatekeeper open-source framework.
  • GCP offers a Constraint Template Library that includes templates like k8sallowedrepos, and more. This template allows users to restrict repositories from which container images can be pulled.

Microsoft Azure

  • Azure offers Azure Policy for Kubernetes that extends the functionality of OPA Gatekeeper.
  • Azure offers several built-in policies for Azure Kubernetes Service (AKS) to simplify governance and compliance. Examples of these policies can be found in their Azure Policy GitHub Repository.

Amazon Web Services (AWS)

  • AWS Elastic Kubernetes Service (EKS) does not provide a direct managed implementation of Gatekeeper.
  • While AWS does not include built-in support for OPA Gatekeeper, users can achieve similar functionality by manually deploying and configuring Gatekeeper or other solutions in their EKS clusters.

Alternatives to OPA Gatekeeper: Kyverno, Kubewarden and jsPolicy

We also inspected some other solutions that are similar to OPA Gatekeeper like Kyverno, Kubewarden and JSPolicy:

  • Kyverno: A Kubernetes-native policy engine that uses YAML, providing simplicity and seamless integration with Kubernetes resources.
  • Kubewarden: A WebAssembly based policy engine that allows policies to be written in various programming languages (Any compiled to WebAssembly)
  • JSPolicy: A Kubernetes policy engine that uses JavaScript or TypeScript for defining policies

You can learn more about the differences between these solutions

We found similar risks of misconfiguration with the Kyverno Policies. For example, let’s examine the following code snippet that could be deployed in Kyverno:

image: "{{ join '* || ' (lookup $policyName 'parameters' 'approvedRepos' nil .Values.policies) }}*"

This code appends * to all values in approvedRepos, e.g., my-registry.com becomes my-registry.com*

This policy uses a dynamic templating mechanism to generate a list of approved container image repositories. While it seems like a straightforward approach, it introduces a significant risk if the user-provided values in approvedRepos are misconfigured.

For instance, if a user specifies a domain like myregistry.com without appending a trailing /, an attacker could exploit this by creating subdomains or similar patterns, such as myregistry.com.attacker.com, because the code would match any image with the value myregistry.com*. This would allow the attacker to bypass the policy and deploy unauthorized container images.

Summary and Mitigations

In this blog, we explored potential risks in Kubernetes policy enforcement, focusing on the k8sallowedrepos policy in OPA Gatekeeper. Specifically, we examined the importance of automatically appending a trailing / to certain values to prevent bypasses. To address these risks, we collaborated with the OPA Gatekeeper security team to develop a new policy, k8sallowedreposv2, which supports exact image names and glob-like syntax for improved control. This enhanced policy offers an alternative to the prefix-only approach of k8sallowedrepos and is now available in the official gatekeeper-library for adoption in your cluster.

In addition to adopting the new policy, there are some important actions you should take if you are using the k8sallowedrepos policy or similar policies:

  1. Ensure Your Constraint Values Include a Trailing /: Verify that all constraint values are appended with a trailing / where required, such as for domains or namespaces (e.g., my-registry.com/ or openpolicyagent/). This prevents unintended matches with subdomains or similar prefixes that could bypass the policy.
  2. Avoid Over-Permissive Policies: When writing your own Rego/regex or other policies, avoid overly permissive or generic matching, as it can expose policies to bypass.
  3. Scan and Validate Your k8sallowedrepos Constraint Values: Use security scanning tools to validate your k8sallowedrepos constraints. Following our research, we enhanced Aqua Trivy to detect insecure configurations within these constraints. Trivy now flags missing trailing slashes and other risky patterns, helping you identify potential issues. Simply run a Trivy scan on your codebase to detect misconfigurations inside allowed repository constraint values.
    A Trivy scan for detecting misconfigurations in the constraint template

    A Trivy scan for detecting misconfigurations in the constraint template

    Alternatively, if you’re using the k8sallowedrepos policy of OPA Gatekeeper in your cluster, you can run the following command to scan and validate your defined constraint directly within the cluster.

kubectl get K8sRequiredLabels -o yaml | yq '.items[] | {"apiVersion": .apiVersion, "kind": .kind, "metadata": {"name": .metadata.name}, "spec": .spec}' > K8sAllowedRepos.yaml

 

Yakir Kadkoda
Yakir Kadkoda is the Director of Security Research at Aqua’s research team, Team Nautilus. He specializes in vulnerability research, uncovering and analyzing emerging security threats and attack vectors in cloud-native environments, supply chain security, and open-source projects. Before joining Aqua, Yakir worked as a red teamer. He has presented his cybersecurity research at leading industry conferences, including Black Hat (USA, EU, Asia), DEF CON, RSAC, SecTor, CloudNativeSecurityCon, STACK, INTENT, and more
Assaf Morag
Assaf is the Director of Threat Intelligence at Aqua Nautilus, where is responsible of acquiring threat intelligence related to software development life cycle in cloud native environments, supporting the team's data needs, and helping Aqua and the broader industry remain at the forefront of emerging threats and protective methodologies. His research has been featured in leading information security publications and journals worldwide, and he has presented at leading cybersecurity conferences. Notably, Assaf has also contributed to the development of the new MITRE ATT&CK Container Framework.

Assaf recently completed recording a course for O’Reilly, focusing on cyber threat intelligence in cloud-native environments. The course covers both theoretical concepts and practical applications, providing valuable insights into the unique challenges and strategies associated with securing cloud-native infrastructures.