Source Code Leaks: How to Avoid Them Before They Happen

Source code should be highly guarded. Still, there are numerous incidents of high-profile companies discovering their source code has been leaked.

Amit Sheps
September 7, 2022

What are Source Code Leaks?

Your source code is your most valuable asset. For software companies creating digital products, protecting that code is the number one priority. A blueprint to your business’s proprietary technology, it’s the foundation of your organization, with all its internals, dependencies, and components. Since organizations compete based on the robustness of their software, source code should be highly guarded. Still, there are numerous incidents of high-profile companies discovering their source code has been leaked.  

Whether exposed or stolen, leaked source code may not only give your competitors an edge in developing  new products, but also allow hackers to exploit its vulnerabilities. Unauthorized revelation of this code may give bad actors an inside look at important intellectual property and system data, allowing cyber attackers to deceitfully gather confidential user and corporate information via security exploits.

In this article:

Why Should You Should Worry about Code Leaks?

Company Reputation: Companies unable to protect their most valued data such as source code  develop untrustworthy reputations. Depending on the type of information exposed, the results  can be disastrous including the corruption of databases, the revealing of confidential information, the theft of intellectual property, and financial burden to compensate those affected. According to a Forbes report, 46 percent of organizations have suffered reputational damage due to a data breach while 19 percent have suffered irreparable brand damage due to third-party security breaches.   

Misuse of User Data:  A common example of this occurs as a result of employees copying confidential work files or data to their personal devices. The intent may be innocent but the consequences catastrophic. The purpose may have been to work on a project outside of normal work hours. However, by making information accessible outside of an authorized, secure environment, this opens the possibility of user data and credentials being covertly stolen and sold on the dark web.  

Intellectual Property (IP) Theft:  Most valuable Intellectual Property exists as living, breathing files such as source code, design files, go-to-market strategies, are edited, copied, shared, and advanced.  If IP theft occurs, it can result in serious economic damage including loss of competitive edge and a decrease in business growth. In fact, according to the Commission on the Theft of American Intellectual Property, as of July, 2022 the total theft of U.S. trade secrets accounts for anywhere from $180 billion to $540 billion per year. These breaches can include unreleased product features, incubating ideas, and undisclosed working processes. Due to this,  building effective policies may become incredibly complicated. Company leaders should remain well aware of the competitive edge they could suffer if IP theft occurs.   

Access to Core Systems: Attackers are adept at finding the weakest link. Oftentimes,  this is due to human error. Careless employees inadvertently provide the easiest access. By developing increasingly aggressive and advanced tactics to target core systems, attackers find inventive ways to penetrate the very core of a system, including databases and critical servers. While you may be unable to fully eliminate these risks, following the principle of least privilege can be critical. 

Infection of Customer Servers:  Servers may be compromised in multiple ways. For example, an attacker may have somehow obtained a user password gaining access to a server or has discovered a security hole in a web application and associated plugins on platforms such as WordPress, Joomla, and Drupal. As in the SolarWinds attack, hackers often target customers resulting in a ripple effect that can persist for years. Customers share their sensitive information with your businesses under the assumption that you have reliable security measures in place to protect their data. It is your responsibility to ensure they are not left vulnerable to bad actors. 

As DataBreaches  makes clear, “there’s no need to hack if it’s already leaking.”  The truth is  that in the end many of these breaches are avoidable.  

A Look at High-Profile Leaks and Their Consequences 

Leak Analysis: Intel 

In August 2019, Intel reported a leak compromising restricted documents and code on a public server. The code’s existence was discovered by ethical engineer Till Kottmann who received the original information from an unknown source. Most of the things here have NOT been published ANYWHERE before and are classified as confidential, under NDA or Intel Restricted Secret” Till Kottmann states. 

According to Intel, an employee of the Intel Design and Research Center may have been responsible. The package included reference, sample, and initialization code for the company’s 7th generation microprocessor code-named Kaby Lake. It also contained firmware, schematics, documents, tools for later unreleased platforms, and camera processing tech, among other highly sensitive data, made for Space. The leak’s impact was severe, with a multitude trade secrets contained in the files revealed.  

Alarmingly, it can take years to remove exposed source code from the internet, as Microsoft discovered when it took an incredible eleven years to remove all traces of Windows 2000 after a devastating 2004 leak.  

Leak Analysis: Mercedes-Benz 

In May 2020, Kottmann would discover another major code leak, this time for automotive goliath Daimler, otherwise known as Mercedes-Benz group.  

The developer was able to register an account on a code-hosting portal and then downloaded 580 Git repositories through Gitlab which contained the source code of onboard logic units (OLUs) installed in Mercedes vans. Such a hack was due to a lack of account authorization processes and was a big wake-up call for Mercedes.  

After the initial leak, to make matters worse investigators discovered passwords and API tokens for Daimler’s internal systems. Bad actors could use passwords and keys to execute future intrusions against Daimler’s cloud and internal network.  

Further investigations found that none of the source code had been made public, so they assumed that the code was private containing proprietary information. Ultimately, Daimler took down the GitLab server from where Kottmann downloaded the data to minimize the potential damage.  

Leak Analysis: Nintendo 

Companies in the gaming industry have a vested interest in protecting their copyright. Of these, Nintendo has a reputation for diligently applying the law in cracking down on theft of its intellectual property. Yet, this past year, an enormous collection of files and source code was leaked from Nintendo servers revealing details of the development process of highly popular games such as Super Mario and Pokemon. A seeming treasure trove for fans, many also were unsure what to make of this exposed confidential information. Once revealed, there was no way of closing this Pandora’s box.  Ultimately, Nintendo’s only recourse was to threaten those who publicly share the information with legal action. In the end, the source code leak was so immense that this incident came be known as the “gigaleak.” 

Leak Analysis: Nissan 

Another automaker that had suffered source code leaks was Nissan. This incident involved many of Nissan’s mobile apps, marketing and sales tools, website information, and connected car services. It was discovered when Nissan sloppily misconfigured one of their Git servers with the username and password as admin/admin. The Nissan leak is a textbook example of how one oversight with access credentials can expose entire systems.  

How These Leaks Could Have Been Prevented 

To prevent source code leaks, it takes more than security best practices documentation or a one-time security audit. It requires continuous security protocols that are enforced at every level, for every user, and at every component of the system. Here are some measures that can be implemented with a modern security solution: 

Git repo config: You can check the config for your Git repos to ensure that only the appropriate ones are made public. 

Run code checks: Within public repos you need to check for accidental or intentional inclusion of confidential and sensitive information. 

Strong access credentials: Automatically check for weak passwords and ensure two-factor authentication is set up. 

Access controls: Git repos and other parts of the system should be access controlled so that users, whether human or machine, can see only what they need. 

User behavior tracking: There should be a baseline setup for what normal user behavior looks like, and any anomaly should be alerted to immediately. 

Privilege escalation: As bad actors attempt  to escalate their privileges,  such attempts should be tracked and acted upon system wide. 

Though difficult to prevent, the stakes of source code leaks are so high that they should be on every organization’s top priority list. Rather than relying on outdated, static security processes, leveraging a security solution such as the Aqua Security platform for dynamic, timely deployment and protection of your source code may be your wisest investment. 

Amit Sheps
Amit is the Director of Technical Product Marketing at Aqua. With an illustrious career spanning renowned companies such as CyberX (acquired by Microsoft) and F5, he has played an instrumental role in fortifying manufacturing floors and telecom networks. Focused on product management and marketing, Amit's expertise lies in the art of transforming applications into cloud-native powerhouses. Amit is an avid runner who relishes the tranquility of early morning runs. You may very well spot him traversing the urban landscape, reveling in the quietude of the city streets before the world awakes.