Protecting Source Code from Theft During Development

May 14, 2019
No Comments


Back in 2004, one of the most embarrassing and worrying thefts was announced to the world. Microsoft source code for Windows 2000 was stolen. At that point around 90% of the world’s computers were running on a Windows OS. The source code ended up being posted on a public website for the world to see and hackers to take advantage of. In the end, a third-party contractor, Mainsoft, used by Microsoft, was found to be the source of the leak. The Microsoft source code was both patented and copyrighted. This offered a degree of control over its use, even if publicly accessible. However, the question is, could Microsoft have gone further and closed the door before the horse bolted?

7- Layers to Prevent Source Code Theft by Insiders

The Microsoft W2K theft is not the only instance of insiders who, working on the source code, end up leaking or stealing the very Intellectual Property (IP) they are helping to create. In 2017, an ex-employee of Goldman Sachs was convicted for stealing source code and trade secrets from the firm to help in his start-up. Insiders and software theft is common; with one reason being that 89 percent of developers who, even on leaving an organization, have access to proprietary corporate data.

Developers, whether employees or outsourced contractors have intimate access to your company’s ‘secret sauce’ – the very source code that represents your Intellectual Property. How to prevent the theft of this source code is a tricky question and one we hope to resolve here.

Protecting any data leaks, including source code exposure via an insider is not a simple on/off switch. Protecting data, including source code, is a problem that can be resolved by using layers of protection rather than a point solution. Below we have listed the best technologies and approaches to use in the mitigation of source code theft.

Access Control

Access control is the fundamental starting point of source code theft control. Always avoid giving global access to source code; if a developer doesn’t need to use certain parts of the code, do not allow access to that developer. Use a ‘least privilege model’ to control who accesses your source code. Isolate areas of source code on a ‘need to know’ basis. Wherever possible apply the tenets of the Zero Trust model of security to developer access. This model is based on verifying access at every juncture, across all devices.

Also, ensure that access is robust. Insiders are one problem, but external hackers could gain access via a developer with extended privileges. Wherever possible, use the most robust authentication available. This should be, at a minimum, second-factor. Ideally, also apply risk-based rules to manage and enforce access.

Data Loss Prevention (DLP)

The next port of call in your layered approach to source code theft prevention is to use a Data Loss Prevention tool. A DLP solution should allow you to set up rules based on the classification of data; for example, DLP software can classify IP data such as source code. The DLP platform uses a combination of methods to prevent the loss of data via insiders, as well as malicious outsiders. DLP monitors data and other IP whilst at rest and in transit. It uses alerts, encryption, and other protective mechanisms, such as locking USB ports, to stop code exfiltration and other data, leaking out.

VPN Connection for Remote Workers

Remote working is growing in popularity; developers especially, like to work remotely. A survey found that around 50 percent of developers set remote working as a priority. But remote working opens up opportunities for a developer to either accidentally or maliciously expose the code. One mechanism that can help mitigate this risk is the use of a VPN for remote workers. A VPN connection can help to control what a person can and cannot do. It also helps back up your DLP policies. A VPN allows you to blacklist certain sites that could be used to post source code to, for example.


Developers are a community. They like to help each other out and share knowledge. Often, source code exposure happens without any malicious intent. A developer may just feel it’s OK to share a snippet of code to show a fellow developer how something should be done. Use either automated or human-driven monitoring to find code that may have leaked out. Search for keywords in your code in places like Pastebin, Github. You should also consider delving into the darknet to look for company mentions and keywords.

Keep detailed audits of your repository access and use. You may need this if source code is leaked.

Legal Frameworks

In a previous post, we talked about putting certain legal structures in place to help in source code theft. Legal structures are a deterrent to the theft of source code as they can be used to enforce punishment in a court of law. Legal tools such as copyright and patents should always be used wherever possible to help in source code theft prevention. The Microsoft theft mentioned earlier had various legal instruments in place that they called upon to minimize the exposure.

Trust but Verify

Trust is a key element of controlling source code leaks. But how do you create and build-upon trust? The notion of “Trust but Verify” is based on a bilateral trusted relationship.  It is a component and basis of the Zero Trust model mentioned earlier, but with more emphasis on the human aspect of communication. The layers mentioned previously all go towards creating this trusted relationship. Trust but Verify needs to have:

  • Clear communication channels and expectations – including distinct and understandable code ownership rules
  • Policies and contracts that are in plain language
  • Enforcement using DLP

Be Selective

When you bring on-board a developer or outsource to a development company do your homework. This individual/company will have access to valuable resources. Ask for references and do your own checks before hiring. A little leg-work up front should form part of this overall layered strategy in preventing company source code exposure.

Protecting Your Source Code Protects Your Developers

Our software developers are one of our most precious resources. But it only takes one spoilt apple to upset your business goals and give your competitors a leading edge. Applying various layers to solve the issue of source code theft will help harden your development against exposure. This benefits everyone in the long run. Creating clear boundaries of communication as well as protected data flow is good housekeeping. But furthermore, it lets you run a business that over-time feeds into a trust model that we all benefit from.



Read other posts like this:

Trends in Data Loss Prevention (DLP)
What is DLP (Data Loss Prevention)
How to Choose a Secure Software Development Company
The Great Resignation and What it Means for Software Development and Data Security
Source Code Security Highlights of 2019 Report
Top Data Breaches of 2019: Half-Year Review