Neo Malware: Malicious Open Source Packages

By Jeff Wayman

Malware is at the top of the list among things that keep security and development organizations on edge.

The impact of malware is often focused on cost, and that makes sense, considering it makes up much of the now multi-trillion dollar cybercrime industry. However, there are even greater concerns, especially after large-scale attacks like NotPetya and SolarWinds.

Despite years of training and increased awareness, attacks via malicious attachments, drive-by downloads, and poor password habits have not abated. However, understanding malware’s ever-shifting landscape is more like trying to understand the flow of a river.

Levies are built based on known flood levels, and for a time, they hold. Unfortunately, confidence in holding nature at bay breeds hubris. Not to mention, those fortifications often cause problems further down the river in places that may never have flooded. Then, the rain comes, and the waters rise, flooding the banks and changing the shape and boundaries of the river.

Like a river, traditional malware attacks have followed the path of least resistance. As organizations build up and defend against the known methods of malware deployment, attackers look for new, novel ways to circumvent or completely change the target.

This isn’t a case of abandoning approaches that continue to threaten everything from personal data to public infrastructure; it’s an expansion. And the water is rising around a new target, the software supply chain.

The Rise of the Open Source Malicious Package

Around 2015, there was a noticeable and rapid change in how malware infiltrates organizations. These new methods bypassed the traditional virus and malware scanning techniques, directly targeting developers and development infrastructure.

The trouble comes from a gap in the dependence of common scanning tools on knowing the malware is present and then matching the malicious application’s binary hashes or fingerprints. When a match is found, the threat is quarantined or, more often, prevented from being downloaded in the first place.

While this process has been highly effective on known malware, novel malware attempts can easily evade these tools. This weakness has driven attackers to software supply chains and, more specifically, open source components.

Today, most software organizations recognize that as much as 90% of an application comprises open source software components. Because of this, attackers have shifted their focus to this new frontier. And there’s been little resistance.

Unsuspecting development teams with poor supply chain management processes are now primary targets. As easily as opening an email or downloading malware that looks like a legitimate application, developers can unknowingly download malicious open source packages and incorporate them into their projects and automated build processes.

Real-world examples and impacts

Despite members of the open source community, security researchers, and even private companies sounding the alarm since 2015, warnings mostly went unnoticed. It’s impossible to know why, though pre-log4Shell, many software organizations weren’t paying attention to open source consumption. However, in 2017, when typosquatting attacks in NPM and Python were officially identified, this began to change.

Following this event, in March of 2018, npm credentials were intentionally compromised and used to create a malicious version of a package from a core contributor to the conventional-changelog ecosystem. The package was installed 28,000 times, and successfully executed a Monero Cryptominer. And this was just the beginning.

In 2019, 23 RubyGems packages were pulled from the public repository due to malware. Then, in 2020, researchers contacted GitHub, concerned about a set of GitHub-hosted repositories unintentionally serving malware. When the investigation of the Octopus Scanner was concluded, they would identify 26 open source projects backdoored by Malware and serving malicious code.

In the same year, RubyGems was once again a target of cryptocurrency-stealing malware. This was followed by the Bladabindi trojan, designed to enumerate back-door NetBeans projects through the NetBeans IDE.

The bad news is that this is just the beginning, and the trend is not slowing; attackers’ creativeness continues to catch developers off guard. While organizations may attempt to apply strict governance and controls over open source usage, malware can enter from anywhere, even when the package has the same name.

A New Trojan Horse

In 2021, the first officially identified dependency confusion attack was detected. This time, attackers exploited a publicly hosted internal package by simply increasing the version number.

Attackers had now discovered a way to capitalize on a modern development best practice: automated build processes. By March 2021, more than ten thousand similar malicious packages were ready to automatically deliver malware directly into the development infrastructure.

Unfortunately, most organizations are ill-prepared for the new strategies and tactics used to deploy malware, and attackers are just getting started. Even in cases where organizations have diligently deployed SCA solutions and seemingly protected their software supply chain, they may still be completely open to attack.

The gap between public repositories and the consumption paths in a development organization has only widened. Compounded by poor repository practices, a lack of password controls, and the absence of project security, some public repositories have become the perfect place to hide malware. There’s a clear need to push efforts even further left, beyond standard approaches to SCA and scanning, to intercept these new and novel approaches.

So, what can an organization do?

As this article was being written, news began to break that polyfill.io had been compromised by a Chinese company and turned potentially legitimate npm packages into hidden malware factories. This comes on the heels of the xz Utils backdoor, a most likely state-sponsored attempt to execute a malicious package. In both cases, conventional tools scanning for these threats did not detect the malicious component. In contrast, tools and teams with the latest information can defend against these attacks.

As malicious open source packages become more complex and sophisticated, the need for better processes and advanced security solutions becomes critical. Security teams and developers must collaborate more effectively, including adopting tooling that natively addresses the unique risks of open source software and the software supply chain, especially concerning novel malware threats.

Here are seven critical steps:

Take a comprehensive look at their software supply chain and open source consumption.
Ensure open source consumption is managed via an artifact repository manager.
Implement a repository firewall between internal and public repositories, capable of intercepting malware attacks before they enter development ecosystems.
Adopt automated security policies to enforce the use of trusted and verified components.
Continuously monitor and analyze the behavior of integrated components for any suspicious activities.
Educate and enable security and development teams on the unique risk of malicious open source packages.
Incorporate shared knowledge into their security practices and work cross-functionally.

Combined, these steps can significantly reduce the risk of introducing malicious components into development processes, ensuring a secure and resilient software supply chain. However, tooling and processes are only part of the solution; the development and security communities must leverage intelligence backed by in-person research and machine learning to understand and adapt to emerging threats.

OpenSSF’s Malicious Packages Repository

In October of last year, the OpenSSF launched the Malicious Packages Repository. A first project of its kind, the Malicious Packages Repository serves as an “open source system for collecting and publishing cross-ecosystem reports of malicious packages.” As of July 17, 2024, the project has collected 23,936 packages.

The Malicious Packages Repository is an open and contributor-driven initiative. It serves a critical function in helping bridge the gap between the identification of malicious open source packages and the research necessary to prevent them in the future by ensuring comprehensive coverage and upholding a high-quality standard. Like all open source, the success of this project depends entirely on the community’s efforts to support it.

To get involved in the Malicious Packages Repository project, check out the project’s contribution guidelines. There, you can find everything you need to get started. You can also join the OpenSSF Package Analysis Slack Channel or connect with the OpenSSF Securing Critical Projects Working Group for information about the project.

About the Author

Jeff Wayman has spent over a decade leading digital content and community teams across OSS Security, DevOps, and DevSecOps roles. In his current position, he guides OSS security thought leadership and digital strategy for Sonatype’s Office of the CTO. Jeff promotes OSS security awareness through his work with the OpenSSF End Users Working Group and his contributions to the Atlantic Council’s Open Source Policy Network. Jeff is pursuing an MBA at the Gies College of Business at the University of Illinois, Urbana-Champaign, focusing on Digital Marketing and Strategic Innovation.