
By Ian Kretz and Sebastián Obregoso, Datadog
Datadog is a proud Open Source Security Foundation (OpenSSF) member, and we believe that being a part of this security community will lead us all to a safer place. Attackers are increasingly turning to supply chain attacks to distribute their malicious code, and the Open Source Vulnerabilities (OSV) database, to which OpenSSF is a leading contributor, is a valuable source of information that helps make everyone aware of packages that have been compromised or published with malicious intent.
Being part of OpenSSF also gives us an opportunity to contribute back to the community. One way we do this is by continuously monitoring npm and PyPI, two of the most widely used open source package repositories, and collecting the malicious packages we find there in a public dataset. The crucial ingredient in our continuous monitoring effort is GuardDog, an open source Python tool for identifying indicators of potential malicious intent in open source packages.
In this post, we’ll dive into how GuardDog works and how we use it at Datadog in our efforts to improve the overall state of software supply chain security.
What is GuardDog?
GuardDog analyzes package metadata and performs static analysis of the package’s code to identify common attacker techniques, risks, and other potential signs of malicious intent. We created GuardDog with the goal of identifying malicious packages published to PyPI, and we have since added support for the npm and Golang ecosystems.
GuardDog’s analysis is powered by:
- Metadata scanners, which examine non-code-related aspects of the package, such as attempts at typosquatting or the inclusion of binary executables.
- Semgrep rules, which provides language-specific semantic analysis that helps uncover complex behavior patterns such as exfiltrating sensitive data like environment variables to a remote server.
- YARA rules that enable complex pattern matching like base64-encoded strings.
Both Semgrep and YARA are well-known and widely used open source tools in the security research community.Â
GuardDog then outputs the findings to the user, but making a judgement call based on the findings is a complex task for a CLI tool like this. That’s why expert review is crucial in successfully finding malware with GuardDog. For instance, one of the most common GuardDog findings is that a package has no description or README content bundled with it—needless to say, it would be a mistake to judge a package malicious solely on this basis.
To place GuardDog in the context of related efforts at improving software supply chain security, it is instructive to compare and contrast it with OpenSSF’s command-line tool OpenSSF Scorecard. Like GuardDog, OpenSSF Scorecard is a command line tool for scanning open source projects using a set of analyzers. Whereas GuardDog is exclusively concerned with indicators of potential malicious intent, OpenSSF Scorecard focuses on analyzing packages’ software development practices and known vulnerabilities. OpenSSF Scorecard can also assign a numerical score to the risk a package poses to its potential users, a step that GuardDog currently does not take.
Scanning and triage architecture
GuardDog is the crucial component of our effort to continuously monitor PyPI and npm for malicious packages, which we began in early 2023. The fruit of this effort is a public dataset of malicious packages, each reviewed and confirmed to be malicious by a researcher, as well as a set of case studies and data that give us significant insight into the objectives and the tactics, techniques, and procedures (TTPs) of open source threat actors. Here, we give a brief overview of how we monitor PyPI and npm with GuardDog.
Like needles in the proverbial haystack, malicious packages published to these registries are sitting there just waiting to be discovered, catalogued, and added to our dataset. Our process for collecting them resembles the pipeline illustrated in the above figure. Starting at the widest part of the pipeline, we ingest every new version of every package published to PyPI and npm and give them to GuardDog to scan. For a sense of scale here, an average of around 20,000 new releases are published every day across these two ecosystems.
Based on GuardDog’s findings, we filter packages in the next part of the pipeline to identify the potentially malicious ones among them for further consideration. Some aspects of filtering are clear enough (e.g., packages with no GuardDog findings should be excluded), but choosing which packages to select is more art than science. Our approach is to select for combinations of GuardDog findings that speak to concrete adversarial objectives. For instance, code attempting to steal cryptowallets would need to access sensitive system files _and_ exfiltrate the data found there. We’ve created GuardDog rules to identify these code behaviors and select for packages that exhibit both.
The packages that make it through the filter are finally presented for triage to a security researcher, who identifies which ones are indeed malicious. This manual review step is critical to our success in producing a high-quality dataset, which includes nearly 5,000 distinct samples at the time of this writing. It also underscores the nature of GuardDog’s findings as mere indicators of potentially suspicious or malicious behavior rather than conclusive proof of it.
Our efforts to continuously monitor open source package ecosystems with GuardDog is very similar to OpenSSF’s own Package Analysis project, which combines static and dynamic analysis methods to identify malicious packages. While dynamic analysis does figure into our package triage workflow on an ad hoc basis, our GuardDog scanning and triage architecture is fundamentally designed around static code analysis.
A year in open source malware detection with GuardDog
We’ve seen some pretty interesting case studies in the course of monitoring these open source ecosystems with GuardDog. We’ve also gathered some fascinating data-driven insights about the TTPs that open source threat actors tend to favor.
In 2024, we identified two campaigns using npm as a vehicle for malware delivery from threat actors associated with Democratic People’s Republic of Korea, both of which landed on our radar thanks to GuardDog. The first campaign, from a threat actor we have dubbed Stressed Pungsan (we associate nation-state actors with their national dog breeds), used a Trojanized npm package as its initial access vector, taking advantage of a preinstall command hook provided by npm to download and execute a suspicious dynamic link library (DLL) on the victim system. The second campaign, courtesy of Tenacious Pungsan, used backdoored copies of several prominent npm packages to distribute an obfuscated variant of Beavertail malware, first observed in conjunction with the highly publicized Contagious Interview campaign targeting tech-sector job seekers.
Obfuscated Beavertail malware in a Tenacious Pungsan-associated npm package
These campaigns illustrate several TTPs that we routinely observe from open source threat actors. A significant majority of packages we’ve identified this year, including the Stressed Pungsan packages, use install-time hooks exposed by the package manager to achieve code execution. These threat actors regularly target software developers to steal their cryptowallets, privileged API keys, and other sensitive data; the pervasive use of these hooks in open source malware speaks to this broader trend. Other common TTPs include code obfuscation and the occurrence of shady-looking domains or hardcoded IP addresses in malware payloads.
There is a lot to learn about attacker behaviors from the malware we find and flag, but there is just as much to learn from the malware we miss. We constantly improve our detection rules using the malicious packages found by other members of the software supply chain security community, including and especially OpenSSF itself. In sharing what we’ve learned about malicious package detection with GuardDog, we hope to add to the conversation about improving software supply chain security and invite community feedback and interest in this unique problem space.
Conclusion
Open source package repositories are getting more and more attention for threat actors due to their ease of use and low barrier to entry for attacks. As such, it is becoming an increasingly challenging task for developers to stay protected.
For this reason, Datadog released Supply-Chain Firewall, a drop-in command line wrapper around pip and npm that can keep you from installing malicious packages in these ecosystems. It makes use of our public malicious packages dataset, which is updated daily, and OSV.dev published advisories to prevent known malware from being installed in your environment.
And for those interested in vulnerabilities as well as malicious indicators, using OpenSSF Scorecard alongside GuardDog could be a fruitful approach for obtaining a more comprehensive picture of the risks associated with using an open source package.
Visit our repos to learn more and get involved with GuardDog and Supply-Chain Firewall. Both are also available as (safe, we promise) PyPI packages.
We thank OpenSSF for the opportunity to share our work. Stay safe!
- GuardDog: GitHub PyPIÂ
- Supply-Chain Firewall: GitHub PyPI
- Malicious packages dataset: https://github.com/datadog/malicious-software-packages-dataset
About the Authors
Ian Kretz is a Security Researcher specializing in software supply-chain security at Datadog, maintaining the open-source projects, GuardDog and Supply-Chain Firewall. Ian previously worked on software supply-chain security, as well as applied cryptography and formal methods, at the MITRE Corporation.
Sebastián Obregoso works as a security researcher at Datadog, currently working with supply chain security and maintaining open source tools such as Guarddog and contributing to the malicious-packages-dataset, among others. Bachelor in Systems Engineering, with 8 years of background in software development and another 13 years in the security area, throughout this time he led areas focused on Cloud and Application Security.