All Posts By

OpenSSF

Taking Stock of the State of European Cyber Resilience Act (CRA) Compliance: An Urgent Wake-up Call for the Open Source Ecosystem

By Blog, EU Cyber Resilience Act, Global Cyber Policy

By Christopher (CRob) Robinson, OpenSSF

For the better part of two years, discussions surrounding the European Cyber Resilience Act (CRA) have been somewhat theoretical: mapping requirements, debating definitions, and analyzing how the requirements will impact our amazing ecosystem. But folks, it’s mid-2026, and the CRA is live. Theory is officially in the rearview mirror as implementation milestones roll out over the next two years. 

I’ve just finished reviewing the finalized 2026 CRA Awareness and Readiness Report, a joint effort with LF Research experts, and to be blunt, the results are a sobering reality check. Despite tireless community work, the broader ecosystem is far from ready for CRA compliance.

CRA Awareness Has Stalled 

The most disappointing finding is that awareness surrounding this regulation has decreased year-over-year. Today, 66% of respondents remain unfamiliar with the CRA, a slight increase from 62% in 2025. That means a growing portion of the software ecosystem is unaware of a regulation with global consequences and hefty fines. 

The geographic disparity is even more alarming. In the United States and Canada, nearly 72% of respondents are unfamiliar with the regulation. It cannot be understated: if you are a North American company selling software products into the EU market, you are legally required to comply with the CRA. However, the majority of the neighborhood is still walking unprepared toward a September 2026 reporting deadline. 

Why the “Consume and Forget” Model is No Longer Possible

For years, organizations have treated open source like a free lunch: grabbing code and assuming the lights are being kept on by someone else. Under the CRA, that posture is no longer tenable. Manufacturers now bear the legal responsibility for the security of the components they integrate. For some (read: most) this is a stark wake up call. 

Despite that, 51% of manufacturers still passively rely on upstream projects for security fixes. In the new world of the CRA, “passive” is a level 10 risk.

Private Forks Are Not the Answer (They’re Worse) 

Many of you have tried to dodge the upstream journey by maintaining private forks, but inefficient code is still inefficient code, and now we have the bill to prove it. The report shows that maintaining private workarounds is a massive form of technical debt, costing organizations an average of $258,000 in labor every single release cycle. With some release cycles as short as a matter of hours, these costs can quickly get out of hand. 

For large organizations (5,000+ employees), this burden exceeds 11,152 labor hours per cycle. Maintaining these divergent codebases is a giant bill for a strategy that actually makes supply chain transparency worse. Contributing fixes upstream isn’t just being a “good neighbor” – it’s the only financially rational path forward.

For the last several years, the OpenSSF community has observed traditional vulnerability disclosure systems buckling under the strain of volume of discoveries being reported through them. Data from the report points to a surge of 394% increase in Common Vulnerabilities and Exposures (CVEs) and an 811% spike in vulnerabilities that fall within the High+ severity categories in the first quarter of 2026. Several factors contribute to this trend:

  • Transparency: Open source is open and transparent, which means the community cannot hide vulnerabilities behind opaque processes or paywalls. 
  • Project Growth: Year-over-year we’re seeing an explosion of MORE open source projects.
  • Ubiquity: Open source is quite literally the majority of software used globally. 
  • AI Tools: More users are leveraging Large Language Models (LLMs) and other tools to explore and analyze software. The transparency of open source software offers a low barrier of entry for those using these new tools and test code. 

Globally, regulations like the CRA are codifying long-standing security guidance into law. This shifts security from a “nice-to-have” recommendation to a legal requirement backed by heavy non-compliance fines. 

How Does Upstream Investment Improve Your Security Posture? 

On the bright-ish side the data reveals a clear correlation: organizational diversity is a strong predictor of a project’s security posture. When more organizations invest in a project, that project becomes more resilient, making upstream investment a direct catalyst for your own compliance posture. Organizations have an important role in their own security health through their participation in open source projects.

However, the participation of small and medium-sized enterprises (SMEs) is crucial to the entire ecosystem, they are the backbone of the industry. Currently, over half of European SMEs remain unfamiliar with the CRA, creating a significant gap in project diversity. Directed investment in SME engagement is essential to prevent compliance from becoming a structural barrier to innovation. By funding the support and tools these smaller players need to remain compliant, we ensure the entire upstream supply chain remains robust and competitive.

What OpenSSF Resources Can Help Organizations Prepare for the CRA? 

While we wait for the full 2026 report to drop, the tools to succeed already exist. Our previous research, Unaware and Uncertain: The Stark Realities of Cyber Resilience Act Readiness in Open Source, highlighted these same gaps a year ago. It’s time to start acting. The tools to succeed already exist and practitioners who find our resources rate them highly:

This ecosystem is rife with the talent and the collaborative instincts to meet this challenge. The December 2027 deadline is a forcing function, but it’s an opportunity to build a software supply chain that is actually secure by design.

Europe is leading the way in protecting consumers globally. Despite our geographic distance in the U.S., the oceans between us all do not provide isolation from this regulation any longer. Software and products with digital elements are built with hardware, software, and firmware created through international collaboration. That fact feeds the global economy and makes manufacturers globally responsible for CRA adherence. Events that happen “over there” DO truly affect everyone.  

The results of the CRA research conducted with our peers in LF Europe is truly grave. A significant amount of work and collaboration has occurred across the ecosystem since CRA enforcement. It is shocking to look back at all this work done by both the OpenSSF and its partners and see that 39% of manufacturers, who have BILLIONS of euros at stake in potential non-compliance penalties, are still unaware and uncertain about their requirements.  

The next stage in our shared journey together unfolds  in September 2026 when the vulnerability reporting obligations are enforced. There is not much time to prepare. Organizations have a narrow window to audit their upstream dependencies and establish the processes needed to report and patch new vulnerabilities as they emerge. The more complex aspects of the CRA are currently a year out, coming due December 2027. Please, take action today to protect yourselves, your companies, the upstream maintainers on whom you depend, and your customers.

The OpenSSF encourages everyone that benefits from open source software to consider the beauty and complexity of the open software world. Every day in software repositories, chat channels, and mailing lists a talented cohort of developers co-engineer the tools you use and love. We ask that organizations and their leaders understand that free software is NOT free. Being a responsible consumer and participant in the  ecosystem creates benefits for everyone. With CRA in our midst, there is ample opportunity to make this shared space better and more secure for everyone. My hope is that we can rise to that opportunity.

Stay Ahead of the CRA

Be the first to read the 2026 CRA Research Report. Subscribe to our newsletter for an alert when it releases the week of June 9 (European Open Source Security Forum in Brussels).

Get involved with the OpenSSF Global Cyber Policy Working Group.

About the Author

Christopher Robinson (aka CRob) is the Chief Technical Officer and Chief Security Architect for the Open Source Software Foundation (OpenSSF). With over 25 years of experience in engineering and leadership, he has worked with Fortune 500 companies in industries like finance, healthcare, and manufacturing, and spent six years as Program Architect for Red Hat’s Product Security team.

Hack to the Future: The Impact and Legacy of the DARPA AIxCC Challenge

By AI, Blog, Global Cyber Policy, Guest Blog

By Helen Woeste

AIxCC Competition Background & Results: 

In 2023, DARPA announced a two-year long competition called the Artificial Intelligence Cyber Challenge (AIxCC) with the goal to safeguard open source software used in critical infrastructure throughout America. The intent is to hasten the development of open source AI tooling that can assist developers with finding and fixing bugs in live software with minimal cost. Open source is a drastically underfunded and underresourced form of infrastructure. It therefore presents an exciting, practical target, and opportunity for the research and development of AI in cybersecurity. Additionally, open source’s publicly observable code is ideal for competition and collaboration. 

AIxCC was run in collaboration with ARPA-H and supported with contributions from Anthropic, Google, Microsoft, and OpenAI, with additional consulting around open source provided by the Linux Foundation and the Open Source Security Foundation (OpenSSF). This research was developed with funding from the Defense Advanced Research Projects Agency (DARPA). The competition consisted of two rounds, the Semifinal Competition (ASC) and the Final Competition (AFC), where cash prizes from a pot of $30,500,000 were distributed. For the ASC, 42 team submissions were accepted across two tracks; the Open Track and the Small Business Track, which required an additional technical paper submission. The top seven teams moved forward to the AFC which was set up to mimic a real world CI/CD pipeline. The scoring algorithm was also designed to highlight behaviors that would make the competing systems more useful to developers. At the conclusion of AFC, the top three teams were Team Atlanta, Trail of Bits, and Theori. 

For the AIxCC competition, real open source projects were selected, and their code was forked and then modified to insert artificial bugs for the Cyber Reasoning Systems (CRS) to discover and fix. However, during the execution of the competition, the CRSs discovered several real potential bugs alongside the artificial ones. This introduced the issue of how to triage and manage resolution of fixes in the projects. OpenSSF engaged third party open source security organization Open Source Technology Improvement Fund (OSTIF) to get involved with the closing out of the bugs identified as a result of the AIxCC competition. 

OSTIF selected the team at Ada Logics for their extensive experience working with open source fuzzing, bug verification, and disclosure. With a list of potential bugs identified through the course of the competition, Ada Logics was tasked with securely submitting verified issues, ensuring that anything reported to open source project maintainers was a proven bug. The Ada Logics team was able to reproduce and confirm twenty-seven issues after multiple rounds of testing and continued coordination between AIxCC competitors, collaborators, and contributors. CRS teams, including Team Atlanta, Team Buttercup, Team FuzzingBrain, Team Shellphish, Team Theori, Team 42-b3yond-6ug, and Team Lacrosse, working together with Kudu Dynamics and the OpenSSF, continued to collaborate and meet with OSTIF around the disclosures to ensure total accuracy of the reported issue’s testing and resulting decision around disclosure. 

It was of utmost importance that any and all real bugs detected during the competition were verified before alerting the project maintainer to the issue. This is to differentiate how the competition reports issues to projects from the low-quality reports plaguing open source maintainers today. In several cases, CRS-generated patches were submitted alongside bugs, an offering to project maintainers looking to quickly resolve the finding. Additionally, feedback was sourced from the projects around their experience as a target in the competition as well as the disclosure procedure following. 

The Findings:

Teams discovered twenty-seven candidate real-world issues during the competition and OSTIF engineers were ultimately able to replicate all of the draft bugs. The affected projects were cURL, shadowsocks-libev, healthcare-data-harmonization, hertzbeat, little-cms, and mongoose. Once identified, the hard work began of fixing those bugs, implementing CRS tooling to perform the second half of its double duty to find and fix security issues. 

However, some of the findings did not meet a level of security concern for various reasons. Some issues were fixed by code changes in the projects during the time-period in between the competition and when engineers reproduced them. Others were outside of the threat model of the project and did not meet the criteria needed to incorporate into the project (for example, the Apache Poi project threat model states “Expect any type of Exception when processing documents,” making any exception-based findings non-issues). One issue had actually already been found by OSS-Fuzz, but the project hadn’t fixed it yet.

Ultimately, interesting findings were discovered and fixed by the Cyber Reasoning Systems in this competition, and the systems found a lot of valid issues. Further, some projects had introduced fixes before the bugs were reported. This is likely because the AIxCC teams submitted the fuzzing harnesses to the projects before triage had taken place, which re-discovered the same bugs before triage had completed. One significant lesson learned from this is that cyber reasoning systems may benefit from doing self-triage when discovering potential issues by checking against the project’s documentation and understanding the types of issues that the project accepts as security bugs that need to be addressed.

Conclusion & Looking Forward:

The AIxCC program was a massive undertaking by dozens of organizations, all working to contribute back to open source security in a meaningful way using novel AI tooling. The competition was mindfully designed and carried out, with attention given towards the open source projects and maintainers, the wide variety of competitors and interests, and the impact of the competition itself on the industry all the way down to the maintainers. 

OpenSSF is the home for extended collaboration on these new open source tools through its newly formed Cyber Reasoning Systems Special Interest Group. OSS-CRS and FuzzingBrain, two open source projects that emerged from the competition, are now hosted at OpenSSF in the Linux Foundation. A third tool applied and was accepted to the OpenSSF, and has a few remaining steps before the official transition. The group aims to foster their development and adoption, and to establish best practices that help projects use CRSs effectively and responsibly.

This work is already producing real results. For example, FuzzingBrain has since turned its AI-assisted fuzzing system on the broader open source ecosystem, discovering sixty-two vulnerabilities across twenty-six projects, from CUPS and Apache Avro to Ghidra and OpenLDAP, with forty-three confirmed by maintainers and thirty-six already patched upstream. 42-b3yond-6ug has expanded its CRS to uncover twelve kernel-related vulnerabilities in the Linux kernel and related components, plus ten zero-day vulnerabilities in userspace projects including Eclipse Mosquitto and OpenLDAP. The team is also developing a platform to support more efficient model training and evaluation of models and agents, with a release expected soon. Using OSS-CRS, Team Atlanta discovered twenty-five vulnerabilities across sixteen projects spanning a broad range of software including PHP, U-Boot, memcached, and Apache Ignite 3. Of those, nine have been fixed and eight more have been confirmed with fixes in progress.

The future of AI assisting maintainers in finding and fixing security vulnerabilities is bright. The challenges raised by the AIxCC competition already have solutions being developed in open source, such as LLM-based tools that build threat models by looking at the data-flow of projects, and AI agents that triage findings against threat models and documentation before reporting issues. As these tools all continue to develop, they will harmonize into reliable solutions that maintainers can use to elevate their security with far less effort than today.

Our gratitude to the folks at Ada Logics for triaging the potential bugs and working hard to reproduce the issues so maintainers didn’t have to, OpenSSF for trusting us to bring together all of the stakeholders to work on the issues together, DARPA and ARPA-H for holding the AIxCC competition and sponsoring this work, the teams that built the Cyber Reasoning Systems for the competition, Kudu Dynamics for their support in confirming the findings, and all of the maintainers that worked with us to resolve the issues.

OpenSSF and OSTIF will continue to support this kind of work by serving as human connectors between CRS tools and open source communities. The goal is to help triage and validate vulnerability reports and proposed patches before they reach maintainers, ensuring findings are accurate, actionable, and respectful of maintainers’ time.

Organizing a competition of this scale on behalf of open source maintainers and its end users takes both enormous collaboration and individual effort. Understanding the communities involved, and building lightweight programs that shield maintainers from headaches while strengthening security is the best possible outcome for the ecosystem. It took everyone coming together to make this happen, and ongoing efforts will bring low-cost and low-maintenance tools to everyone that are valuable and make us all safer. 

As AI moves forward at breakneck speed, innovative work like this highlights how you can move fast and build things together for a better tomorrow. 

Author Bio

Helen Woeste joined OSTIF in 2023, coming from a decade of work experience in the restaurant and hospitality industries. With a passion (and degree) for writing and governance structures, Woeste quickly transitioned into an operations and communications role in technology. 

 

The views, opinions and/or findings expressed are those of the author and should not be interpreted as representing the official views or policies of the Department of Defense or the U.S. Government.

Distribution Statement “A” (Approved for Public Release, Distribution Unlimited)

Open Infrastructure Is Not Free, Part II: The Hidden Cost of Running Package Registries

By Blog

The September 2025 Working Together Towards Sustainable Open Source open letter raised the alarm about the economic sustainability of open source package registries, highlighting how rising adoption and the pace of innovation are placing new and growing pressures on open source package registries. Those pressures have only accelerated in the time since the letter, amplified by the adoption of AI coding agents and tools.

But what are the real economics of an open source package registry? Beyond obvious infrastructure costs, there’s significant, often invisible work required to keep registries running, maintained by a small number of staff and volunteers. It’s more than just uploads and downloads. It’s strengthening security as threats evolve, continuously improving the developer experience, and more.

To ensure long-term sustainability, the registries have formed a Sustaining Package Registries Working Group hosted by the Linux Foundation to collaborate on and share community-aligned strategies and offerings. The right set of strategies will vary by registry and evolve over time, and some registries have already rolled out new approaches.

Behind the Scenes of a Package Registry

Registries today run primarily on two things: (1) infrastructure donations and credits; and (2) heroic efforts from small paid teams (themselves funded by donations and grants) and unpaid volunteers that operate and maintain registry services. The bulk of donations and grants comes graciously from a small set of donors who care about the value of package ecosystems, but even these donations don’t scale with demands on the registries.

The core job of a registry is to accept packages from open source publishers and make them available for consumers to download: simple in concept, demanding in practice. We expand on the behind-the-scenes jobs below.

Scale Drivers for Registries

For a sense of scale, the registries – npm, Maven Central, PyPI, Crates.io, RubyGems, Open VSX, Packagist, Hex, CPAN and more than a dozen others – will serve over 10 trillion open source package downloads in 2026 as the headwaters of the world’s software supply chains. 

That’s more than one billion downloads per hour or just under double the predicted number of Google searches that will be run in 2026. 

That 10 trillion number is incomprehensibly large but believable. Modern applications contain not just some open source but hundreds of dependencies often spanning language ecosystems, e.g., your Python package manager may be written in Rust, or the continuous integration system for your Java application may be written in Ruby.

The consumption side of the registry ecosystem – those 10 trillion downloads – is part of the pressure on sustainability, and it’s tempting to look at a “click charge” as part of the solution.  

Nonetheless, the scale of adoption and commercial use places a significant infrastructure and human load on the registries to the tune of millions of dollars per year of CDN, infrastructure, and labor.

The AI Boom Presents Big Challenges

Beyond adoption driving downloads, AI is another scale driver, amplifying both legitimate and malicious activity. AI is accelerating the rate of consumption and production of open source, pushing registry management beyond the scale of human action or oversight. (The difference between Python’s 2025 report and 2026 trajectory is striking: PyPI added 130,000 new packages in 2025, nearly matching its total of 140,000 in 2018, and the registry is adding nearly 900 packages per day in 2026.)

Registries also play a front-line role in supply chain security, keeping malware and vulnerabilities from entering the open source ecosystem, and the “good guys” aren’t the only ones using AI.

Attackers are using AI to create and ship more novel, more difficult-to-detect attacks more quickly. The community is still in the thick of Shai-Hulud and other recent supply chain attacks like the ones on Trivy and LiteLLM, and even more recently, the Axios compromise, which demonstrated how AI and social engineering are converging. But going back to 2021, where we can get a full picture of the end-to-end cost of a significant vulnerability, remediating the log4shell (CVE-2021-44228) vulnerability consumed around 10% of a year’s enterprise security effort across the industry.

Complexity Drivers for Registries

With some background on scale and AI drivers, let’s dive into the high-level jobs to be done by a registry, and it’s a long list. No registry does all of these jobs in depth today. Depending on scale, some jobs require fractional or sporadic attention while others, like site reliability engineering, might require a team.

  • Identity and Access Management: Managing publisher identities, credentials, permissions, and audit logs is essential to secure package publication and support incident response.
  • Namespace and Ownership Management: Protecting namespaces and defining publisher, maintainer, and owner roles helps prevent abuse such as brandjacking and typosquatting.
  • Package Ingestion and Validation: Registries must store packages, index metadata, and validate elements like structure, licensing, and signatures to ensure quality and trust.
  • Supply Chain Security and Risk Management: Registries help secure the supply chain by blocking, flagging, quarantining, or removing vulnerable or malicious packages and surfacing risk in package metadata.
  • Registry Security: Registries require continuous hardening, review, and monitoring because a compromise could put the entire ecosystem at risk.
  • Registry Availability: Maintaining reliable publication, discovery, and consumption services requires strong monitoring, alerting, and operational support to minimize downtime.
  • Package Discovery, Search, and Evaluation: Consumers need robust search, filtering, and quality signals to find relevant packages and assess their health and ecosystem importance.
  • Consumption, Distribution, and Mirroring: Registries must deliver packages efficiently through scalable infrastructure while keeping caches and clients aware of upstream changes such as vulnerabilities and new versions.
  • Governance, Policy, and Community Support: Operating a registry requires clear policies, transparent enforcement, and ongoing legal and community support as global regulations evolve.
  • Observability, Analytics, and Ecosystem Insights: Registries provide unique visibility into publishing and consumption patterns, enabling insights that publishers and consumers often cannot gather on their own.

Sustainability Call to Action

With massive traffic, a mountain of hard work to do, supply chain attackers at the gates, and a mission to keep access for individuals open and free, the registries need funding and paid services or models that scale with the demands. The way to get there is to bring commercial users and ecosystem stakeholders to the table as paying customers.

The Sustaining Package Registries Working Group is bringing registry leaders together to define what sustainable operation looks like across funding, operations, and transparency.

Now the ecosystem needs to meet that moment. The companies that depend on these systems must help sustain them so the next generation of software can be built on infrastructure that is not just open, but resilient.

Alpha-Omega 

Continuous Delivery Foundation (CDF)

Eclipse Foundation (OpenVSX)

OpenJS Foundation

Open Source Security Foundation (OpenSSF)

Linux Foundation

Packagist (Composer)

Perl and Raku Foundation

Python Software Foundation (PyPI)

Ruby Central (RubyGems)

Rust Foundation (crates.io)

Sonatype (Maven Central)

Why Third-Party Notices Are Breaking at Scale: What the Ecosystem Needs Next

By Blog, Guest Blog

By Devashri Datta, Independent Researcher, Software Supply Chain Security

Third-party notices (TPNs) are documents distributed to users that list open source third-party software components included in the product and key licensing information. Every time you buy a TV or router, you’ve probably seen them. Yet TPNs were never designed for the complexity, scale, and velocity of today’s software ecosystem. TPNs are one of the most widely distributed and yet least understood artifacts in modern software supply chains.

Inside nearly every appliance, firmware image, SaaS platform, and enterprise distribution, the same pattern persists: a long, unstructured PDF is expected to represent the full scope of open source license compliance.

As software systems have scaled, TPNs have quietly become a critical but increasingly fragile pillar. They are now failing technically, operationally, and structurally under the demands of modern development and distribution.

This article examines why TPNs are breaking. It also outlines what the ecosystem must do next based on large-scale analysis of real-world TPN documents and the development of an automated framework for extracting information directly from them. While traditionally viewed as compliance artifacts, Third-Party Notices (TPNs) also represent an underutilized source of security-relevant intelligence. In many real-world scenarios where Software Bills of Materials (SBOMs) are incomplete, unavailable, or restricted, TPNs may provide the only observable evidence of component usage. This positions TPNs as a critical input to software supply chain security workflows, including vulnerability management, third-party risk assessment, and incident response.

The Hidden Reality: TPNs Are the Supply Chain’s Last Mile

Despite advances in Software Bill of Materials (SBOM) formats such as SPDX and CycloneDX, TPNs remain:

  • The only compliance artifact that many vendors publicly distribute
  • The only artifact available to customers or regulators for proprietary systems
  • The only verifiable attribution record when source code and SBOMs are inaccessible

SBOMs provide structured visibility into software components, but their completeness depends on the generation methods and the availability of build-time data. In practice, SBOMs may not consistently capture full transitive dependencies or runtime-resolved components. In some cases, additional components and licensing details may appear in downstream artifacts such as third-party notices (TPNs), though these are typically not integrated into SBOM analysis pipelines. SBOM availability also varies across organizations and products and may not always be accessible to end users or external stakeholders due to policy or regulatory interpretation. Regulatory frameworks such as the EU Cyber Resilience Act (CRA) are evolving, and expectations around SBOM scope and disclosure remain subject to interpretation. As a result, relying solely on SBOM data may not provide complete visibility into whether a product contains a specific vulnerable component, depending on SBOM completeness and related artifact availability.

In practice, TPNs often serve as the last mile of compliance visibility, bridging internal software composition and external disclosure.

However, TPNs were never designed to operate at the scale or complexity of today’s supply chains.

Security Blind Spot in Software Supply Chains

While SBOMs and software composition analysis (SCA) tools have improved visibility during development, they assume access to structured or source level data. In contrast, TPNs often represent the only externally available artifact in downstream consumption environments such as embedded systems, firmware, and proprietary SaaS distributions.

This creates a structural blind spot in software supply chain security: security teams are frequently forced to make risk decisions without machine readable component intelligence. As a result, vulnerability exposure, dependency risk, and third-party software usage often remain partially or completely unobservable at the point of consumption.

Why the TPN Ecosystem Is Breaking

PDFs Are an Anti-Pattern for Machine-Readable Compliance

Most TPNs are distributed as large, heterogeneous PDFs containing:

  • Multi-column layouts
  • OCR artifacts and noise
  • Inconsistent license formatting
  • Duplicated or truncated license text

TPNs often omit component identifiers and lack specific version numbers for components.

PDFs are optimized for display, not structured data. As a result, extracting meaningful compliance information programmatically is extremely difficult.

Existing Compliance Tools Don’t Address the Problem

Current tools such as FOSSology, ScanCode, and ORT are designed to analyze source code or binaries—not TPN documents. Yet in many real-world scenarios, especially audits or vendor reviews, TPNs are the only artifact available.

This creates a fundamental gap: The most widely distributed compliance artifact is the least analyzable.

Inconsistent Generation Pipelines Lead to Data Drift

TPNs are generated through highly variable processes:

  • Custom scripts
  • Proprietary internal tooling
  • Manual aggregation from legacy systems
  • Partial or outdated SBOM exports

As a result, even TPNs from the same organization can vary significantly across releases, introducing inconsistencies, omissions, and misalignment with actual dependencies.

Scale Has Outpaced Human Review

Modern TPNs often span hundreds of pages across multiple license families and components.

Manual review has become increasingly impractical due to:

  • Repetitive license text
  • Poorly structured component mappings
  • Lack of contextual metadata
  • Hidden obligations within large text blocks

Compliance teams are effectively being asked to analyze documents at a scale that exceeds human capability.

Proposed Contribution: TPN-to-Security Intelligence Framework

This work introduces a systematic framework for transforming Third-Party Notices (TPNs) from unstructured compliance artifacts into structured security intelligence inputs. The framework addresses a critical gap in software supply chain security: the absence of machine-readable component visibility in downstream and vendor-distributed environments.

Unlike traditional software composition analysis tools that rely on source code, build artifacts, or SBOMs, this approach operates on TPNs as a primary data source. It enables the extraction, classification, and interpretation of software components and license obligations from highly unstructured documents.

The key contribution of this work is the demonstration that TPNs can be operationalized into actionable security intelligence for:

  • Vulnerability exposure identification when SBOMs are unavailable
  • Third-party risk assessment using externally visible artifacts
  • Incident response prioritization based on inferred component usage
  • Governance and compliance enforcement through structured outputs

Breaking the Logjam: Toward Automated License Intelligence

To address this systemic gap, I developed an automated end-to-end framework that treats TPNs as primary compliance artifacts, rather than secondary documentation.

The approach enables structured extraction and interpretation of license intelligence directly from unstructured documents. While TPNs may lack some information, they still provide valuable signals. For example, even without version identifiers, knowing that a product includes a component can be very valuable (e.g., when asking “which products contain a version of log4j that might be vulnerable to this attack?”).

Structured Extraction from Unstructured PDFs

Using normalization, segmentation, and page-level reconstruction, the system identifies and extracts coherent license blocks even from highly inconsistent documents.

License Identification and Classification

A hybrid approach combining rule-based methods and fuzzy matching maps extracted text into meaningful license categories:

  • Permissive
  • Weak copyleft
  • Strong copyleft
  • Proprietary
  • Public domain
  • Content licenses
  • Unknown

This approach achieves, in my testing:

  • 92–96% accuracy for permissive licenses
  • 85–90% accuracy for copyleft detection

Risk Interpretation

Each component is evaluated for compliance risk based on obligations such as:

  • Attribution requirements
  • Redistribution conditions
  • Copyleft scope
  • Source disclosure obligations
  • Ambiguous or unidentified licenses

Visualization and Machine-Readable Outputs

The framework produces:

  • Interactive dashboards
  • Structured datasets
  • Outputs compatible with governance workflows and SBOM pipelines

This demonstrates that meaningful compliance intelligence can be derived even from the most constrained artifact available. This closes a long-standing visibility gap in the software supply chain.

Security Implications of TPN Breakdown

The failure of TPNs is not only a compliance problem—it has direct consequences for software supply chain security. When TPNs are inconsistent, unstructured, or incomplete, they reduce the ability of downstream stakeholders to:

  • Identify exposure to known vulnerable components
  • Trace dependency relationships in third-party software
  • Perform accurate third-party risk assessments
  • Respond quickly to emerging vulnerabilities in production systems

This makes TPN degradation a security visibility problem, not just a documentation inefficiency.

What the Ecosystem Needs Next

TPN failures are not isolated inefficiencies. They represent a structural weakness in how the global software supply chain communicates compliance.

Addressing this requires coordinated effort across standards, tooling, and ecosystem alignment.

Standardized, Machine-Readable TPN Formats

The ecosystem needs formats beyond PDFs, such as:

  • Creating a standard TPN-JSON format for use
  • SPDX-aligned TPN profiles

These would enable structured, interoperable compliance disclosures.

One possible longer-term solution is to embed machine-readable data (such as an SBOM in SPDX format or a TPN in JSON format) within the PDFs, creating a “hybrid PDF”. The PDF format already permits adding internal files (called “attached files”). LibreOffice already supports generating PDFs that embed the source document, allowing people to use their existing process for exchanging display PDF while also including machine-readable data. Tools that can quickly extract those embedded files and complain when they’re not present could speed their deployment. However, while this approach has promise, it doesn’t deal with the current documents, which do not embed this information.

Improved Support for Dependency Analysis

Unsurprisingly, many improvements for handling dependencies could help in processing TPNs, SBOMs, and many other related formats.

It would be better if there was shared reference corpora for license matching. That’s because accurate license detection requires:

  • Canonical license datasets
  • Variant and legacy license mappings
  • Community-maintained reference corpora

This would significantly improve consistency across tools and organizations.

In addition, there should be open APIs for information on licensing. Standard APIs should support:

  • License extraction
  • Component-to-license mapping
  • Obligation and risk interpretation

This would enable interoperability between vendors, auditors, and regulators.

Integration Between SBOM and TPN Pipelines

Today, SBOMs and TPNs exist in disconnected workflows. Yet in many cases, TPNs provide the only information available about product components.

A unified pipeline would:

  • Eliminate duplication
  • Reduce inconsistencies
  • Ensure alignment between internal and external disclosures

Related Work

Prior efforts across the software supply chain ecosystem have focused on improving license detection and SBOM generation during development and build phases. However, these approaches often assume access to source code or structured metadata, leaving a visibility gap when Third‑Party Notices (TPNs) are the only available compliance artifact.

Related work on automating TPN analysis demonstrates how unstructured compliance documents can be transformed into machine‑readable license intelligence suitable for governance and audit workflows. Supporting datasets for compliance governance and SBOM alignment are described in:

Datta, D., **TPN Compliance Dataset for Software Supply Chain Governance**, Zenodo, 2025. 

https://doi.org/10.5281/zenodo.19152619

Framework:

https://doi.org/10.5281/zenodo.19099831

Security Workflow Integration Model

The proposed framework reframes TPNs as an input layer in modern software supply chain security workflows. Rather than treating TPNs as static compliance documentation, they can be operationalized into structured security intelligence pipelines.

The extracted data can be integrated into:

  • Vulnerability management systems (to identify exposed components when SBOMs are missing)
  • Third-party risk management (TPRM) platforms (to assess supplier software risk)
  • Incident response workflows (to rapidly evaluate exposure after CVE disclosures)
  • DevSecOps pipelines (to enforce policy-based controls on software composition)

This positions TPN analysis as a bridge between compliance documentation and operational security decision-making.

Conclusion: The Future Requires Fixing TPNs

Third-party notices (TPNs) were originally designed as simple attribution mechanisms and ways to declare licenses to recipients (as required by many licenses). Today, they are expected to support audits, transparency, regulatory compliance, and supply chain security.

But they are still delivered as static documents that do not scale.

TPNs are not failing because organizations lack intent; they are failing because the ecosystem has outgrown the tools and formats upon which it relies.

If we want a more transparent, auditable, and trustworthy software supply chain, TPNs must evolve into structured, machine-readable, and interoperable artifacts.

The next phase of open source security will not be defined solely by SBOMs or scanning tools, but by how effectively we solve the last mile of compliance visibility.

Fixing TPNs is an important step toward a more reliable and verifiable software ecosystem.

 

Acknowledgments

The author acknowledges David A. Wheeler and Sally Cooper for their insightful feedback and helpful discussions during the development of this work.

Resources

The open source implementation of the prototype described in this post, including parsing logic, license-classification rules, and the interactive dashboard, is available on GitHub for anyone interested in exploring or extending the approach:

https://github.com/devashridatta-dotcom/tpn-automation

Community feedback and contributions are welcome.

Author Bio

Devashri Datta is an AI & Software Supply Chain Security Researcher. Security researcher and enterprise security architect focused on software supply chain security, DevSecOps automation, and security governance at scale. Research areas include SBOM governance, vulnerability intelligence (VEX), Third-Party Notice (TPN) analysis, AI-assisted risk modeling, and security exception management in cloud-native environments under compliance frameworks such as SOC 2, ISO 27001, and FedRAMP.

From Noise to Signal: Using Runtime Context to Win the Vulnerability Management Battle

By Blog, Guest Blog

By Jonas Rosland

Security teams in 2026 have no shortage of data, alerts, or findings. In 2025 alone, 48,185 Common Vulnerabilities and Exposures (CVEs) were published, a 20.6% increase over 2024’s already record-breaking total of 39,962. That works out to roughly 130 new vulnerabilities disclosed every single day, and for seven consecutive years, the annual count has hit a new record high.

The drivers are structural: the explosive growth of open source software, the complexity of transitive dependencies hidden deep in software supply chains, and an expanding CVE ecosystem that now encompasses nearly twice as many reporting organizations as it did five years ago. With 97% of commercial applications containing open source components, inherited risk has become a routine part of working with modern software.

While only 2% of all discovered vulnerabilities are ever exploited in the wild, of that small fraction, nearly 29% were exploited on or before the day their CVE was published. Attackers are selective, but once they identify a target, the window for defenders is very narrow. The window between vulnerability disclosure and confirmed exploitation is also shrinking. Whereas that timeline was over a year in 2020, it’s now shrunk to just hours.

The old model of scanning everything, triaging by Common Vulnerability Scoring System (CVSS) score, and working through a queue simply cannot keep pace with this reality. Something has to change.

Most vulnerabilities will never be exploited

The vast majority of what your vulnerability scanner finds will never actually be used against you. That means the core challenge facing security teams isn’t patching speed, but knowing where to focus. When a scanner returns thousands of findings ranked only by CVSS score, what looks like a workload problem is really a prioritization problem. Critical vulnerabilities in libraries that aren’t loaded at runtime, or in containers that haven’t run in months, crowd out the findings that genuinely matter, such as exploitable vulnerabilities in running, exposed workloads. The result is alert fatigue, missed priorities, and growing friction between security and development teams.

The OpenSSF Best Practices criteria reflect this directly. At the “Passing” level, projects must not contain unpatched vulnerabilities of medium or higher severity that have been known publicly for more than 60 days, and critical vulnerabilities should be fixed rapidly after they are reported. The emphasis here isn’t on the volume of findings processed, but on the speed and accuracy with which the most dangerous vulnerabilities are addressed, a distinction that gets lost when teams are buried in undifferentiated backlogs.

Why static analysis alone isn’t enough

Static analysis is non-negotiable. The OpenSSF Best Practices criteria require it at the “Passing” level, and at “Silver,” projects must use tools that look for real vulnerabilities in code, not just style issues. Integrated into CI/CD pipelines, static analysis catches bugs early when they are cheapest to fix, and it remains a solid foundation of any security program. However, alone, it’s not enough.

The limitation is that static analysis sees everything, regardless of whether it matters in practice. It cannot tell you whether a vulnerable library is actually loaded in a running container, or whether that container ever receives external traffic. A CVSS 9.8 score looks identical whether the package is called thousands of times a day in a critical service or has never once been invoked in production. Runtime security fills that gap by observing what is actually executing in production. By tracking which processes are running, which packages are loaded, and which connections are being made, security teams gain much more precise intelligence about where risk actually lives.

Only 15% of critical and high-severity vulnerabilities with an available fix are in packages actually loaded at runtime. By isolating that subset, teams can reduce the scope of what needs immediate attention to a small fraction of their total backlog, in some cases by over 95%. That’s the practical difference between a list that overwhelms a development team and one they can actually act on. Static analysis provides breadth by catching everything possible during development, while runtime intelligence adds depth by showing what genuinely matters in production. Together, they give teams the context to make better decisions.

Helping security and development teams speak the same language

Runtime data also changes how security teams and developers talk to each other. Telling a developer “this CVE is rated 8.1” lands very differently than “this vulnerability is in a package actively loaded in your production authentication service.” The second statement connects a finding to a tangible business risk, and that context helps developers understand urgency in a way that a severity score on its own rarely does.

When security teams can bring developers a short, contextualized list of what needs attention and why, the conversation tends to shift from friction to collaboration. The OpenSSF Best Practices framework supports this kind of working relationship structurally, requiring documented vulnerability response processes, response times under 14 days, and release notes that explicitly identify runtime vulnerabilities fixed in each release. These aren’t bureaucratic requirements, but the scaffolding for the kind of consistent, trust-based communication that makes vulnerability management work in practice.

Neither team can do this work alone. Security engineers don’t always know which code paths are business-critical, and developers don’t always have visibility into what their software looks like from an attacker’s perspective. Runtime data helps bridge that gap by giving both sides a shared, evidence-based view of where the real risk lives.

Shrinking the problem over time

Prioritization manages the vulnerability problem today, but reducing the attack surface is how you make the problem smaller tomorrow. Runtime intelligence supports two practical strategies that static scanning alone cannot.

  1. Build leaner, more deliberate images. Runtime analysis identifies unused packages, old utilities, and bundled libraries that never get called in production, giving teams a clear basis for stripping images down. Building from scratch or distroless base images takes this further by removing shells, package managers, and other components that have no place in a production workload, and combining that with rootless containers limits the damage an attacker can do if they do gain access. Runtime data can also flag containers running on stale base images that are actively receiving traffic, making the case for a refresh concrete rather than a task that keeps getting deprioritized.
  2. Detect and respond to unexpected behavior in production. Even with good prioritization, not every risk can be patched immediately. This is where a runtime threat detection tool like Falco becomes valuable. By defining what normal behavior looks like for a given workload, Falco can flag unexpected activity in real time, such as a process spawning a shell, a container writing to a sensitive path, or an unusual outbound connection appearing. This doesn’t replace patching, but it provides a meaningful layer of protection while remediation work is underway, and it gives teams better visibility into whether a vulnerability is being actively probed or exploited.

The OpenSSF Best Practices criteria encourage minimizing the attack surface throughout, and the logic applies equally in production environments. The best vulnerability is the one that doesn’t exist because the vulnerable component was never there, and the next best outcome is knowing quickly when something unexpected is happening around the ones that remain.

Where to go from here

The 2025 numbers make one thing very clear: the volume of vulnerabilities isn’t going down, and teams that try to treat every finding with equal urgency will continue to struggle. The more practical path is to use static analysis and runtime intelligence together, letting each do what it does best, and to use that shared context to build better working relationships between security and development teams. Finding the right vulnerabilities to fix, explaining why they matter, and making it straightforward for developers to act on them is where the real progress happens.

About the Author

Jonas Rosland is Director of Open Source at Sysdig, where he works on cloud-native security and open source strategy. Sysdig supports open source security projects, including Falco, a CNCF graduated project for runtime threat detection.