Tag

SBOM

Why Third-Party Notices Are Breaking at Scale: What the Ecosystem Needs Next

By Blog, Guest Blog

By Devashri Datta, Independent Researcher, Software Supply Chain Security

Third-party notices (TPNs) are documents distributed to users that list open source third-party software components included in the product and key licensing information. Every time you buy a TV or router, you’ve probably seen them. Yet TPNs were never designed for the complexity, scale, and velocity of today’s software ecosystem. TPNs are one of the most widely distributed and yet least understood artifacts in modern software supply chains.

Inside nearly every appliance, firmware image, SaaS platform, and enterprise distribution, the same pattern persists: a long, unstructured PDF is expected to represent the full scope of open source license compliance.

As software systems have scaled, TPNs have quietly become a critical but increasingly fragile pillar. They are now failing technically, operationally, and structurally under the demands of modern development and distribution.

This article examines why TPNs are breaking. It also outlines what the ecosystem must do next based on large-scale analysis of real-world TPN documents and the development of an automated framework for extracting information directly from them. While traditionally viewed as compliance artifacts, Third-Party Notices (TPNs) also represent an underutilized source of security-relevant intelligence. In many real-world scenarios where Software Bills of Materials (SBOMs) are incomplete, unavailable, or restricted, TPNs may provide the only observable evidence of component usage. This positions TPNs as a critical input to software supply chain security workflows, including vulnerability management, third-party risk assessment, and incident response.

The Hidden Reality: TPNs Are the Supply Chain’s Last Mile

Despite advances in Software Bill of Materials (SBOM) formats such as SPDX and CycloneDX, TPNs remain:

  • The only compliance artifact that many vendors publicly distribute
  • The only artifact available to customers or regulators for proprietary systems
  • The only verifiable attribution record when source code and SBOMs are inaccessible

SBOMs provide structured visibility into software components, but their completeness depends on the generation methods and the availability of build-time data. In practice, SBOMs may not consistently capture full transitive dependencies or runtime-resolved components. In some cases, additional components and licensing details may appear in downstream artifacts such as third-party notices (TPNs), though these are typically not integrated into SBOM analysis pipelines. SBOM availability also varies across organizations and products and may not always be accessible to end users or external stakeholders due to policy or regulatory interpretation. Regulatory frameworks such as the EU Cyber Resilience Act (CRA) are evolving, and expectations around SBOM scope and disclosure remain subject to interpretation. As a result, relying solely on SBOM data may not provide complete visibility into whether a product contains a specific vulnerable component, depending on SBOM completeness and related artifact availability.

In practice, TPNs often serve as the last mile of compliance visibility, bridging internal software composition and external disclosure.

However, TPNs were never designed to operate at the scale or complexity of today’s supply chains.

Security Blind Spot in Software Supply Chains

While SBOMs and software composition analysis (SCA) tools have improved visibility during development, they assume access to structured or source level data. In contrast, TPNs often represent the only externally available artifact in downstream consumption environments such as embedded systems, firmware, and proprietary SaaS distributions.

This creates a structural blind spot in software supply chain security: security teams are frequently forced to make risk decisions without machine readable component intelligence. As a result, vulnerability exposure, dependency risk, and third-party software usage often remain partially or completely unobservable at the point of consumption.

Why the TPN Ecosystem Is Breaking

PDFs Are an Anti-Pattern for Machine-Readable Compliance

Most TPNs are distributed as large, heterogeneous PDFs containing:

  • Multi-column layouts
  • OCR artifacts and noise
  • Inconsistent license formatting
  • Duplicated or truncated license text

TPNs often omit component identifiers and lack specific version numbers for components.

PDFs are optimized for display, not structured data. As a result, extracting meaningful compliance information programmatically is extremely difficult.

Existing Compliance Tools Don’t Address the Problem

Current tools such as FOSSology, ScanCode, and ORT are designed to analyze source code or binaries—not TPN documents. Yet in many real-world scenarios, especially audits or vendor reviews, TPNs are the only artifact available.

This creates a fundamental gap: The most widely distributed compliance artifact is the least analyzable.

Inconsistent Generation Pipelines Lead to Data Drift

TPNs are generated through highly variable processes:

  • Custom scripts
  • Proprietary internal tooling
  • Manual aggregation from legacy systems
  • Partial or outdated SBOM exports

As a result, even TPNs from the same organization can vary significantly across releases, introducing inconsistencies, omissions, and misalignment with actual dependencies.

Scale Has Outpaced Human Review

Modern TPNs often span hundreds of pages across multiple license families and components.

Manual review has become increasingly impractical due to:

  • Repetitive license text
  • Poorly structured component mappings
  • Lack of contextual metadata
  • Hidden obligations within large text blocks

Compliance teams are effectively being asked to analyze documents at a scale that exceeds human capability.

Proposed Contribution: TPN-to-Security Intelligence Framework

This work introduces a systematic framework for transforming Third-Party Notices (TPNs) from unstructured compliance artifacts into structured security intelligence inputs. The framework addresses a critical gap in software supply chain security: the absence of machine-readable component visibility in downstream and vendor-distributed environments.

Unlike traditional software composition analysis tools that rely on source code, build artifacts, or SBOMs, this approach operates on TPNs as a primary data source. It enables the extraction, classification, and interpretation of software components and license obligations from highly unstructured documents.

The key contribution of this work is the demonstration that TPNs can be operationalized into actionable security intelligence for:

  • Vulnerability exposure identification when SBOMs are unavailable
  • Third-party risk assessment using externally visible artifacts
  • Incident response prioritization based on inferred component usage
  • Governance and compliance enforcement through structured outputs

Breaking the Logjam: Toward Automated License Intelligence

To address this systemic gap, I developed an automated end-to-end framework that treats TPNs as primary compliance artifacts, rather than secondary documentation.

The approach enables structured extraction and interpretation of license intelligence directly from unstructured documents. While TPNs may lack some information, they still provide valuable signals. For example, even without version identifiers, knowing that a product includes a component can be very valuable (e.g., when asking “which products contain a version of log4j that might be vulnerable to this attack?”).

Structured Extraction from Unstructured PDFs

Using normalization, segmentation, and page-level reconstruction, the system identifies and extracts coherent license blocks even from highly inconsistent documents.

License Identification and Classification

A hybrid approach combining rule-based methods and fuzzy matching maps extracted text into meaningful license categories:

  • Permissive
  • Weak copyleft
  • Strong copyleft
  • Proprietary
  • Public domain
  • Content licenses
  • Unknown

This approach achieves, in my testing:

  • 92–96% accuracy for permissive licenses
  • 85–90% accuracy for copyleft detection

Risk Interpretation

Each component is evaluated for compliance risk based on obligations such as:

  • Attribution requirements
  • Redistribution conditions
  • Copyleft scope
  • Source disclosure obligations
  • Ambiguous or unidentified licenses

Visualization and Machine-Readable Outputs

The framework produces:

  • Interactive dashboards
  • Structured datasets
  • Outputs compatible with governance workflows and SBOM pipelines

This demonstrates that meaningful compliance intelligence can be derived even from the most constrained artifact available. This closes a long-standing visibility gap in the software supply chain.

Security Implications of TPN Breakdown

The failure of TPNs is not only a compliance problem—it has direct consequences for software supply chain security. When TPNs are inconsistent, unstructured, or incomplete, they reduce the ability of downstream stakeholders to:

  • Identify exposure to known vulnerable components
  • Trace dependency relationships in third-party software
  • Perform accurate third-party risk assessments
  • Respond quickly to emerging vulnerabilities in production systems

This makes TPN degradation a security visibility problem, not just a documentation inefficiency.

What the Ecosystem Needs Next

TPN failures are not isolated inefficiencies. They represent a structural weakness in how the global software supply chain communicates compliance.

Addressing this requires coordinated effort across standards, tooling, and ecosystem alignment.

Standardized, Machine-Readable TPN Formats

The ecosystem needs formats beyond PDFs, such as:

  • Creating a standard TPN-JSON format for use
  • SPDX-aligned TPN profiles

These would enable structured, interoperable compliance disclosures.

One possible longer-term solution is to embed machine-readable data (such as an SBOM in SPDX format or a TPN in JSON format) within the PDFs, creating a “hybrid PDF”. The PDF format already permits adding internal files (called “attached files”). LibreOffice already supports generating PDFs that embed the source document, allowing people to use their existing process for exchanging display PDF while also including machine-readable data. Tools that can quickly extract those embedded files and complain when they’re not present could speed their deployment. However, while this approach has promise, it doesn’t deal with the current documents, which do not embed this information.

Improved Support for Dependency Analysis

Unsurprisingly, many improvements for handling dependencies could help in processing TPNs, SBOMs, and many other related formats.

It would be better if there was shared reference corpora for license matching. That’s because accurate license detection requires:

  • Canonical license datasets
  • Variant and legacy license mappings
  • Community-maintained reference corpora

This would significantly improve consistency across tools and organizations.

In addition, there should be open APIs for information on licensing. Standard APIs should support:

  • License extraction
  • Component-to-license mapping
  • Obligation and risk interpretation

This would enable interoperability between vendors, auditors, and regulators.

Integration Between SBOM and TPN Pipelines

Today, SBOMs and TPNs exist in disconnected workflows. Yet in many cases, TPNs provide the only information available about product components.

A unified pipeline would:

  • Eliminate duplication
  • Reduce inconsistencies
  • Ensure alignment between internal and external disclosures

Related Work

Prior efforts across the software supply chain ecosystem have focused on improving license detection and SBOM generation during development and build phases. However, these approaches often assume access to source code or structured metadata, leaving a visibility gap when Third‑Party Notices (TPNs) are the only available compliance artifact.

Related work on automating TPN analysis demonstrates how unstructured compliance documents can be transformed into machine‑readable license intelligence suitable for governance and audit workflows. Supporting datasets for compliance governance and SBOM alignment are described in:

Datta, D., **TPN Compliance Dataset for Software Supply Chain Governance**, Zenodo, 2025. 

https://doi.org/10.5281/zenodo.19152619

Framework:

https://doi.org/10.5281/zenodo.19099831

Security Workflow Integration Model

The proposed framework reframes TPNs as an input layer in modern software supply chain security workflows. Rather than treating TPNs as static compliance documentation, they can be operationalized into structured security intelligence pipelines.

The extracted data can be integrated into:

  • Vulnerability management systems (to identify exposed components when SBOMs are missing)
  • Third-party risk management (TPRM) platforms (to assess supplier software risk)
  • Incident response workflows (to rapidly evaluate exposure after CVE disclosures)
  • DevSecOps pipelines (to enforce policy-based controls on software composition)

This positions TPN analysis as a bridge between compliance documentation and operational security decision-making.

Conclusion: The Future Requires Fixing TPNs

Third-party notices (TPNs) were originally designed as simple attribution mechanisms and ways to declare licenses to recipients (as required by many licenses). Today, they are expected to support audits, transparency, regulatory compliance, and supply chain security.

But they are still delivered as static documents that do not scale.

TPNs are not failing because organizations lack intent; they are failing because the ecosystem has outgrown the tools and formats upon which it relies.

If we want a more transparent, auditable, and trustworthy software supply chain, TPNs must evolve into structured, machine-readable, and interoperable artifacts.

The next phase of open source security will not be defined solely by SBOMs or scanning tools, but by how effectively we solve the last mile of compliance visibility.

Fixing TPNs is an important step toward a more reliable and verifiable software ecosystem.

 

Acknowledgments

The author acknowledges David A. Wheeler and Sally Cooper for their insightful feedback and helpful discussions during the development of this work.

Resources

The open source implementation of the prototype described in this post, including parsing logic, license-classification rules, and the interactive dashboard, is available on GitHub for anyone interested in exploring or extending the approach:

https://github.com/devashridatta-dotcom/tpn-automation

Community feedback and contributions are welcome.

Author Bio

Devashri Datta is an AI & Software Supply Chain Security Researcher. Security researcher and enterprise security architect focused on software supply chain security, DevSecOps automation, and security governance at scale. Research areas include SBOM governance, vulnerability intelligence (VEX), Third-Party Notice (TPN) analysis, AI-assisted risk modeling, and security exception management in cloud-native environments under compliance frameworks such as SOC 2, ISO 27001, and FedRAMP.

Rethinking Post-Deployment Vulnerability Detection

By Blog, Guest Blog

By Tracy Ragan

Over the past decade, the IT community has made significant progress in improving pre-deployment vulnerability detection. Static analysis, Software Composition Analysis (SCA), container scanning, and dependency analysis are now standard components of modern CI/CD pipelines. These tools help developers identify vulnerable libraries and insecure code before software is released.

However, security does not end at build time.

Every successful software attack ultimately exploits a vulnerability that exists in a running system. Attackers can and do target code repositories, CI pipelines, and developer environments; these supply chain attacks are serious threats. But vulnerabilities running in live production systems are among the most dangerous because, once exploited, they can directly lead to persistent backdoors, system compromise, lateral movement, and data breaches.

This reality exposes an important gap in how organizations manage vulnerabilities today. While significant attention is placed on detecting vulnerabilities before deployment, far fewer organizations have effective mechanisms for identifying newly disclosed CVEs that affect software already running in production.

Across the industry, most development teams today run some form of pre-deployment vulnerability scanning, yet relatively few maintain continuous visibility into vulnerabilities impacting deployed software after release. This imbalance creates a dangerous blind spot: the systems organizations rely on every day may become vulnerable long after the code has passed through security checks.

As the volume of vulnerability disclosures continues to increase, the industry must rethink how post-deployment vulnerabilities are detected and remediated.

The Growing Post-Deployment Vulnerability Problem

Modern software systems depend heavily on open source components. A typical application may include hundreds, or even thousands, of transitive dependencies. While security scanning tools help identify vulnerabilities during development, they cannot predict vulnerabilities that have not yet been disclosed.

New CVEs are published daily across open source ecosystems. When a vulnerability is disclosed affecting a widely used package, thousands of deployed applications may suddenly become vulnerable, even if those applications passed every security check during their build process.

This creates a persistent challenge: software that was secure at release can become vulnerable later without any code changes.

In many organizations, the detection of these vulnerabilities relies on periodic rescanning of artifacts or manual monitoring of vulnerability feeds. These approaches introduce delays between vulnerability disclosure and detection, extending the window of exposure for deployed systems.

Because attackers actively monitor vulnerability disclosures and quickly develop exploits, this detection gap creates significant operational risk.

Current Approaches to Detecting Post-Deployment CVEs

Organizations today use several methods to identify vulnerabilities affecting deployed software. While each approach has value, they are often costly and introduce operational complexity.

One common strategy involves rescanning previously built artifacts or container images stored in registries. Security teams periodically run vulnerability scanners against these artifacts to identify newly disclosed CVEs. Although this approach can detect vulnerabilities that were unknown at build time, the process cannot identify where the containers are running across system assets. 

Another approach relies on host-based security agents or runtime inspection tools deployed on production infrastructure. These tools identify vulnerable libraries by inspecting installed packages or monitoring application behavior. In practice, these solutions are most commonly implemented in large enterprise environments where dedicated operations and security teams can manage the operational complexity. They often require significant infrastructure integration, deployment planning, and ongoing maintenance.

Agent-based approaches also struggle to support edge environments, embedded systems, air-gapped deployments, satellites, or high-performance computing clusters, where installing additional runtime software may not be feasible or permitted. Even in traditional cloud environments, deploying and maintaining agents across thousands of systems can be a substantial operational lift.

This complexity stands in sharp contrast to pre-deployment scanning tools, which can often be installed in CI/CD pipelines in just minutes. Integrating a software composition analysis scanner into a build pipeline typically requires only a small configuration change or plugin installation. Because these tools are easy to adopt and operate earlier in the development lifecycle, they have seen widespread adoption across organizations of all sizes.

Post-deployment solutions, by comparison, often require significantly more effort to deploy and maintain. As a result, far fewer organizations implement comprehensive post-deployment vulnerability monitoring. While most development teams today run some form of pre-deployment vulnerability scanning, relatively few maintain continuous visibility into vulnerabilities impacting software already running in production. This leaves a critical visibility gap in the environments where vulnerabilities are ultimately exploited: live operational systems.

SBOMs Are an Underutilized Security Asset

A more efficient model for detecting post-deployment vulnerabilities already exists but is often underutilized.

Software Bill of Materials (SBOMs) provide a detailed inventory of the components included in a software release. When generated during the build process using standardized formats such as SPDX or CycloneDX, SBOMs capture critical metadata, including component names, versions, dependency relationships, and identifiers such as Package URLs.

SBOM adoption has accelerated in recent years due in part to initiatives such as Executive Order 14028 and ongoing work across the open source ecosystem. Organizations increasingly generate SBOMs as part of their software supply chain transparency efforts.

Yet in many environments, SBOMs are treated primarily as compliance documentation rather than operational security tools. Instead of being archived after release, SBOMs can serve as persistent inventories of the components running in deployed software systems.

Detecting Vulnerabilities Without Rescanning

When SBOMs are available and associated with deployed releases, detecting newly disclosed vulnerabilities becomes significantly simpler.

Vulnerability intelligence feeds, such as the OSV.dev database, the National Vulnerability Database (NVD), and other vendor advisories, identify the packages and versions affected by each CVE. By correlating this vulnerability information with stored SBOMs and release metadata, organizations can quickly determine whether a deployed asset includes an affected component.

Because the SBOM already describes the complete dependency graph, there is no need to reanalyze artifacts or rescan source code. Detection becomes a metadata correlation problem rather than a compute-intensive scanning process.

This model enables organizations to continuously monitor deployed software environments and identify newly disclosed vulnerabilities almost immediately after they are published.

Digital Twins and Continuous Vulnerability Synchronization

To operationalize this approach at scale, organizations need systems capable of continuously tracking the relationship between software releases, deployed environments, and their associated SBOMs. One emerging concept is the creation of a software digital twin, a continuously updated model that represents the software components running across operational systems.

A digital twin maintains the relationship between deployed endpoints and the SBOMs that describe the software they run. By synchronizing these SBOM inventories with vulnerability intelligence sources such as OSV.dev or the NVD at regular intervals, organizations can automatically detect when newly disclosed CVEs impact running systems.

Rather than waiting for scheduled scans or relying on agents installed on production infrastructure, this model enables continuous vulnerability awareness through metadata synchronization.

Once an affected component is identified, remediation workflows can also be automated. Modern development platforms already rely on dependency manifests such as pom.xml, package.json, requirements.txt, or container Dockerfiles. By automatically updating these dependency files and generating pull requests with patched versions, organizations can rapidly move fixes back through their CI/CD pipelines.

This type of automation has the potential to reduce vulnerability remediation times from months to days, dramatically shrinking the window of exposure. And, it is easy to scale, giving developers more control and visibility into the production threat landscape. 

Aligning with OpenSSF Security Initiatives

Efforts across the Open Source Security Foundation (OpenSSF) ecosystem have helped establish the foundational infrastructure needed for this approach.

The OSV.dev vulnerability database provides high-quality vulnerability data tailored to open source ecosystems. Standards such as SPDX and CycloneDX enable consistent representation of SBOM data across tools and platforms. Projects like OpenVEX provide mechanisms for communicating vulnerability exploitability context, helping organizations determine which vulnerabilities require immediate attention.

Together, these initiatives create the building blocks for a more efficient and scalable vulnerability management model, one that relies on accurate software inventories and continuous vulnerability intelligence rather than repeated artifact scanning.

The Future of Vulnerability Management

Pre-deployment security scanning will continue to play an important role in software development. Identifying vulnerabilities early in the development lifecycle reduces risk and improves software quality.

But the security landscape is evolving. As software ecosystems grow more complex and vulnerability disclosures increase, organizations must also strengthen their ability to detect vulnerabilities that appear after software has already been deployed.

Rethinking post-deployment vulnerability detection means shifting away from repeated artifact scanning and toward continuous monitoring of software composition.

SBOMs provide the foundation for this shift. When combined with digital twin models that track deployed software, continuous synchronization with vulnerability databases, and automated dependency remediation, organizations can dramatically improve their ability to defend operational systems.

One thing is certain: attackers ultimately focus on exploiting vulnerabilities running in live systems. Gaining clear visibility into the attack surface, understanding exactly what OSS packages are deployed, where they are running, and how quickly they can be remediated, is essential to securing live systems from cloud-native to the edge. 

Author 

Tracy Ragan is the Founder and Chief Executive Officer of DeployHub and a recognized authority in secure software delivery and software supply chain defense. She has served on the Governing Boards of the Open Source Security Foundation (OpenSSF) and currently serves as a strategic advisor to the Continuous Delivery Foundation (CDF) Governing Board. She also sits on both the CDF and OpenSSF Technology Advisory Committees. In these roles, she helps shape industry standards and pragmatic guidance for securing the software supply chain and advancing DevOps pipelines to enable safer, more effective use of open-source ecosystems at scale.

With more than 25 years of experience across software engineering, DevOps, and secure delivery pipelines, Tracy has built a career at the intersection of automation, security, and operational reality. Her work is focused on closing one of the industry’s most critical gaps: detecting and remediating high-risk vulnerabilities running in live, deployed systems, across cloud-native, edge, embedded, and HPC environments.

Tracy’s expertise is grounded in decades of hands-on leadership. She is the Co-Founder and former COO of OpenMake Software, where she pioneered agile build automation and led the development of OpenMake Meister, a build orchestration platform adopted by hundreds of enterprise teams and generating over $60M in partner revenue. That experience directly informs her current mission: eliminating security blind spots that persist long after software is released.

What’s in the SOSS? Podcast #45 – S2E22 SBOM Chaos and Software Sovereignty: The Hidden Challenges Facing Open Source with Stephanie Domas (Canonical)

By Podcast

Summary

Stephanie Domas, Canonical’s Chief Security Officer, returns to What’s in the SOSS to discuss critical open source challenges. She addresses the issues of third-party security patch versioning, the rise of software sovereignty, and how custom patches break SBOMs. Domas also explains why geographic code restrictions contradict open source principles and what the EU’s Cyber Resilience Act (CRA) means for enterprises. She highlights Canonical’s work integrating memory-safe components like sudo-rs into the next Ubuntu LTS. This episode challenges assumptions about supply chain security, software trust, and the future of collaborative development in a regulated world.

Conversation Highlights

00:00 – Welcome
01:49 – Memory safety revolution
02:00 – Black Hat reflections
03:48 – The SBOM versioning crisis
06:23 – Semantic versioning falls apart
10:06 – Software sovereignty exposed
12:33 – Trust through transparency
14:02 – The insider threat parallel
17:04 – EU CRA impact
18:50 – The manufacturer gray area
21:08 – The one-maintainer problem
22:51 – Will regulations kill open source adoption?
24:43 – Call to action

Transcript

CRob (00:07.109)
Welcome welcome welcome to what’s in the sauce where we talked to the amazing people that make up the upstream open source ecosystem these are developers maintainers researchers all manner of contributors that help make open source great today I have an Amazing friend of the show and actually this is a repeat performance. She has been with us once before

We have Stephanie Domas. She’s the chief security officer for Canonical. It’s a little company you might have heard about. So welcome for joining and thank you for joining us today, Stephanie.

Stephanie Domas (00:45.223)
Thank you for having me again for a second time.

CRob (00:48.121)
I know. It’s been a while since we chatted. So how are things going in the amazing world of the Linux distros?

Stephanie Domas (00:58.341)
Yeah, so just for people who aren’t as familiar with canonical because we do have a bit of a brand thing, we are the makers of Ubuntu Linux. So connect the dots for everyone. So the world of the distros is always a fun place. There has been a lot of recent excitement around NPM hacks, supply chain hacks. And so the world of archives, running archives, running a distro, there’s never a dull moment.

CRob (01:08.422)
Yay.

Stephanie Domas (01:28.475)
So on our distro front, right, we’ve been excited to try and on the security front, focusing on, we’re taking a fresh eye to how to introduce things like memory, save sort of core components into our operating systems. So a sudo RS, so a Rust implementation of sudo is now a part.

CRob (01:40.231)
Ooh.

Stephanie Domas (01:49.085)
Ubuntu will be a part of our next LTS, which is something we’re really excited. We’re looking for more opportunities to replace some of these fundamental components with memory safe versions of them. So that’s also exciting.

CRob (01:52.604)
Nice.

CRob (02:00.167)
That’s amazing. My hat is off to all of you for taking the leadership role and doing that. That’s great. I had the great opportunity to participate in a panel with you recently at Hacker Summer Camp. So kind of, I know it’s been a little bit of time, reflecting back, what was your favorite experience from the Black Hat Def Con, B-Sides, Diana week?

Stephanie Domas (02:24.609)
Yeah, and so I’m always one of those people who one of my favorite things is just the ability to reconnect with so many people in the industry. That’s the one time of year where actually despite the fact that you and I physically live near each other, that actually tends to be the one time a year we see each other in person. so extending that to all of just the great people I’ve known in the industry and getting to see them. The panel you spoke about was another real highlight for me. So the panel for those who didn’t get the privilege of attending wasn’t asked me anything on open source. And it was great because there was such a variety of interests right there. People who are interested in what it’s like to be a maintainer. What does it mean to use open source in enterprise? What does it mean to try and think of monetization models around open source? And so the diversity of conversation, which is really exciting to find people who are in the space really unfamiliar with open source or trying to figure out how do I reason about it and to be able to have those conversations with them was really exciting for me.

CRob (03:24.515)
I agree, I thought that was a great experience and I would love to see us do that again sometime. Excellent. Well, let’s move on to business. Today you have some things you’re interested in talking about. So let’s talk about one of my favorite topics, a software bill of materials and also talking about sovereignty within software.

Stephanie Domas (03:29.725)
I’ll put in a good word.

CRob (03:48.867)
From your perspective, how are you seeing these ideas and the tooling around SBOM, how are you seeing that adopted or used within the ecosystem today?

Stephanie Domas (04:00.421)
Yeah, so I will say, I’ll preface it with SBOMs continue to show a lot of really great promise. I do think there is a lot of value to them. I think they’re really starting to deliver on some of those benefits from trying to implement them at scale, right across our entire distro, across the archives, right? There are still some implementation challenges. So one of the things that’s been on my mind a lot recently is around versioning.

CRob (04:28.273)
Ooh.

Stephanie Domas (04:29.863)
So I feel like every week I see some new vendor or I get approached by some vendor whose business model is to create patches for you on the open source that you use to help maintain your security.

It’s a really interesting business proposition. get why these companies are popping up. But when I start to think of S-bombs and some of the difficulty we’re having around implementing them, this idea of version control in open source, when we have companies coming out of the woodwork to create patches, just creates this swirl of chaos.

To add some more, to add a specific example around there, you think about semantic versioning, right? When we release the version of something, it’s 5.0.0. When we release a patch, when the original manufacturer, the upstream releases a patch, it’s 5.0.1, right? Then 5.0.2. And so S-bombs come into play because the S-bombs goal is to understand the version of the thing that is inside of your piece of software. And then it uses that to kind of look up and say, hey, what are any known security issues in this version. But as I continue to see and have even talked to some of these businesses whose whole value proposition is, we will actually maintain and develop patches for you for the things that you are using, you’re breaking this whole idea of semantic versioning that now you are potentially carrying a tremendous number of patches for security specifically that aren’t representing that versioning number. And so then the question becomes like, what do you do with that? If I’m using internally 5.0.0 and I’ve used one of these companies to develop a security patch for me, what does that mean to my version number? What do I put in my S-bomb when I have created my own security patches that I have chosen not to upstream? What does that mean? And so…

Stephanie Domas (06:23.931)
I see this as a really unique problem to open source, right? Where there’s, there’s a lot of upsides to open source, like the tremendous amount to be specific, but the code base being open and allowing people to develop their own patches now creates this slurry of confusion of, don’t think version numbers will be the thing we can rely on in S bonds. And I don’t know what the answer is, right? I don’t actually, I don’t have a good approach to this when we’re starting to see, you know, customers using these companies create their own patches, they use some internal vulnerability scanner, and then they come back to us and say, hey, there’s these problems, these reports I’m getting from my scanner that’s scanning my S-bombs, and we’re sitting there confused of like, but because you’re carrying these other patches, I mean, it’s causing so much confusion where inside of the company, they don’t even realize. And so they get these scanning reports, so they come to us. They’re like, well, I don’t know how to answer this question because you’ve done something else internally, which is great. You like that, but your risk position, you’ve hired one of these companies to give you patches. So it’s one of these interesting things that’s been on my mind recently of, don’t know what the answer is, but I think SBOM’s inversion control and open source is going to be a really big struggle in the coming years as we see more of these offshoots of, of patches and security maintenance coming from not upstream and with no intention to upstream them.

CRob (07:50.415)
And I think, you we talk, get to deal with a lot of global public policy concerns and thinking about like the CRA where manufacturers are, have a lot of pressure to produce these patches. They’re encouraged to share them back upstream, but they’re not required. So I imagine the situation that you’re describing where these vendors are kind of setting up this little cottage practice.

You’re only going to see more of that. again, it’s going to cause a lot of confusion for the end users and a lot of, I’m sure, frustrated calls to your support line saying, I’m using package one, two, three, well, the actual, canonical version, pun intended. Ours is XYZ. Again, I bet that would cause a lot of customer downstream confusion.

Stephanie Domas (08:39.579)
Yeah, the EU CRA is one of the forcing functions that I think will bring a lot of this to light is where you start to have these obligations around patching and add a testing to this to your customers. But then again, you start to get this.

untenable spaghetti muster. How is that virgining happening when that manufacturer has potentially again use one of these businesses to create patches internally that they did not upstream. And so what does that mean for the S-bomb of the users when you give that to the users? How are they supposed to make informed risk decisions about that when the version number in that S-bomb is either follows upstream, in which case it misrepresents the security posture or they’ve come up with their own versioning system that’s unique to the internal patches that they’re carrying. And so again, the poor users are left without being able to make those informed decisions. So yes, the EU CRA, one of my favorite topics, is it’s gonna be a forcing function for that. And I think it’s gonna…

Again, there’s no answers. Like I think we’re going to be forced to try and make some of these decisions about like, what, what does that mean? Right? How do you, how do users reason about their SBOMs when the version numbers in them may not make sense anymore.

CRob (09:53.423)
It’s good to know that you’ve jumped into one of the hardest problems in software engineering, versioning. Do you want to tackle the naming issue as well while we’re at it?

Stephanie Domas (10:03.205)
I’m here to solve all, actually I’m just here to call out the problems.

CRob (10:06.247)
So somewhat adjacent, you mentioned that you wanted to talk about kind of this trend in the ecosystem around sovereignty. The EU thinks about it as digital sovereignty, but we’ve seen this concept in a lot of different places. So what are some of your observations around this, air quotes, new movement?

Stephanie Domas (10:32.121)
Yeah, so sovereignty is a really interesting one. For those who haven’t been too exposed to it, we’re seeing a big push in things like data sovereignty, which is focused on where is my data, right? Is my data contained in some, name some geographical border?

The big one that has been coming up for me a lot recently is software sovereignty. So there are some countries and some regulations that are taking the approach that the way to drive trust in a piece of software is by where it was developed, where the developers that wrote that piece of code come from. And on a surface, I…

I can see why they’re thinking that, right? I don’t necessarily agree with that. I can see why they may be thinking, hey, if this software was developed by people with the same ideals as me, maybe I can assume that it is for the same purpose that I agree with, right? I can see that argument, but I don’t love that argument. So the thought process to me is, is this actually the most effective path to true security, right? What they’re trying to achieve through this focus on software sovereignty is in fact more trust in the software. And being economical, right? That our solutions are open source, right? When this has come up for me for ambitions to sell to different federal governments, right? And so I sit in these conversations where contractors and consultants are trying to explain to me like, well, here’s what you have to do to sell to this name federal government here.

And you sit there thinking like, none of this makes any sense for open source. It’s kind of the antithesis of it, right? Attestation that the code was only worked on by people in certain countries. And I keep sitting there thinking, this is a wild way to derive trust in software. And so I sit there asking myself, why is this happening, right? On one side of the fence, there’s the, yes, I derived trust in software by who developed it.

Stephanie Domas (12:33.157)
The other side of that and the side that I think open source solves is, it not a more robust defense, right? To not think about national borders, but instead globally accessible and audible code bases. And again, they’re in these meetings. So CMMC was a reaching one for those in the U S but they have them in all over the world sitting there thinking like there’s absolutely no way for me to achieve anything beyond level one in CMMC because I can’t and have no interest in attesting to build pipelines that are only from certain citizens. And again, it’s one of these where it’s it’s confusing to me. So it’s been on my mind sitting there in these meetings about this and support sovereignty, right? Support sovereignty has actually come up as well for us. We have experience not just governments, but customers now who are seeking support to only be performed by people in name geographic border. And so we’re having to try and develop strategies again around support sovereignty. And again, I go back to if the codebase is globally accessible and auditable, I just feel like that’s a better way to solve this problem of deriving trust in what is done in the software, even from a support perspective, if you want to derive trust in what changes they’re proposing, what code they’re proposing to you. The fact that you can see it all should be how you derive trust, not where that person is physically sitting in the world. So that’s me on my snowboard. Sovereignty for the moment.

CRob (14:02.611)
And I, I like in this movement to back when I was an in-house practitioner in a InfoSec team, that the business always was dismissive of the insider threat vector. The fact that they were so focused on external actors that these, that somebody’s going to come in and hack the organization. But you look at reports like the Verizon data breach report that comes out every year and consistently every year, the most damaging cybersecurity breaches are people outside of your organizational trust boundary. And that’s, know, they cause the most damage. They are the most frequent, whether it’s ignorance or malice, you know, the, the, is the root cause of the problem, but it’s still someone inside of your organization. And to extrapolate that from a nation, you’re going to have a much broader spectrum of people, even outside of the, your small enterprise. So I always thought I was kind of scratched my head why people were so dismissive of this because there’s evidence that shows that the insiders are actually the ones that can cause that have consistently caused the most damage and then when you blow that up.

Stephanie Domas (15:11.773)
And I can tell you, yeah, as someone who spent time as an adjunct professor at The Ohio State University teaching software development.

There’s a lot of not great coders out there, right? So even if their ideals align, even if like there is nothing associated with code coming from a domestic location, it ensures quality, right? Again, even if you’re not worried about malice, there’s a lot of bad developers out there. The ability to have a portable code base should be your number one focus. And so yeah.

CRob (15:46.119)
And it’s the whole, have the thousand eyes, transparency perspective. But what I’ve always loved about open source is the fact that it’s meritocracy based. That, you know, the best ideas get submitted. have, we are able to pull ideas from everywhere that, know, anyone’s idea is potentially equally as valid as everybody else’s. So I love that aspect and that makes the software stronger because it helps ideally.

Coach those developers like you alluded to that might not be as skilled or as aware of security techniques and ideally helps them improve themselves, but also helps ideally not let those bad commits get into the code.

Stephanie Domas (16:25.777)
Yeah, and I’ll be the first to admit that not all open source is quality, right? It’s a variety of things. But the point is that you have the ability to determine that yourself before you choose to use the code. So depending on your risk posture, you get to make that decision.

CRob (16:43.43)
Mm-hmm.

Absolutely. So again, kind of staying on this topic, from your perspective and like the developers and then the customers you get to work with, how has this new uptick of global cyber legislation impacted your mission and impacted your customers?

Stephanie Domas (17:04.698)
Yeah, so I’ll actually I’m going to circle back to our favorite legislation, which is the EU CRA. So this has been a really interesting one because it’s.

CRob (17:10.01)
Yay!

Stephanie Domas (17:15.549)
It’s forced a lot more light, I think from the customers we’re working into the open source they’re using, how they’re consuming it and putting a lot more intentionality into if I continue to consume it this way, what does that mean for my, my liability associated with this piece of code? So in a way it’s been really good that it’s forcing customers to think more about that. As you mentioned earlier about patching and thinking about how am I getting this patch? How am I incorporating this patch? Part of the EU CRA is requiring for security patching for the defined lifecycle of the device. And again, that’s actually driving some really interesting conversations about, okay, yeah, that makes sense for the piece of software I wrote, but I had these pieces that I was consuming in the open source space and what does that mean for these? And so I do think it’s driving a good conversation on what does that mean? I worry that there is still some confusion in the legislation and until we get the practice guidance, we won’t have the answers to around that enterprise consumption of open source. And what does that mean from my risk perspective, right? If you consume so, and I’ll also caveat. like there is a carve out in the EU CRA for open source stewards. Canonical is defied, we make open source, but we are not actually an open source too, right? Because we have monetization models around support and because we want to support our customers, we are actually considered a manufacturer. So we’re taking a manufacturer role. So a manufacturer role over open source is also kind of a big gray area in the practice guidance and the legislation of like, well, what does that mean? Because again, despite the fact that we’re taking manufacturer on Ubuntu, the amount of upstream dependencies of us is astronomical. And so what does that mean? Right? What does that mean for us taking liability on the Debian kernel? I don’t know.

It’s bit confusing at the moment of what that means because we are in fact in manufacturer consuming open source. then to our customers who we have business relationships with, right? We are signing CRA support clauses, right? We are giving them, we are taking manufactured. They don’t have to take manufacturer on the open source they consume from us. But again, it’s a bit unclear of what entirely that means. We are actively involved in things like the Etsy working group that is defined some of these practice standards, particular and operating systems and Ubuntu is one of our flagship products. So we’re involved in that, that working group to try and define like, what exactly does this mean for operating systems? But despite being best known for Ubuntu, we actually have a very large software portfolio. And there’s a lot of software in our portfolio that doesn’t have a vertical practice guidance being made. And so this like general horizontal one that would define the requirements for those products not really see much activity on that yet. So there’s also big unknown around. don’t know what those expectations are for our products right now, which again, have tremendous upstream dependencies. And in some case, we are a fraction of that code base, right? What canonical produces in that code base is a fraction of the overall code base. But when people consume it through, what does that mean? Because we took the manufacturer role on it.

CRob (20:33.415)
Yeah. Well, and you touched on it a little bit earlier that all open source is amazing. Not all of it is great software. It’s a big spectrum. You look at a student project or someone running an experiment that you stumble across on GitHub is a very different experience than like operating with Kubernetes or the Linux kernel or anything from like the Apache foundation. like when you have these large mature teams, they have a lot of different capabilities as opposed to random project you find on the internet that somebody might not have intended for it to be commercialized.

Stephanie Domas (21:08.416)
Yeah, and I was reading an article recently on, I think it was an NPM archive that they had done their statistical analysis on, but they had done all this interesting analysis on the number of essentially the number of different handles who had committed to a piece of software and their overall conclusion right was that.

CRob (21:23.911)
Mm-hmm.

Stephanie Domas (21:26.301)
The majority of open source was one maintainer on this archive, right? That they showed of the, you know, N number of most popular downloads over 50 % of them. And I can’t remember the number. It might’ve been something closer to 80 % was one person. And so again, it’s not open source has a lot of amazing things. Not all of it is well written, not all of it is well maintained. so again, it’s sort of forcing regulations like the EU CRA are forcing people to take a much better look at their supply chain to understand, you know, where am I consuming that from? that something that is well maintained? Because I can’t just take the old version, say it’s stable and not worry about it from there. Can’t do that anymore under the CRA. I have to do security patching on this thing and now I’m potentially responsible for it and what does that mean? I do worry and I hope that we won’t see a recoil of enterprises using open source because of that fear, right? Because there are large libraries out there where there aren’t people willing to step up and take the maintainer role on them. And that makes enterprises afraid to consume those products under the EU CRA. And so that’s a fear that I do have. And again, it’s because it’s a bit of a gray area about how the liability, if that open source repo decides to take steward status, what does that mean for the enterprise consuming that? If it’s something small, it’s a tiny library, maybe the risk isn’t large, but it may be a really meaningful part of the application.

And if there’s no, if you’re not consuming it through somebody willing to take manufacturing role in your chain, will you still be willing to consume that piece of open source? So I do worry about that. I hope that as the practice guidance comes out, that is clarified so there’s better understanding of what that means and we won’t see that recoiling. But I have tried to talk to some of the legislators that are working on that and actually they haven’t been able to answer that question for me yet either. Is there going to be clarification because this is an area of concern. So hopefully between now and more practice guidance, we get more clarification so we don’t see that recoiling.

CRob (23:34.915)
And I know that at least from the horizontal and vertical standard standpoint, we are very close to starting to get public drafts being able to be delivered sometime fourth quarter in 2025. And ideally that’ll allow us to see what the legislators are thinking and where they are planning on landing on some of these issues. And hopefully we get some clarifications. And ideally we get the chance to provide feedback and say, hey, great work.

But have you considered X, Y, and Z to help make this more precise and more clear?

Stephanie Domas (24:11.387)
Absolutely, and my understanding that, hopefully I’m remembering the numbers correctly, I think there’s 15 vertical standards being written in Etsy right now. And so while that’s an astronomical amount of practice guidance, so like tremendous amount of work being done, that still won’t cover all the products. And so there’s still going to be a tremendous number of products that are sitting outside of those 15 vertical standards. And so again, the question will be like, but what does that mean for all the other products that are not in these 15?

CRob (24:43.431)
So as we wind down, what thoughts do you have? What do you want our audience to take away? What actions would you like them to take based off this conversation?

Stephanie Domas (24:55.825)
Yeah, depending on your role in the space. mean, I’m going to go ahead and plug open SSF, right? So the issues that we talked about, like what does SBOM mean as open source starts to become divergent, right? All of these things, I think will only be solved in organizations like the open SSF, right? Who has this bringing all the collective parties together to think about what it means. Same with some of these gray areas I mentioned about in the EU CRA and what does that mean? We need those types of organizations. if anyone listened to this and was like, well, maybe I have ideas on how to solve because all Stephanie did was complain about the problem, but I want to be a part of the solution. Look at joining the organizations like OpenSSF. That’s where the solutions will come from. Right. I can see my side of the problem, but I can’t, I can’t, even if I had ideas, I can’t independently solve it. Organizations like OpenSSF can do.

CRob (25:44.902)
Right.

It’s very much a community set of issues and I think collectively we’re stronger in having a better solution together. My friend, I need to arrange an ice cream social event in the near future before we’re covered in snow. But thank you for your time Stephanie and thank you to you and the team over at Canonical for all your hard work. with that, we’re going to call this a wrap. Thank you everybody.

Stephanie Domas (25:56.805)
Agreed.

CRob (26:17.019)
Happy open sourcing out there.

Trustify joins GUAC

By Blog, Guest Blog

By Ben Cotton and Dejan Bosanac

The superpower of open source is multiple people working together on a common goal. That works for projects, too. GUAC and Trustify are two projects bringing visibility to the software supply chain. Today, they’re combining under the GUAC umbrella. With Red Hat’s contribution of Trustify to the GUAC project, the two combine to create a unified effort to address the challenges of consuming, processing, and utilizing supply chain security metadata at scale.

Why Join?

The Graph for Understanding Artifact Composition (GUAC) project was created to bring understanding to software supply chains. GUAC ingests software bills of materials (SBOMs) and enriches them with additional data to create a queryable graph of the software supply chain. Trustify also ingests and manages SBOMs, with a focus on security and compliance. With so much overlap, it makes sense to combine our efforts.

The grand vision for this evolved community is to become the central hub within OpenSSF for initiatives focused on building and using supply chain knowledge graphs. This includes: defining & promoting common standards, data models, & ontologies; developing shared infrastructure & libraries; improving the overall tooling ecosystem; fostering collaboration & knowledge sharing; and providing a clear & welcoming community for contributors.

What’s Next?

Right now, we’re working on the basic logistics: migrating repositories, updating websites, merging documentation. We have created a new GUAC Steering Committee that oversees two core projects: Graph for Understanding Artifact Composition (GUAC) and Trustify, and subprojects like sw-id-core and GUAC Visualizer. These projects have their own maintainers, but we expect to see a lot of cross-collaboration as everyone gets settled in.

If you’d like to learn more, join Ben Cotton and Dejan Bosanac at OpenSSF Community Day Europe for their talk on Thursday 28 August. If you can’t make it to Amsterdam, the community page has all of the ways you can engage with our community.

Author Bios

Ben Cotton is the open source community lead at Kusari, where he contributes to GUAC and leads the OSPS Baseline SIG. He has over a decade of leadership experience in Fedora and other open source communities. His career has taken him through the public and private sector in roles that include desktop support, high-performance computing administration, marketing, and program management. Ben is the author of Program Management for Open Source Projects and has contributed to the book Human at a Distance and to articles in The Next Platform, Opensource.com, Scientific Computing, and more.

Dejan Bosanac is a software engineer at Red Hat with an interest in open source and integrating systems. Over the years he’s been involved in various open source communities tackling problems like: Software supply chain security, IoT cloud platforms and Edge computing and Enterprise messaging.

 

Member Spotlight: Datadog – Powering Open Source Security with Tools, Standards, and Community Leadership

By Blog

Datadog, a leading cloud-scale observability and security platform, joined the Open Source Security Foundation (OpenSSF) as a Premier Member in July, 2024. With both executive leadership and deep technical involvement, Datadog has rapidly become a force in advancing secure open source practices across the industry.

Key Contributions

GuardDog: Open Source Threat Detection

In early 2025, Datadog launched GuardDog, a Python-based open source tool that scans package ecosystems like npm, PyPI, and Go for signs of malicious behavior. GuardDog is backed by a publicly available threat dataset, giving developers and organizations real-time visibility into emerging supply chain risks.

This contribution directly supports OpenSSF’s mission to provide practical tools that harden open source ecosystems against common attack vectors—while promoting transparency and shared defense.

Datadog actively supports the open source security ecosystem through its engineering efforts, tooling contributions, and participation in the OpenSSF community:

  • SBOM Generation and Runtime Insights
    Datadog enhances the usability and value of Software Bills of Materials (SBOMs) through tools and educational content. Their blog, Enhance SBOMs with runtime security context, outlines how they combine SBOM data with runtime intelligence to identify real-world risks and vulnerabilities more effectively.
  • Open Source Tools Supporting SBOM Adoption
    Datadog maintains the SBOM Generator, an open source tool based on CycloneDX, which scans codebases to produce high-quality SBOMs. They also released the datadog-sca-github-action, a GitHub Action that automates SBOM generation and integrates results into the Datadog platform for improved visibility.
  • Sigstore and Software Signing
    As part of the OpenSSF ecosystem, Datadog supports efforts like Sigstore to bring cryptographic signing and verification to the software supply chain. These efforts align with Datadog’s broader commitment to improving software provenance and integrity, especially as part of secure build and deployment practices.
  • OpenSSF Membership
    As a Premier Member of OpenSSF, Datadog collaborates with industry leaders to advance best practices, contribute to strategic initiatives, and help shape the future of secure open source software.

These collaborations demonstrate Datadog’s investment in long-term, community-driven approaches to open source security.

What’s Next

Datadog takes the stage at OpenSSF Community Day North America on Thursday, June 26, 2025, in Denver, CO, co-located with Open Source Summit North America.

They’ll be presenting alongside Intel Labs in the session:

Talk Title: Harnessing In-toto Attestations for Security and Compliance With Next-gen Policies
Time: 3:10–3:30 PM MDT
Location: Bluebird Ballroom 3A
Speakers:

  • Trishank Karthik Kuppusamy, Staff Engineer, Datadog
  • Marcela Melara, Research Scientist, Intel Labs

This session dives into the evolution of the in-toto Attestation Framework, spotlighting new policy standards that make it easier for consumers and auditors to derive meaningful insights from authenticated metadata—such as SBOMs and SLSA Build Provenance. Attendees will see how the latest policy framework bridges gaps in compatibility and usability with a flexible, real-world-ready approach to securing complex software supply chains.

Register now and connect with Datadog, Intel Labs, and fellow open source security leaders in Denver.

Why It Matters

By contributing to secure development frameworks, creating open source tooling, and educating the broader community, Datadog exemplifies what it means to be an OpenSSF Premier Member. Their work is hands-on, standards-driven, and deeply collaborative—helping make open source safer for everyone.

Learn More