Skip to main content

How to Make High-Quality SBOMs

By John Speed Meyers, Chainguard

A software bill of materials (SBOM, pronounced s-bomb), an ingredient list of software components, has rightfully become a popular topic among those interested in open source software security. By enabling software transparency, SBOMs offer the possibility of improved security (not to mention better software quality and license compliance) for any person or organization that produces or uses software. And given that anybody who produces or uses software is nearly everybody, SBOMs deserve a special seat at the open source software security table. 

This widespread use of SBOMs arguably depends, however, on SBOM quality—that SBOMs contain sufficient and accurate information for the intended user to achieve their goals. But, until recently, it has been difficult to measure SBOM quality. New SBOM quality tools, a new SBOM dataset, and new SBOM quality research changes this state of affairs though. What do these new tools, datasets, and research findings say about the current state of SBOM quality?

Measuring SBOM quality is possible.

This is not an intractable issue, though different use cases will require different quality standards. For instance, using the “minimum elements” standard, previously defined by the United States Department of Commerce, is one sensible approach to ensuring adequate information for avoiding components with known vulnerabilities.

SBOMs vary widely in quality.

Some SBOMs lack key data that would make them useful. For instance, only 1 percent of nearly 3000 analyzed SBOMs appear to be compliant with the National Telecommunications and Information Administration (NTIA) minimum elements standard. But many SBOMs, including some currently being produced by open source projects, are high quality, enabling the widespread use the “SBOMs Everywhere” stream is aiming for.

Improving SBOM quality is feasible.

These new SBOM quality tools allow open source projects, organizations, and individuals to measure SBOM quality and improve the SBOMs that their tools produce or consume.

In short, making sure the SBOMs that will be everywhere are high quality is possible. After describing a couple of SBOM quality tools, a new SBOM dataset, and related SBOM-quality research findings, this post suggests actions for those in the open source community who are interested in making high-quality SBOMs everywhere.

SBOM Quality Tools

One way to measure SBOM quality is to define a set of minimum data fields that must be present. Two relatively new open source tools take this approach.

One tool, NTIA Conformance Checker, is a project housed within the SPDX GitHub organization. Written in Python, the tool can check whether an SPDX document possesses the data fields described by the National Telecommunications and Information Administration’s “minimum elements” document. The minimum elements are intended to support basic SBOM functionality, including vulnerability management, and include fields such as component name, component version, supplier name, other unique identifiers, dependency relationships, SBOM author, and timestamp.

Another tool, SBOM Scorecard, is an eBay open source project created by Justin Abrahms. Written in Go, SBOM Scorecard accepts SBOM documents in both SPDX and CycloneDX formats and determines whether an SBOM is specification compliant and whether it contains package IDs (via a PURL or CPE), package versions, and licenses. In contrast to the NTIA Conformance Checker tool, the SBOM Scorecard tool includes a focus on open source licenses and on ensuring SBOMs include either a PURL or CPE for each software component listed. A web-based version of the tool can also be found at sbom-scorecard.dev.

Although both tools are new, their existence suggests that measuring SBOM quality is achievable.

A New SBOM Dataset

While these new tools are welcome, they become useful only when applied to SBOMs. A new SBOM dataset, bom-shelter, provides two distinct SBOM mini-datasets to help parties that want to test out these or related SBOM analysis tools and to help researchers that want to systematically analyze a large set of SBOMs.

One dataset contains over 50 SBOMs found “in the wild,” that is, from a variety of open source projects. Another dataset contains approximately 3000 SPDX SBOMs generated by 4 different SBOM generation tools. While this second dataset is larger, it is potentially less representative of “real world” SBOMs since these SBOMs were created specifically for research purposes rather than gather from real projects. Together with new SBOM quality tools, these datasets enable research on the state of SBOM quality.

SBOM Quality Research Findings

To examine SBOM quality, I conducted preliminary research (here and here) on the topic with the tools and datasets mentioned above. To put it simply, SBOM quality varies widely.

The NTIA Conformance Checker tool, when applied to the 3000 SBOM dataset, revealed that only one percent of those SBOMs complied with the NTIA minimum elements. But before the reader laments this state of affairs, it’s worth pointing out that that same analysis also found that the majority of data fields are uniformly supplied in the majority of the analyzed SBOMs. These SBOMs are rich in data, but there is certainly still work to be done.

The SBOM Scorecard tool, used in conjunction with the open source project SBOMs found in the wild, came to a similar conclusion. While some of these SBOMs did lack basic information, such as component versions or component licenses, several of the analyzed SBOMs in this small dataset stood out as excellent. Some open source projects are already producing high-quality SBOMs.

How to Make High-Quality SBOMs 

For those in the open source community that seek to make high-quality SBOMs, there are a few steps that you can take.

  1. Measure the quality of the SBOMs you care about. You can try out the tools (NTIA Conformance Checker and SBOM Scorecard) described above, including in a convenient web app format; bug reports and feature requests are welcome. Or you can write your own!
  2. Work to improve the SBOMs you find, either fixing the particular SBOM or submitting issues or contributions to the SBOM generation tool you use.
  3. Contribute to the libraries that underpin SBOM development, such as SPDX’s Python library.

In the future when SBOMs are everywhere, we’ll be thankful that those SBOMs are also high-quality, enabling a more secure open source software ecosystem for all. For anyone wanting further efforts, join OpenSSF’s SBOM Everywhere group.

 

John Speed Meyers is a principal security scientist at Chainguard, a software supply chain security company.