Skip to main content
All Posts By

OpenSSF

SBOMs in the Era of the CRA: Toward a Unified and Actionable Framework

By Blog, Guest Blog

By Madalin Neag, Kate Stewart, and David A. Wheeler

In our previous blog post, we explored how the Software Bill of Materials (SBOM) should not be a static artifact created only to comply with some regulation, but should be a decision ready tool. In particular, SBOMs can support risk management. This understanding is increasing thanks to the many who are trying to build an interoperable and actionable SBOM ecosystem. Yet, fragmentation, across formats, standards, and compliance frameworks, remains the main obstacle preventing SBOMs from reaching their full potential as scalable cybersecurity tools. Building on the foundation established in our previous article, this post dives deeper into the concrete mechanisms shaping that evolution, from the regulatory frameworks driving SBOM adoption to the open source initiatives enabling their global interoperability. Organizations should now treat the SBOM not merely as a compliance artifact to be created and ignored, but as an operational tool to support security and augment asset management processes to ensure vulnerable components are identified and updated in a timely proactive way. This will require actions to unify various efforts worldwide into an actionable whole.

Accelerating Global Policy Mandates

The global adoption of the Software Bill of Materials (SBOM) was decisively accelerated by the U.S. Executive Order 14028 in 2021, which mandated SBOMs for all federal agencies and their software vendors. This established the SBOM as a cybersecurity and procurement baseline, reinforced by the initial NTIA (2021) Minimum Elements (which required the supplier, component name, version, and relationships for identified components). Building on this foundation, U.S. CISA (2025) subsequently updated these minimum elements, significantly expanding the required metadata to include fields essential for provenance, authenticity, and deeper cybersecurity integration. In parallel, European regulatory momentum is similarly mandating SBOMs for market access, driven by the EU Cyber Resilience Act (CRA). Germany’s BSI TR-03183-2 guideline complements the CRA by providing detailed technical and formal requirements, explicitly aiming to ensure software transparency and supply chain security ahead of the CRA’s full enforcement.

To prevent fragmentation and ensure these policy mandates translate into operational efficiency, a wide network of international standards organizations is driving technical convergence at multiple layers. ISO/IEC JTC 1/SC 27 formally standardizes and oversees the adoption of updates to ISO/IEC 5962 (SPDX), evaluating and approving revisions developed by the SPDX community under The Linux Foundation. The standard serves as a key international baseline, renowned for its rich data fields for licensing and provenance and support for automation of risk analysis of elements in a supply chain. Concurrently, OWASP and ECMA International maintain ECMA-424 (OWASP CycloneDX), a recognized standard optimized specifically for security automation and vulnerability linkage. Within Europe, ETSI TR 104 034, the “SBOM Compendium,” provides comprehensive guidance on the ecosystem, while CEN/CENELEC is actively developing the specific European standard (under the PT3 work stream) that will define some of the precise SBOM requirements needed to support the CRA’s vulnerability handling process for manufacturers and stewards.

Together, these initiatives show a clear global consensus: SBOMs must be machine-readable, verifiable, and interoperable, supporting both regulatory compliance over support windows and real-time security intelligence. This global momentum set the stage for the CRA, which now transforms transparency principles into concrete regulatory obligations.

EU Cyber Resilience Act (CRA): Establishing a Legal Requirement

The EU Cyber Resilience Act (CRA) (Regulation (EU) 2024/2847) introduces a legally binding obligation for manufacturers to create, 1maintain, and retain a Software Bill of Materials (SBOM) for all products with digital elements marketed within the European Union. This elevates the SBOM from a voluntary best practice to a legally required element of technical documentation, essential for conformity assessment, security assurance, and incident response throughout a product’s lifecycle. In essence, the CRA transforms this form of software transparency from a recommendation into a condition for market access.

Core Obligations for Manufacturers under the CRA include:

  • SBOM Creation – Manufacturers must prepare an SBOM in a commonly used, machine-readable format [CRA I(II)(1)], such as SPDX or CycloneDX.
  • Minimum Scope – The SBOM must cover at least the top-level dependencies of the product  [CRA I(II)(1)]. While this is the legal minimum, including deeper transitive dependencies is strongly encouraged.
  • Inclusion & Retention – The SBOM must form part of the mandatory technical documentation and be retained for at least ten years (Art.13) after the product has been placed on the market.
  • Non-Publication Clause – The CRA requires the creation and availability of an SBOM but does not mandate its public disclosure Recital 77. Manufacturers must provide the SBOM upon request to market surveillance authorities or conformity assessment bodies for validation, audit, or incident investigation purposes.
  • Lifecycle Maintenance – The SBOM must be kept up to date throughout the product’s maintenance and update cycles, ensuring that any component change or patch is reflected in the documentation Recital 90.
  • Vulnerability Handling – SBOMs provide the foundation for identifying component vulnerabilities under the CRA, while full risk assessment requires complementary context such as exploitability and remediation data. (Annex I)

The European Commission is empowered, via delegated acts under Article 13(24), to further specify the format and required data elements of SBOMs, relying on international standards wherever possible. To operationalize this, CEN/CENELEC is developing a European standard under the ongoing PT3 work stream, focused on vulnerability handling for products with digital elements and covering the essential requirements of Annex I, Part II of the CRA. Its preparation phase includes dedicated sub-chapters on formalizing SBOM structures, which will serve as the foundation for subsequent stages of identifying vulnerabilities and assessing related threats (see “CRA workshop ‘Deep dive session: Vulnerability Handling” 1h36m35s).

In parallel, the U.S. Cybersecurity and Infrastructure Security Agency (CISA) continues to shape global SBOM practices through its “Minimum Elements” framework and automation initiatives. These efforts directly influence Europe’s focus on interoperability and structured vulnerability handling under the CRA. This transatlantic alignment helps ensure SBOM data models and processes evolve toward a consistent, globally recognized baseline. CISA recently held a public comment window ending October 2, 2025 on a draft version of a revised set of minimum elements, and is expected to publish an update to the original NTIA Minimum Elements in the coming months.

Complementing these efforts, Germany’s BSI TR-03183-2 provides a more detailed technical specification than the original NTIA baseline, introducing requirements for cryptographic checksums, license identifiers, update policies, and signing mechanisms. It already serves as a key reference for manufacturers preparing to meet CRA compliance and will likely be referenced in the forthcoming CEN/CENELEC standard together with ISO/IEC and CISA frameworks. Together, the CRA and its supporting standards position Europe as a global benchmark for verifiable, lifecycle aware SBOM implementation, bridging policy compliance with operational security.

Defining the Unified Baseline: Convergence in Data Requirements

The SBOM has transitioned from a best practice into a legal and operational requirement due to the European Union’s Cyber Resilience Act (CRA). While the CRA mandates the SBOM as part of technical documentation for market access, the detailed implementation is guided by documents like BSI TR-03183-2. To ensure global compliance and maximum tool interoperability, stakeholders must understand the converging minimum data requirements. To illustrate this concept, the following comparison aligns the minimum SBOM data fields across the NTIA, CISA, BSI, and ETSI frameworks, revealing a shared move toward completeness, verifiability, and interoperability.

Data Field NTIA (U.S., 2021 Baseline) CISA’s Establishing a Common SBOM (2024) BSI TR-03183-2 (Germany/CRA Guidance) (2024) ETSI TR 104 034 (Compendium) (2025)
Component Name Required Required Required Required
Component Version Required Required Required Required
Supplier Required Required Required Required
Unique Identifier (e.g., PURL, CPE) Required Required  Required Required
Cryptographic Hash Recommended Required Required  Optional
License Information Recommended Required  Required Optional
Dependency Relationship Required  Required  Required Required
SBOM Author Required Required  Required  Required
Timestamp (Date of Creation) Required Required Required Required
Tool Name / Generation Context Not noted Not noted Required  Optional
Known Unknowns Declaration Optional Required Optional Optional
Common Format Required  Not noted Required  Required
Depth Not noted Not noted Not noted Optional

 

  • NTIA (2021): Established the basic inventory foundation necessary to identify components (who, what, and when).
  • CISA (2024): Framing Software Component Transparency establishes a structured maturity model by defining minimum, recommended, and aspirational SBOM elements, elevating SBOMs from simple component lists to verifiable security assets. CISA is currently developing further updates expected in 2025 to extend these principles into operational, risk-based implementation guidance.
  • BSI TR-03183-2: Mirrors the CISA/NTIA structure but mandates strong integrity requirements (Hash, Licenses) from the compliance perspective of the CRA, confirming the strong global convergence of expectations.
  • ETSI TR 104 034: As a technical compendium, it focuses less on specific minimum fields and more on the necessary capabilities of a functional SBOM ecosystem (e.g., trust, global discovery, interoperability, and lifecycle management).

The growing alignment across these frameworks shows that the SBOM is evolving into a globally shared data model, one capable of enabling automation, traceability, and trust across the international software supply chain.

Dual Standard Approach: SPDX and CycloneDX

The global SBOM ecosystem is underpinned by two major, robust, and mature open standards: SPDX and CycloneDX. Both provide a machine-processable format for SBOM data and support arbitrary ecosystems. These standards, while both supporting all the above frameworks, maintain distinct origins and strengths, making dual format support a strategic necessity for global commerce and comprehensive security.

The Software Package Data Exchange (SPDX), maintained by the Linux Foundation, is a comprehensive standard formally recognized by ISO/IEC 5962 in 2021. Originating with a focus on capturing open source licensing and intellectual property in a machine readable format, SPDX excels in providing rich, detailed metadata for compliance, provenance, legal due diligence, and supply chain risk analysis. Its strengths lie in capturing complex license expressions (using the SPDX License List and SPDX license expressions) and tracking component relationships in great depth, together with its extensions to support linkage to security advisories and vulnerability information, making it the preferred standard for rigorous regulatory audits and enterprise-grade software asset management. As the only ISO-approved standard, it carries significant weight in formal procurement processes and traditional compliance environments.  It supports multiple formats (JSON, XML, YAML, Tag/Value, and XLS) with free tools to convert between the formats and promote interoperability. 

The SPDX community has continuously evolved the specification since its inception in 2010, and most recently has extended it to a wider set of metadata to support modern supply chain elements, with the publication of SPDX 3.0 in 2024. This update to the specification contains additional fields & relationships to capture a much wider set of use cases found in modern supply chains including AI. These additional capabilities are captured as profiles, so that tooling only needs to understand the relevant sets, yet all are harmonized in a consistent framework, which is suitable for supporting product line management Fields are organized into a common “core”, and there are “software” and “licensing” profiles, which cover what was in the original specification ISO/IEC 5962. In addition there is now a “security” profile, which enables VEX and CSAF use cases to be contained directly in exported documents, as well as in databases  There is also a “build” profile which supports high fidelity tracking of relevant build information for “Build” type SBOMs. SPDX 3.0 also introduced a “Data” and “AI” related profiles, which made accurate tracking of AI BOMs possible, with support for all the requirements of the EU AI Act (see table in linked report).  As of writing, the SPDX 3.0 specification is in the final stages of being submitted to ISO/IEC for consideration. 

CycloneDX, maintained by OWASP and standardized as ECMA-424, is a lightweight, security-oriented specification for describing software components and their interdependencies. It was originally developed within the OWASP community to improve visibility into software supply chains. The specification provides a structured, machine-readable inventory of elements within an application, capturing metadata such as component versions, hierarchical dependencies, and provenance details. Designed to enhance software supply chain risk management, CycloneDX supports automated generation and validation in CI/CD environments and enables early identification of vulnerabilities, outdated components, or licensing issues. Besides its inclusion with SPDX in the U.S. federal government’s 2021 cybersecurity Executive Order, its formal recognition as an ECMA International standard in 2023 underscore its growing role as a globally trusted format for software transparency. Like SPDX, CycloneDX has continued to evolve since formal standardization and the current release is 1.7, released October 2025.

The CycloneDX specification continues to expand under active community development, regularly publishing revisions to address new use cases and interoperability needs. Today, CycloneDX extends beyond traditional SBOMs to support multiple bill-of-materials types, including Hardware (HBOM), Machine Learning (ML-BOM), and Cryptographic (CBOM), and can also describe relationships with external SaaS and API services. It integrates naturally with vulnerability management workflows through formats such as VEX, linking component data to exploitability and remediation context. With multi-format encoding options (JSON, XML, and Protocol Buffers) and a strong emphasis on automation.

OpenSSF and the Interoperability Toolkit

The OpenSSF has rapidly become a coordination hub uniting industry, government, and the open source community around cohesive SBOM development. Its mission is to bridge global regulatory requirements, from the EU’s Cyber Resilience Act (CRA) to CISA’s Minimum Elements and other global mandates, with practical, open source technical solutions. This coordination is primarily channeled through the “SBOM Everywhere” Special Interest Group (SIG), a neutral and open collaboration space that connects practitioners, regulators, and standards bodies. The SIG plays a critical role in maintaining consistent semantics and aligning development efforts across CISA, BSI, NIST, CEN/CENELEC, ETSI, and the communities implementing CRA-related guidance. Its work ensures that global policy drivers are directly translated into unified, implementable technical standards, helping prevent the fragmentation that so often accompanies fast-moving regulation.

A major focus of OpenSSF’s work is on delivering interoperability and automation tooling that turns SBOM policy into practical reality:

  • Protobom tackles one of the field’s toughest challenges – format fragmentation – by providing a format-agnostic data model capable of seamless, lossless conversion between SPDX, CycloneDX, and emerging schemas.
  • BomCTL builds on that foundation and offers a powerful, developer-friendly command line utility designed for CI/CD integration. It handles SBOM generation, validation, and transformation, allowing organizations to automate compliance and security workflows without sacrificing agility. Together, Protobom and Bomctl  embody the principles shared by CISA and the CRA: ensuring that SBOM data is modular, transparent, and portable across tools, supply chains, and regulatory environments worldwide.

Completing this ecosystem is SBOMit, which manages the end-to-end SBOM lifecycle. It provides built-in support for creation, secure storage, cryptographic signing, and controlled publication, embedding trust, provenance, and lifecycle integrity directly into the software supply chain process. These projects are maintained through an open, consensus-driven model, continuously refined by the global SBOM community. Central to that collaboration are OpenSSF’s informal yet influential “SBOM Coffee Club” meetings, held every Monday, where developers, vendors, and regulators exchange updates, resolve implementation challenges, and shape the strategic direction of the next generation of interoperable SBOM specifications.

OpenSSF’s strategic support for both standards – SPDX and CycloneDX – is vital for the entire ecosystem. By contributing to and utilizing both formats, most visibly through projects like Protobom and BomCTL which enable seamless, lossless translation between the two, OpenSSF ensures that organizations are not forced to choose between SPDX and CycloneDX. This dual format strategy satisfies the global requirement for using both formats  and maximizes interoperability, guaranteeing that SBOM data can be exchanged between all stakeholders, systems, and global regulatory jurisdictions effectively.

A Shared Vision for Action

Through this combination of open governance and pragmatic engineering, OpenSSF is defining not only how SBOMs are created and exchanged, but how the world collaborates on software transparency.

The collective regulatory momentum, anchored by the EU Cyber Resilience Act (CRA) and the U.S. Executive Order 14028, supported by the CISA 2025 Minimum Elements revisions, has cemented the global imperative for Software Bill of Materials (SBOM). These frameworks illustrate deep global alignment: both the CRA and CISA emphasize that SBOMs must be structured, interoperable, and operationally useful for both compliance and cybersecurity. The CRA establishes legally binding transparency requirements for market access in Europe, while CISA’s work encourages SBOMs within U.S. federal procurement, risk management, and vulnerability intelligence workflows. Together, they define the emerging global consensus: SBOMs must be complete enough to satisfy regulatory obligations, yet structured and standardized enough to enable automation, continuous assurance, and actionable risk insight. The remaining challenge is eliminating format and semantic fragmentation to transform the SBOM into a universal, enforceable cybersecurity control.

Achieving this global scalability requires a unified technical foundation that bridges legal mandates and operational realities. This begins with Core Schema Consensus, adopting the NTIA 2021 baseline and extending it with critical metadata for integrity (hashes), licensing, provenance, and generation context, as already mandated by BSI TR-03183-2 and anticipated in forthcoming CRA standards. To accommodate jurisdictional or sector-specific data needs, the CISA “Core + Extensions” model provides a sustainable path: a stable global core for interoperability, supplemented by modular extensions for CRA, telecom, AI, or contractual metadata. Dual support for SPDX and CycloneDX remains essential, satisfying the CRA’s “commonly used formats” clause and ensuring compatibility across regulatory zones, toolchains, and ecosystems.

Ultimately, the evolution toward global, actionable SBOMs depends on automation, lifecycle integrity, and intelligence linkage. Organizations should embed automated SBOM generation and validation (using tools such as Protobom, BomCTL, and SBOMit) into CI/CD workflows, ensuring continuous updates and cryptographic signing for traceable trust. By connecting SBOM information with vulnerability data in internal databases, the SBOM data becomes decision-ready, capable of helping identify exploitable or end-of-life components and driving proactive remediation. This operational model, mirrored in the initiatives of Japan (METI), South Korea (KISA/NCSC), and India (MeitY), reflects a decisive global movement toward a single, interoperable SBOM ecosystem. Continuous engagement in open governance forums, ISO/IEC JTC 1, CEN/CENELEC, ETSI, and the OpenSSF SBOM Everywhere SIG, will be essential to translate these practices into a permanent international standard for software supply chain transparency.

Conclusion: From Compliance to Resilient Ecosystem

The joint guidance A Shared Vision of SBOM for Cybersecurity insists on these global synergies under the endorsement of 21 international cybersecurity agencies. Describing the SBOM as a “software ingredients list,” the document positions SBOMs as essential for achieving visibility, building trust, and reducing systemic risk across global digital supply chains. That document’s central goal is to promote immediate and sustained international alignment on SBOM structure and usage, explicitly urging governments and industries to adopt compatible, unified systems rather than develop fragmented, country specific variants that could jeopardize scalability and interoperability.

The guidance organizes its vision around four key, actionable principles aimed at transforming SBOMs from static compliance documents into dynamic instruments of cybersecurity intelligence:

  • Modular Architecture – Design SBOMs around a Core Schema Baseline that satisfies essential minimum elements (component identifiers, supplier data, versioning) and expand it with optional extensions for domain specific or regulatory contexts (e.g., CRA compliance, sectoral risk requirements). Support and improve OSS tooling that enables processing and sharing of this data in a variety of formats.
  • Trust and Provenance – Strengthen authenticity and metadata transparency by including details about the generation tools, context, and version lineage, ensuring trust in the accuracy and origin of SBOM data.
  • Actionable Intelligence – Integrate SBOM data with vulnerability and incident-response frameworks such as VEX and CSAF, converting static component inventories into decision ready, risk aware security data.
  • Open Governance – Encourage sustained public–private collaboration through OpenSSF, ISO/IEC, CEN/CENELEC, and other international bodies to maintain consistent semantics and prevent fragmentation.

This Shared Vision complements regulatory frameworks like the EU Cyber Resilience Act (CRA) and reinforces the Open Source Security Foundation’s (OpenSSF) mission to achieve cross-ecosystem interoperability. Together, they anchor the future of SBOM governance in openness, modularity, and global collaboration, paving the way for a truly unified software transparency model.

The primary challenge to achieving scalable cyber resilience lies in the fragmentation of the SBOM landscape. Global policy drivers, such as the EU Cyber Resilience Act (CRA), the CISA-led Shared Vision of SBOM for Cybersecurity, and national guidelines like BSI TR-03183, have firmly established the mandate for transparency. However, divergence in formats, semantics, and compliance interpretations threatens to reduce SBOMs to static artifacts generated only because some regulation requires that they be created, rather than dynamic assets that can aid in security. Preventing this outcome requires a global commitment to a unified SBOM framework, a lingua franca capable of serving regulatory, operational, and security objectives simultaneously. This framework must balance policy diversity with technical capability universality, ensuring interoperability between European regulation, U.S. federal procurement mandates, and emerging initiatives in Asia and beyond. The collective engagement of ISO/IEC, ETSI, CEN/CENELEC, BSI, and the OpenSSF provides the necessary multistakeholder governance to sustain this alignment and accelerate convergence toward a common foundation.

Building such a framework depends on two complementary architectural pillars: Core Schema Consensus and Modular Extensions. The global core should harmonize essential SBOM elements, and CRA’s legal structure, into a single, mandatory baseline. Sectoral or regulatory needs (e.g., AI model metadata, critical infrastructure tagging, or crypto implementation details) should be layered through standardized modular extensions to prevent the ecosystem from forking into incompatible variants. To ensure practical interoperability, this architecture must rely on open tooling and universal machine-processable identifiers (such as PURL, CPE, SWID, and SWHID) that guarantee consistent and accurate linkage. Equally crucial are trust and provenance mechanisms: digitally signed SBOMs, verifiable generation context, and linkage with vulnerability data. These collectively transform the SBOM from a passive unused inventory into an actively maintained, actionable cybersecurity tool, enabling automation, real-time risk management, and genuine international trust in the digital supply chain, realizing the OpenSSF vision of “SBOMs everywhere.”

SBOMs have transitioned from a best practice to a requirement in many situations. The foundation established by the U.S. Executive Order 14028 has been legally codified by the EU’s Cyber Resilience Act (CRA), making SBOMs a non-negotiable legal requirement for accessing major markets. This legal framework is now guided by a collective mandate, notably by the Shared Vision issued by CISA, NSA, and 19 international cybersecurity agencies, which provides the critical roadmap for global alignment and action. Complementary work by BSI, ETSI, ISO/IEC, and OpenSSF now ensures these frameworks converge rather than compete.

To fully achieve global cyber resilience, SBOMs must not be merely considered as a compliance artifact to be created and ignored, but instead as an operational tool to support security and augment asset management processes. Organizations must:

  • Integrate and Automate SBOMs: Achieve full lifecycle automation for SBOM creation and continuous updates, making it a seamless part of the DevSecOps pipeline.
  • Maximize SBOM Interoperability: Mandate the adoption of both SPDX and CycloneDX to satisfy divergent global and regulatory requirements and ensure maximum tool compatibility.
  • Operationalize with Open Source Software Leverage OpenSSF tools (Protobom, BomCTL, SBOMit) to rapidly implement and scale technical best practices.
  • Drive Shared Governance for SBOMs: Actively engage in multistakeholder governance initiatives (CEN/CENELEC, ISO/IEC, CISA, ETSI,  OpenSSF) to unify technical standards and policy globally.
  • Enable Decision-Ready Processes that build on SBOMs: Implement advanced SBOM processes that link component data with exploitability and vulnerability context, transforming static reports into actionable security intelligence.

By embracing this shared vision, spanning among many others the CRA, CISA, METI, KISA, NTIA, ETSI, and BSI frameworks, we can definitively move from merely fulfilling compliance obligations to achieving verifiable confidence. This collective commitment to transparency and interoperability is the essential step in building a truly global, actionable, and resilient software ecosystem.

About the Authors

Madalin Neag works as an EU Policy Advisor at OpenSSF focusing on cybersecurity and open source software. He bridges OpenSSF (and its community), other technical communities, and policymakers, helping position OpenSSF as a trusted resource within the global and European policy landscape. His role is supported by a technical background in R&D, innovation, and standardization, with a focus on openness and interoperability.

Kate is VP of Dependable Embedded Systems at the Linux Foundation. She has been active in the SBOM formalization efforts since the NTIA initiative started, and was co-lead of the Formats & Tooling working group there. She was co-lead on the CISA Community Stakeholder working group to update the minimum set of Elements from the original NTIA set, which was published in 2024. She is currently co-lead of the SBOM Everywhere SIG.

Dr. David A. Wheeler is an expert on open source software (OSS) and on developing secure software. He is the Director of Open Source Supply Chain Security at the Linux Foundation and teaches a graduate course in developing secure software at George Mason University (GMU). Dr. Wheeler has a PhD in Information Technology, is a Certified Information Systems Security Professional (CISSP), and a Senior Member of the Institute of Electrical and Electronics Engineers (IEEE). He lives in Northern Virginia.

 

What’s in the SOSS? Podcast #42 – S2E19 New Education Course: Secure AI/ML-Driven Software Development (LFEL1012) with David A. Wheeler

By Podcast

Summary

In this episode of “What’s In The SOSS,” Yesenia interviews David A. Wheeler, the Director of Open Source Supply Chain Security at the Linux Foundation. They discuss the importance of secure software development, particularly in the context of AI and machine learning. David shares insights from his extensive experience in the field, emphasizing the need for both education and tools to ensure security. The conversation also touches on common misconceptions about AI, the relevance of digital badges for developers, and the structure of a new course aimed at teaching secure AI practices. David highlights the evolving nature of software development and the necessity for continuous learning in this rapidly changing landscape.

Conversation Highlights

00:00 Introduction to Open Source and Security
02:31 The Journey to Secure AI and ML Development
08:28 Understanding AI’s Impact on Software Development
12:14 Myths and Misconceptions about AI in Security
18:24 Connecting AI Security to Open Source and Closed Source
20:29 The Importance of Digital Badges for Developers
24:31 Course Structure and Learning Outcomes
28:18 Final Thoughts on AI and Software Security

Transcript

Yesenia (00:01)
Hello and welcome to What’s in the SOSS, OpenSSF podcast where we talk to interesting people throughout the open source ecosystem. They share their journey, expertise and wisdom. So yes, I need said one of your hosts and today we have the extraordinary experience of having David Wheeler on a welcome David. For those that may not know you, can you share a little bit about your row at the Linux Foundation OpenSSF?

David A. Wheeler (00:39)
Sure, my official title is actually probably not very illuminating. It says it’s the direct, I’m the director of open source supply chain security. But what that really means is that my job is to try to help other folks improve the security of open source software at large, all the way from it’s in someone’s head, they’re thinking about how to do it, developing it, putting it in repos, getting it packaged up, getting it distributed, receiving it just all the way through. We want to make sure that people get secure software and the software they actually intended to get.

Yesenia (01:16)
It’s always important, right? You don’t want to open up a Hershey bar that has no peanuts in the peanuts, right? So that was my analogy for the supply chain security in MySpace. Because I’m a little sensitive to peanuts. I was like, you know, you don’t want that.

David A. Wheeler (01:22)
You

David A. Wheeler (01:31)
You don’t want that. And although the food analogy is often pulled up, I think it’s still a good analogy. If you’re allergic to peanuts, you don’t want the peanuts. And unfortunately, it’s not just, hey, whether or not it’s got peanuts or not, but there was a scare involving Tylenol a while back. And to be fair, the manufacturer didn’t do anything wrong, but the bottles were tampered with by a third party.

Yesenia (01:40)
Mm-mm.

David A. Wheeler (01:57)
And so we don’t want tampered products. We want to make sure that when you request an open source program, it’s actually the one that was intended and not something else.

Yesenia (02:07)
So you have a very important job. Don’t play yourself there. We want to make sure the product you get is the one you get, right? So if you don’t know David, go ahead and message him on Slack, connect with him. Great gentleman in the open source space. And you’ve had a long time advocating for secure software development in the open source space. How did your journey lead to creating a course specifically on secure AI and ML driven development?

David A. Wheeler (02:36)
As with many journeys, it’s a complicated journey with lots of whens and ways. As you know, I’ve been interested in how do you develop secure software for a long time, decades now, frankly. And I have been collecting up over the years what are the common kinds of mistakes and more importantly, what are the systemic simple solutions you can make that would prevent that problem and eliminating it entirely ideally.

Um, and over the years it’s turned out that in fact, for a vast number, for the vast majority of problems that people have, there are well-known solutions, but they’re not well known by the developers. So a lot of this is really an education story of trying to make it so that software developers know how to do things. Now it’s a fair, you know, some would say, some would say, well, what about tools? Tools are valuable. Absolutely.

If to the extent that we can, we want to make it so that tools automatically do the secure thing. And that’s the right thing to do, but that’ll never be perfect. And people can always override tools. And so it’s not a matter of education or tools. I think that’s a false dichotomy. It’s you need tools and you need education. You need education or you can’t use the tools well as much as we can. We want to automate things so that they will handle things automatically, but you need both. You need both.

Now, to answer your specific question, I’ve actually been involved in and out with AI to some extent for literally decades as well. People have been interested in AI for years, me too. I did a lot more with symbolic based AI back in the day, wrote a lot of Lisp code. But since that time, really machine learning, although it’s not new, has really come into its own.

And all of a sudden it became quite apparent to me, and it’s not just me, to many people that software development is changing. And this is not a matter of what will happen someday in the future. This is the current reality for software development. And I’m going to give a quick shout out to some researchers in Stanford. I’ll have to go find the link. So who basically did some, I think some important studies related to this.

David A. Wheeler (04:59)
When you’re developing software from scratch and trying to create a tiny program, the AI tools are hard to beat because basically they’re just creating, know, they’re just reusing a template, but that’s a misleading measure, okay? That doesn’t really tell you what normal software development is like. However, let’s assume that you’re taking existing programming and improving it, and you’re using a language for which there’s a lot of examples for training. Okay, we’re talking Python and Java and, you know, various widely used languages, okay?

David A. Wheeler (05:28)
If you do those, turns out the AI tools can generate a lot of code. Some of it’s right. So that means you have to do more rework, frankly, when you use these tools. Once you take the rework into account, they’re coming up with a 20 % improvement in productivity. That is astounding. And I will argue that is the end of the argument. Seriously, there are definitely, there are companies where they have a single customer and the customer pays them to write some software. If the customer says never use AI, fine, the customer’s willing to pay 20 % improvement, I will charge that extra to them. But out in most commercial open source settings, you can’t throw, you can’t ignore a 20 % improvement. And that’s current tech, that’s not future tech. mean, the reality is that we haven’t seen improvements like this since the switch from hut, from assembly to high level languages, the use of, you know, the use of structure programming, I think was another case we got that kind. And you can make a very good case that open source software was that was a third case where you got that digital, productivity. Now you could also argue that’s a little unfair because open source didn’t improve your ability to write software. makes you didn’t have to write the software.

David A. Wheeler (06:53)
But that’s okay. That’s still an improvement, right? So I think that counts. But for the most part, we’ve had a lot of technologies that claim to improve productivity. I’ve worked many over the years. I’ve been very interested in how do you improve productivity? Most of them turned out not to be true. I don’t think that’s true for AI. It’s quite clear for multiple studies. mean, not all studies agree with this, by the way, but I think there’s enough studies that there’s a productivity improvement.

David A. Wheeler (07:21)
It does depend on how you employ these, that, but you know, and they’ll get better. But the big problem now is everyone is on the list. This is a case where everyone, even if you’re a professional and you’ve been doing software development for years, everybody’s new at this part of the game. These tools are new. And the problem here is that the good news is that they can help you. The bad news is they can harm you. They can do.

David A. Wheeler (07:50)
They can produce terribly insecure software. They can also end up being the security vulnerability themselves. And so we’re trying to get ahead of the game and looking around what’s the latest information, what can we learn? And it turns out there’s a lot that we can learn that we actually think is gonna stay on the test of time. And so that’s what this course is, those basics they’re gonna apply no matter what tool you use.

David A. Wheeler (08:17)
How do you make it say you’re using these tools, but you’re not immediately becoming a security vulnerability? How is it so that you’re less likely to produce vulnerable code? And that turns out to be harder. We can talk about why that is, but that’s what this course is in a nutshell.

Yesenia (08:33)
Yeah, I know I had a sneak preview at the slide deck and I was just like, this is fantastic. Definitely needed it. And I wanted to take a moment and give a kudos to the researchers because the engine, the industry wouldn’t be what it is today without the researchers. Like they’re the ones that are firsthand, like try and failing and then somebody picks it up and builds it and it is open source or industry. then boom, it becomes like this whole new field. So I know AI has been around for a minute.

David A. Wheeler (09:01)
Yeah, let me add that. I agree with you. Let me actually separate different researchers because we’re building on the first of course, the researchers who created these original AI and ML systems more generally, obviously a lot of the LLM based research. You’ve got the research specifically in developing tools for developing, for improving software development. And then you’ve got the researchers who are trying to figure out the security impacts of this. And those folks,

Yesenia (09:30)
Those are my favorite. Those are my favorite.

David A. Wheeler (09:31)
I’m Well, we need all of these folks. But the thing is, what concerns me is that remarkably, even though we’ve got this is really new tech, we do have some really interesting research results and publications about their security impacts. The problem is, most of these researchers are really good about, you know, doing the analysis, creating controls, doing the research, publishing a paper, but for most people, publishing a paper has no impact. People are not going to go out and read every paper on a topic. That’s, know, they have work to do basically. So if you’re a researcher makes these are very valuable, but what we’ve tried to do is take the research and boil it down to as a practitioner, what do you need to know? And we do cite the research because

David A. Wheeler (10:29)
You know, if you’re, if you’re interested or you say, Hey, I’m not sure I believe that. Well, great. Curiosity is fantastic. Go read the studies. there’s always limitations on studies. We don’t have infinite time and infinite money. but I think the research is actually pretty consistent. at least with Ted Hayes technology, we, can’t guess what the great grand future holds.

David A. Wheeler (10:55)
But I’m going to guess that at least for the next couple of years, we’re going to see a lot of LLMs, LLM use, they’re going to build on other tools. And so there’s things that we know just based on that, that we can say, well, given the direction of current technology, what’s okay, what’s to be concerned about? And most importantly, what can we do practically to make this work in our favor? So we get that 20 % because we’re going to want it.

Yesenia (11:24)
Yeah, at this point in time, we’re seedling within the AI ML piece. What you said is really, really important. It’s just like, so much more to this. There’s so much more that’s growing. And I want to take it back to something you had mentioned. You’re talking about the good that is coming from the AI ML. And there is the bad, of course. And for the course that you’re coming out, what is one misconception about AI in the software development security that you hope that this course will shatter? What myth are you busting?

David A. Wheeler (11:53)
What myth am I busting? I guess I’m going to cheat because I’m going to respond with two. It’s by the fact that I actually can count. guess, okay, I’m going to turn it into one, which is, guess, basically either over or under indexing on the value of A. Basically expecting too much or expecting too little. Okay, basically trying to figure out what the real expectations you should have are and not go outside that. So there’s my one. So let me talk about over and under. We’ve got people.

Yesenia (12:30)
Well, I’m going to give you another one because in software everything starts with zero. So I’ll give you another one.

David A. Wheeler (13:47)
Okay, all right, so let me talk about the under. There are some who have excessive expectations. We’ve got the, you know, I think vibe coding in particular is a symptom of this, okay? Now, there are some people who use the word vibe coding as another word for using AI. I think that’s not what the original creator of the term meant. And I actually think that’s not helpful because it’s a whole lot like talking about automated carriages.

Um, very soon we’re only going to be talking about carriages. Okay. Everybody’s going to be using automation AI except the very few who don’t. Okay. So, so there’s no point in having a special term for the normal case. Um, so what, what I mean by vibe coding is what the original creator of the term meant, which is, Hey, AI system creates some code. I’m never going to review it. I’m never going to look at it. I’m not going to do anything. I will just blindly accept it. This is a terrible idea. If it matters what the quality of the code is. Now there are cases where frankly the quality of the code is irrelevant. I’ve seen some awesome examples where you’ve got little kids, know, eight, nine year olds running around and telling a computer what to do and they get a program that seems to kind of do that. And that is great. I mean, if you want to do vibe coding with that, that’s fantastic. But if the code actually does something that matters, with current tech, this is a terrible idea. They’re not.

They can sometimes get it correct, but even humans struggle making perfect code every time and they’re not that good. The other case though is, man, we can’t ever use this stuff. I mean, again, if you’ve got a customer who’s paying you extra to never do it, that’s wonderful. Do what the customer asks and is willing to pay for it. For most of us, that’s not a reasonable industry position. What we’re going to need to do instead is learn together as an industry how to use this well. The good news is that although we will all be learning together, there’s some things already known now. So let’s run to the researchers, find out what they’ve learned, go to the practitioners, basically find what has been learned so far, start with that. And then we can build on and improve and go and other things. You don’t expect too much, don’t expect too little.

Yesenia (15:28)
Yeah, the five coding is an interesting one, because sometimes it spews out like correct code. But as somebody who’s written code and reviewed code and like done all this with the supply chain, I’m like. It’s like that extra work you gotta kind of add to it to make sure that you’re validating your testing it and it hasn’t just accidentally thrown in some security vulnerability in in that work. And I think that was. Go ahead.

David A. Wheeler (15:51)
What I can interrupt you, one of the studies that we cited, they basically created a whole bunch of functions that could be written either insecurely or securely as a test. Did this whole bunch of times. And they found that 45 % of the time using today’s current tooling, they chose the insecure approach. And there’s reason for this. ML systems are finally based on their training sets.

They’ve been trained on lots of insecure programs. What did you expect to get? You know, so this is actually going to be a challenge because when you’re trying to fight what the ML systems are training on, that is harder than going with the flow. That doesn’t mean it can’t be done, but it does require extra effort.

Yesenia (16:41)
We’re going extra left at that point. All right, so you your one and I gave you, know, one more because we started at zero. Any other misconception that is being bumped at the course.

David A. Wheeler (16:57)
Um, I guess the, uh, I guess the misconception sort is nothing can be done. And, uh, of course the whole course is a, uh, a, a, a stated disagreement with examples, uh, because in fact, there are things we can do right now. Now I would actually concede if somebody said, Hey, we don’t know everything. Well, sure. Uh, you know, I think all of us are in a life journey and we’re all learning things as we go. Uh, but that doesn’t mean that we have to, um, you know, just accept that nothing can be done. That’s a fatalistic approach that I think serves no one. There are things we can do. There are things that are known, though maybe not by you, but that’s okay. That’s what a course is for. We’ve worked to boil down, try to identify what is known, and with relatively little time, you’ll be far more prepared than you would be otherwise.

Yesenia (17:49)
It is a good course and I the course is aimed for developers, software engineers, open source contributors. So how does it connect to real world open source work like those that are working on closed source versus that open source software?

David A. Wheeler (18:04)
Well, okay, I should first quickly note that I work for the Open Source Security Foundation, Open Source is in the name, so we’re very interested in improving the security of open source software. That is our fundamental focus. That said, sometimes the materials that we create are not really unique to open source software. Where it can be applied by closed source software, we try to make that clear. Sometimes we don’t make it clear as we should, but we’re working on that.

Um, and frankly, in many cases, I think there’s also worth noting that, um, if you’re developing closed source software, the vast majority of the components you’re using are open source software. I mean, the average is 70 to 90 % of the software in a closed source system software system is actually open source software components. Uh, simple because it doesn’t make sense to rebuild everything from scratch today. That’s not a, an M E economically viable option for most folks. So.

in this particular case for the AI, it is applicable equally to open source and closed source. It applies to everybody. and this is actually true also for our LFD 121 course on how to develop secure software. And when you think about it, it makes sense. The attackers don’t care what your license is. just, you know, they just don’t care. They’re going to try to do bad things to you regardless of, of the licensing.

And so while certainly a number of things that we develop like, you know, the best practices badge are very focused on open source software, you know, other things like baseline, other things like, for example, this course on LFT 121, the general how to develop secure software course, they’re absolutely for open source and closed source. Because again, the attackers don’t care.

Yesenia (19:53)
Yeah, they just they just don’t they’re they’re actually just trying to go around all this like they’re trying to make sure they learn it so that they know what to do. Unfortunately, that’s the case. And this course you said it offers a digital badge. Why is this important for developers and employers?

David A. Wheeler (20:11)
Well, I think the short answer is that anybody can say, yeah, I learned something. But it’s, think, for, I guess I should start with the employers because that’s the easier one to answer. Employers like to see that people know things and having a digital badge is a super easy way for an employer to, to make sure that, yeah, they actually learn at least the basics of, you know, that topic. you know, certainly it’s the same for you know, university degrees and other things. You when you’re an employer, you want the, it’s very, important that people who are working for you actually know something that’s critically important. And while a degree or digital badge doesn’t guarantee it, it at least gives that additional evidence. For people, mean, obviously if you want, are trying to get employed by someone, it’s always nice to be able to prove that. But I think it’s also a way to both show you know something to others and frankly encourage others to learn this stuff. We have a situation now where way too many people don’t know how to do some what to me are pretty basic stuff. You know I’ll point back to the LFD 121 course which is how to develop secure software. Most colleges, most universities that’s not a required part of the degree. I think it should be.

David A. Wheeler (21:35)
But since it isn’t, it’s really, really helpful for everybody to know, wait, this person coming in, do they not, they’ve got this digital badge. That gives me much higher confidence going in as somebody I’m working with and that sort of thing, as well as just encouraging others to say, hey, look, I carried enough to take the time to learn this, you can too. And both LFD 121 and this new AI course are free, so that in there online so you can take it at your pace. Those roadblocks do not exist. We’re trying to get the word out because this is important.

Yesenia (22:16)
Yeah, I love that these courses are more accessible and how you touched on the students, like students applying for universities that might be more highly competitive. They’re like, hey, look, I’m taking this extra path to learn and take these courses. Here’s kind of like the proof. And it’s like Pokemon. It’s good to collect them all, know, between the badges and the certifications and the degrees.

I definitely that’s the security professional’s journeys. Collect them all at this point with the credibility and benefits.

David A. Wheeler (22:46)
Well, indeed, course, the real goal, of course, is to learn, not the badges. But I think that badges are frankly, you know, you collecting the gold star, there is nothing more human and nothing more okay than saying, Hey, I, you know, I got to I got a gold star for if you’re doing something that’s good. Yes, revel in that. Enjoy it. It’s fun.

David A. Wheeler (23:11)
And I don’t think that these are impossible courses by any means. And unlike some other things which are, know, I’m not against playing games, playing games is fun, but this is a little thing that’s both can be interesting and is going to be important long-term to not only yourself, but every who uses the code you make. Cause that’s gonna be all of us are the users of the software that all developers as a community make.

Yesenia (23:42)
Yeah, there’s a wide range impact from this, not just like even if you don’t create software, just understanding and learning about this, you’re a user to understanding that basic understanding of it. So I want to transition a little bit to the course because I know we’re spending the whole time about it. Let’s say I’m a friendly person. I signed up for this free LFEL 1012. Can you walk me through the course structure? Like what am I expected to take away from the course in that time period?

David A. Wheeler (24:09)
Okay, yeah, so let’s see here. So I think what I should do is kind of first walk through the outline. Basically, I mean, the first two parts unsurprisingly are introduction and some context. And then we jump immediately into key AI concepts for secure development. We do not assume that someone taking this course is already an expert in AI. I mean, if you are, that’s great. It doesn’t take, we’re not spending a lot of time on it, but we wanna make sure that you understand the basics, the key terms that matter for software development. And then we drill right into the security risks of using AI assistance. I want to make it clear, we’re not saying you can’t use just because something has a risk, everything has risks, okay? But understanding what the potential issues are is important because then you can start addressing them. And then we go through what I would call kind of the meat of the course, best practices for secure assistant use.

You know, how do you reduce the risk that the assistant itself doesn’t become subverted, it starts working against you, things like that. Writing more secure code with AI, if you just say, write some code, a lot of it’s gonna be insecure. There are ways to deal with that, but it’s not so simple or straightforward. For example, it’s pretty common to tell AI systems that, hey, I’m an expert in this topic and suddenly it gets better. That trick doesn’t work.

No, you may laugh, but honestly, that trick works in a lot of situations, but it doesn’t work here. And we’ve actually researched showing it doesn’t work. So there are things that work, but it’s more it’s, it’s more than that. And finally, reviewing code changes in a world with AI. Now, of course, involves reviewing proposed changes from others. And in some cases, trying to deal with the potential DDoS attacks as people start creating far more code than anybody can reasonably review. Okay. We’re have to deal with this. and frankly, biggest problem, frankly, the folks who are vibe coding, you know, they, they run some program. It tells them 10 things. I’ll just dump all 10 things at them. And no, that’s a terrible idea. you know, and the, the curl folks, for example, have had an interesting point where.

They complained bitterly about some inputs from AI systems, which were absolute garbage, wasted their time. And they’ve praised other AI submissions because somebody took the time to make sure that they were actually helpful and correct and so on. And then that’s fantastic. you know, basically you need to push back on the junk and then find and then welcome the good stuff. And then, of course, a little conclusion wrap up kind of thing.

Yesenia (27:01)
I love it. it was a good outline. was not seeing it. Is this like videos accomplished with it or is this just like a module click through?

David A. Wheeler (27:10)
Well, basically we, we group them into little chapters. forgot what their official term is. It’s chapters, section modules. I don’t remember what the right term is. I guess I should, but basically after you go that, then there’s a couple of quiz questions and then a little videos. Basically the idea is that we want people to get it quickly, but you know, if it’s just watch a video for an hour, people fall asleep, don’t remember anything. That’s the goal is to learn, not just, you know, sleep through a video.

David A. Wheeler (27:39)
So little snippets, little quiz questions, and at the end there’s a little final exam. And if you get your answers right, you get your badge. So it’s not that terribly hard. We estimate, it varies, but we estimate about an hour for people. So it’s not a massive time commitment. Do it on lunch break or something. I think this is going to be, as I said, I think this is going to be time well spent.

David A. Wheeler (28:07)
This is the world that we are all moving to, or frankly, have already arrived at.

Yesenia (28:12)
Yeah, I’m already here. think I said it’s a seedling. It’s about to grow into that big tree. Any last minute thoughts, takeaways that you want to share about the course, your experience, open source, supply chain security, all of the above.

David A. Wheeler (28:27)
My goodness. I’ll think of 20 things after we’re done with this, of course. well, no, problem is I’ll think about them later in French. believe it’s called the wisdom of the stairs. It’s as you leave the party, you come up with the point you should have made. so I guess I’ll just say that, you know, if you develop software, whether you’re not, whether you realize or not, it’s highly likely that the work that you do will influence many, many.

Yesenia (28:31)
You only get zero and one.

David A. Wheeler (28:54)
About many, many people, many more than you probably realize. So I think it’s important for all software developers to learn how to develop software, secure software in general, because whether or not you know how to do it, the attackers know how to attack it and they will attack it. So it’s important to know that in general, since we are moving and essentially have already arrived at the world of AI and software development, it’s important to learn the basics and yes.

Do keep learning. Well, all of us are going to keep learning throughout our lives. As long as we’re in this field, that’s not a bad thing. I think it’s an awesome thing. I wake up happy that I get to learn new stuff. But that means you actually have to go and learn the new stuff. And the underlying technology remark is, it’s actually remarkably stable in many things. This is a case though, where, yes, a lot of things change in the detail, but the fundamentals don’t. But this is something where, yeah, actually there is something fundamental that is changing. One time we didn’t use AI often to help us develop software. Now we do. So how do we do that wisely? And there’s a long list of specifics. The course goes through it. I’ll give a specific example so it’s not just this highfalutin, high level stuff. So for example,

Pretty much all these systems are based very much on LLMs, which is great. LLMs have some amazing stuff, but they also have some weaknesses. One is in particular, they are incredibly gullible. If they are told something, they will believe it. And if you tell them to read a document that gives them instructions on some tech, and the document includes malicious instructions, that’s what it’s going to do because it heard the malicious instructions.

David A. Wheeler (30:48)
Now that doesn’t mean you can’t use these technologies. I think that’s a road too far for most folks. But it does mean that there’s new risks that we never had to deal with before. And so there’s new techniques that we’re going to need to apply to do it well. And I don’t think they’re unreasonable. They’re just, you know, we now have a new situation and we’re have to make some changes because of new situation.

Yesenia (31:11)
Yeah, it’s like you mentioned earlier, like you can ask it to be an expert in something and then it’s like, oh, I’m an expert. That’s what I laughing. I was like, yeah, I use that a lot. I’m like, the prompt is you’re an expert in sales. You’re an expert in brand. You’re an expert in this. And it’s like, OK, once it gets in.

David A. Wheeler (31:25)
But the thing is, that really does work in some fields, remarkably. And of course, we can only speculate sometimes why LLMs do better in some areas than others. But I think in some areas, it’s quite easy to distinguish the work of experts from non-experts. And the work of experts is manifestly and obviously different. And at least so far, LLMs struggle to do that.

David A. Wheeler (31:54)
This differentiation in this particular domain. And we can speculate why but basically, the research says that doesn’t work. So don’t do that. Do there are other techniques that have far more success, do those instead. And I would say, hey, I’m sure we’ll learn more things, there’ll be more research, use those as we learn them. But that doesn’t mean that we get to excuse ourselves from ignoring the research we have now, even though we don’t know it.

David A. Wheeler (32:23)
We don’t know everything. We won’t know everything next year either. Find out what you need to know now and be prepared to learn more Seagull.

Yesenia (32:32)
It’s a journey. Always learning every year, every month, every day. It’s great. We’re going to transition into our rapid fire. All right, so I’m going to ask the question. You got to answer quiz and there’s no edit on this point. All right, favorite programming language to teach security with.

David A. Wheeler (32:56)
I don’t have a favorite language. It’s like asking what my children, know, which of my children are my favorite. I like lots of programming languages. That said, I often use Java, Python, C to teach different things more because they’re decent exemplars of those kinds of languages. But so there’s your answer.

Yesenia (33:19)
Those are good range because you have your memory based one, which is the see your Python, which more scripts in the Java, which is more object oriented. So you got a good diverse group.

David A. Wheeler (33:28)
Static type, right, you’ve got your static typing, you’ve got your scripting, you’ve got your lower level. But indeed, I love lots of different programming languages. I know over 100, I’m not exaggerating, I counted, there’s a list on my website. But that’s less impressive than you might think because after you’ve learned a couple, the others tend to not, they often are similar too. Yes, Haskell and Lisp are really different.

David A. Wheeler (33:55)
But most Burmese languages are not as different as you might think, especially after you’ve learned a few. So I hope can help.

Yesenia (34:01)
Yeah, the newer ones too are very similar in nature. Next question, Dungeon and Dragon or Lord of the Rings?

David A. Wheeler (34:11)
I love both. What are we doing? What are you doing to me? Yeah, so I play Dungeons and Dragons. I have watched, I’ve read the books and watched the movies many times. So yes.

Yesenia (34:24)
Yes, yes, yes. First open source project you ever contributed to.

David A. Wheeler (34:30)
Wow, that is too long ago. I don’t remember. Seriously, but it was before the term open source software was created because that was created much later. So it was called free software then. So I honestly don’t remember. I sure it was some small contribution to something somewhere like many folks do, but I’m sorry. It’s lost in the midst of times back in the eighties. Maybe. Yeah. The eighties somewhere, probably mid eighties.

Yesenia (35:00)
You’re going to go to sleep now. Like,

David A. Wheeler (35:01)
So, yeah, yeah, I’m sure somebody will do the research and tell me. thank you.

Yesenia (35:09)
There wasn’t GitHub, so you can’t go back to commits.

David A. Wheeler (35:11)
That’s right. That’s right. No, was long before get long before GitHub and so on. Yep. Carry on.

Yesenia (35:18)
When you’re writing code, coffee or tea?

David A. Wheeler (35:22)
Neither! Coke Zero is my preferred caffeine of choice.

Yesenia (35:26.769)
And this is not sponsored.

David A. Wheeler (35:28.984)
It is not sponsored. However, I have a whole lot of empty cans next to me.

Yesenia (35:35)
AI tool you find the most useful right now.

David A. Wheeler (35:39)
Ooh, that one’s hard. I actually use about seven or eight depending on what they’re good for. For actual code right now, I’m tending to use Claude Code. Claude Code’s really one of the best ones out there for code. And of course, five minutes later, it may change. GitHub’s not bad either. There’s some challenges I’ve had with them. They had some bugs earlier, which I suspect they fixed by now.

But in fact, I think this is an interesting thing. We’ve got a race going on between different big competitors, and this is in many ways good for all of us. The way you get good at anything is by competing with others. So I think that we’re seeing a lot of improvements because you’ve got competing. And it’s okay if the answer changes over time. That’s an awesome thing.

Yesenia (36:36)
That is awesome. That’s technology. And the last one, this is for chaos. GIF or GIF?

David A. Wheeler (36:42)
It’s GIF. Graphics. Graphics has a guck in it. And yes, I’m aware that the original perpetrator doesn’t pronounce it that way, but it’s still GIF. I did see a cartoon caption which said GIF or GIF. And of course I can hear it just reading it.

Yesenia (36:53)
There you have it.

Yesenia (37:05)
My notes is literally spelled the same.

David A. Wheeler (37:08)
Hahaha!

Yesenia (37:11)
All right, well there you have it folks, another rapid fire. David, thank you so much for your time today, for your impact contribution to open source in the past couple decades. Really appreciate your time and all the contributors that were part of this course. Check it out on the Linux Foundation website. then David, do you want to close it out with anything on how they can access the course?

David A. Wheeler (37:38)
Yeah, so basically the course is ecure AI/ML-Driven Software Development its numbers LFEL 1012 And I’m sure we’ll put a link in the show. No, I’m not gonna try to read out the URL But we’ll put a link in there to to get to it But please please take that course. We’ve got some other courses

Software development, you’re always learning and this is an easy way to get the information you most need.

Yesenia (38:14)
Thank you so much for your time today and those listening. We’ll catch you on the next episode.

David A. Wheeler (38:19)
Thank you.

OpenSSF Scorecard Audit is Complete!

By Blog, Guest Blog

This blog was originally published on the OSTIF website on October 9, 2025 by Helen Wooste

The Open Source Technology Improvement Fund is proud to share the results of our security audit of OpenSSF Scorecard. OpenSSF Scorecard is an open source automated testing resource to help projects continually assess security risks. With the help of ADA Logics and the OpenSSF, this project can continue to provide the open source community with a free and proactive security best practice that is easy for maintainers to use.

Audit Process:

This engagement was undertaken by the ADA Logics team during early summer of 2025. Within scope of review was five repositories: scorecard-webapp, scorecard-action, scorecard-monitor, scorecard, and allstar. These five projects underwent formal threat modeling, which then guided the manual code review that followed. Each repository interacts with different interfaces, handles different (potentially sensitive) data, and therefore has differing attack impacts that affect its security needs. Fuzzing work was also performed during this audit, and resulted in the uncovering of some of the reported findings.

Audit Results:

  • 9 Issues with security impact
    • 2 Medium
    • 5 Low
    • 2 Informational
  • Formal threat models of following repositories
    • scorecard-webapp
    • scorecard-action
    • scorecard-monitor
    • scorecard
    • allstar
  • Continuous fuzzing of the majority of Scorecard probes via integration onto OSS-Fuzz

The OpenSSF Scorecard maintainers were active and helpful participants in the duration of the audit. Many issues reported by this audit have been addressed, so if you are a user of OpenSSF Scorecard please update to the most recent version in order to take advantage of the work by both ADA Logics and the maintainers. Additionally if you would like to contribute to Scorecard, click this link to see the available meetings for further discussion.

Thank you to the individuals and groups that made this engagement possible:

  • OpenSSF Scorecard maintainers and community, especially: Spencer Schrock, Raghav Kaul, Stephen Augustus, and Jeff Mendoza
  • ADA Logics: David Korczynski and Adam Korczynski
  • The OpenSSF

You can read the Audit Report HERE

Everyone around the world depends on open source software. If you’re interested in financially supporting this critical work, reach out to contactus@ostif.org.

OSTIF is celebrating our 10 year anniversary! Join us for a meetup about our work, lessons learned, and where we see the future of open source security going by following our meetup calendar https://lu.ma/ostif-meetups

To get involved, visit openssf.org and join the conversation on SlackGitHub, and upcoming community events.

Helen is a Hoosier who spent her youth in West Lafayette and then Bloomington, Indiana, spending time in the latter earning her undergraduate degree in History from Indiana University. A month after graduation she moved to Chicago where she worked in hospitality and food service management, running a variety of enterprises from bakeries to high-end restaurants to a pasta food truck. In 2023, she transitioned into open source by accepting a position with OSTIF. She is grateful for the opportunity to work with a global community that prioritizes sharing free knowledge for the greater good.

OpenSSF Newsletter – September 2025

By Newsletter

Welcome to the September 2025 edition of the OpenSSF Newsletter! Here’s a roundup of the latest developments, key events, and upcoming opportunities in the Open Source Security community.

TL;DR:

🎉 Big week in Amsterdam: Recap of OpenSSF at OSSummit + OpenSSF Community Day Europe.

🥚 Golden Egg Awards shine on five amazing community leaders.

✨ Fresh resources: AI Code Assistant tips and SBOM whitepaper.

🤝 Trustify + GUAC = stronger supply chain security.

🌍 OpenSSF Community Day India: 230+ open source enthusiasts packed the room.

🎙 New podcasts: AI/ML security + post-quantum race.

🎓 Free courses to level up your security skills.

📅 Mark your calendar and join us for Community Events.

Celebrating the Community: OpenSSF at Open Source Summit and OpenSSF Community Day Europe Recap

From August 25–28, 2025, the Linux Foundation hosted Open Source Summit Europe and OpenSSF Community Day Europe in Amsterdam, bringing together developers, maintainers, researchers, and policymakers to strengthen software supply chain security and align on global regulations like the EU Cyber Resilience Act (CRA). The week included strong engagement at the OpenSSF booth and sessions on compliance, transparency, proactive security, SBOM accuracy, and CRA readiness. 

OpenSSF Community Day Europe celebrated milestones in AI security, public sector engagement, and the launch of Model Signing v1.0, while also honoring five community leaders with the Golden Egg Awards. Attendees explored topics ranging from GUAC+Trustify integration and post-quantum readiness to securing GitHub Actions, with an interactive Tabletop Exercise simulating a real-world incident response. 

These gatherings highlighted the community’s progress and ongoing commitment to strengthening open source security. Read more.

OpenSSF Celebrates Global Momentum, AI/ML Security Initiatives and Golden Egg Award Winners at Community Day Europe

At OpenSSF Community Day Europe, the Open Source Security Foundation honored this year’s Golden Egg Award recipients. Congratulations to Ben Cotton (Kusari), Kairo de Araujo (Eclipse Foundation), Katherine Druckman (Independent), Eddie Knight (Sonatype), and Georg Kunz (Ericsson) for their inspiring contributions.

With exceptional community engagement across continents and strategic efforts to secure the AI/ML pipeline, OpenSSF continues to build trust in open source at every level.

Read the full press release to explore the achievements, inspiring voices, and what’s next for global open source security.

Blogs: What’s New in the OpenSSF Community?

Here you will find a snapshot of what’s new on the OpenSSF blog. For more stories, ideas, and updates, visit the blog section on our website.

Open Source Friday with OpenSSF – Global Cyber Policy Working Group

On August 15, 2025, GitHub’s Open Source Friday series spotlighted the OpenSSF Global Cyber Policy Working Group (WG) and the OSPS Baseline in a live session hosted by Kevin Crosby, GitHub. The panel featured OpenSSF’s Madalin Neag (EU Policy Advisor), Christopher Robinson (CRob) (Chief Security Architect) and David A. Wheeler (Director of Open Source Supply Chain Security) who discussed how the Working Group helps developers, maintainers, and policymakers navigate global cybersecurity regulations like the EU Cyber Resilience Act (CRA). 

The conversation highlighted why the WG was created, how global policies affect open source, and the resources available to the community, including free training courses, the CRA Brief Guide, and the Security Baseline Framework. Panelists emphasized challenges such as awareness gaps, fragmented policies, and closed standards, while underscoring opportunities for collaboration, education, and open tooling. 

As the CRA shapes global standards, the Working Group continues to track regulations, engage policymakers, and provide practical support to ensure the open source community is prepared for evolving cybersecurity requirements. Learn more and watch the recording.

Improving Risk Management Decisions with SBOM Data

SBOMs are becoming part of everyday software practice, but many teams still ask the same question: how do we turn SBOM data into decisions we can trust? 

Our new whitepaper, “Improving Risk Management Decisions with SBOM Data,” answers that by tying SBOM information to concrete risk-management outcomes across engineering, security, legal, and operations. It shows how to align SBOM work with real business motivations like resiliency, release confidence, and compliance. It also describes what “decision-ready” SBOMs look like, and how to judge data quality. To learn more, download the Whitepaper.

Trustify joins GUAC

GUAC and Trustify are combining under the GUAC umbrella to tackle the challenges of consuming, processing, and utilizing supply chain security metadata at scale. With Red Hat’s contribution of Trustify, the unified community will serve as the central hub within OpenSSF for building and using supply chain knowledge graphs, defining standards, developing shared infrastructure, and fostering collaboration. Read more.

Recap: OpenSSF Community Day India 2025

On August 4, 2025, OpenSSF hosted its second Community Day India in Hyderabad, co-located with KubeCon India. With 232 registrants and standing-room-only attendance, the event brought together open source enthusiasts, security experts, engineers, and students for a full day of learning, collaboration, and networking.

The event featured opening remarks from Ram Iyengar (OpenSSF Community Engagement Lead, India), followed by technical talks on container runtimes, AI-driven coding risks, post-quantum cryptography, supply chain security, SBOM compliance, and kernel-level enforcement. Sessions also highlighted tools for policy automation, malicious package detection, and vulnerability triage, as well as emerging approaches like chaos engineering and UEFI secure boot.

The event highlighted India’s growing role in global open source development and the importance of engaging local communities to address global security challenges. Read more.

New OpenSSF Guidance on AI Code Assistant Instructions

In our recent blog, Avishay Balter, Principal SWE Lead at Microsoft and David A. Wheeler, Director, Open Source Supply Chain Security at OpenSSF introduce the OpenSSF “Security-Focused Guide for AI Code Assistant Instructions.” AI code assistants can speed development but also generate insecure or incorrect results if prompts are poorly written. The guide, created by the OpenSSF Best Practices and AI/ML Working Groups with contributors from Microsoft, Google, and Red Hat, shows how clear and security-focused instructions improve outcomes. It stands as a practical resource for developers today, while OpenSSF also develops a broader course (LFEL1012) on using AI code assistants securely. 

This effort marks a step toward ensuring AI helps improve security instead of undermining it. Read more.

Open Infrastructure Is Not Free: A Joint Statement on Sustainable Stewardship

Public package registries and other shared services power modern software at global scale, but most costs are carried by a few stewards while commercial-scale users often contribute little. Our new open letter calls for practical models that align usage with responsibility — through partnerships, tiered access, and value-add options — so these systems remain strong, secure, and open to all.

Signed by: OpenSSF, Alpha-Omega, Eclipse Foundation (Open VSX), OpenJS Foundation, Packagist (Composer), Python Software Foundation (PyPI), Rust Foundation (crates.io), Sonatype (Maven Central).

Read the open letter.

What’s in the SOSS? An OpenSSF Podcast:

#38 – S2E15 Securing AI: A Conversation with Sarah Evans on OpenSSF’s AI/ML Initiatives

In this episode of What’s in the SOSS, Sarah Evans, Distinguished Engineer at Dell Technologies, discusses extending secure software practices to AI. She highlights the AI Model Signing project, the MLSecOps whitepaper with Ericsson, and efforts to identify new personas in AI/ML operations. Tune in to hear how OpenSSF is shaping the future of AI security.

#39 – S2E16 Racing Against Quantum: The Urgent Migration to Post-Quantum Cryptography with KeyFactor’s Crypto Experts

In this episode of What’s in the SOSS, host Yesenia talks with David Hook and Tomas Gustavsson from Keyfactor about the race to post-quantum cryptography. They explain quantum-safe algorithms, the importance of crypto agility, and why sectors like finance and supply chains are leading the way. Tune in to learn the real costs of migration and why organizations must start preparing now before it’s too late.

Education:

The Open Source Security Foundation (OpenSSF), together with Linux Foundation Education, provides a selection of free e-learning courses to help the open source community build stronger software security expertise. Learners can earn digital badges by completing offerings such as:

These are just a few of the many courses available for developers, managers, and decision-makers aiming to integrate security throughout the software development lifecycle.

News from OpenSSF Community Meetings and Projects:

In the News:

Meet OpenSSF at These Upcoming Events!

Join us at OpenSSF Community Day in South Korea!

OpenSSF Community Days bring together security and open source experts to drive innovation in software security.

Connect with the OpenSSF Community at these key events:

Ways to Participate:

There are a number of ways for individuals and organizations to participate in OpenSSF. Learn more here.

You’re invited to…

See You Next Month! 

We want to get you the information you most want to see in your inbox. Missed our previous newsletters? Read here!

Have ideas or suggestions for next month’s newsletter about the OpenSSF? Let us know at marketing@openssf.org, and see you next month! 

Regards,

The OpenSSF Team

What’s in the SOSS? Podcast #40 – S2E17 From Manager to Open Source Security Pioneer: Kate Stewart’s Journey Through SBOM, Safety, and the Zephyr Project

By Podcast

Summary

In this episode of What’s in the SOSS, CRob has an inspiring conversation with Kate Stewart, a Linux Foundation veteran who took an unconventional path into open source as a manager rather than a developer, navigating complex legal challenges to get Motorola’s contributions upstream. Now a decade into her tenure at the Linux Foundation, Kate leads critical initiatives in safety-critical open source software, including the Zephyr RTOS project and ELISA, while being instrumental in the evolution of SPDX and Software Bill of Materials (SBOM). She breaks down the different types of SBOMs, explains how the Zephyr project became a security exemplar with gold-level OpenSSF badging, and shares practical insights on navigating the European Union’s Cyber Resilience Act (CRA). Whether you’re interested in embedded systems, security best practices, or the evolving regulatory landscape for open source, this episode offers valuable perspectives from someone who’s been shaping these conversations for years.

Conversation Highlights

00:00 Intro Music & Promo Clip
00:00 Introduction and Welcome
00:42 Kate’s Current Work at Linux Foundation
02:18 Origin Story: From Motorola Manager to Open Source Advocate
06:38 Building Global Open Source Teams and SPDX Beginnings
09:45 The Variety of Open Source Contributors
10:57 Deep Dive: What is an SBOM and Why It Matters
17:05 The Evolution of SBOM Types and Academic Understanding
19:21 Cyber Resilience Act and Zephyr as a Security Exemplar
26:46 Zephyr’s Security Journey: From Badging to CNA Status
31:05 Rapid Fire Questions
32:19 Advice for Newcomers and Closing Thoughts

Transcript

Intro Music + Promo Clip (00:00)

CRob (00:07.862)
Welcome, welcome, welcome to “What’s in the SOSS?” the OpenSSF’s podcast where we talked to the amazing people that help produce, manage and advocate this amazing movement we call open source software. Today we have a real treat – someone that I’ve been interacting with off and on over the years through many kind of industry ecosystem spanning things. So I would like to have my friend, Kate Stewart, give us a little introduction, talk about yourself.

Kate Stewart (00:42.103)
Rob, glad to be here. Right now, I work at the Linux Foundation and have been focusing on what it’s going to take to actually be able to use open source in safety critical contexts. So obviously, security is going to be a part of that. We need to make sure that there’s no vulnerabilities and things like that. But it does go beyond that. And so that’s been my focus for the last few years. I’ve been working on a couple of open source projects, which is Zephyr and the other of which is Elisa and then helping with a variety of other embedded projects like Yocto and Zan and so forth. Trying to figure out what can we do to make sure that we actually get to the stage where we can do safety analysis at scale with vulnerabilities happening all the time with open source. And it’s a really good challenge. I’ve also been involved in SBOM World a fair amount over the years for fun. I think I was involved with the SBOM World before it was called the SBOM World actually with a project called SPDX, or software package data exchange is what it’s called now, and it’s now moved to being system package data exchange because data and models and all these other lovely concepts are part of the whole transparency that we’re going to need. And the transparency that we’re going to need is what we’re going to need for safety. So these things also come together with that theme in mind.

CRob (02:02.702)
Awesome, that’s an important space that we haven’t had a lot of folks talk about. Safety is just as important, if not more so, when we’re thinking about security and just kind of protecting people. So let’s dive into your background. What’s your open source origin story, Kate? How did you get into this crazy space?

Kate Stewart (02:18.967)
Well, I’m a little bit different than most. Okay. I got into this not as a developer. I got into this as a manager. Okay. So I basically was managing a team of developers doing bring up software back in the Motorola days. I was about 20 plus years ago now. And Apple had just finished pivoting away from the PowerPC. And so we needed to be able to prove that what we were doing in the silicon was actually going to work. And so this is part of embedded story and the enablement story to me. But what I ended up having to do is we went and looked at Linux. We went, OK, yeah, let’s work with Linux. Let’s work with the GCC tool chains, and let’s use that as our enablement platform. And then I had to, you know, we had our customers saying, OK, that’s fine, but until it’s upstream, we don’t believe it. So it was one of those sorts of ones, right?

It doesn’t, your port doesn’t exist unless it’s upstream as far as they were concerned. So we could get it running in our labs, but it’s all of a sudden I had to go and deal with the lawyers and figure out how can we get it so that we can actually contribute upstream. And I pretty much was doing, basically I went with the lawyers and when I was at Motorola and worked with them, convinced them, educated them, know, education, all that to get them so that they could understand, we’re not going to take a lot of risk.

CRob (03:18.866)
Mm-hmm.

Kate Stewart (03:47.313)
And it’s in our best interest to sell our chips. So we got that happening. And then the company spun out FreeScale for Motorola. And I got to do it all over again with a new set of lawyers. So I was managing the open source teams. And then we did a lot of work with the team in Austin. And then we started scaling that out worldwide to all the other open source teams in Motorola, or FreeScale at the time then, and managing teams.

CRob (04:00.05)
Hahaha

Kate Stewart (04:16.363)
Teams in China, teams in Israel, teams in Canada, and gosh, Japan and France. Anyhow, we had a good selection of teams around the world. And so making sure that we could actually do all this properly at scale was an interesting challenge. So that was my start in open source. And this is why you sort of see me in the licensing space, because I’ve been talking to lawyers a lot. And that’s kind of where SPDX started, is because we had to keep doing the same metadata over and over and over.

And so my colleagues at Wind River were looking at the same stuff. My colleagues at Monovista were looking at the same stuff. We had no way of collaborating. So it was a language to collaborate with. that if I go and scrub this curl package and pull all the licensing information out and sanitize it, I could share it with you type of deal and vice versa. So that’s kind of how SPVX started with being able to share the metadata about projects and make things transparent so that we could do the risk analysis properly. After that. you know, I basically went and got an opportunity to join Canonical and I was a Buntus release manager for two and a half years. So all of a sudden that was a whole different view of open source, right? I was coming from a nice little embedded space and at Canonical, I learned all about dependencies and all about how you had to make sure that your full dependency chain was actually satisfied and coherent.

CRob (05:23.664)
Nice.

Kate Stewart (05:39.447)
Or you were going to have problems. I also learned a lot about zero days and making the whole security story come together so that we could ship a distribution that people were coming, that was not going to cause problems for people. And so there’s about five, a bunch of releases I was released management for in that time period. And so that taught me a lot about open source at scale in the current environment. And after that, I went to, I was doing the roadmaps at Linaro.

CRob (05:51.43)
Mm-hmm.

Kate Stewart (06:09.399)
Product management, was the director of product management at Lenaro to figure out where did the army ecosystem want to collaborate? What topics do want to collaborate on? And so I was there for a couple of years and then I joined the Linux Foundation in 2015. So this will be my 10th year and it’s been a fun night. I know. So I’ve been, yeah, yeah, it is. they brought me in because the lawyers at the Linux Foundation knew me from the SPDX world. But after I joined, they sort of they had me point to a certain problem initially when I joined and we figured out a solution. But then they realized, oh, she understands embedded. Ooh. So they basically asked me to pull together the funding pool for real-time Linux. And I said to them that that was a big problem. And now, as of last year, we finally have everything upstream. It’s only taken eight years, but we finally got it. We preempt our taste at patches. And then after I was able to get that going well, went, hmm, we’ve got this RTOS called Zephyr that Intel is contributing up and we need someone to run it and start to build a program up. And so it was sort of like, okay, let’s figure out how we can deal with this one. So it was fun. We did a lot of surveys with developers, what were the big pain points in the IoT ecosystem? Back in 2015, I don’t know if you remember, but the big joke was what’s the S in IoT? Yeah, exactly. And so…

Focusing on security in Zephyr has been the focus pretty much since the day we launched the project. We had our first security committee meetings. And every time we found a new security best practice, we did try to apply it into Zephyr and to see where we are. I think we were the fifth project to get a gold badge out of the OpenSSF Badging Program. I think there’s a few more now, but not that many. We have gone to that level.

Kate Stewart (08:02.417)
And certainly Zephyr was there before Linux was there, just as side thing. So we actually got it.

CRob (08:08.274)
I’ll tell Greg next time I see him.

Kate Stewart
He knows it. I’ve teased him for many years. So nothing new there. But yeah, so we were trying to do best practices in Zephyr. And we’ve been working towards safety in Zephyr. Now it’s taken us a while to figure out a path that would work in the context of open source. But these were the two things that we were told back in 2015 by the developers that they wanted to see in open source Arctos. And so that’s what we’ve been focusing on. the project has grown really well over the years. We’re now the fifth most active project at the Linux Foundation in terms of project velocity, especially by CNCF, not us. So we have some degree of separation there.

And we’ve actually hit the top 30 of all open source projects now, so we’re the 25th. So it’s had a pretty good trajectory. And this just goes to show, if you try to do the best practices, it makes a project successful. And developers want to be there because they can build on it. So that’s kind of the origin story of where I am to today. I’ve been starting to work on it. I continue to work on Linux-related topics with the safety, with the Elisa project.

Kate Stewart (09:20.331)
And we’re busy trying to figure out similar paths, how we can go for certifications with Linux. So that’s been growing slowly. But both of these projects are ones that grow very slowly over time. And they just sort of creep up bit by bit by bit by step by step by step. And I’ve got some great, yeah, exactly, pedal by pedal by pedal. I’ve got some great board members in both projects who are very much engaged and have been doing a lot to help us move it forward. So I’ve been very lucky in that sense too. Yeah.

CRob (09:45.362)
Awesome.

That’s an amazing journey that you’ve described and touching on so many interesting different areas. I love like through these interviews, I get to hear how people got here and the variety of skills that people bring that you’re able to contribute very meaningfully to upstream is just amazing. It really makes me happy to hear that you have again, kind of a non-traditional open source background. That’s great.

Kate Stewart (10:09.729)
Yeah, like I say, whenever the academics go do surveys about open source origins, I always enjoyed being able to do that to say, it isn’t just developers that do open source and help open source. Realistically, it’s a community. And so it’s, know, everyone has different roles and different abilities to contribute in different ways. And as we all want to go after the same vision, the right thing happens. It’s pretty much what happened. It’s pretty much what I’ve seen anyhow.

CRob (10:31.89)
So where I first got to interact with you way back in days of yore was on this little thing that you alluded to, this small little thing called Software Bill of Materials. I joined the second NTIA call and you were one of the presenters in the room. So for our audience, could you maybe give me an elevator pitch on what’s an SBOM and why is it important?

Kate Stewart (10:57.835)
Well, so Software Bill of Materials is a way of exposing the metadata about the software you’re using in a transparent fashion. And so it’s basically putting in together the key elements of what are your components and how are they linked together? What’s the relationships between them? And understanding these relationships and how these components interact is what gives you the ability to decide if something is potentially at risk or not. Is it vulnerable or is it not vulnerable? What are the transitive dependencies? What’s happening? Realistically, there was a lot of simplifications that were made at that point in time in the initial S-BOM work. They’re starting to come back to bite us now. Just saying, one of which was the only thing was includes as a dependency. Well, realistically,

Kate Stewart (11:53.067)
You need to know something statically linked is something dynamically linked. How are these, how are these things interacting? and we’ve got, you know, things emerging right now in the whole AI front of, know, what training data we’re using, you know, what model on your models and how is this all assembled? And so these are things that we’d actually dealt with in SPDX a long time ago, but the SBOM community wasn’t ready to talk about it at that time. They’re starting to move in this direction now, but you’ll find, like I say, I’m having a lot of fun right now reading some academic papers, because I’ll be talking about some of these topics at the end of the month up in Ottawa at the MSR. So I’ll be doing a keynote there about some of this stuff. And I was looking at all these academic papers to see, OK, what do they think an S-BOM is, just so that I figure out where I have to talk to them about this stuff.

Kate Stewart (12:52.263)
One of the things that they’re seeing is that there’s a lot of the interpretations and the subtleties around the S-BOM is this data of components and the relationship between these components is very much like an elephant. It depends on which part of the elephant you’re coming in at it from and what you’re feeling with with your little line hands as to what you perceive it as.

Kate Stewart (13:18.999)
And we went through the, I guess it was in the CISA working group. I was a fly on the wall on the first one and opened my mouth, which is why I was on the second NTIA call. And that’s why I basically took over co-leading the formats and tooling working group under NTIA. And I was an active participant in the framing discussions. So I’ve been pretty much involved with it all the way through. then I was working when we first had the the SBOM Everywhere. So we had the SBOM Working thing in Washington. I was there. And that was sort of the start of figuring out, okay, well, we want to have the SBOM Everywhere sync to start focusing issues. And so when there was a hiatus between NTIA and CISA, that SBOM Everywhere group was keeping the discussion going in a reasonably collective way. And we’re sort of starting to head into that again with the NTIA SBOM Everywhere, pulling voices together, and understanding how this technology is evolving and where the strengths and weaknesses are and where the gaps are, filling the gaps. So I’ve been involved with the OpenSSSF, OpenSSF, SBOM Everywhere SIG under the tooling group, basically with Josh Breshers since it started, trying to make sure that we had the different voices coming in and we had the different perspectives available so that we could start to get a good lie of the land. And we started work on clarifying what is actually happening with the types of SBOMs? Because the type of data you have available for source SBOMs is what historically the legal focus focused on for license compliance, but still useful for the safety people. But then you have the build SBOMs where you say, OK, these are the pieces I’m putting together. And the question is, you capturing, these are all the source files that went into my build image, or are you just capturing these are the components that go into my build image?

Kate Stewart (15:14.559)
But these are build types of SBOMs. And they have different purposes, and they have different ability to reduce false positives. Specifically, if you don’t compile a piece of code into your package, you’re not vulnerable to it. there’s a lot of, I’d like to get rid of a whole bunch of these false positives in the security space. I really think we should be able to, if we actually get the right level of information. So we can take that and then what you actually configure may impact whether you have vulnerability or not when you deploy it. So what have you released and where are you deploying? And then what’s happening to the system as it’s running? Are you updating your libraries? Are there potentially interactions between your runtime libraries and what you put down as images? All these are different types of data that can legitimately be put into an SBOM. And there’s different levels of trust depending on where you are in the ecosystem, how much you actually how much the people who have put the data into the format actually understand the data and have confidence. Because there’s a lot of tools out there, which is the sixth type, which is an analysis SBOM. And these are ones that are looking at a different part of the life cycle or off to the side and trying to guess what’s going on. And the guessing what’s going on is if you don’t have anything else, that’s what you need to do. No question. But if you can get it precisely from the build flows and from your whole ecosystem as it’s being used, as it’s being deployed, as it’s being monitored, you’re going to have a lot more accuracy, which removes a lot of problems. So that type of concept that we sort of picked up on in OpenSSF, Seth Swansegan, has been picked up by CISUN. And we’ve got a short white paper out about the same thing. That concept hasn’t hit academia yet.

CRob (17:03.858)
We better get that over there.

Kate Stewart (17:05.55)
Uh-huh. so this is I’ll be having some fun. So the whole concept of the different data and like they’re trying to look at like all these SBOMs everywhere and how do you work with them, et cetera. And it’s sort of like, well, first question one is how authoritative is the people who produce this SBOM on you? Is it garbage in? Is it someone just filling in the fields and not really understanding what they’re filling in?

CRob (17:28.676)
Mm-hmm.

Kate Stewart (17:31.797)
Or does it come from a source that you trust and that you think actually knows what you’re talking about so you can build on it further and augment it and everything else? So yeah, these are the challenges I think that are kind of interesting for us to be playing with right now. And so I was part of, in SysI, working on the basically the tool quality was basically we were working on looking at tooling and looking at the formats and actually ended up doing a revision to the revision of the framing document. So we basically pulled together consensus on what the next additional sets of fields were going to be that got added into the tree. And then I think CISA has plans, or at least had plans, we’ll find out what happens this year, of basically going through a more formal process. But they’ve got the stakeholder input from us now. And that was the stage of where we thought, and realistically, the lawyers managed to show up in those meetings and convince them that, they did need to keep licenses in there now. Thank you. Because it’s another form of risk at the end of the day. It’s just risk forms. And so people creating these things and doing policies, the academics are busy trying to study this because it’s an interesting area for them. And so I think we’ll see a lot of innovation in the next few years and getting towards a stage where we can actually do product lines at scale.

Um, and being able to keep things safe, I think is where we need to aim for the whole SBOM initiatives and they need to be system-wide. They can’t just be software anymore. You need to know which data sets you’ve used because you can have poison data sets all over the place. can have, um, you know, bad training on your models might have an impact if you’ve not weighted things properly. These are all factors that when people are trying to understand what went wrong, they’re going to need to be able to look at.

CRob (19:21.458)
What I love about my upstream work is that I get to collaborate with amazing people like you on these really hard challenges and know software bill of materials has been won. Let’s move on to another interesting problem we all get to work with. The European Union last year released a new piece of legislation, the CRA, the Cyber Resilience Act.

Kate Stewart (19:45.565)
Yes.

CRob (19:47.43)
And it’s a set of requirements for predominantly manufacturers, but also a little bit for open source stewards that talks about the requirements for products with digital elements. through our parallel work, we worked with our, with Hillary Carter and the LF research team. And we did two study, we sponsored two studies. The first study, which was very sad news where we polled 600 people participating in upstream open source, whether that was a manufacturer, a developer, a steward like us. And the findings there were a little scary, where a lot of people are unaware and uncertain is the title of the paper. People don’t know about this law and the requirements that are coming down the pipe. But in concert with that, LF Research also released a paper that has, I feel, is some pretty spectacular news. The title of the other paper was Pathways to Cybersecurity, Best Practices in Open Source, and a project that’s near and dear to your heart was cited as one of the exemplars of an upstream project that takes security seriously and is doing the right types of things. So could you maybe talk a little bit about Zephyr and talk about like what your personal observations of what this report kind of detailed?

Kate Stewart (21:13.749)
Yeah, we’re going to have a gap. Also, there’s a lot of things that some of these projects have done right. Zephyr, for one, like I say, we’ve taken security seriously right since the project started. Literally, the first security committee meeting happened the day after we launched the project. it’s in our blood that we make sure that we’ve tried to do this right. We’ve been, you know, we started trying to figure out, and was funny enough, was part of the Core Infrastructure Initiative Open Source Badging started at that point in time roughly as well, that’s us. And so we looked at it going like, well, how do we improve our security posture as a project? Because we were coming at this from a community and we started using the badging program as looking at the criteria, filling it out, understanding how this all works. And we initially got to passing.

Yay, we got it. We’ve got our basic passing. And then David Wheeler got busy with his community, and he came up with a silver and a gold level on us. And we just kept increasing the resilience of the project at the heart of it all. So this was also really quite educational. So we started looking at that. And as we were going for that first passing, we realized, it. What’s involved with becoming a CVE, numbering authority, or CNA?

Kate Stewart (22:37.399)
And so I started working and reaching out to the folks behind that and understanding what’s involved. And so we ended up having myself and some of the security committee, we ended up meeting with them trying to figure out, okay, well, what does it take for project to be a CNA, not a company, but a project? And so in, think 2017, 2018, we actually got our CNA criteria. We fulfilled the CNA criteria and the project itself has been a CNA with a functioning P-SIRT.

CRob (23:06.066)
Nice.

Kate Stewart (23:06.557)
We’ve been plugged into the whole P-SIRT community and the first community and everything else for the last few years. And last year at first conference, I actually gave a talk about Zephyr because a lot of the corporation folks don’t realize that projects can do this too, if they have the will and the initiative behind their membership and their developers that they want to be this way. we’ve been looking at these things and tackling that.

CRob (23:14.514)
Excellent.

Kate Stewart (23:36.311)
And then we went after, we were sort of sitting on our laurels to a certain extent before we started really going after the silver. And the automation behind that badge kicked us out because some of the website lights weren’t working anymore. Yeah. So I went, oh, it kicked us out. We’ll take it seriously now. OK. This is good. It wasn’t just a paper exercise. There’s actually something that’s keeping you honest there behind the scenes. And that checking behind the scenes motivated us then to go after silver and then gold, and we finally got gold. then, so I can say we were about the fifth, I think the fifth to get gold, fourth or the fifth. I was really, you we were pretty happy. And we’ve maintained it since then. And every year we do a periodic audit of the materials. We started looking at the scorecard practices as a project. And so we started looking at last year and the security committee, actually, this is the amusing part here. The security committee was going, ah, this really doesn’t apply to people building things with Zephyr, et cetera, et cetera. And we were going like, oh, well, OK.

Kate Stewart (24:33.761)
And then when we had this little recent incident with that exposure, all of a sudden we’re talking on weekends and they realized, shit, we better take this all seriously. So we’re improving our posture there too now. So I think we went like from 50 something up to 78 and we’ve got plans for getting ourselves up to the hundred level type of deal. So we’re continuing to improve our posture and work on that sort of stuff there. So Zephyr takes security very, very seriously as you can tell.

CRob (24:52.914)
Excellent.

Kate Stewart (25:03.031)
It really is, you know, part of the reason I think we’ve been a successful project is the fact that we do have a strong security story. We’ve got our threat models. We’ve got, know, we’ve got the things that people are saying to be looking at and we keep putting them and we have, people that meet from our members and from the community and, you know, continue to refine our posture. So I think it’ll always be that way. And so whenever I can find a new practice, we’re starting to look at, can we apply it? How does it match to what we’ve already done?

And so I’m curious when the baseline stuff rolls out, what is it going to look like? And I suspect we’ve already hit there, but we might learn something. That’s always good. And the nice thing about doing it this way, which was applying the best practices badging and forth, let us put things in place. They’re serving us well for the CRA. And that’s probably why we showed up in this report, because

You know, lot of the things that we do with the US government, once the Europeans figure out, what database are they going to be throwing things at? Who is going to be the C-CERTS? We’ve been doing it already on the US side. We should be able to do it on the European side. It’s just a matter of figuring it out a little bit. Just basically, testing a few of our processes. But we’ve already been doing these types of things in one direction. We just have to broaden the reach a little bit more.

CRob (26:26.651)
Yeah, I think it’ll be a pretty easy pivot. You need to make some small adjustments, do some documentation potentially, but it sounds like you’ll be in a pretty good spot to fulfill that any obligations underneath that anyone that uses Zephyr will be in a great space to defend themselves like the Market Surveillance Authority or other groups.

Kate Stewart (26:46.037)
Right. And, you know, one of the things that the project did when we started looking at the criteria coming down, and the needs of the criteria was it was looking like, okay, they want to have this, this longer support period, like, you know, between five years up to 25 type of deer. And so the separate projects TSC actually in March voted to extend our LTS support from 2.5 years to five years. And we’ve got that we’re doing different things in different periods of time.

CRob (27:12.551)
Yay!

Kate Stewart (27:15.863)
Okay, so what we’re doing traditionally now for the first two years, two and a half years, and then we’re basically just focusing on security and critical updates after that. So it keeps the load reasonable for our maintainers and our release team members. And so that’s kind of how we’ve attracted. So, yeah, the TSE voted on it, it was approved. And so that’s our bit we could do to help shrink the gap between the steward obligations and the manufacturer obligations. So we’re trying to make it.

Kate Stewart (27:45.013)
Zephyr is friendly as possible for the manufacturers. Now we are going to have a challenge that I don’t see a clean story for yet. We went through some bulk vulnerability assessments and things like that, and we’ve changed some processes. And what we do is we have documented in the project, we will respond to any vulnerability reports in seven days. We will basically, you know, we’ll acknowledge them, you know, start working with whoever’s reporting things in, and then we will in 30 days have it fixed upstream.

And then in 60, we’ll have another 60 day embargo window. total 90 days of embargo window before we make it public. And we do this because in the embedded space, you know, it’s a lot harder sometimes to remediate things in the field. And so we wanted to make sure that, you know, the people who are trusting Zephyr as their RTOS, you know, would have a chance to work with their customers. Now the CRA, you know, and so the CRA is asking for, especially in severe vulnerabilities, some pretty different timelines. And so I’m really going to be interested in how that’s all going to boil itself out. The other thing that’s interesting is that the CRA is calling for us to notify our customers. Well, we don’t know who’s using Zephyr. So one of the mechanisms we put in place earlier, which was a product, like basically a vulnerability notification list for our product makers.

Kate Stewart (29:12.887)
So any of our members in the project or any people who can show us that they’re using a product that’s got Zephyr in it, we’ll add them to the notification list under embargo. And so we’re trying to handle it that way. But that’s going to be the best we’re going to able to do. We won’t be able to find this end user because we don’t have the line of sight. And now the cut, the manufacturer.

CRob (29:32.028)
But exactly, and that’s not necessarily your responsibility is the upstream. That’s the manufacturer somebody that’s commercializing this.

Kate Stewart (29:36.317)
Well, it’s one of those sections that applies a little bit to the stewards.

CRob (29:45.81)
It does, but the timelines are not identical.

Kate Stewart (29:47.147)
So that’s why we’re always going to do to whoever wants to let us know about them. Yeah, let us know about them. I think sanity will prevail, and we will not be subject to various other, some of the punitive stuff. But I think we can certainly look at trying to make sure that what we’ve done is as much as we can do with the data we have available to us. And hence, the more transparency we make into this ecosystem, the better. Speaking back to S-BOMs, you know, Zephyr actually, every build of Zephyr, you can get three SBOMs out with just a couple command line tweaks. You get sources SBOM of the Zephyr sources from the files you pulled from. Of course, the sources from the application you’ve used. So you get source SBOMs for each of those. And then you get a build SBOM that links back with cryptographic encryption to the source and to the, to the, to the source SBOMs and lets you know exactly which files made it in as well as which, you know, dependency links and components you may have pulled from your ecosystem. But that level of transparency is what we’re going to need for safety. And we have it today with Zephyr from following these best practices on security. So we’re in reasonable shape, I think, for the regulatory industries as well.

CRob (31:05.414)
Good. Well, I could talk about SBOM and CRA all day, but let’s move on to the rapid fire part of our interview.

Kate Stewart (31:14.134)
Okay.

CRob (31:17.66)
got a bunch of questions here. Just give me your first emotional response. First thing that comes to mind here. First question, VI or EMACs.

Kate Stewart (31:28.599)
VI.

CRob (31:30.546)
Nice, very good, very good. These are all potentially contentious somewhat, so don’t feel bad about having an opinion. What’s your favorite open source mascot?

Kate Stewart (31:34.638)
Tux

CRob (31:45.106)
Excellent choice. What’s your favorite vegetable?

Kate Stewart (31:50.625)
Carrots.

CRob (31:52.434)
Very nice. Star Trek or Star Wars?

Kate Stewart (31:58.455)
Star Trek.

CRob (32:00.541)
Very good. There’s a pattern. So everyone I’ve asked that so far has been trekkers, which is good. And finally, mild or spicy food.

Kate Stewart (32:09.611)
Depends. Probably, at this point now more mild. There’s certain things I like spicy though.

CRob (32:14.578)
Ha

CRob (32:19.42)
Fair enough. Well, thank you for playing along there. And as we wind down here, Kate, do you have any call to action or any advice to someone trying to get into this amazing upstream ecosystem?

Kate Stewart (32:33.303)
There’s lots of free training out there. Just start taking it. Educate yourself so that you can participate in the dialogue and not feel completely overwhelmed. Each domain has its own set of jargon, be it lawyers, licensing jargon, security professionals with their jargon, safety folks with all their standards jargon. Everyone talks about certain concepts a little bit differently. so taking the free training that’s available from the Linux Foundation and other places just so that you’re aware of these concepts. actually, before you start opining on some of these things, actually do your homework. I know that’s a horrible concept, but like I said, was reading on some of these papers, these academic papers, and it was pretty clear that they hadn’t done their homework in a couple of areas. So that was a bit sort of like, yeah, OK. So yeah, do your homework before you start to opine. But do your homework. Educate yourself. Do your homework. And then find the areas that are most interesting to you, because there’s so many areas where people need help these days and there’s a lot of things we can participate in. And you don’t need to be a developer to participate as well. Obviously developers make everything come together and make it all work, but there’s a need for people doing a lot of other tasks. And if you want to try and make things move forward, there’s lots of ways.

CRob (33:49.586)
Well, thank you for sharing your story and thank you for all the amazing contributions you’re making to the ecosystem. Kate Stewart, thank you for showing up today.

Kate Stewart (33:56.919)
Thank you very much for all the work you’re doing in OpenSSF. I think this is really important. Thank you.

CRob (34:03.718)
My pleasure. And to everyone out there, thank you for listening and happy open sourcing. We’ll talk soon. Cheers.