Tag

Software Supply Chain Security

What’s in the SOSS? Podcast #53 – S3E5 AIxCC Part 3 – Buttercup’s Hybrid Approach: Trail of Bits’ Journey to Second Place in AIxCC

By Podcast

Summary

In the third episode of our AI Cyber Challenge (AIxCC) series, CRob sits down with Michael Brown, Principal Security Engineer at Trail of Bits, to discuss their runner-up cybersecurity reasoning system, Buttercup. Michael shares how their team took a hybrid approach – combining large language models with conventional software analysis tools like fuzzers – to create a system that exceeded even their own expectations. Learn how Trail of Bits made Buttercup fully open source and accessible to run on a laptop, their commitment to ongoing maintenance with prize winnings, and why they believe AI works best when applied to small, focused problems rather than trying to solve everything at once.

This episode is part 3 of a four-part series on AIxCC:

Conversation Highlights

00:04 – Introduction & Welcome
00:12 – About Trail of Bits & Open Source Commitment
03:16 – Buttercup: Second Place in AIxCC
04:20 – The Hybrid Approach Strategy
06:45 – From Skeptic to Believer
09:28 – Surprises & Vindication During Competition
11:36 – Multi-Agent Patching Success
14:46 – Post-Competition Plans
15:26 – Making Buttercup Run on a Laptop
18:22 – The Giant Check & DEF CON
18:59 – How to Access Buttercup on GitHub
21:37 – Enterprise Deployment & Community Support
22:23 – Closing Remarks

Transcript

CRob (00:04.328)
And next up, we’re talking to Michael Brown from Trail of Bits. Michael, welcome to What’s in the SOSS.

Michael Brown (ToB) (00:10.688)
Hey, thanks for having me. I appreciate being here.

CRob (00:12.7)
We love having you. So maybe could you describe a little bit about your organization you’re coming from, Trail of Bits, and maybe share a little insight into what your open source origin story is.

Michael Brown (ToB) (00:23.756)
Yeah, sure. So Trail of Bits is a small business. We’re a security R &D firm. We’ve been in existence since about 2012. I’ve personally been with the company about four years plus. I work there within our research and engineering department. I’m a principal security engineer, and I also lead up our AIML security research team. So Trail of Bits, we do quite a bit of government research. We also work for commercial clients.

And one of the common threads in all of the work that we do, not just government, not just commercial, is that we try to make it as public as much as we possibly can allow. So for example, sometimes, you know, we work on sensitive research programs for the government and they don’t let us make it public. Sometimes our commercial clients don’t want to publicize the results of every security audit, but to the maximum extent that our clients allow us to, we make our tools, we make our findings, we make them open source. And we’re really big believers in

that the work that we do should be a rising tide that raises all ships when it comes to the security posture for the critical infrastructure that we all depend upon, whether we’re working on hobbies at home and whether we’re building things for large organizations, all that stuff.

CRob (01:37.32)
love it. And how did you get into open source?

Michael Brown (ToB) (01:42.146)
Honestly, I’ve just kind of always have been there. So realistically, you know, the open source community is where a lot of the research tools that I started out my research career. That’s where you found them. So I started off a bit in academia. I got my undergrad in computer science and then went and did something completely different for eight years. And then when I kind of

Uh, you know, for context, I joined the military. flew helicopters for like eight years and basically nothing in computing. But as I was starting to get out, um, the army, I, know, I was getting married, about to have kids. I kind of decided I wanted to be, you know, around the house a little bit more often. um, you know, I started getting a master’s degree at Georgia Tech. They’re offering it online. And then after I did that, I went to go, um, do a PhD there and also work for their, um, uh, their applied research arm, Georgia Tech Research Institute.

So lot of the work that I was doing was, you know, cutting edge work on software analysis, compilers and AI ML. And a lot of the stuff that I, you know, built the tools that I did my research on, they came from the open source community. They were tools that were open sourced as part of the publication process for academic work. They were made publicly and available open source by companies like Trail of Bits before I came to work with them as the result of government research projects.

So, honestly, I guess I don’t really have much of an origin story for when I got there. I kind of just landed there when I started my career in security research and just stayed.

CRob (03:16.814)
Everybody has a different journey that gets us here. And interestingly enough, you mentioned our friends at Georgia Tech, which was a peer competitor of yours in the AICC competition, which you all on Trail of Bits team, I believe your project was called Buttercup. And you came in second place. You had some amazing results with your work. So maybe could you tell us a little bit about the…

Michael Brown (ToB) (03:33.741)
Yeah, that’s correct.

CRob (03:43.15)
What you did is part of the AI CC competition and kind of how your team approached this.

Michael Brown (ToB) (03:51.022)
Yeah. So, um, you know, at the risk of sounding a bit like a hipster, um, I’ve been working at the intersection of software security, compiler, software analysis, AI ML for, you know, basically, um, almost my entire, uh, career as a research scientist. So, you know, dating back to the earliest program I worked on, uh, for, DARPA was back in 2019. And, um, so this was, this was before the large language model was a predominant form of the technology or kind of became synonymous with AI. So.

CRob (04:04.719)
Mm.

Michael Brown (ToB) (04:20.792)
For a long time, I’ve been working and trying to understand how we can apply techniques from AI ML modeling to security problems and doing the problem formulation, make sure that we’re applying that in an intelligent way where we’re going to get good solid results that actually generalize and scale. So as the large language model came out and we started recognizing that certain problems within the security domain are good for large language models, but a lot of them aren’t.

When the AI cyber challenge came around, we always approached this and this, I was the lead designer, my co-designer, Ian Smith. And I, you know, when we sat down and make the, the original concept for what became Buttercup, we always took an approach where we were going to use the best problem solving technique for the sub problem at hand. So when we approached this giant elephant of a problem, we did what you do and you have an elephant and you’ve got to eat it, eat it one bite at a time.

So each bite we took a look at it and said, okay, you we have like these five or six things that we have to do really, really well to win this competition. What’s the best way to solve each of these five or six things? And then the rest of it became an engineering challenge to chain them together. Our approach is very much a hybrid approach. This was a similar approach taken by the first place winners at Georgia Tech, which by the way, if you’ve got to be beat by anybody being beat by your alma mater, it takes a little bit of this thing out of it. So, you know, we came in first and second place. It’s funny, I actually have another Georgia Tech PhD alumni.

CRob (05:33.832)
You

Michael Brown (ToB) (05:42.926)
on my team who worked on Buttercup. So Georgia Tech is very well represented in the AI cyber challenge. So yeah, we’ve always had a hybrid approach. The winning team had a hybrid approach. So we used AI where it was useful. We used conventional software analysis techniques where they were useful. And we put together something that ultimately performed really, really well and exceeded even my expectations.

CRob (05:45.458)
That’s awesome.

CRob (06:07.56)
I can say I mentioned in previous talks, I was initially skeptical about the value that could have been derived from this type of work. But the results that you and the other competitors delivered were absolutely stunning. You have converted me into a believer now that I think AI absolutely has a very positive role can play both in the research, but also kind of the vulnerability and operations management space. What

Looking at your buttercup. What is unique about your approach with the cyber reasoning system with the buttercup?

Michael Brown (ToB) (06:45.39)
Yeah, so it’s funny you say that we converted you. I kind of had to convert myself along the way. There was a time in this competition where I thought, you know, this whole thing was going to kind of be reliant on AI too much and was going to fall on its face. And then, you know, at that point, I’d be able to say like, see, I told you, you can’t use LLMs for everything. But then it turns out, you know, as we got through there, we use LLMs for two critical areas and they worked much better than I thought they would. I thought they would work pretty well, but they ended up working to a much better degree than I thought they actually would. you know, what makes Buttercup unique?

CRob (06:49.852)
Yeah.

CRob (07:00.678)
You

Michael Brown (ToB) (07:15.69)
is that, like I said, we take a hybrid approach. We use AIML for very good problems that are well-suited for AIML. And what I mean by that is when we employ large language models, we use them on small subproblems for which we have a lot of context. We have tools that we can install for the large language model to use to ensure that it creates valid outputs and outputs that can carry on to the next stage with a high degree of confidence that they’re correct.

CRob (07:30.076)
Mm-hmm.

CRob (07:43.912)
Mm-hmm.

Michael Brown (ToB) (07:45.934)
And then in the places where we have to create or sorry, in one of the places where we have to use conventional software analysis tools, those areas are very amenable to the conventional analysis. So, you what I mean by this? good example is, for example, we needed to produce a proof of vulnerability. We have to have a crashing test case to show that when we claim a vulnerability exists in a system, we can prove through reproduction that it actually exists. Large language models aren’t great.

at finding these crashing test cases just by asking it to look at the code and say, hey, what’s going to crash this? They don’t do very well at that. They also don’t do well at generating an input that will even get you to a particular point in a program. But fuzzers do a great job of this. So we use the fuzzer to do this. But one of the things about fuzzers is they kind of take a long time. They’re also more generally aimed at finding bugs, not necessarily vulnerabilities.

CRob (08:36.808)
Mm-hmm.

Michael Brown (ToB) (08:42.702)
So we use an AIML, or Large Language Model based accelerator, or a C generator, to help us generate inputs that were going to guide the fuzzer to either saturate the fuzzing harnesses that existed for these programs more quickly. They would help us find and shake loose more crashing inputs that correspond to vulnerabilities as opposed to bugs. And those things really, really helped us deal with some of the short analysis and short.

processing windows that we encountered in the AI cyber challenge. So was really a matter of using conventional tools, but making them work better with AI or using AI for problems like generating software patches for which there really aren’t great conventional software analysis tools to do that.

CRob (09:28.018)
So as you were going through the competition, which went through multiple rounds, are there anything that surprised you or that you learned that, again, you said your opinion changed on using AI? What were maybe some of the moments that generated that?

Michael Brown (ToB) (09:45.226)
Yeah, so there I mean there were a couple of them. I’ll start with one where I can pat myself on the back and I’ll finish with one where I was kind of surprised. So first, we had a couple of moments that were really kind of vindicating as we went through this. Our opinion going into this was that large language models, couldn’t just throw the whole problem at it and expect it to be successful. So going into this, there was a lot of things that we did that we did

CRob (09:49.405)
hehe

Michael Brown (ToB) (10:14.774)
two years ago when we first started out that, you know, that’s like two years ago, it’s like five lifetimes when it comes to the development of AI systems now. So there were some things that we did that didn’t exist before that became industry standard by the time we finished the competition. So things like putting your LLM queries or your LLM prompts in a workflow that includes like validation with tools or the ability to use tools.

CRob (10:29.298)
Mm-hmm.

Michael Brown (ToB) (10:43.062)
That was something that was not mainstream when we first started out, but that was something that we kind of built custom into Buttercup when it came particularly to patching. And then also using a multi-agent approach. know, a lot of the, you know, I don’t know, a lot of the hype around AI is that, know, you just ask it anything and it gives you the answer. You know, we’re asking a lot of AI when we say, here’s a program, tell me what vulnerabilities exist, prove they exist, and then fix them for me.

And also don’t make a mistake anywhere along the way. It’s way too much to ask. we found particularly with patching, were, back then, multi-agent systems or even agentic systems or multi-agentic systems were unheard of. were still just, we were still using chat GPT 3.5, still very much like chatbot interactions, web browser interactions.

CRob (11:16.564)
Yeah.

Michael Brown (ToB) (11:36.438)
integration into tools was certainly less widespread. So we had seen some very early work on archive about solving complex problems with multiple agents. So breaking the problem down for it. And we used this and our patcher ended up being incredibly good. was our most important and our biggest success on the project. Really want to shout out Ricardo Charon, who’s our leader, the lead developer for our patching agent.

CRob (11:47.976)
Mm-hmm.

Michael Brown (ToB) (12:06.414)
or for a patching system within for both the semi finals and finals in the ICC. He did an incredible job and we really built something that I, like I said, I regard as our biggest success. So, know, sure enough, and as we go through this two year competition, now all of a sudden, you know, multi agentic systems, multi agentic tool enabled systems, they’re all the rage. This is how we’re solving these challenging problems. And also a lot of this problem breakdown stuff that has made its way baked into the models now, the newer thinking and reasoning models from

Anthropic and open AI respectively. They you you can give it these large complicated problems and will do it will first try to break it down before trying to solve it. So you know we were building all that stuff into our system before. It came about so that that’s an area where you know, like I said, we learned along the way that we had the right approach from the beginning and it’s really easy to go back and say that that’s what we learned that we were right. So on the other side of this I will have to say, you know, I’ll reiterate I was really surprised at how well.

CRob (12:53.639)
Mm-hmm.

Michael Brown (ToB) (13:04.11)
language models were able to do some of the tasks we asked it to do. Part of it’s how we approach the problem. We didn’t ask too much of it. And I think that’s part of the reason why the large language models were successful. an area that I thought where it was going to be much more challenging was patching. But it turned out to be an area where, a certain degree, this is kind of an easier version of the problem in general because open source software, which are the targets of the AI cyber challenge, they’re ingested into the training.

CRob (13:08.924)
Mm.

Michael Brown (ToB) (13:31.404)
data for all of these large language models. the models do have some a priori familiarity with the targets. So when we give it a chunk of vulnerable code from a given program, it’s not the first time it’s seen the code. But still, they did an amazing job actually generating useful patches. The patch rate that I expected personally to see was much lower than the actual patch rate that we had, both in the semifinals and in the finals. So even in that first year window,

CRob (13:33.64)
Mm.

Michael Brown (ToB) (13:58.63)
I was really kind of blown away with how well the models were doing at code generating, code generation tasks, particularly small focused code generation tasks. So I think, think large language models are kind of getting a bad rap right now when it comes to like, you know, trying to vibe code entire applications. They’re like, gosh, this, this code is slop. It’s terrible. It’s full of bugs and stuff. Well, well, you did also ask you to build the whole thing. You know, if I asked a junior developer to build a whole thing, they probably also put together some.

CRob (14:07.366)
Yeah.

CRob (14:17.233)
Yeah.

CRob (14:26.258)
Yeah.

Michael Brown (ToB) (14:26.71)
and gross stuff. But when I ask a junior developer to give me a bug fix, much like the large language model, when I ask it for a more constrained version of problem, they tend to do a better job because there’s just fewer moving parts. So yeah, those are are two things I took away. One that, you know, like I said, I get to pat myself on the back for another that was actually surprising.

CRob (14:46.012)
That’s awesome. That’s amazing. So now that the competition is over and what does the team plan to do next beyond this competition?

Michael Brown (ToB) (14:57.098)
Yeah, so I mean, look, we spent a lot of our time over the last two years. A lot of I wouldn’t quite say blood. I don’t think anyone would bled over this, but we certainly had some tears. We certainly had a lot of anxiety. you know, we put a lot of we put a lot of ourselves into Buttercup. And so, you we want people to use it. So to that end, Buttercup is fully available and fully open source. know, DARPA made it a contingent, a contingency of participating in the competition that

CRob (15:09.917)
Mm-hmm.

Michael Brown (ToB) (15:26.892)
you had to make the code that you submitted to the semi finals and the finals open source. So we did that along with all of our other competitors, but we actually took it one step further. So the code that we submitted to the finals is great. It’s awesome, but it runs that scale. It used $40,000 of a $130,000, I think, total budget. And it ran across like an Azure subscription that had multiple nodes.

countless replicated containers. This is not something that everyone can use. We want everyone to use it. So actually in the month after we submitted our final version of the CRS, but before DEF CON occurred, where we figured out that we won, we spent a month making a version of Buttercup that’s decoupled from DARPA’s competition infrastructure. So it runs entirely standalone on its own, but more importantly, we scaled it down so it’ll run on a laptop.

CRob (16:18.696)
Mm-hmm.

Michael Brown (ToB) (16:25.154)
We left all of the hooks. We left all of the infrastructure to scale it back up if you want. So the idea now is that if you go to trail of bits.com slash buttercup, you can learn about the tool. have links to our GitHub repositories where it’s available and you can go, you can go download buttercup on your laptop and run it right now. And if you’ve got an API key, that’ll let you spend a hundred dollars. We can run a demo to show you that, we can find and patch a vulnerability and live.

CRob (16:51.496)
That’s easy.

Michael Brown (ToB) (16:53.164)
Yeah, so anyone can do this right now. So if you’re an organization that wants to use Buttercup, you can also use the hooks that we left back in to scale it up to the size of your organization, the budget that you have, and you run it at scale on your own software targets. So even users beyond the open source community, we want this to be used on closed source code too. So yeah, our goal is for, you you asked what we’re gonna do with it afterward. We made it open source, we want people to use it.

And even on top of that, we don’t want it to bit rot. So we actually are going to retain a pretty significant portion of our winnings of our $3 million prize. We’re going to retain a pretty significant portion of that. And we’re going to use it for ongoing maintenance. So we’re maintaining it. We’ve had people submit PRs that we’ve accepted. They’re tiny, know, it’s only been out for like a month, but you know, and then we’ve also made quite a few updates to the public version of Buttercup afterwards. So it’s actively maintained.

There’s money from the company’s putting its money where its mouth is. We’re actively maintaining it. The people who built it are part of the people who are maintaining it. We are taking contributions from the community. We hope they help us maintain it as well. And yeah, we’ve made it so anyone can use it. I think we’ve taken it about as far as we can possibly go in terms of reducing the barriers to adoption to the absolute minimum level for people to use Buttercup and leverage AI to help them find and patch vulnerabilities at scale.

CRob (18:16.716)
I love that approach. Thank you for doing that. How did you fit the giant check through the teller window?

Michael Brown (ToB) (18:22.574)
Fortunately, that check was a novelty and we did not actually have a larger problem than the AICC itself to solve afterward, which was getting paid. So yeah, we did have the comically large check, you know, taped up in our booth at the AICC village at DEF CON and it certainly attracted quite a few photographs from passersby.

CRob (18:26.716)
Ha ha ha!

CRob (18:31.964)
Yeah.

CRob (18:37.864)
Mm-hmm.

Michael Brown (ToB) (18:47.736)
I don’t know. think if you get on like social media or whatever and you look up AICC, if you see anything, there’s probably lots of pictures of me throwing a big smile up and you two thumbs up underneath the check that random people took.

CRob (18:59.464)
So you mentioned that Buttercup is all open source now. So if someone was interested in checking it out or possibly even contributing, where would they go do that?

Michael Brown (ToB) (19:07.564)
Yeah, so we have a GitHub organization. We’re Trail of Bits. You can find Buttercup there. You can also find our public archives of the old versions of Buttercup. So if you’re interested in what, like, the code that actually won the competitions, you can see what got us from the semifinals to the finals. You can see what won us second place in the finals. And you can also download and use the version that’s actively maintained that’ll run on your laptop. And all three of them are there. Their repository name is just Buttercup.

We are not the only people who love the princess bride. So there are other repositories named butter. Yeah. There are other, there are other repositories named butter cup on GitHub. So you might have to sift it a little bit, but yeah, basically it github.com slash trailer bits slash butter cup, I think is like 85 % of the URL there. don’t have it memorized, but, but yeah, you can find it publicly available and, along with a lot of other tools that, trailer bits has made over the years. So we encourage you to check some of those out as well. A lot of those are still actively maintained.

CRob (19:39.036)
That’s what it was.

Michael Brown (ToB) (20:03.72)
have a lot of community support. believe it or not at last, I counted like something like 1250 stars. buttercup is only like our fifth most popular tool that, that trail of bits has created. So, you know, we, we were quite, we were quite notable for creating some binary lifting tools that are up there. we have also some other tools that we’ve created recently for, parser security, analysis like that, like graftage.

And then also some kind of more conventional security tools like algo VPN, like still rank above buttercup. So as awesome as buttercup is, it’s like, it’s like only the fifth coolest tool that we’ve made as voted on by the community. So check out the other stuff while you’re there too. believe it or not, buttercup isn’t, isn’t, isn’t our most popular, offering.

CRob (20:51.56)
Pretty awesome statement to be able to say. That’s only our fifth most important tool.

Michael Brown (ToB) (20:53.966)
Yeah.

Michael Brown (ToB) (20:58.444)
I don’t know, you know, like personally, I’m kind of hoping that maybe we move up a few notches after people get time to like go find it and, you know, and start it. But, you know, we, we’ve, we’ve made some other really significant and really awesome contributions to the community, even outside of the AI cyber challenge. So I want to really, really stress all of that stuff is open source. We, you know, we, we aren’t just doing this because we have to, we actually care about the open source community. We want to secure the software infrastructure. We want people to use the tool and secure the software for, you know, before they, before they, you know,

Get it out there so that you we we we tackle this like kind of untackable problem of securing this massive ecosystem of code.

CRob (21:37.606)
Michael, thank you to Trey Labitz and your whole team for all the work you do, including the competition runner-up Buttercup, which did an amazing job by itself. Thank you for all your work, and thank you for joining us today.

Michael Brown (ToB) (21:52.802)
Yeah, thanks for having me. You one last thing to shout out there. If you’re an organization, you’re looking to employ Buttercup within your organization. Don’t be bashful about reaching out to us and asking about use cases for deploying within your organization. We know we’re happy to help out there. That’s probably an area that we focus on a little bit less in terms of getting this out the door for average folks to use or for individuals to use. So we’re definitely interested in helping to make sure Buttercup gets used.

Like I said, reach out to us, talk to us if you’re interested in Buttercup, we want to hear.

CRob (22:23.44)
Love it. All right. Have a great day.

Michael Brown (ToB) (22:25.678)
All right, thanks a lot.

What’s in the SOSS? Podcast #52 – S3E4 AIxCC Part 2 – From Skeptics to Believers: How Team Atlanta Won AIxCC by Combining Traditional Security with LLMs

By Podcast

Summary

In this 2nd episode in our series on DARPA’s AI Cyber Challenge (AIxCC), CRob sits down with Professor Taesoo Kim from Georgia Tech to discuss Team Atlanta’s journey to victory. Kim shares how his team – comprised of academics, world-class hackers, and Samsung engineers – initially skeptical of AI tools, underwent a complete mindset shift during the competition. He shares how they successfully augmented traditional security techniques like fuzzing and symbolic execution with LLM capabilities to find vulnerabilities in large-scale open source projects. Kim also reveals exciting post-competition developments, including commercialization efforts in smart contract auditing and plans to make their winning CRS accessible to the broader security community through integration with OSS-Fuzz.

This episode is part 2 of a four-part series on AIxCC:

Conversation Highlights

00:00 – Introduction
00:37 – Team Atlanta’s Background and Competition Strategy
03:43 – The Key to Victory: Combining Traditional and Modern Techniques
05:22 – Proof of Vulnerability vs. Finding Bugs
06:55 – The Mindset Shift: From AI Skeptics to Believers
09:46 – Overcoming Scalability Challenges with LLMs
10:53 – Post-Competition Plans and Commercialization
12:25 – Smart Contract Auditing Applications
14:20 – Making the CRS Accessible to the Community
16:32 – Student Experience and Research Impact
20:18 – Getting Started: Contributing to the Open Source CRS
22:25 – Real-World Adoption and Industry Impact
24:54 – The Future of AI-Powered Security Competitions

Transcript

Intro music & intro clip (00:00)

CRob (00:10.032)
All right, I’m very excited to talk to our next guest. I have Taesoo Kim, who is a professor down at Georgia Tech, also works with Samsung. And he got the great opportunity to help shepard Team Atlanta to victory in the AIxCC competition. Thank you for joining us. It’s a really pleasure to meet you.

Taesoo Kim (00:35.064)
Thank you for having me.

CRob (00:37.766)
So we were doing a bunch of conversations around the competition. I really want to showcase like the amazing early cutting edge work that you and the team have put together. So maybe, can you tell us what was your team’s approach? What was your strategy as you were kind of approaching the competition?

Taesoo Kim (00:59.858)
that’s a great question. Let me start with a little bit of a background.

CRob (00:)
Please.

Taesoo Kim (00:59)
Ourself, our team, Atlanta, we are multiple group of people in various backgrounds, including me as academics and researchers in security area. We also have world-class hackers in our team and some of the engineers from Samsung as well. So we have a little bit of background in various areas so that we bring our expertise.

Taesoo Kim (01:29.176)
to compete in this competition. It’s a two-year journey. We put a lot of effort, not just engineering side, we also tinkled with lot of research approach that we’ve been working on this area for a while. Said that, I think most important strategy that our team took is that, although it’s an AI competition…

CRob (01:51.59)
Mm-hmm.

Taesoo Kim (01:58.966)
…meaning that they promote the adoption of LLM-like techniques, we didn’t simply give up in traditional analysis technique that we are familiar with. It means we put a lot of effort to improve, like fuzzing is one of the great dynamic testing for finding vulnerability, and also traditional techniques like symbolic executions and concocted executions, even directed fuzzing. Although we suffer from lot of scalability issues in those tools, because one of themes of AIxCC is to find bugs in the real world.

And large-scale open source project. It means most of the traditional techniques do not scale in that level. We can analyze one function or a small number of code in the source code repository when it comes to, for example, Linux or Nginx. This is crazy amount of source code. Like building a whole graph in this gigantic repository itself is extremely hard. So that we start augmenting LLM in our pipeline.

One of the great examples of fuzzing is that when we are mutating input, although we leverage a lot of mutation techniques in the fuzzing side, we also leverage the understanding of LLM in a way that LLM also navigates the possibility of mutating places in the source code in a way that they can generate some of the dictionaries, providing vocabulary for fuzzer, and realize the input format that they have to mutate as well. So lot of augmentations of using LLM happen all over the places in traditional software analysis technique that we are doing.

CRob (03:43.332)
And do you feel that combination of using some of the newer techniques and fuzzing and some of the older, more traditional techniques, do you think that that was what was kind of unique and helped push you over the victory line and the cyber reasoning challenge?

Taesoo Kim (04:01.26)
It’s extremely hard to say which one contributed the most during the competition. But I want to emphasize that finding bugs in the location of the source code versus formulating input that trigger those vulnerability in our competition, what we call as proof of vulnerability. These two tasks are completely different. You can identify many bugs.

But unfortunately, in order to say this is truly the bug, you have to prove by yourself by showing or constructing the input that triggered the vulnerability. The difficulty of both tasks are, I would say people do not comprehend the challenges of formulating input versus finding a vulnerability in the source code. You can pinpoint without much difficulty the various places in the source code.

But in fact, that’s an easier job. In practice, more difficult challenge is identifying the input that actually reach the place that you like and trigger the vulnerability as a result. So we spend much more time how to construct the input correctly to show that we really prove the existence of vulnerability in the source.

CRob (05:09.692)
Mm-hmm.

CRob (05:22.94)
And I think that’s really a key to the competition as it happened versus just someone running LLM and scanners kind of randomly on the internet is the fact that you all were incented to and required to develop that fix and actually prove that these things are really vulnerable and accessible.

Taesoo Kim (05:33.718)
Exactly.

Taesoo Kim (05:42.356)
Exactly. That also highlights what practitioners care about. So you ended up having so many false positives in the security tools. No one cares. There are a of complaints about why we are not using security tools in the first place. So this is one of the important criteria of the competition. one of the strengths in traditional tools like buzzer and concord executor, everything centers around to reduce the false positives. The region people.

CRob (05:46.192)
Yes.

Taesoo Kim (06:12.258)
take Fuzzer in their workflow. So whenever Fuzzer says there is a vulnerability, indeed there is a vulnerability. There’s a huge difference. So that we start with those existing tool and recognize the places that we have to improve so that we can really scale up those traditional tool to find vulnerability in this large scale software.

CRob (06:36.568)
Awesome. As you know, the competition was a marathon, not a sprint. So you were doing this for quite some time. But as the competition was progressing, was there anything that surprised you in the team and kind of changed your thinking about the capabilities of these tools?

Taesoo Kim (06:51.502)
Ha

Taesoo Kim (06:55.704)
So as I mentioned before, we are hackers. We won Defqon CTF many times and we also won F1 competition in the past. So by nature, we are extremely skeptical about AI tool at the beginning of the competition. Two years ago, we evaluated every single existing LLM services with the benchmark that we designed. We realized they are all not usable at all.

CRob (07:09.85)
Mm-hmm.

Taesoo Kim (07:24.33)
not appropriate for the competition. Instead of spending time on improving those tools, which we felt like inferior at the beginning, so our motto at that time, our team, don’t touch those areas. We’re going to show you how powerful these traditional techniques are. So that’s why we progressed the semi-final. We did pretty well. We found many of the bugs by using all the traditional tools that we’ve been working on. But like…

Immediately after semifinal, everything changed. We reevaluated the possibility of adopting LLM. At that time, just removing or obfuscating some of the tokens in the repository, the LLM couldn’t even reason anything about. But suddenly, near or around semifinal, something happened. We realized that even after we inject or

If you think of it this way, there is a token, and you replace this token with meaningless words. LLM previously all confused about all these synthetic structures of the source code, but now, or on semifinal, they really understand. Although we tried to fool many times, you really catch up the idea, which is a source code that they never saw before, never used in the training, because we intentionally create this source code for the evaluation.

We start realizing that we actually understand. We shock everybody. So we start realizing that there are so many places, if that’s the case, there are so many places that we can improve. Right? So that’s the moment that we change our mindset. So now everything about LLM, everything about the new Asian architectures, so that we ended up putting humongous amount of efforts creating various architectures of Asian design that we have.

Also, we replaced some of software analysis techniques with LLM as well, surprisingly. For example, symbolic execution is a good example. It’s extremely hard to scale. Whenever you execute one instruction at a time, you have to create the constraint around them. But one of the big challenges in real-world software, there are so many, I would say, hard-to-analyze functions exist. Meaning that, for example, there is a

Taesoo Kim (09:46.026)
Even NGINX as an example, we thought that they probably compared the string to string at a time. But the way they perform string compare in NGINX, they map this string or do the hashing so that they can compare the hash value. Fudger, another symbolic executor, is extremely bad at those. If you hit one hashing function, you’re screwed. There are so many constraints that there is no way we can revert back by definition.

There’s no way. But if you think about how to overcome these situations by using LLM, the LLM can recognize that this is a hashing function. We don’t actually have to create a constraint around, hey, what about we replace with identity functions? It’s something that we can easily divert by using symbolic execution. So then we start recognizing the possibility of LLM role in the symbolic execution. Now see that.

Smaller execution can scale to the large software right now. So I think this is a pretty amazing outcome of the competition.

CRob (10:53.11)
Awesome. So again, the competition completed in August. So what plans do you have? What plans does the team have for your CRS now that the competition’s over?

Taesoo Kim (10:58.446)
Thank

Taesoo Kim (11:02.318)
I think that’s a great question. Many of tech companies approach our team. Some of them recently joined, other big companies. And many of our students want to quit the PhD program and start a company. For good reasons, right?

CRob (11:14.848)
I bet.

Taesoo Kim (11:32.766)
One of the team, my four PhD students recently formed and looking for commercialization opportunity. Not in the traditional cyber infrastructure we are looking at through the DARPA, but they spotted the possibility in smart contracts. that smart contracts and modernized financial industries like stable coins and whatnot

where they can apply the AI XTC like techniques in finding vulnerability in those areas. So that instead of analyzing everything by human auditor, you can analyze everything by using LLM or agents and similar techniques that we developed for AI XTC so that you can reduce the auditing time significantly. In order to get some auditing in the smart contract, traditionally you have to wait for two weeks.

In the worst case, even months with a ridiculous amount of cost. Typically, in order to get one auditing for the smart contract, $20,000 or $50,000 per case. But in fact, you can reduce down the amount of auditing time by, I’ll say, a few hours by day. This speed, the potential benefit of achieving this speed is you really open up

CRob (12:40.454)
Mm-hmm.

CRob (12:47.836)
Wow.

Taesoo Kim (12:58.186)
amazing opportunity in this area. So you can automate the auditing, you can increase the frequency of auditing in the smart contract area. Not only that we thought there is a possibility for even more like compliance checkings of the smart contracts, there’s so many opportunities that we can play immediately by using ARCC systems. That’s the one area that we’re looking at. Another one is more traditional area.

CRob (13:00.347)
Mm-hmm.

Taesoo Kim (13:25.07)
what we call cyber infrastructure, like hospitals and some government sectors. They really want to analyze, but unfortunately, or fortunately though, there are other opportunities that in ARCC, we analyze everything by source code, but they don’t have access to them. So we are creating the pipeline that given a binary or execution only environment, how to convert them.

CRob (13:28.828)
Mm-hmm.

CRob (13:38.236)
Mm-hmm.

CRob (13:49.569)
Taesoo Kim (13:52.416)
in a way that we can still leverage the existing infrastructure that we have for AICC. More interestingly, they don’t have access to the internet when they’re doing pen testings or analyzing those, so that we start incorporating some of our open source model as part of our systems. These are two commercialization efforts that we’re thinking and many of my students are currently

CRob (13:57.67)
That’s very clever.

CRob (14:05.5)
Yeah.

CRob (14:13.564)
It’s awesome.

CRob (14:20.366)
And I imagine that this is probably amazing source material for dissertations and the PhD work, right?

Taesoo Kim (14:29.242)
Yes, yes. Last two years, we are purely focused on ARCC. Our motto is that we don’t have time for publication. It’s just win the competition. Everything is coming after. This is the moment that we actually, I think we’re going to release our Tech Report. It’s over 150 pages. Next week, around next week. So we have a draft right now, but we are still publishing.

CRob (14:39.256)
Yeah.

CRob (14:51.94)
Wow.

Taesoo Kim (14:58.51)
for publication so that other people not just like source code okay that’s great but you need some explanation why you did this many of the sources is for the competition right so that the core pieces might be a little bit different for like daily usage of normal developers and operator so we kind of create a condensed technical material for them to understand

Not only that, we have a plan to make it more accessible, meaning that currently our CRS implementation tightly bound to the competition environment. Meaning that we have a crazy amount of resources in Azure side, everything is deployed and better tested. But unfortunately, most of the people, including ourselves, we don’t have resources. Like the competition have about

80,000 cloud credit that we have to use. So no one has that kind of resource. It’s not like that, not if you’re not a company. But we want to apply this one for your project in the smaller scale. That’s what we are currently working on. So discarding all these competition dependent parameter from the source code, making more containable so that you can even launch our CRS in your local environment.

This is one of the big, big development effort that we are doing right now in our lab.

CRob (16:32.155)
That’s awesome. take me a second and thinking about this from the students perspective that participated. What kind of an experience was it getting to work with professors such as yourself and then actual professional researchers and hackers? What do you see the students are going to take away from this experience?

Taesoo Kim (16:53.846)
I think exposing to the latest model because we are tightly collaborating with this OpenAI and Gemini, we are really exposed to those latest model. If you’re just working on the security, not tightly working for LLM, you probably don’t appreciate that much. But through the competition, everyone’s mindset change. And then we spend time.

and deeply take a look in what’s possible, what’s not, we now have a great sense of what type of problem we have to solve, even in the research side. And now, suddenly, after this competition, every single security project, security research that we are doing at Georgia Tech is based on LLF. Even more surprising to hear that we have some decompilation project that we are doing, the traditional possible security research you can read.

CRob (17:42.448)
Ha ha.

Taesoo Kim (17:52.162)
binary analysis, malware analysis, decompilations, crash analysis, whatnot. Now everything is LLM. Now we realize LLM is much better at decompiling than traditional tools like IDEA and Jydra. So I think these are the type of research that we previously thought impossible. We’re probably not even thinking about applying LLM. Because we spend our lifetime working on decompiling.

CRob (17:53.68)
Mm.

CRob (17:59.068)
Yeah.

Taesoo Kim (18:22.318)
But at a certain point, we realized that LLM is just doing better than what we’ve been working on. Just one day. It’s a complete mind change. In traditional program analysis perspective, many things are empty completely. There’s no way you can solve it in an easier way. So they’re not spending time. That’s our typical mindset. But now, it works in practice, amazingly.

CRob (18:29.574)
Yeah.

Taesoo Kim (18:51.807)
how to improve what we thought previously impossible by using another one. It’s the key.

CRob (18:57.404)
That’s awesome. It’s interesting, especially since you stated initially when you went into the competition, you were very skeptical about the utility of LLMs. So that’s great that you had this complete reversal.

Taesoo Kim (19:04.238)
Thank

Yeah, but I think I like to emphasize one of the problems of LLM though, it’s expensive, it’s slow in traditional sense, you have to wait a few seconds or a few minutes in certain cases like reasoning model or whatnot. So tightly binding your performance with this performance lagging component in the entire systems is often challenging.

CRob (19:17.648)
Yes.

CRob (19:21.82)
Mm-hmm.

Taesoo Kim (19:39.598)
and then just talking. But another benefit of everything is text. There’s no proper API, just text. There’s no sophisticated way to leverage it, just text. I don’t know, you’re probably familiar with all these security issues, potentially with unstructured input. It’s similar to cross-site scripting in the web space. There’s so many problems you can imagine.

CRob (19:51.984)
Okay, yeah.

CRob (20:01.979)
Mm-hmm.

Taesoo Kim (20:08.11)
But as far as you can use in a well-contained manner in the right way, we believe there are so many opportunities we can get from it.

CRob (20:18.876)
Great. So now that your CRS has been released as open source, if someone from our community was interested in joining and maybe contributing to that, what’s the best way somebody could get started and get access?

Taesoo Kim (20:28.494)
Mm-hmm.

So we’re going to release non-competition version very soon, along with several documents, we call standardization effort that we and other teams are doing right now. So we define non-competition CRS interface so that you can tightly, as far as you implement those interface, our goal is to mainstream OSS browser together with Google team.

CRob (20:36.369)
Mm-hmm.

CRob (20:58.524)
Mm-hmm.

Taesoo Kim (20:59.086)
so that you can put your CRS as part of OSS Fuzz mainstream, so that we can make it much easier, so that everyone can evaluate one at a time in their local environment as part of OSS Fuzz project. So we’re gonna release the RFC document pretty soon through our website, so that everyone can participate and share their opinion, what are the features that they think we are missing, that we’d love to hear about.

CRob (21:03.74)
Thanks.

CRob (21:18.001)
Mm-hmm.

Taesoo Kim (21:26.502)
And then after that, a month period, we’re going to release our local version so that everyone can start using. And with a very permissive license, everyone can take advantage of the public research, including companies.

CRob (21:34.78)
Awesome.

CRob (21:42.692)
It’s, I’m just amazed. when I came into this, partnering with our friends at DARPA, I was initially skeptical as well. And as I was sitting there watching the finals announced, it was just amazing. Kind of this, the innovative innovation and creativity that all the different teams displayed. again, congratulations to your team, all the students and the researchers and everyone that participated.

Taesoo Kim (21:59.79)
Mm-hmm.

CRob (22:12.6)
Well done. Do you have any parting thoughts? know, as you’re think, as we move on, do you have any kind of words of wisdom you want to share with the community or any takeaways for people curious to get in this space?

Taesoo Kim (22:25.486)
Oh, regarding commercialization, one thing I also like to mention is that in Samsung, we already took the open source version of the CRS, start applying the internal project and open source Samsung project immediately after. So we started seeing the benefit of applying the CRS in the real world immediately after the competition. A lot of people think that competition is just for competition or show

CRob (22:38.108)
Mm-hmm.

Taesoo Kim (22:55.032)
But in fact, it’s not. Everyone in industry, including at Tropic Meta and OpenAI, they all want to adopt those technologies behind the scene. And Amazon, we also working together with Amazon AWS team so that they want to support the deployment of our systems in AWS environment as well. So everyone can just one click, they can launch the systems. And they mentioned there are several.

CRob (22:55.036)
Mm-hmm.

Taesoo Kim (23:24.023)
government-backed They explicitly request to launch our CRS in their environment.

CRob (23:31.1)
I imagine so. Well, again, kudos to the team. Congratulations. It’s amazing. I love to see when researchers have these amazing creative ideas and actually are able to add actual value. And it’s great to hear that Samsung was immediately able to start to get value out of this work. And I hopefully other folks will do the same.

Taesoo Kim (23:55.18)
Yeah, exactly. I think regarding one of wisdom or general advice in general is that this competition based innovation, particularly in academic or involvement like startups or not, because of this venue, so including ourselves and startup people and other team members put their life

on this competition. It’s an objective metric, head-to-head competitions. We don’t care about your background. Just win, right? There’s your objective score. Your job is fine and fix it, I think this competition really drives a lot of efforts behind the scene in our team. We are motivated because of this entire competition is represented in broader audience. I think this is really a way to drive the innovation.

CRob (24:26.46)
Mm-hmm.

CRob (24:32.57)
Yes.

CRob (24:36.709)
Mm-hmm.

Taesoo Kim (24:54.904)
to get some public attention beyond Alphi as well. So I think we really want to see other type of competition in this space. And in the longer future, you probably see based on the current trend, CTF competitions like that, maybe not just CTF, it’s Asian-based CTF, no human involved or the Asians are now attacking each other and solving CTF challenge.

CRob (24:58.524)
Excellent.

CRob (25:19.59)
Mm-hmm.

Taesoo Kim (25:24.846)
This is not a five-year no-vote. It’s going to happen in two years or shortly. Even in this year’s live CTF, one of the teams actually leveraged Asian systems and Asians actually solved the competition quicker than humans. So think we’re going to see those types of events and breakthroughs more often than

CRob (25:55.292)
I used to be a judge at the collegiate cyber competition for one of our local schools. And I think I see a lot of interesting applicability kind of using this as to help them to teach the students that you have an aggressive attacker is doing these different techniques and it’s able to kind of apply some of these learnings that you all have. It’s really exciting stuff.

Taesoo Kim (26:00.142)
Mm-hmm.

Taesoo Kim (26:15.47)
I think one of the interesting quote from, I don’t know who actually said, but in the AI space, someone mentioned that there will be one person, one billion market cap company appear because of LLN or because of AI in general. But if you see the CTF, currently most of the team has minimum 50 people or 100 people competing each other. We’re going to see very soon.

one person or maybe five people with the help of those AI tools and they’re going to compete. Or human are just assisting AI in a way that, hey, could you bring up the Raspberry Pi for me or set up so that human just helping LLN or helping AI in general so that AI can compete. So I think we’re going to see some interesting thing happening pretty soon in our company for sure.

CRob (26:59.088)
Mm-hmm. Yeah.

CRob (27:11.804)
I agree. Well, again, Taesoo, thank you for your time. Congratulations to the team. And that is a wrap. Thank you very much.

Taesoo Kim (27:22.147)
Thank you so much.

SBOMs in the Era of the CRA: Toward a Unified and Actionable Framework

SBOMs in the Era of the CRA: Toward a Unified and Actionable Framework

By Blog, EU Cyber Resilience Act, Global Cyber Policy, Guest Blog

By Madalin Neag, Kate Stewart, and David A. Wheeler

In our previous blog post, we explored how the Software Bill of Materials (SBOM) should not be a static artifact created only to comply with some regulation, but should be a decision ready tool. In particular, SBOMs can support risk management. This understanding is increasing thanks to the many who are trying to build an interoperable and actionable SBOM ecosystem. Yet, fragmentation, across formats, standards, and compliance frameworks, remains the main obstacle preventing SBOMs from reaching their full potential as scalable cybersecurity tools. Building on the foundation established in our previous article, this post dives deeper into the concrete mechanisms shaping that evolution, from the regulatory frameworks driving SBOM adoption to the open source initiatives enabling their global interoperability. Organizations should now treat the SBOM not merely as a compliance artifact to be created and ignored, but as an operational tool to support security and augment asset management processes to ensure vulnerable components are identified and updated in a timely proactive way. This will require actions to unify various efforts worldwide into an actionable whole.

Accelerating Global Policy Mandates

The global adoption of the Software Bill of Materials (SBOM) was decisively accelerated by the U.S. Executive Order 14028 in 2021, which mandated SBOMs for all federal agencies and their software vendors. This established the SBOM as a cybersecurity and procurement baseline, reinforced by the initial NTIA (2021) Minimum Elements (which required the supplier, component name, version, and relationships for identified components). Building on this foundation, U.S. CISA (2025) subsequently updated these minimum elements, significantly expanding the required metadata to include fields essential for provenance, authenticity, and deeper cybersecurity integration. In parallel, European regulatory momentum is similarly mandating SBOMs for market access, driven by the EU Cyber Resilience Act (CRA). Germany’s BSI TR-03183-2 guideline complements the CRA by providing detailed technical and formal requirements, explicitly aiming to ensure software transparency and supply chain security ahead of the CRA’s full enforcement.

To prevent fragmentation and ensure these policy mandates translate into operational efficiency, a wide network of international standards organizations is driving technical convergence at multiple layers. ISO/IEC JTC 1/SC 27 formally standardizes and oversees the adoption of updates to ISO/IEC 5962 (SPDX), evaluating and approving revisions developed by the SPDX community under The Linux Foundation. The standard serves as a key international baseline, renowned for its rich data fields for licensing and provenance and support for automation of risk analysis of elements in a supply chain. Concurrently, OWASP and ECMA International maintain ECMA-424 (OWASP CycloneDX), a recognized standard optimized specifically for security automation and vulnerability linkage. Within Europe, ETSI TR 104 034, the “SBOM Compendium,” provides comprehensive guidance on the ecosystem, while CEN/CENELEC is actively developing the specific European standard (under the PT3 work stream) that will define some of the precise SBOM requirements needed to support the CRA’s vulnerability handling process for manufacturers and stewards.

Together, these initiatives show a clear global consensus: SBOMs must be machine-readable, verifiable, and interoperable, supporting both regulatory compliance over support windows and real-time security intelligence. This global momentum set the stage for the CRA, which now transforms transparency principles into concrete regulatory obligations.

EU Cyber Resilience Act (CRA): Establishing a Legal Requirement

The EU Cyber Resilience Act (CRA) (Regulation (EU) 2024/2847) introduces a legally binding obligation for manufacturers to create, 1maintain, and retain a Software Bill of Materials (SBOM) for all products with digital elements marketed within the European Union. This elevates the SBOM from a voluntary best practice to a legally required element of technical documentation, essential for conformity assessment, security assurance, and incident response throughout a product’s lifecycle. In essence, the CRA transforms this form of software transparency from a recommendation into a condition for market access.

Core Obligations for Manufacturers under the CRA include:

  • SBOM Creation – Manufacturers must prepare an SBOM in a commonly used, machine-readable format [CRA I(II)(1)], such as SPDX or CycloneDX.
  • Minimum Scope – The SBOM must cover at least the top-level dependencies of the product  [CRA I(II)(1)]. While this is the legal minimum, including deeper transitive dependencies is strongly encouraged.
  • Inclusion & Retention – The SBOM must form part of the mandatory technical documentation and be retained for at least ten years (Art.13) after the product has been placed on the market.
  • Non-Publication Clause – The CRA requires the creation and availability of an SBOM but does not mandate its public disclosure Recital 77. Manufacturers must provide the SBOM upon request to market surveillance authorities or conformity assessment bodies for validation, audit, or incident investigation purposes.
  • Lifecycle Maintenance – The SBOM must be kept up to date throughout the product’s maintenance and update cycles, ensuring that any component change or patch is reflected in the documentation Recital 90.
  • Vulnerability Handling – SBOMs provide the foundation for identifying component vulnerabilities under the CRA, while full risk assessment requires complementary context such as exploitability and remediation data. (Annex I)

The European Commission is empowered, via delegated acts under Article 13(24), to further specify the format and required data elements of SBOMs, relying on international standards wherever possible. To operationalize this, CEN/CENELEC is developing a European standard under the ongoing PT3 work stream, focused on vulnerability handling for products with digital elements and covering the essential requirements of Annex I, Part II of the CRA. Its preparation phase includes dedicated sub-chapters on formalizing SBOM structures, which will serve as the foundation for subsequent stages of identifying vulnerabilities and assessing related threats (see “CRA workshop ‘Deep dive session: Vulnerability Handling” 1h36m35s).

In parallel, the U.S. Cybersecurity and Infrastructure Security Agency (CISA) continues to shape global SBOM practices through its “Minimum Elements” framework and automation initiatives. These efforts directly influence Europe’s focus on interoperability and structured vulnerability handling under the CRA. This transatlantic alignment helps ensure SBOM data models and processes evolve toward a consistent, globally recognized baseline. CISA recently held a public comment window ending October 2, 2025 on a draft version of a revised set of minimum elements, and is expected to publish an update to the original NTIA Minimum Elements in the coming months.

Complementing these efforts, Germany’s BSI TR-03183-2 provides a more detailed technical specification than the original NTIA baseline, introducing requirements for cryptographic checksums, license identifiers, update policies, and signing mechanisms. It already serves as a key reference for manufacturers preparing to meet CRA compliance and will likely be referenced in the forthcoming CEN/CENELEC standard together with ISO/IEC and CISA frameworks. Together, the CRA and its supporting standards position Europe as a global benchmark for verifiable, lifecycle aware SBOM implementation, bridging policy compliance with operational security.

Defining the Unified Baseline: Convergence in Data Requirements

The SBOM has transitioned from a best practice into a legal and operational requirement due to the European Union’s Cyber Resilience Act (CRA). While the CRA mandates the SBOM as part of technical documentation for market access, the detailed implementation is guided by documents like BSI TR-03183-2. To ensure global compliance and maximum tool interoperability, stakeholders must understand the converging minimum data requirements. To illustrate this concept, the following comparison aligns the minimum SBOM data fields across the NTIA, CISA, BSI, and ETSI frameworks, revealing a shared move toward completeness, verifiability, and interoperability.

Data Field NTIA (U.S., 2021 Baseline) CISA’s Establishing a Common SBOM (2024) BSI TR-03183-2 (Germany/CRA Guidance) (2024) ETSI TR 104 034 (Compendium) (2025)
Component Name Required Required Required Required
Component Version Required Required Required Required
Supplier Required Required Required Required
Unique Identifier (e.g., PURL, CPE) Required Required  Required Required
Cryptographic Hash Recommended Required Required  Optional
License Information Recommended Required  Required Optional
Dependency Relationship Required  Required  Required Required
SBOM Author Required Required  Required  Required
Timestamp (Date of Creation) Required Required Required Required
Tool Name / Generation Context Not noted Not noted Required  Optional
Known Unknowns Declaration Optional Required Optional Optional
Common Format Required  Not noted Required  Required
Depth Not noted Not noted Not noted Optional

 

  • NTIA (2021): Established the basic inventory foundation necessary to identify components (who, what, and when).
  • CISA (2024): Framing Software Component Transparency establishes a structured maturity model by defining minimum, recommended, and aspirational SBOM elements, elevating SBOMs from simple component lists to verifiable security assets. CISA is currently developing further updates expected in 2025 to extend these principles into operational, risk-based implementation guidance.
  • BSI TR-03183-2: Mirrors the CISA/NTIA structure but mandates strong integrity requirements (Hash, Licenses) from the compliance perspective of the CRA, confirming the strong global convergence of expectations.
  • ETSI TR 104 034: As a technical compendium, it focuses less on specific minimum fields and more on the necessary capabilities of a functional SBOM ecosystem (e.g., trust, global discovery, interoperability, and lifecycle management).

The growing alignment across these frameworks shows that the SBOM is evolving into a globally shared data model, one capable of enabling automation, traceability, and trust across the international software supply chain.

Dual Standard Approach: SPDX and CycloneDX

The global SBOM ecosystem is underpinned by two major, robust, and mature open standards: SPDX and CycloneDX. Both provide a machine-processable format for SBOM data and support arbitrary ecosystems. These standards, while both supporting all the above frameworks, maintain distinct origins and strengths, making dual format support a strategic necessity for global commerce and comprehensive security.

The Software Package Data Exchange (SPDX), maintained by the Linux Foundation, is a comprehensive standard formally recognized by ISO/IEC 5962 in 2021. Originating with a focus on capturing open source licensing and intellectual property in a machine readable format, SPDX excels in providing rich, detailed metadata for compliance, provenance, legal due diligence, and supply chain risk analysis. Its strengths lie in capturing complex license expressions (using the SPDX License List and SPDX license expressions) and tracking component relationships in great depth, together with its extensions to support linkage to security advisories and vulnerability information, making it the preferred standard for rigorous regulatory audits and enterprise-grade software asset management. As the only ISO-approved standard, it carries significant weight in formal procurement processes and traditional compliance environments.  It supports multiple formats (JSON, XML, YAML, Tag/Value, and XLS) with free tools to convert between the formats and promote interoperability. 

The SPDX community has continuously evolved the specification since its inception in 2010, and most recently has extended it to a wider set of metadata to support modern supply chain elements, with the publication of SPDX 3.0 in 2024. This update to the specification contains additional fields & relationships to capture a much wider set of use cases found in modern supply chains including AI. These additional capabilities are captured as profiles, so that tooling only needs to understand the relevant sets, yet all are harmonized in a consistent framework, which is suitable for supporting product line management Fields are organized into a common “core”, and there are “software” and “licensing” profiles, which cover what was in the original specification ISO/IEC 5962. In addition there is now a “security” profile, which enables VEX and CSAF use cases to be contained directly in exported documents, as well as in databases  There is also a “build” profile which supports high fidelity tracking of relevant build information for “Build” type SBOMs. SPDX 3.0 also introduced a “Data” and “AI” related profiles, which made accurate tracking of AI BOMs possible, with support for all the requirements of the EU AI Act (see table in linked report).  As of writing, the SPDX 3.0 specification is in the final stages of being submitted to ISO/IEC for consideration. 

CycloneDX, maintained by OWASP and standardized as ECMA-424, is a lightweight, security-oriented specification for describing software components and their interdependencies. It was originally developed within the OWASP community to improve visibility into software supply chains. The specification provides a structured, machine-readable inventory of elements within an application, capturing metadata such as component versions, hierarchical dependencies, and provenance details. Designed to enhance software supply chain risk management, CycloneDX supports automated generation and validation in CI/CD environments and enables early identification of vulnerabilities, outdated components, or licensing issues. Besides its inclusion with SPDX in the U.S. federal government’s 2021 cybersecurity Executive Order, its formal recognition as an ECMA International standard in 2023 underscore its growing role as a globally trusted format for software transparency. Like SPDX, CycloneDX has continued to evolve since formal standardization and the current release is 1.7, released October 2025.

The CycloneDX specification continues to expand under active community development, regularly publishing revisions to address new use cases and interoperability needs. Today, CycloneDX extends beyond traditional SBOMs to support multiple bill-of-materials types, including Hardware (HBOM), Machine Learning (ML-BOM), and Cryptographic (CBOM), and can also describe relationships with external SaaS and API services. It integrates naturally with vulnerability management workflows through formats such as VEX, linking component data to exploitability and remediation context. With multi-format encoding options (JSON, XML, and Protocol Buffers) and a strong emphasis on automation.

OpenSSF and the Interoperability Toolkit

The OpenSSF has rapidly become a coordination hub uniting industry, government, and the open source community around cohesive SBOM development. Its mission is to bridge global regulatory requirements, from the EU’s Cyber Resilience Act (CRA) to CISA’s Minimum Elements and other global mandates, with practical, open source technical solutions. This coordination is primarily channeled through the “SBOM Everywhere” Special Interest Group (SIG), a neutral and open collaboration space that connects practitioners, regulators, and standards bodies. The SIG plays a critical role in maintaining consistent semantics and aligning development efforts across CISA, BSI, NIST, CEN/CENELEC, ETSI, and the communities implementing CRA-related guidance. Its work ensures that global policy drivers are directly translated into unified, implementable technical standards, helping prevent the fragmentation that so often accompanies fast-moving regulation.

A major focus of OpenSSF’s work is on delivering interoperability and automation tooling that turns SBOM policy into practical reality:

  • Protobom tackles one of the field’s toughest challenges – format fragmentation – by providing a format-agnostic data model capable of seamless, lossless conversion between SPDX, CycloneDX, and emerging schemas.
  • BomCTL builds on that foundation and offers a powerful, developer-friendly command line utility designed for CI/CD integration. It handles SBOM generation, validation, and transformation, allowing organizations to automate compliance and security workflows without sacrificing agility. Together, Protobom and Bomctl  embody the principles shared by CISA and the CRA: ensuring that SBOM data is modular, transparent, and portable across tools, supply chains, and regulatory environments worldwide.

Completing this ecosystem is SBOMit, which manages the end-to-end SBOM lifecycle. It provides built-in support for creation, secure storage, cryptographic signing, and controlled publication, embedding trust, provenance, and lifecycle integrity directly into the software supply chain process. These projects are maintained through an open, consensus-driven model, continuously refined by the global SBOM community. Central to that collaboration are OpenSSF’s informal yet influential “SBOM Coffee Club” meetings, held every Monday, where developers, vendors, and regulators exchange updates, resolve implementation challenges, and shape the strategic direction of the next generation of interoperable SBOM specifications.

OpenSSF’s strategic support for both standards – SPDX and CycloneDX – is vital for the entire ecosystem. By contributing to and utilizing both formats, most visibly through projects like Protobom and BomCTL which enable seamless, lossless translation between the two, OpenSSF ensures that organizations are not forced to choose between SPDX and CycloneDX. This dual format strategy satisfies the global requirement for using both formats  and maximizes interoperability, guaranteeing that SBOM data can be exchanged between all stakeholders, systems, and global regulatory jurisdictions effectively.

A Shared Vision for Action

Through this combination of open governance and pragmatic engineering, OpenSSF is defining not only how SBOMs are created and exchanged, but how the world collaborates on software transparency.

The collective regulatory momentum, anchored by the EU Cyber Resilience Act (CRA) and the U.S. Executive Order 14028, supported by the CISA 2025 Minimum Elements revisions, has cemented the global imperative for Software Bill of Materials (SBOM). These frameworks illustrate deep global alignment: both the CRA and CISA emphasize that SBOMs must be structured, interoperable, and operationally useful for both compliance and cybersecurity. The CRA establishes legally binding transparency requirements for market access in Europe, while CISA’s work encourages SBOMs within U.S. federal procurement, risk management, and vulnerability intelligence workflows. Together, they define the emerging global consensus: SBOMs must be complete enough to satisfy regulatory obligations, yet structured and standardized enough to enable automation, continuous assurance, and actionable risk insight. The remaining challenge is eliminating format and semantic fragmentation to transform the SBOM into a universal, enforceable cybersecurity control.

Achieving this global scalability requires a unified technical foundation that bridges legal mandates and operational realities. This begins with Core Schema Consensus, adopting the NTIA 2021 baseline and extending it with critical metadata for integrity (hashes), licensing, provenance, and generation context, as already mandated by BSI TR-03183-2 and anticipated in forthcoming CRA standards. To accommodate jurisdictional or sector-specific data needs, the CISA “Core + Extensions” model provides a sustainable path: a stable global core for interoperability, supplemented by modular extensions for CRA, telecom, AI, or contractual metadata. Dual support for SPDX and CycloneDX remains essential, satisfying the CRA’s “commonly used formats” clause and ensuring compatibility across regulatory zones, toolchains, and ecosystems.

Ultimately, the evolution toward global, actionable SBOMs depends on automation, lifecycle integrity, and intelligence linkage. Organizations should embed automated SBOM generation and validation (using tools such as Protobom, BomCTL, and SBOMit) into CI/CD workflows, ensuring continuous updates and cryptographic signing for traceable trust. By connecting SBOM information with vulnerability data in internal databases, the SBOM data becomes decision-ready, capable of helping identify exploitable or end-of-life components and driving proactive remediation. This operational model, mirrored in the initiatives of Japan (METI), South Korea (KISA/NCSC), and India (MeitY), reflects a decisive global movement toward a single, interoperable SBOM ecosystem. Continuous engagement in open governance forums, ISO/IEC JTC 1, CEN/CENELEC, ETSI, and the OpenSSF SBOM Everywhere SIG, will be essential to translate these practices into a permanent international standard for software supply chain transparency.

Conclusion: From Compliance to Resilient Ecosystem

The joint guidance “A Shared Vision of SBOM for Cybersecurity” insists on these global synergies under the endorsement of 21 international cybersecurity agencies. Describing the SBOM as a “software ingredients list,” the document positions SBOMs as essential for achieving visibility, building trust, and reducing systemic risk across global digital supply chains. That document’s central goal is to promote immediate and sustained international alignment on SBOM structure and usage, explicitly urging governments and industries to adopt compatible, unified systems rather than develop fragmented, country specific variants that could jeopardize scalability and interoperability.

The guidance organizes its vision around four key, actionable principles aimed at transforming SBOMs from static compliance documents into dynamic instruments of cybersecurity intelligence:

  • Modular Architecture – Design SBOMs around a Core Schema Baseline that satisfies essential minimum elements (component identifiers, supplier data, versioning) and expand it with optional extensions for domain specific or regulatory contexts (e.g., CRA compliance, sectoral risk requirements). Support and improve OSS tooling that enables processing and sharing of this data in a variety of formats.
  • Trust and Provenance – Strengthen authenticity and metadata transparency by including details about the generation tools, context, and version lineage, ensuring trust in the accuracy and origin of SBOM data.
  • Actionable Intelligence – Integrate SBOM data with vulnerability and incident-response frameworks such as VEX and CSAF, converting static component inventories into decision ready, risk aware security data.
  • Open Governance – Encourage sustained public–private collaboration through OpenSSF, ISO/IEC, CEN/CENELEC, and other international bodies to maintain consistent semantics and prevent fragmentation.

This Shared Vision complements regulatory frameworks like the EU Cyber Resilience Act (CRA) and reinforces the Open Source Security Foundation’s (OpenSSF) mission to achieve cross-ecosystem interoperability. Together, they anchor the future of SBOM governance in openness, modularity, and global collaboration, paving the way for a truly unified software transparency model.

The primary challenge to achieving scalable cyber resilience lies in the fragmentation of the SBOM landscape. Global policy drivers, such as the EU Cyber Resilience Act (CRA), the CISA-led Shared Vision of SBOM for Cybersecurity, and national guidelines like BSI TR-03183, have firmly established the mandate for transparency. However, divergence in formats, semantics, and compliance interpretations threatens to reduce SBOMs to static artifacts generated only because some regulation requires that they be created, rather than dynamic assets that can aid in security. Preventing this outcome requires a global commitment to a unified SBOM framework, a lingua franca capable of serving regulatory, operational, and security objectives simultaneously. This framework must balance policy diversity with technical capability universality, ensuring interoperability between European regulation, U.S. federal procurement mandates, and emerging initiatives in Asia and beyond. The collective engagement of ISO/IEC, ETSI, CEN/CENELEC, BSI, and the OpenSSF provides the necessary multistakeholder governance to sustain this alignment and accelerate convergence toward a common foundation.

Building such a framework depends on two complementary architectural pillars: Core Schema Consensus and Modular Extensions. The global core should harmonize essential SBOM elements, and CRA’s legal structure, into a single, mandatory baseline. Sectoral or regulatory needs (e.g., AI model metadata, critical infrastructure tagging, or crypto implementation details) should be layered through standardized modular extensions to prevent the ecosystem from forking into incompatible variants. To ensure practical interoperability, this architecture must rely on open tooling and universal machine-processable identifiers (such as PURL, CPE, SWID, and SWHID) that guarantee consistent and accurate linkage. Equally crucial are trust and provenance mechanisms: digitally signed SBOMs, verifiable generation context, and linkage with vulnerability data. These collectively transform the SBOM from a passive unused inventory into an actively maintained, actionable cybersecurity tool, enabling automation, real-time risk management, and genuine international trust in the digital supply chain, realizing the OpenSSF vision of “SBOMs everywhere.”

SBOMs have transitioned from a best practice to a requirement in many situations. The foundation established by the U.S. Executive Order 14028 has been legally codified by the EU’s Cyber Resilience Act (CRA), making SBOMs a non-negotiable legal requirement for accessing major markets. This legal framework is now guided by a collective mandate, notably by the Shared Vision issued by CISA, NSA, and 19 international cybersecurity agencies, which provides the critical roadmap for global alignment and action. Complementary work by BSI, ETSI, ISO/IEC, and OpenSSF now ensures these frameworks converge rather than compete.

To fully achieve global cyber resilience, SBOMs must not be merely considered as a compliance artifact to be created and ignored, but instead as an operational tool to support security and augment asset management processes. Organizations must:

  • Integrate and Automate SBOMs: Achieve full lifecycle automation for SBOM creation and continuous updates, making it a seamless part of the DevSecOps pipeline.
  • Maximize SBOM Interoperability: Mandate the adoption of both SPDX and CycloneDX to satisfy divergent global and regulatory requirements and ensure maximum tool compatibility.
  • Operationalize with Open Source Software Leverage OpenSSF tools (Protobom, BomCTL, SBOMit) to rapidly implement and scale technical best practices.
  • Drive Shared Governance for SBOMs: Actively engage in multistakeholder governance initiatives (CEN/CENELEC, ISO/IEC, CISA, ETSI,  OpenSSF) to unify technical standards and policy globally.
  • Enable Decision-Ready Processes that build on SBOMs: Implement advanced SBOM processes that link component data with exploitability and vulnerability context, transforming static reports into actionable security intelligence.

By embracing this shared vision, spanning among many others the CRA, CISA, METI, KISA, NTIA, ETSI, and BSI frameworks, we can definitively move from merely fulfilling compliance obligations to achieving verifiable confidence. This collective commitment to transparency and interoperability is the essential step in building a truly global, actionable, and resilient software ecosystem.

About the Authors

Madalin Neag works as an EU Policy Advisor at OpenSSF focusing on cybersecurity and open source software. He bridges OpenSSF (and its community), other technical communities, and policymakers, helping position OpenSSF as a trusted resource within the global and European policy landscape. His role is supported by a technical background in R&D, innovation, and standardization, with a focus on openness and interoperability.

Kate is VP of Dependable Embedded Systems at the Linux Foundation. She has been active in the SBOM formalization efforts since the NTIA initiative started, and was co-lead of the Formats & Tooling working group there. She was co-lead on the CISA Community Stakeholder working group to update the minimum set of Elements from the original NTIA set, which was published in 2024. She is currently co-lead of the SBOM Everywhere SIG.

Dr. David A. Wheeler is an expert on open source software (OSS) and on developing secure software. He is the Director of Open Source Supply Chain Security at the Linux Foundation and teaches a graduate course in developing secure software at George Mason University (GMU). Dr. Wheeler has a PhD in Information Technology, is a Certified Information Systems Security Professional (CISSP), and a Senior Member of the Institute of Electrical and Electronics Engineers (IEEE). He lives in Northern Virginia.

 

What’s in the SOSS? Podcast #42 – S2E19 New Education Course: Secure AI/ML-Driven Software Development (LFEL1012) with David A. Wheeler

By Podcast

Summary

In this episode of “What’s In The SOSS,” Yesenia interviews David A. Wheeler, the Director of Open Source Supply Chain Security at the Linux Foundation. They discuss the importance of secure software development, particularly in the context of AI and machine learning. David shares insights from his extensive experience in the field, emphasizing the need for both education and tools to ensure security. The conversation also touches on common misconceptions about AI, the relevance of digital badges for developers, and the structure of a new course aimed at teaching secure AI practices. David highlights the evolving nature of software development and the necessity for continuous learning in this rapidly changing landscape.

Conversation Highlights

00:00 Introduction to Open Source and Security
02:31 The Journey to Secure AI and ML Development
08:28 Understanding AI’s Impact on Software Development
12:14 Myths and Misconceptions about AI in Security
18:24 Connecting AI Security to Open Source and Closed Source
20:29 The Importance of Digital Badges for Developers
24:31 Course Structure and Learning Outcomes
28:18 Final Thoughts on AI and Software Security

Transcript

Yesenia (00:01)
Hello and welcome to What’s in the SOSS, OpenSSF podcast where we talk to interesting people throughout the open source ecosystem. They share their journey, expertise and wisdom. So yes, I need said one of your hosts and today we have the extraordinary experience of having David Wheeler on a welcome David. For those that may not know you, can you share a little bit about your row at the Linux Foundation OpenSSF?

David A. Wheeler (00:39)
Sure, my official title is actually probably not very illuminating. It says it’s the direct, I’m the director of open source supply chain security. But what that really means is that my job is to try to help other folks improve the security of open source software at large, all the way from it’s in someone’s head, they’re thinking about how to do it, developing it, putting it in repos, getting it packaged up, getting it distributed, receiving it just all the way through. We want to make sure that people get secure software and the software they actually intended to get.

Yesenia (01:16)
It’s always important, right? You don’t want to open up a Hershey bar that has no peanuts in the peanuts, right? So that was my analogy for the supply chain security in MySpace. Because I’m a little sensitive to peanuts. I was like, you know, you don’t want that.

David A. Wheeler (01:22)
You

David A. Wheeler (01:31)
You don’t want that. And although the food analogy is often pulled up, I think it’s still a good analogy. If you’re allergic to peanuts, you don’t want the peanuts. And unfortunately, it’s not just, hey, whether or not it’s got peanuts or not, but there was a scare involving Tylenol a while back. And to be fair, the manufacturer didn’t do anything wrong, but the bottles were tampered with by a third party.

Yesenia (01:40)
Mm-mm.

David A. Wheeler (01:57)
And so we don’t want tampered products. We want to make sure that when you request an open source program, it’s actually the one that was intended and not something else.

Yesenia (02:07)
So you have a very important job. Don’t play yourself there. We want to make sure the product you get is the one you get, right? So if you don’t know David, go ahead and message him on Slack, connect with him. Great gentleman in the open source space. And you’ve had a long time advocating for secure software development in the open source space. How did your journey lead to creating a course specifically on secure AI and ML driven development?

David A. Wheeler (02:36)
As with many journeys, it’s a complicated journey with lots of whens and ways. As you know, I’ve been interested in how do you develop secure software for a long time, decades now, frankly. And I have been collecting up over the years what are the common kinds of mistakes and more importantly, what are the systemic simple solutions you can make that would prevent that problem and eliminating it entirely ideally.

Um, and over the years it’s turned out that in fact, for a vast number, for the vast majority of problems that people have, there are well-known solutions, but they’re not well known by the developers. So a lot of this is really an education story of trying to make it so that software developers know how to do things. Now it’s a fair, you know, some would say, some would say, well, what about tools? Tools are valuable. Absolutely.

If to the extent that we can, we want to make it so that tools automatically do the secure thing. And that’s the right thing to do, but that’ll never be perfect. And people can always override tools. And so it’s not a matter of education or tools. I think that’s a false dichotomy. It’s you need tools and you need education. You need education or you can’t use the tools well as much as we can. We want to automate things so that they will handle things automatically, but you need both. You need both.

Now, to answer your specific question, I’ve actually been involved in and out with AI to some extent for literally decades as well. People have been interested in AI for years, me too. I did a lot more with symbolic based AI back in the day, wrote a lot of Lisp code. But since that time, really machine learning, although it’s not new, has really come into its own.

And all of a sudden it became quite apparent to me, and it’s not just me, to many people that software development is changing. And this is not a matter of what will happen someday in the future. This is the current reality for software development. And I’m going to give a quick shout out to some researchers in Stanford. I’ll have to go find the link. So who basically did some, I think some important studies related to this.

David A. Wheeler (04:59)
When you’re developing software from scratch and trying to create a tiny program, the AI tools are hard to beat because basically they’re just creating, know, they’re just reusing a template, but that’s a misleading measure, okay? That doesn’t really tell you what normal software development is like. However, let’s assume that you’re taking existing programming and improving it, and you’re using a language for which there’s a lot of examples for training. Okay, we’re talking Python and Java and, you know, various widely used languages, okay?

David A. Wheeler (05:28)
If you do those, turns out the AI tools can generate a lot of code. Some of it’s right. So that means you have to do more rework, frankly, when you use these tools. Once you take the rework into account, they’re coming up with a 20 % improvement in productivity. That is astounding. And I will argue that is the end of the argument. Seriously, there are definitely, there are companies where they have a single customer and the customer pays them to write some software. If the customer says never use AI, fine, the customer’s willing to pay 20 % improvement, I will charge that extra to them. But out in most commercial open source settings, you can’t throw, you can’t ignore a 20 % improvement. And that’s current tech, that’s not future tech. mean, the reality is that we haven’t seen improvements like this since the switch from hut, from assembly to high level languages, the use of, you know, the use of structure programming, I think was another case we got that kind. And you can make a very good case that open source software was that was a third case where you got that digital, productivity. Now you could also argue that’s a little unfair because open source didn’t improve your ability to write software. makes you didn’t have to write the software.

David A. Wheeler (06:53)
But that’s okay. That’s still an improvement, right? So I think that counts. But for the most part, we’ve had a lot of technologies that claim to improve productivity. I’ve worked many over the years. I’ve been very interested in how do you improve productivity? Most of them turned out not to be true. I don’t think that’s true for AI. It’s quite clear for multiple studies. mean, not all studies agree with this, by the way, but I think there’s enough studies that there’s a productivity improvement.

David A. Wheeler (07:21)
It does depend on how you employ these, that, but you know, and they’ll get better. But the big problem now is everyone is on the list. This is a case where everyone, even if you’re a professional and you’ve been doing software development for years, everybody’s new at this part of the game. These tools are new. And the problem here is that the good news is that they can help you. The bad news is they can harm you. They can do.

David A. Wheeler (07:50)
They can produce terribly insecure software. They can also end up being the security vulnerability themselves. And so we’re trying to get ahead of the game and looking around what’s the latest information, what can we learn? And it turns out there’s a lot that we can learn that we actually think is gonna stay on the test of time. And so that’s what this course is, those basics they’re gonna apply no matter what tool you use.

David A. Wheeler (08:17)
How do you make it say you’re using these tools, but you’re not immediately becoming a security vulnerability? How is it so that you’re less likely to produce vulnerable code? And that turns out to be harder. We can talk about why that is, but that’s what this course is in a nutshell.

Yesenia (08:33)
Yeah, I know I had a sneak preview at the slide deck and I was just like, this is fantastic. Definitely needed it. And I wanted to take a moment and give a kudos to the researchers because the engine, the industry wouldn’t be what it is today without the researchers. Like they’re the ones that are firsthand, like try and failing and then somebody picks it up and builds it and it is open source or industry. then boom, it becomes like this whole new field. So I know AI has been around for a minute.

David A. Wheeler (09:01)
Yeah, let me add that. I agree with you. Let me actually separate different researchers because we’re building on the first of course, the researchers who created these original AI and ML systems more generally, obviously a lot of the LLM based research. You’ve got the research specifically in developing tools for developing, for improving software development. And then you’ve got the researchers who are trying to figure out the security impacts of this. And those folks,

Yesenia (09:30)
Those are my favorite. Those are my favorite.

David A. Wheeler (09:31)
I’m Well, we need all of these folks. But the thing is, what concerns me is that remarkably, even though we’ve got this is really new tech, we do have some really interesting research results and publications about their security impacts. The problem is, most of these researchers are really good about, you know, doing the analysis, creating controls, doing the research, publishing a paper, but for most people, publishing a paper has no impact. People are not going to go out and read every paper on a topic. That’s, know, they have work to do basically. So if you’re a researcher makes these are very valuable, but what we’ve tried to do is take the research and boil it down to as a practitioner, what do you need to know? And we do cite the research because

David A. Wheeler (10:29)
You know, if you’re, if you’re interested or you say, Hey, I’m not sure I believe that. Well, great. Curiosity is fantastic. Go read the studies. there’s always limitations on studies. We don’t have infinite time and infinite money. but I think the research is actually pretty consistent. at least with Ted Hayes technology, we, can’t guess what the great grand future holds.

David A. Wheeler (10:55)
But I’m going to guess that at least for the next couple of years, we’re going to see a lot of LLMs, LLM use, they’re going to build on other tools. And so there’s things that we know just based on that, that we can say, well, given the direction of current technology, what’s okay, what’s to be concerned about? And most importantly, what can we do practically to make this work in our favor? So we get that 20 % because we’re going to want it.

Yesenia (11:24)
Yeah, at this point in time, we’re seedling within the AI ML piece. What you said is really, really important. It’s just like, so much more to this. There’s so much more that’s growing. And I want to take it back to something you had mentioned. You’re talking about the good that is coming from the AI ML. And there is the bad, of course. And for the course that you’re coming out, what is one misconception about AI in the software development security that you hope that this course will shatter? What myth are you busting?

David A. Wheeler (11:53)
What myth am I busting? I guess I’m going to cheat because I’m going to respond with two. It’s by the fact that I actually can count. guess, okay, I’m going to turn it into one, which is, guess, basically either over or under indexing on the value of A. Basically expecting too much or expecting too little. Okay, basically trying to figure out what the real expectations you should have are and not go outside that. So there’s my one. So let me talk about over and under. We’ve got people.

Yesenia (12:30)
Well, I’m going to give you another one because in software everything starts with zero. So I’ll give you another one.

David A. Wheeler (13:47)
Okay, all right, so let me talk about the under. There are some who have excessive expectations. We’ve got the, you know, I think vibe coding in particular is a symptom of this, okay? Now, there are some people who use the word vibe coding as another word for using AI. I think that’s not what the original creator of the term meant. And I actually think that’s not helpful because it’s a whole lot like talking about automated carriages.

Um, very soon we’re only going to be talking about carriages. Okay. Everybody’s going to be using automation AI except the very few who don’t. Okay. So, so there’s no point in having a special term for the normal case. Um, so what, what I mean by vibe coding is what the original creator of the term meant, which is, Hey, AI system creates some code. I’m never going to review it. I’m never going to look at it. I’m not going to do anything. I will just blindly accept it. This is a terrible idea. If it matters what the quality of the code is. Now there are cases where frankly the quality of the code is irrelevant. I’ve seen some awesome examples where you’ve got little kids, know, eight, nine year olds running around and telling a computer what to do and they get a program that seems to kind of do that. And that is great. I mean, if you want to do vibe coding with that, that’s fantastic. But if the code actually does something that matters, with current tech, this is a terrible idea. They’re not.

They can sometimes get it correct, but even humans struggle making perfect code every time and they’re not that good. The other case though is, man, we can’t ever use this stuff. I mean, again, if you’ve got a customer who’s paying you extra to never do it, that’s wonderful. Do what the customer asks and is willing to pay for it. For most of us, that’s not a reasonable industry position. What we’re going to need to do instead is learn together as an industry how to use this well. The good news is that although we will all be learning together, there’s some things already known now. So let’s run to the researchers, find out what they’ve learned, go to the practitioners, basically find what has been learned so far, start with that. And then we can build on and improve and go and other things. You don’t expect too much, don’t expect too little.

Yesenia (15:28)
Yeah, the five coding is an interesting one, because sometimes it spews out like correct code. But as somebody who’s written code and reviewed code and like done all this with the supply chain, I’m like. It’s like that extra work you gotta kind of add to it to make sure that you’re validating your testing it and it hasn’t just accidentally thrown in some security vulnerability in in that work. And I think that was. Go ahead.

David A. Wheeler (15:51)
What I can interrupt you, one of the studies that we cited, they basically created a whole bunch of functions that could be written either insecurely or securely as a test. Did this whole bunch of times. And they found that 45 % of the time using today’s current tooling, they chose the insecure approach. And there’s reason for this. ML systems are finally based on their training sets.

They’ve been trained on lots of insecure programs. What did you expect to get? You know, so this is actually going to be a challenge because when you’re trying to fight what the ML systems are training on, that is harder than going with the flow. That doesn’t mean it can’t be done, but it does require extra effort.

Yesenia (16:41)
We’re going extra left at that point. All right, so you your one and I gave you, know, one more because we started at zero. Any other misconception that is being bumped at the course.

David A. Wheeler (16:57)
Um, I guess the, uh, I guess the misconception sort is nothing can be done. And, uh, of course the whole course is a, uh, a, a, a stated disagreement with examples, uh, because in fact, there are things we can do right now. Now I would actually concede if somebody said, Hey, we don’t know everything. Well, sure. Uh, you know, I think all of us are in a life journey and we’re all learning things as we go. Uh, but that doesn’t mean that we have to, um, you know, just accept that nothing can be done. That’s a fatalistic approach that I think serves no one. There are things we can do. There are things that are known, though maybe not by you, but that’s okay. That’s what a course is for. We’ve worked to boil down, try to identify what is known, and with relatively little time, you’ll be far more prepared than you would be otherwise.

Yesenia (17:49)
It is a good course and I the course is aimed for developers, software engineers, open source contributors. So how does it connect to real world open source work like those that are working on closed source versus that open source software?

David A. Wheeler (18:04)
Well, okay, I should first quickly note that I work for the Open Source Security Foundation, Open Source is in the name, so we’re very interested in improving the security of open source software. That is our fundamental focus. That said, sometimes the materials that we create are not really unique to open source software. Where it can be applied by closed source software, we try to make that clear. Sometimes we don’t make it clear as we should, but we’re working on that.

Um, and frankly, in many cases, I think there’s also worth noting that, um, if you’re developing closed source software, the vast majority of the components you’re using are open source software. I mean, the average is 70 to 90 % of the software in a closed source system software system is actually open source software components. Uh, simple because it doesn’t make sense to rebuild everything from scratch today. That’s not a, an M E economically viable option for most folks. So.

in this particular case for the AI, it is applicable equally to open source and closed source. It applies to everybody. and this is actually true also for our LFD 121 course on how to develop secure software. And when you think about it, it makes sense. The attackers don’t care what your license is. just, you know, they just don’t care. They’re going to try to do bad things to you regardless of, of the licensing.

And so while certainly a number of things that we develop like, you know, the best practices badge are very focused on open source software, you know, other things like baseline, other things like, for example, this course on LFT 121, the general how to develop secure software course, they’re absolutely for open source and closed source. Because again, the attackers don’t care.

Yesenia (19:53)
Yeah, they just they just don’t they’re they’re actually just trying to go around all this like they’re trying to make sure they learn it so that they know what to do. Unfortunately, that’s the case. And this course you said it offers a digital badge. Why is this important for developers and employers?

David A. Wheeler (20:11)
Well, I think the short answer is that anybody can say, yeah, I learned something. But it’s, think, for, I guess I should start with the employers because that’s the easier one to answer. Employers like to see that people know things and having a digital badge is a super easy way for an employer to, to make sure that, yeah, they actually learn at least the basics of, you know, that topic. you know, certainly it’s the same for you know, university degrees and other things. You when you’re an employer, you want the, it’s very, important that people who are working for you actually know something that’s critically important. And while a degree or digital badge doesn’t guarantee it, it at least gives that additional evidence. For people, mean, obviously if you want, are trying to get employed by someone, it’s always nice to be able to prove that. But I think it’s also a way to both show you know something to others and frankly encourage others to learn this stuff. We have a situation now where way too many people don’t know how to do some what to me are pretty basic stuff. You know I’ll point back to the LFD 121 course which is how to develop secure software. Most colleges, most universities that’s not a required part of the degree. I think it should be.

David A. Wheeler (21:35)
But since it isn’t, it’s really, really helpful for everybody to know, wait, this person coming in, do they not, they’ve got this digital badge. That gives me much higher confidence going in as somebody I’m working with and that sort of thing, as well as just encouraging others to say, hey, look, I carried enough to take the time to learn this, you can too. And both LFD 121 and this new AI course are free, so that in there online so you can take it at your pace. Those roadblocks do not exist. We’re trying to get the word out because this is important.

Yesenia (22:16)
Yeah, I love that these courses are more accessible and how you touched on the students, like students applying for universities that might be more highly competitive. They’re like, hey, look, I’m taking this extra path to learn and take these courses. Here’s kind of like the proof. And it’s like Pokemon. It’s good to collect them all, know, between the badges and the certifications and the degrees.

I definitely that’s the security professional’s journeys. Collect them all at this point with the credibility and benefits.

David A. Wheeler (22:46)
Well, indeed, course, the real goal, of course, is to learn, not the badges. But I think that badges are frankly, you know, you collecting the gold star, there is nothing more human and nothing more okay than saying, Hey, I, you know, I got to I got a gold star for if you’re doing something that’s good. Yes, revel in that. Enjoy it. It’s fun.

David A. Wheeler (23:11)
And I don’t think that these are impossible courses by any means. And unlike some other things which are, know, I’m not against playing games, playing games is fun, but this is a little thing that’s both can be interesting and is going to be important long-term to not only yourself, but every who uses the code you make. Cause that’s gonna be all of us are the users of the software that all developers as a community make.

Yesenia (23:42)
Yeah, there’s a wide range impact from this, not just like even if you don’t create software, just understanding and learning about this, you’re a user to understanding that basic understanding of it. So I want to transition a little bit to the course because I know we’re spending the whole time about it. Let’s say I’m a friendly person. I signed up for this free LFEL 1012. Can you walk me through the course structure? Like what am I expected to take away from the course in that time period?

David A. Wheeler (24:09)
Okay, yeah, so let’s see here. So I think what I should do is kind of first walk through the outline. Basically, I mean, the first two parts unsurprisingly are introduction and some context. And then we jump immediately into key AI concepts for secure development. We do not assume that someone taking this course is already an expert in AI. I mean, if you are, that’s great. It doesn’t take, we’re not spending a lot of time on it, but we wanna make sure that you understand the basics, the key terms that matter for software development. And then we drill right into the security risks of using AI assistance. I want to make it clear, we’re not saying you can’t use just because something has a risk, everything has risks, okay? But understanding what the potential issues are is important because then you can start addressing them. And then we go through what I would call kind of the meat of the course, best practices for secure assistant use.

You know, how do you reduce the risk that the assistant itself doesn’t become subverted, it starts working against you, things like that. Writing more secure code with AI, if you just say, write some code, a lot of it’s gonna be insecure. There are ways to deal with that, but it’s not so simple or straightforward. For example, it’s pretty common to tell AI systems that, hey, I’m an expert in this topic and suddenly it gets better. That trick doesn’t work.

No, you may laugh, but honestly, that trick works in a lot of situations, but it doesn’t work here. And we’ve actually researched showing it doesn’t work. So there are things that work, but it’s more it’s, it’s more than that. And finally, reviewing code changes in a world with AI. Now, of course, involves reviewing proposed changes from others. And in some cases, trying to deal with the potential DDoS attacks as people start creating far more code than anybody can reasonably review. Okay. We’re have to deal with this. and frankly, biggest problem, frankly, the folks who are vibe coding, you know, they, they run some program. It tells them 10 things. I’ll just dump all 10 things at them. And no, that’s a terrible idea. you know, and the, the curl folks, for example, have had an interesting point where.

They complained bitterly about some inputs from AI systems, which were absolute garbage, wasted their time. And they’ve praised other AI submissions because somebody took the time to make sure that they were actually helpful and correct and so on. And then that’s fantastic. you know, basically you need to push back on the junk and then find and then welcome the good stuff. And then, of course, a little conclusion wrap up kind of thing.

Yesenia (27:01)
I love it. it was a good outline. was not seeing it. Is this like videos accomplished with it or is this just like a module click through?

David A. Wheeler (27:10)
Well, basically we, we group them into little chapters. forgot what their official term is. It’s chapters, section modules. I don’t remember what the right term is. I guess I should, but basically after you go that, then there’s a couple of quiz questions and then a little videos. Basically the idea is that we want people to get it quickly, but you know, if it’s just watch a video for an hour, people fall asleep, don’t remember anything. That’s the goal is to learn, not just, you know, sleep through a video.

David A. Wheeler (27:39)
So little snippets, little quiz questions, and at the end there’s a little final exam. And if you get your answers right, you get your badge. So it’s not that terribly hard. We estimate, it varies, but we estimate about an hour for people. So it’s not a massive time commitment. Do it on lunch break or something. I think this is going to be, as I said, I think this is going to be time well spent.

David A. Wheeler (28:07)
This is the world that we are all moving to, or frankly, have already arrived at.

Yesenia (28:12)
Yeah, I’m already here. think I said it’s a seedling. It’s about to grow into that big tree. Any last minute thoughts, takeaways that you want to share about the course, your experience, open source, supply chain security, all of the above.

David A. Wheeler (28:27)
My goodness. I’ll think of 20 things after we’re done with this, of course. well, no, problem is I’ll think about them later in French. believe it’s called the wisdom of the stairs. It’s as you leave the party, you come up with the point you should have made. so I guess I’ll just say that, you know, if you develop software, whether you’re not, whether you realize or not, it’s highly likely that the work that you do will influence many, many.

Yesenia (28:31)
You only get zero and one.

David A. Wheeler (28:54)
About many, many people, many more than you probably realize. So I think it’s important for all software developers to learn how to develop software, secure software in general, because whether or not you know how to do it, the attackers know how to attack it and they will attack it. So it’s important to know that in general, since we are moving and essentially have already arrived at the world of AI and software development, it’s important to learn the basics and yes.

Do keep learning. Well, all of us are going to keep learning throughout our lives. As long as we’re in this field, that’s not a bad thing. I think it’s an awesome thing. I wake up happy that I get to learn new stuff. But that means you actually have to go and learn the new stuff. And the underlying technology remark is, it’s actually remarkably stable in many things. This is a case though, where, yes, a lot of things change in the detail, but the fundamentals don’t. But this is something where, yeah, actually there is something fundamental that is changing. One time we didn’t use AI often to help us develop software. Now we do. So how do we do that wisely? And there’s a long list of specifics. The course goes through it. I’ll give a specific example so it’s not just this highfalutin, high level stuff. So for example,

Pretty much all these systems are based very much on LLMs, which is great. LLMs have some amazing stuff, but they also have some weaknesses. One is in particular, they are incredibly gullible. If they are told something, they will believe it. And if you tell them to read a document that gives them instructions on some tech, and the document includes malicious instructions, that’s what it’s going to do because it heard the malicious instructions.

David A. Wheeler (30:48)
Now that doesn’t mean you can’t use these technologies. I think that’s a road too far for most folks. But it does mean that there’s new risks that we never had to deal with before. And so there’s new techniques that we’re going to need to apply to do it well. And I don’t think they’re unreasonable. They’re just, you know, we now have a new situation and we’re have to make some changes because of new situation.

Yesenia (31:11)
Yeah, it’s like you mentioned earlier, like you can ask it to be an expert in something and then it’s like, oh, I’m an expert. That’s what I laughing. I was like, yeah, I use that a lot. I’m like, the prompt is you’re an expert in sales. You’re an expert in brand. You’re an expert in this. And it’s like, OK, once it gets in.

David A. Wheeler (31:25)
But the thing is, that really does work in some fields, remarkably. And of course, we can only speculate sometimes why LLMs do better in some areas than others. But I think in some areas, it’s quite easy to distinguish the work of experts from non-experts. And the work of experts is manifestly and obviously different. And at least so far, LLMs struggle to do that.

David A. Wheeler (31:54)
This differentiation in this particular domain. And we can speculate why but basically, the research says that doesn’t work. So don’t do that. Do there are other techniques that have far more success, do those instead. And I would say, hey, I’m sure we’ll learn more things, there’ll be more research, use those as we learn them. But that doesn’t mean that we get to excuse ourselves from ignoring the research we have now, even though we don’t know it.

David A. Wheeler (32:23)
We don’t know everything. We won’t know everything next year either. Find out what you need to know now and be prepared to learn more Seagull.

Yesenia (32:32)
It’s a journey. Always learning every year, every month, every day. It’s great. We’re going to transition into our rapid fire. All right, so I’m going to ask the question. You got to answer quiz and there’s no edit on this point. All right, favorite programming language to teach security with.

David A. Wheeler (32:56)
I don’t have a favorite language. It’s like asking what my children, know, which of my children are my favorite. I like lots of programming languages. That said, I often use Java, Python, C to teach different things more because they’re decent exemplars of those kinds of languages. But so there’s your answer.

Yesenia (33:19)
Those are good range because you have your memory based one, which is the see your Python, which more scripts in the Java, which is more object oriented. So you got a good diverse group.

David A. Wheeler (33:28)
Static type, right, you’ve got your static typing, you’ve got your scripting, you’ve got your lower level. But indeed, I love lots of different programming languages. I know over 100, I’m not exaggerating, I counted, there’s a list on my website. But that’s less impressive than you might think because after you’ve learned a couple, the others tend to not, they often are similar too. Yes, Haskell and Lisp are really different.

David A. Wheeler (33:55)
But most Burmese languages are not as different as you might think, especially after you’ve learned a few. So I hope can help.

Yesenia (34:01)
Yeah, the newer ones too are very similar in nature. Next question, Dungeon and Dragon or Lord of the Rings?

David A. Wheeler (34:11)
I love both. What are we doing? What are you doing to me? Yeah, so I play Dungeons and Dragons. I have watched, I’ve read the books and watched the movies many times. So yes.

Yesenia (34:24)
Yes, yes, yes. First open source project you ever contributed to.

David A. Wheeler (34:30)
Wow, that is too long ago. I don’t remember. Seriously, but it was before the term open source software was created because that was created much later. So it was called free software then. So I honestly don’t remember. I sure it was some small contribution to something somewhere like many folks do, but I’m sorry. It’s lost in the midst of times back in the eighties. Maybe. Yeah. The eighties somewhere, probably mid eighties.

Yesenia (35:00)
You’re going to go to sleep now. Like,

David A. Wheeler (35:01)
So, yeah, yeah, I’m sure somebody will do the research and tell me. thank you.

Yesenia (35:09)
There wasn’t GitHub, so you can’t go back to commits.

David A. Wheeler (35:11)
That’s right. That’s right. No, was long before get long before GitHub and so on. Yep. Carry on.

Yesenia (35:18)
When you’re writing code, coffee or tea?

David A. Wheeler (35:22)
Neither! Coke Zero is my preferred caffeine of choice.

Yesenia (35:26.769)
And this is not sponsored.

David A. Wheeler (35:28.984)
It is not sponsored. However, I have a whole lot of empty cans next to me.

Yesenia (35:35)
AI tool you find the most useful right now.

David A. Wheeler (35:39)
Ooh, that one’s hard. I actually use about seven or eight depending on what they’re good for. For actual code right now, I’m tending to use Claude Code. Claude Code’s really one of the best ones out there for code. And of course, five minutes later, it may change. GitHub’s not bad either. There’s some challenges I’ve had with them. They had some bugs earlier, which I suspect they fixed by now.

But in fact, I think this is an interesting thing. We’ve got a race going on between different big competitors, and this is in many ways good for all of us. The way you get good at anything is by competing with others. So I think that we’re seeing a lot of improvements because you’ve got competing. And it’s okay if the answer changes over time. That’s an awesome thing.

Yesenia (36:36)
That is awesome. That’s technology. And the last one, this is for chaos. GIF or GIF?

David A. Wheeler (36:42)
It’s GIF. Graphics. Graphics has a guck in it. And yes, I’m aware that the original perpetrator doesn’t pronounce it that way, but it’s still GIF. I did see a cartoon caption which said GIF or GIF. And of course I can hear it just reading it.

Yesenia (36:53)
There you have it.

Yesenia (37:05)
My notes is literally spelled the same.

David A. Wheeler (37:08)
Hahaha!

Yesenia (37:11)
All right, well there you have it folks, another rapid fire. David, thank you so much for your time today, for your impact contribution to open source in the past couple decades. Really appreciate your time and all the contributors that were part of this course. Check it out on the Linux Foundation website. then David, do you want to close it out with anything on how they can access the course?

David A. Wheeler (37:38)
Yeah, so basically the course is ecure AI/ML-Driven Software Development its numbers LFEL 1012 And I’m sure we’ll put a link in the show. No, I’m not gonna try to read out the URL But we’ll put a link in there to to get to it But please please take that course. We’ve got some other courses

Software development, you’re always learning and this is an easy way to get the information you most need.

Yesenia (38:14)
Thank you so much for your time today and those listening. We’ll catch you on the next episode.

David A. Wheeler (38:19)
Thank you.

OpenSSF Hosts 2025 Policy Summit in Washington, D.C. to Tackle Open Source Security Challenges

By Blog, Global Cyber Policy, Press Release

WASHINGTON, D.C. – March 11, 2025 – The Open Source Security Foundation (OpenSSF) successfully hosted its 2025 Policy Summit in Washington, D.C., on Tuesday, March 4. The summit brought together industry leaders and open source security experts to address key challenges in securing the software supply chain, with a focus on fostering harmonization for open source software (OSS) development and consumption in critical infrastructure sectors.

The event featured keynotes from OpenSSF leadership and industry experts, along with panel discussions and breakout sessions covering the latest policy developments, security frameworks, and industry best practices for open source software security. 

“The OpenSSF is committed to tackling the most pressing security challenges facing the consumption of open source software in critical infrastructure and beyond,” said Steve Fernandez, General Manager, OpenSSF. “Our recent Policy Summit highlighted the shared responsibility, common goals, and interest in strengthening the resilience of the open source ecosystem by bringing together the open source community, government, and industry leaders.” 

Key Themes and Discussions from the Summit

  1. AI, Open Source, and Security
  • AI security remains an emerging challenge: Unlike traditional software, AI has yet to experience a major security crisis akin to Heartbleed, leading to slower regulatory responses.
  • Avoid premature regulation: Experts advised policymakers to allow industry-led security improvements before introducing regulation.
  • Security guidance for AI developers: There is an increasing need for dedicated security frameworks for AI systems, akin to SLSA (Supply Chain Levels for Software Artifacts) in traditional software.
  1. Software Supply Chain Security and OSS Consumption
  • Balancing software repository governance: The summit explored whether package repositories should actively limit the use of outdated or vulnerable software, recognizing both the risks and ethical concerns of software curation.
  • Improving package security transparency: Participants discussed ways to provide better lifecycle risk information to software consumers and whether a standardized framework for package deprecation and security backports should be introduced.
  • Policy recommendations for secure OSS consumption: OpenSSF emphasized the need for cross-sector collaboration to align software security policies with global regulatory frameworks, such as the EU Cyber Resilience Act (CRA) and U.S. federal cybersecurity initiatives.

“The OpenSSF Policy Summit reaffirmed the importance of industry-led security initiatives,” said Jim Zemlin, Executive Director of the Linux Foundation. “By bringing together experts from across industries and open source communities, we are ensuring that open source security remains a collaborative effort, shaping development practices that drive both innovation and security.”

Following the summit, OpenSSF will continue to refine security guidance, best practices, and policy recommendations to enhance the security of open source software globally. The discussions from this event will inform ongoing initiatives, including the OSS Security Baseline, software repository security principles, and AI security frameworks.

For more information on OpenSSF’s policy initiatives and how to get involved, visit openssf.org.

Supporting Quotes

“The 2025 Policy Summit was an amazing day of mind share and collaboration across different teams, from security, to DevOps, and policy makers. By uniting these critical voices, the day resulted in meaningful progress toward a more secure and resilient software supply chain that supports innovation across IT Teams.” – Tracy Ragan, CEO and Co-Founder DeployHub

“I was pleased to join the Linux Foundation OpenSSF Policy Summit “Secure by Design” panel and share insights on improving the open source ecosystem via IBM’s history of creating secure technology solutions for our clients,” said Jamie Thomas, General Manager, Technology Lifecycle Services & IBM Enterprise Security Executive. “Open source has become an essential driver of innovation for artificial intelligence, hybrid cloud and quantum computing technologies, and we are pleased to see more regulators recognizing that the global open source community has become an essential digital public good.” – Jamie Thomas, General Manager, Technology Lifecycle Services & IBM Enterprise Security Executive

“I was delighted to join this year’s OpenSSF Summit on behalf of JFrog as I believe strongly in the critical role public/private partnerships and collaboration plays in securing the future of open source innovation. Building trust in open source software requires a dedicated focus on security and software maturity. Teams must be equipped with tools to understand and vet open source packages, ensuring we address potential vulnerabilities while recognizing the need for ongoing updates. As the value of open source grows, securing proper funding for these efforts becomes essential to mitigate risks effectively.” – Paul Davis, U.S. Field CISO, JFrog

“Great event. I really enjoyed the discussions and the idea exchange between speakers, panelists and the audience.  I especially liked the afternoon breakout discussion on AI, open source, and security.” – Bob Martin, Senior Software and Supply Chain Assurance Principal Engineer at the MITRE Corporation

“The Internet is plagued by chronic security risks, with a majority of companies relying on outdated and unsupported open source software, putting consumer privacy and national security at risk. As explored at the OpenSSF Policy Summit, we are at an inflection point for open source security and sustainability, and it’s time to prioritize and invest in the open source projects that underpin our digital public infrastructure.” – Robin Bender Ginn, Executive Director, OpenJS Foundation

“It is always a privilege to speak at the OpenSSF Policy Summit in D.C. and converse with some of the brightest minds in security, government, and open source. The discussions we had about the evolving threat landscape, software supply chain security, and the policies needed to protect critical infrastructure were timely and essential. As the open source ecosystem expands with skyrocketing open source AI adoption, it’s vital that we work collaboratively across sectors to ensure the tools and frameworks developers rely on are secure and resilient. I look forward to continuing these important conversations and furthering our collective mission of keeping open source safe and secure.” – Brian Fox, CTO and Co-Founder, Sonatype

“The OpenSSF Policy Summit highlighted the critical intersection of policy, technical innovation, and collaborative security efforts needed to protect our software supply chains and address emerging AI security challenges. By bringing together policy makers and technical practitioners, we’re collectively building a more resilient open source ecosystem that benefits everyone, we look forward to future events and opportunities to collaborate with the OpenSSF to help strengthen this ecosystem.” – Jim Miller, Engineering Director of Blockchain and Cryptography, Trail of Bits

***

About the OpenSSF

The Open Source Security Foundation (OpenSSF) is a cross-industry initiative by the Linux Foundation that brings together the industry’s most important open source security initiatives and the individuals and companies that support them. The OpenSSF is committed to collaboration and working both upstream and with existing communities to advance open source security for all. For more information, please visit us at openssf.org.

Media Contact
Noah Lehman
The Linux Foundation
nlehman@linuxfoundation.org