Podcast – Open Source Security Foundation

Dec 02

What’s in the SOSS? Podcast #46 – S2E23 Securing the Future: AI, Open Source, and Collaboration with Jay White (Microsoft)

By OpenSSF Podcast

Jay White from Microsoft joins What’s in the SOSS to talk about his journey into open source, AI and ML security, model signing, and the importance of community collaboration. Hear…

Nov 19

Love0

What’s in the SOSS? Podcast #45 – S2E22 SBOM Chaos and Software Sovereignty: The Hidden Challenges Facing Open Source with Stephanie Domas (Canonical)

By OpenSSF Podcast

Summary

Stephanie Domas, Canonical’s Chief Security Officer, returns to What’s in the SOSS to discuss critical open source challenges. She addresses the issues of third-party security patch versioning, the rise of software sovereignty, and how custom patches break SBOMs. Domas also explains why geographic code restrictions contradict open source principles and what the EU’s Cyber Resilience Act (CRA) means for enterprises. She highlights Canonical’s work integrating memory-safe components like sudo-rs into the next Ubuntu LTS. This episode challenges assumptions about supply chain security, software trust, and the future of collaborative development in a regulated world.

Listen on Apple Podcasts Listen on Spotify Listen on Overcast Listen on Pocket Casts

Conversation Highlights

00:00 – Welcome
01:49 – Memory safety revolution
02:00 – Black Hat reflections
03:48 – The SBOM versioning crisis
06:23 – Semantic versioning falls apart
10:06 – Software sovereignty exposed
12:33 – Trust through transparency
14:02 – The insider threat parallel
17:04 – EU CRA impact
18:50 – The manufacturer gray area
21:08 – The one-maintainer problem
22:51 – Will regulations kill open source adoption?
24:43 – Call to action

Episode Links

Transcript

CRob (00:07.109)
Welcome welcome welcome to what’s in the sauce where we talked to the amazing people that make up the upstream open source ecosystem these are developers maintainers researchers all manner of contributors that help make open source great today I have an Amazing friend of the show and actually this is a repeat performance. She has been with us once before

We have Stephanie Domas. She’s the chief security officer for Canonical. It’s a little company you might have heard about. So welcome for joining and thank you for joining us today, Stephanie.

Stephanie Domas (00:45.223)
Thank you for having me again for a second time.

CRob (00:48.121)
I know. It’s been a while since we chatted. So how are things going in the amazing world of the Linux distros?

Stephanie Domas (00:58.341)
Yeah, so just for people who aren’t as familiar with canonical because we do have a bit of a brand thing, we are the makers of Ubuntu Linux. So connect the dots for everyone. So the world of the distros is always a fun place. There has been a lot of recent excitement around NPM hacks, supply chain hacks. And so the world of archives, running archives, running a distro, there’s never a dull moment.

CRob (01:08.422)
Yay.

Stephanie Domas (01:28.475)
So on our distro front, right, we’ve been excited to try and on the security front, focusing on, we’re taking a fresh eye to how to introduce things like memory, save sort of core components into our operating systems. So a sudo RS, so a Rust implementation of sudo is now a part.

CRob (01:40.231)
Ooh.

Stephanie Domas (01:49.085)
Ubuntu will be a part of our next LTS, which is something we’re really excited. We’re looking for more opportunities to replace some of these fundamental components with memory safe versions of them. So that’s also exciting.

CRob (01:52.604)
Nice.

CRob (02:00.167)
That’s amazing. My hat is off to all of you for taking the leadership role and doing that. That’s great. I had the great opportunity to participate in a panel with you recently at Hacker Summer Camp. So kind of, I know it’s been a little bit of time, reflecting back, what was your favorite experience from the Black Hat Def Con, B-Sides, Diana week?

Stephanie Domas (02:24.609)
Yeah, and so I’m always one of those people who one of my favorite things is just the ability to reconnect with so many people in the industry. That’s the one time of year where actually despite the fact that you and I physically live near each other, that actually tends to be the one time a year we see each other in person. so extending that to all of just the great people I’ve known in the industry and getting to see them. The panel you spoke about was another real highlight for me. So the panel for those who didn’t get the privilege of attending wasn’t asked me anything on open source. And it was great because there was such a variety of interests right there. People who are interested in what it’s like to be a maintainer. What does it mean to use open source in enterprise? What does it mean to try and think of monetization models around open source? And so the diversity of conversation, which is really exciting to find people who are in the space really unfamiliar with open source or trying to figure out how do I reason about it and to be able to have those conversations with them was really exciting for me.

CRob (03:24.515)
I agree, I thought that was a great experience and I would love to see us do that again sometime. Excellent. Well, let’s move on to business. Today you have some things you’re interested in talking about. So let’s talk about one of my favorite topics, a software bill of materials and also talking about sovereignty within software.

Stephanie Domas (03:29.725)
I’ll put in a good word.

CRob (03:48.867)
From your perspective, how are you seeing these ideas and the tooling around SBOM, how are you seeing that adopted or used within the ecosystem today?

Stephanie Domas (04:00.421)
Yeah, so I will say, I’ll preface it with SBOMs continue to show a lot of really great promise. I do think there is a lot of value to them. I think they’re really starting to deliver on some of those benefits from trying to implement them at scale, right across our entire distro, across the archives, right? There are still some implementation challenges. So one of the things that’s been on my mind a lot recently is around versioning.

CRob (04:28.273)
Ooh.

Stephanie Domas (04:29.863)
So I feel like every week I see some new vendor or I get approached by some vendor whose business model is to create patches for you on the open source that you use to help maintain your security.

It’s a really interesting business proposition. get why these companies are popping up. But when I start to think of S-bombs and some of the difficulty we’re having around implementing them, this idea of version control in open source, when we have companies coming out of the woodwork to create patches, just creates this swirl of chaos.

To add some more, to add a specific example around there, you think about semantic versioning, right? When we release the version of something, it’s 5.0.0. When we release a patch, when the original manufacturer, the upstream releases a patch, it’s 5.0.1, right? Then 5.0.2. And so S-bombs come into play because the S-bombs goal is to understand the version of the thing that is inside of your piece of software. And then it uses that to kind of look up and say, hey, what are any known security issues in this version. But as I continue to see and have even talked to some of these businesses whose whole value proposition is, we will actually maintain and develop patches for you for the things that you are using, you’re breaking this whole idea of semantic versioning that now you are potentially carrying a tremendous number of patches for security specifically that aren’t representing that versioning number. And so then the question becomes like, what do you do with that? If I’m using internally 5.0.0 and I’ve used one of these companies to develop a security patch for me, what does that mean to my version number? What do I put in my S-bomb when I have created my own security patches that I have chosen not to upstream? What does that mean? And so…

Stephanie Domas (06:23.931)
I see this as a really unique problem to open source, right? Where there’s, there’s a lot of upsides to open source, like the tremendous amount to be specific, but the code base being open and allowing people to develop their own patches now creates this slurry of confusion of, don’t think version numbers will be the thing we can rely on in S bonds. And I don’t know what the answer is, right? I don’t actually, I don’t have a good approach to this when we’re starting to see, you know, customers using these companies create their own patches, they use some internal vulnerability scanner, and then they come back to us and say, hey, there’s these problems, these reports I’m getting from my scanner that’s scanning my S-bombs, and we’re sitting there confused of like, but because you’re carrying these other patches, I mean, it’s causing so much confusion where inside of the company, they don’t even realize. And so they get these scanning reports, so they come to us. They’re like, well, I don’t know how to answer this question because you’ve done something else internally, which is great. You like that, but your risk position, you’ve hired one of these companies to give you patches. So it’s one of these interesting things that’s been on my mind recently of, don’t know what the answer is, but I think SBOM’s inversion control and open source is going to be a really big struggle in the coming years as we see more of these offshoots of, of patches and security maintenance coming from not upstream and with no intention to upstream them.

CRob (07:50.415)
And I think, you we talk, get to deal with a lot of global public policy concerns and thinking about like the CRA where manufacturers are, have a lot of pressure to produce these patches. They’re encouraged to share them back upstream, but they’re not required. So I imagine the situation that you’re describing where these vendors are kind of setting up this little cottage practice.

You’re only going to see more of that. again, it’s going to cause a lot of confusion for the end users and a lot of, I’m sure, frustrated calls to your support line saying, I’m using package one, two, three, well, the actual, canonical version, pun intended. Ours is XYZ. Again, I bet that would cause a lot of customer downstream confusion.

Stephanie Domas (08:39.579)
Yeah, the EU CRA is one of the forcing functions that I think will bring a lot of this to light is where you start to have these obligations around patching and add a testing to this to your customers. But then again, you start to get this.

untenable spaghetti muster. How is that virgining happening when that manufacturer has potentially again use one of these businesses to create patches internally that they did not upstream. And so what does that mean for the S-bomb of the users when you give that to the users? How are they supposed to make informed risk decisions about that when the version number in that S-bomb is either follows upstream, in which case it misrepresents the security posture or they’ve come up with their own versioning system that’s unique to the internal patches that they’re carrying. And so again, the poor users are left without being able to make those informed decisions. So yes, the EU CRA, one of my favorite topics, is it’s gonna be a forcing function for that. And I think it’s gonna…

Again, there’s no answers. Like I think we’re going to be forced to try and make some of these decisions about like, what, what does that mean? Right? How do you, how do users reason about their SBOMs when the version numbers in them may not make sense anymore.

CRob (09:53.423)
It’s good to know that you’ve jumped into one of the hardest problems in software engineering, versioning. Do you want to tackle the naming issue as well while we’re at it?

Stephanie Domas (10:03.205)
I’m here to solve all, actually I’m just here to call out the problems.

CRob (10:06.247)
So somewhat adjacent, you mentioned that you wanted to talk about kind of this trend in the ecosystem around sovereignty. The EU thinks about it as digital sovereignty, but we’ve seen this concept in a lot of different places. So what are some of your observations around this, air quotes, new movement?

Stephanie Domas (10:32.121)
Yeah, so sovereignty is a really interesting one. For those who haven’t been too exposed to it, we’re seeing a big push in things like data sovereignty, which is focused on where is my data, right? Is my data contained in some, name some geographical border?

The big one that has been coming up for me a lot recently is software sovereignty. So there are some countries and some regulations that are taking the approach that the way to drive trust in a piece of software is by where it was developed, where the developers that wrote that piece of code come from. And on a surface, I…

I can see why they’re thinking that, right? I don’t necessarily agree with that. I can see why they may be thinking, hey, if this software was developed by people with the same ideals as me, maybe I can assume that it is for the same purpose that I agree with, right? I can see that argument, but I don’t love that argument. So the thought process to me is, is this actually the most effective path to true security, right? What they’re trying to achieve through this focus on software sovereignty is in fact more trust in the software. And being economical, right? That our solutions are open source, right? When this has come up for me for ambitions to sell to different federal governments, right? And so I sit in these conversations where contractors and consultants are trying to explain to me like, well, here’s what you have to do to sell to this name federal government here.

And you sit there thinking like, none of this makes any sense for open source. It’s kind of the antithesis of it, right? Attestation that the code was only worked on by people in certain countries. And I keep sitting there thinking, this is a wild way to derive trust in software. And so I sit there asking myself, why is this happening, right? On one side of the fence, there’s the, yes, I derived trust in software by who developed it.

Stephanie Domas (12:33.157)
The other side of that and the side that I think open source solves is, it not a more robust defense, right? To not think about national borders, but instead globally accessible and audible code bases. And again, they’re in these meetings. So CMMC was a reaching one for those in the U S but they have them in all over the world sitting there thinking like there’s absolutely no way for me to achieve anything beyond level one in CMMC because I can’t and have no interest in attesting to build pipelines that are only from certain citizens. And again, it’s one of these where it’s it’s confusing to me. So it’s been on my mind sitting there in these meetings about this and support sovereignty, right? Support sovereignty has actually come up as well for us. We have experience not just governments, but customers now who are seeking support to only be performed by people in name geographic border. And so we’re having to try and develop strategies again around support sovereignty. And again, I go back to if the codebase is globally accessible and auditable, I just feel like that’s a better way to solve this problem of deriving trust in what is done in the software, even from a support perspective, if you want to derive trust in what changes they’re proposing, what code they’re proposing to you. The fact that you can see it all should be how you derive trust, not where that person is physically sitting in the world. So that’s me on my snowboard. Sovereignty for the moment.

CRob (14:02.611)
And I, I like in this movement to back when I was an in-house practitioner in a InfoSec team, that the business always was dismissive of the insider threat vector. The fact that they were so focused on external actors that these, that somebody’s going to come in and hack the organization. But you look at reports like the Verizon data breach report that comes out every year and consistently every year, the most damaging cybersecurity breaches are people outside of your organizational trust boundary. And that’s, know, they cause the most damage. They are the most frequent, whether it’s ignorance or malice, you know, the, the, is the root cause of the problem, but it’s still someone inside of your organization. And to extrapolate that from a nation, you’re going to have a much broader spectrum of people, even outside of the, your small enterprise. So I always thought I was kind of scratched my head why people were so dismissive of this because there’s evidence that shows that the insiders are actually the ones that can cause that have consistently caused the most damage and then when you blow that up.

Stephanie Domas (15:11.773)
And I can tell you, yeah, as someone who spent time as an adjunct professor at The Ohio State University teaching software development.

There’s a lot of not great coders out there, right? So even if their ideals align, even if like there is nothing associated with code coming from a domestic location, it ensures quality, right? Again, even if you’re not worried about malice, there’s a lot of bad developers out there. The ability to have a portable code base should be your number one focus. And so yeah.

CRob (15:46.119)
And it’s the whole, have the thousand eyes, transparency perspective. But what I’ve always loved about open source is the fact that it’s meritocracy based. That, you know, the best ideas get submitted. have, we are able to pull ideas from everywhere that, know, anyone’s idea is potentially equally as valid as everybody else’s. So I love that aspect and that makes the software stronger because it helps ideally.

Coach those developers like you alluded to that might not be as skilled or as aware of security techniques and ideally helps them improve themselves, but also helps ideally not let those bad commits get into the code.

Stephanie Domas (16:25.777)
Yeah, and I’ll be the first to admit that not all open source is quality, right? It’s a variety of things. But the point is that you have the ability to determine that yourself before you choose to use the code. So depending on your risk posture, you get to make that decision.

CRob (16:43.43)
Mm-hmm.

Absolutely. So again, kind of staying on this topic, from your perspective and like the developers and then the customers you get to work with, how has this new uptick of global cyber legislation impacted your mission and impacted your customers?

Stephanie Domas (17:04.698)
Yeah, so I’ll actually I’m going to circle back to our favorite legislation, which is the EU CRA. So this has been a really interesting one because it’s.

CRob (17:10.01)
Yay!

Stephanie Domas (17:15.549)
It’s forced a lot more light, I think from the customers we’re working into the open source they’re using, how they’re consuming it and putting a lot more intentionality into if I continue to consume it this way, what does that mean for my, my liability associated with this piece of code? So in a way it’s been really good that it’s forcing customers to think more about that. As you mentioned earlier about patching and thinking about how am I getting this patch? How am I incorporating this patch? Part of the EU CRA is requiring for security patching for the defined lifecycle of the device. And again, that’s actually driving some really interesting conversations about, okay, yeah, that makes sense for the piece of software I wrote, but I had these pieces that I was consuming in the open source space and what does that mean for these? And so I do think it’s driving a good conversation on what does that mean? I worry that there is still some confusion in the legislation and until we get the practice guidance, we won’t have the answers to around that enterprise consumption of open source. And what does that mean from my risk perspective, right? If you consume so, and I’ll also caveat. like there is a carve out in the EU CRA for open source stewards. Canonical is defied, we make open source, but we are not actually an open source too, right? Because we have monetization models around support and because we want to support our customers, we are actually considered a manufacturer. So we’re taking a manufacturer role. So a manufacturer role over open source is also kind of a big gray area in the practice guidance and the legislation of like, well, what does that mean? Because again, despite the fact that we’re taking manufacturer on Ubuntu, the amount of upstream dependencies of us is astronomical. And so what does that mean? Right? What does that mean for us taking liability on the Debian kernel? I don’t know.

It’s bit confusing at the moment of what that means because we are in fact in manufacturer consuming open source. then to our customers who we have business relationships with, right? We are signing CRA support clauses, right? We are giving them, we are taking manufactured. They don’t have to take manufacturer on the open source they consume from us. But again, it’s a bit unclear of what entirely that means. We are actively involved in things like the Etsy working group that is defined some of these practice standards, particular and operating systems and Ubuntu is one of our flagship products. So we’re involved in that, that working group to try and define like, what exactly does this mean for operating systems? But despite being best known for Ubuntu, we actually have a very large software portfolio. And there’s a lot of software in our portfolio that doesn’t have a vertical practice guidance being made. And so this like general horizontal one that would define the requirements for those products not really see much activity on that yet. So there’s also big unknown around. don’t know what those expectations are for our products right now, which again, have tremendous upstream dependencies. And in some case, we are a fraction of that code base, right? What canonical produces in that code base is a fraction of the overall code base. But when people consume it through, what does that mean? Because we took the manufacturer role on it.

CRob (20:33.415)
Yeah. Well, and you touched on it a little bit earlier that all open source is amazing. Not all of it is great software. It’s a big spectrum. You look at a student project or someone running an experiment that you stumble across on GitHub is a very different experience than like operating with Kubernetes or the Linux kernel or anything from like the Apache foundation. like when you have these large mature teams, they have a lot of different capabilities as opposed to random project you find on the internet that somebody might not have intended for it to be commercialized.

Stephanie Domas (21:08.416)
Yeah, and I was reading an article recently on, I think it was an NPM archive that they had done their statistical analysis on, but they had done all this interesting analysis on the number of essentially the number of different handles who had committed to a piece of software and their overall conclusion right was that.

CRob (21:23.911)
Mm-hmm.

Stephanie Domas (21:26.301)
The majority of open source was one maintainer on this archive, right? That they showed of the, you know, N number of most popular downloads over 50 % of them. And I can’t remember the number. It might’ve been something closer to 80 % was one person. And so again, it’s not open source has a lot of amazing things. Not all of it is well written, not all of it is well maintained. so again, it’s sort of forcing regulations like the EU CRA are forcing people to take a much better look at their supply chain to understand, you know, where am I consuming that from? that something that is well maintained? Because I can’t just take the old version, say it’s stable and not worry about it from there. Can’t do that anymore under the CRA. I have to do security patching on this thing and now I’m potentially responsible for it and what does that mean? I do worry and I hope that we won’t see a recoil of enterprises using open source because of that fear, right? Because there are large libraries out there where there aren’t people willing to step up and take the maintainer role on them. And that makes enterprises afraid to consume those products under the EU CRA. And so that’s a fear that I do have. And again, it’s because it’s a bit of a gray area about how the liability, if that open source repo decides to take steward status, what does that mean for the enterprise consuming that? If it’s something small, it’s a tiny library, maybe the risk isn’t large, but it may be a really meaningful part of the application.

And if there’s no, if you’re not consuming it through somebody willing to take manufacturing role in your chain, will you still be willing to consume that piece of open source? So I do worry about that. I hope that as the practice guidance comes out, that is clarified so there’s better understanding of what that means and we won’t see that recoiling. But I have tried to talk to some of the legislators that are working on that and actually they haven’t been able to answer that question for me yet either. Is there going to be clarification because this is an area of concern. So hopefully between now and more practice guidance, we get more clarification so we don’t see that recoiling.

CRob (23:34.915)
And I know that at least from the horizontal and vertical standard standpoint, we are very close to starting to get public drafts being able to be delivered sometime fourth quarter in 2025. And ideally that’ll allow us to see what the legislators are thinking and where they are planning on landing on some of these issues. And hopefully we get some clarifications. And ideally we get the chance to provide feedback and say, hey, great work.

But have you considered X, Y, and Z to help make this more precise and more clear?

Stephanie Domas (24:11.387)
Absolutely, and my understanding that, hopefully I’m remembering the numbers correctly, I think there’s 15 vertical standards being written in Etsy right now. And so while that’s an astronomical amount of practice guidance, so like tremendous amount of work being done, that still won’t cover all the products. And so there’s still going to be a tremendous number of products that are sitting outside of those 15 vertical standards. And so again, the question will be like, but what does that mean for all the other products that are not in these 15?

CRob (24:43.431)
So as we wind down, what thoughts do you have? What do you want our audience to take away? What actions would you like them to take based off this conversation?

Stephanie Domas (24:55.825)
Yeah, depending on your role in the space. mean, I’m going to go ahead and plug open SSF, right? So the issues that we talked about, like what does SBOM mean as open source starts to become divergent, right? All of these things, I think will only be solved in organizations like the open SSF, right? Who has this bringing all the collective parties together to think about what it means. Same with some of these gray areas I mentioned about in the EU CRA and what does that mean? We need those types of organizations. if anyone listened to this and was like, well, maybe I have ideas on how to solve because all Stephanie did was complain about the problem, but I want to be a part of the solution. Look at joining the organizations like OpenSSF. That’s where the solutions will come from. Right. I can see my side of the problem, but I can’t, I can’t, even if I had ideas, I can’t independently solve it. Organizations like OpenSSF can do.

CRob (25:44.902)
Right.

It’s very much a community set of issues and I think collectively we’re stronger in having a better solution together. My friend, I need to arrange an ice cream social event in the near future before we’re covered in snow. But thank you for your time Stephanie and thank you to you and the team over at Canonical for all your hard work. with that, we’re going to call this a wrap. Thank you everybody.

Stephanie Domas (25:56.805)
Agreed.

CRob (26:17.019)
Happy open sourcing out there.

Nov 04

Love0

What’s in the SOSS? Podcast #44 – S2E21 A Deep Dive into the Open Source Project Security (OSPS) Baseline

By OpenSSF Podcast

Summary

In this episode of “What’s in the SOSS,” CRob, Ben Cotton, and Eddie Knight discuss the Open Source Project Security Baseline. This baseline provides a common language and control catalog for software security, enabling maintainers to demonstrate their project’s security posture and fostering confidence in open source projects. They explore its integration with other OpenSSF projects, real-world applications like the GUAC case study, and its value to maintainers and stakeholders. The role of documentation in security is emphasized, ensuring secure software deployment. The effectiveness of the baseline is validated through real-world applications and refined by community feedback, with future improvements focusing on better tooling and compliance mapping.

Listen on Apple Podcasts Listen on Spotify Listen on Overcast Listen on Pocket Casts

Conversation Highlights

00:00 Welcome & Introductions
02:40 Understanding the Open Source Project Security Baseline
05:54 The Importance of Defining a Security Baseline
08:49 Integrating Baseline with Other OpenSSF Projects
11:42 Real-World Applications: The Glock Case Study
14:21 Value for Maintainers and Other Stakeholders
17:29 The Role of Documentation in Security
20:37 Future Directions for the Baseline and Orbit
23:26 Community Engagement and Feedback

Episode Links

Transcript

CRob (00:11.23)
Welcome, welcome, welcome to What’s in the SOSS, where we talk to upstream maintainers, security experts, and just generally interesting luminaries throughout the upstream open source ecosystem. My name’s CRob. I’m the security architect for the OpenSSF, and I’m also your host for What’s in the SOSS. Today, we have two amazing friends of the show, amazing community members and developers and contributors across the open source ecosystem. So I want to welcome Eddie and Ben. Do you want to take a moment just to introduce yourselves and kind of explain what your deal is?

Eddie (01:02)
Yeah, my deal is I am in Amsterdam with you at 9 AM with a completely different energy level than you have right now. I am loving this. This is this is awesome. Eddie Knight from Sonatype. I do a lot of work across the Linux Foundation related to security compliance.

Ben (01:20)
I’m Ben Cotton. I’m the open source community lead at Kusari. I’m the leader of the OSPS Baseline SIG and a member of the Orbit Working Group.

CRob (01:29)
Awesome. Great talks today. We’re going to be diving into the OSPS Baseline, the catalog, and ORBIT, GUAC, and a whole bunch of different topics. So let’s set the stage here, gentlemen. The Baseline. Folks have been hearing about this off and on over the last few months, but maybe we could explain this in plain English, like what is the Open Source Project Security Baseline and talk about the catalog?

Eddie (01:57)
All right, I’ll let Ben give the official answer since he’s the project lead for it. Baseline’s a control catalog that helps maintainers and consumers of software have a clear definition of good for their projects and their project security. Ben, you want to give a more real answer?

Ben (02:16)
Yeah, I mean, it’s what it says on the tin, right? It’s a baseline for open source projects to follow in terms of their security hygiene. And it’s not really about the software they’re producing. It’s about the practices that are used to build that software. So the analogy that I recently came up with as we were going back and forth on this is it’s like health department regulations that say, “You have to wash your hands. You can’t pick up the food off the floor and then give it to the customer.” It doesn’t say that the quality of your recipe has to taste good. But you have to use secure practices. So we’ve developed a catalog of controls at three different tiers, the idea being that new projects, small projects, projects that are more trivial in nature just have like a sort of a bare minimum of like, yeah, everyone’s got to do this. Everyone needs to wash their hands before they start cooking food.

CRob (03:14)
I appreciate that.

Ben (03:15)
And that is important for SOSS, right?

CRob (03:18)
Right.

Ben (03:18)
As you go up the levels, know, go up to level three, like that’s really big projects that are, you know, lots of contributors, typically well resourced, at least relative to other open source projects and really important to the world of technology as we know it. And so those have to have the tightest security requirements because they’re rich targets for attackers.

With the baseline, the motivation is like, this is not a list of all the cool things you could do. It’s do this. One of the requirements we have is there is no should – there is only must. Because we don’t want to be having maintainers spending a lot of time chasing down all these things that they could do. We understand that open source maintainers, which we are, are often volunteers doing stuff in their spare time without necessarily any real security training. And so we need to give them straightforward things that are meaningful to enhancing their project security posture.

CRob (04:27)
This is, think, the first time ever on the planet anyone has ever referred to security as cool. Thank you, sir. I appreciate that as a longtime securityologist. So let me ask you, gentlemen both, why do you think it’s so important to define this baseline? And why is that important for open source projects?

Ben (04:47)
So I think the most important thing that’s been coming up in conversations I’ve had with people here in Amsterdam and other places is like, It gives us a common language. Go be more secure is not helpful. It doesn’t tell anyone anything.

And with baseline, especially with these different levels, you can say, our project is secure. We meet baseline level one. We meet baseline level two. Now there’s a common language. We all can know what that means because there is an explicitly defined catalog that says what these levels are. And then conversely, If I’m a vendor or manufacturer of some product and I use an open source project and I want them to be more secure because I have regulatory obligations, I can go to them and say, I really need you to be at baseline level two. We can help you with these specific tasks. And now we’re talking the same language. We have this common understanding of what this means as opposed to you’re not secure enough or you need to be more secure.

CRob (05:52)
Love that.

All right, so from your perspectives, and I think you might have touched on this a little bit, Ben, but how do we think that the baseline makes it easier for maintainers and developers who are already so busy with just their general feature requests and bug burn down?

Eddie (06:09)
So we started this journey a long time before we ever started saying baseline, right. My very first interaction with CNCF before I ever did any commits on OpenSSF, I was just kind of like, maybe attending a call here and there. We were doing this, it was like a security bug bash. And maybe we had called it the slam by this point. We wanted to solve for folks who were doing really cool stuff and everybody in the conversation knew that their stuff was being built well and properly and everybody’s washing their hands and stuff like that, right? But we didn’t have a way to demonstrate that outward and say like, hey, this project is running a clean kitchen. You should trust this more than just a random, you know street food vendor, whatever the open source equivalent of that is.

We want to boost confidence in the projects in our ecosystem. And, back then we had the CLO monitor because it was just for CNCF. And there was this set of rules of like, these are the things that we expect CNCF projects to do. And when we could go to a project and say, and I would pull out my phone on the KubeCon floor and be like, click through, type in your project name, pull it up, see like, this is where you’re scoring right now, right? And the scoring part brings all of its own baggage. But the point is like, there’s this list, right? And they’re like, that’s all you need? That’s it? That’s all you needed me to do, right? And so we had projects that were able to increase their own like personal maintainer confidence in their project. Like, oh man, I’m actually doing a really good job here.

All I needed to do was like shift this, rename this section header in a file so it could be detected. And now people see that I’m actually doing this stuff. And so you’re dramatically boosting our own like confidence in our work, but then you’re also boosting the public confidence in it. And this source is just having a list, right? Now that list for CNCF is not, it did not prove to be scalable and compatible with the broader ecosystem. It’s like, well, we don’t do this, we don’t do that.

So having baseline is a way of saying, let’s get that list, let’s get those wins that we experienced within the CNCF and make that possible for everybody by making it this like, not just CNCF, but agnostic baseline of controls that are good for projects.

CRob (08:53)
And those of us that have come from enterprise computing, the term baseline is very common practice as you’re deploying systems and applications. There’s a checklist of things you have to do before that system is able to go live. I love it. So thinking about the catalog, I realized that we have a lot of other assets within the OpenSSF, a lot of great projects and guidance. Could you maybe touch on some of the other projects within the OpenSSF that Baseline works with / integrates to?

Ben (09:21)
Yes. Yeah. So there is this whole working group called ORBIT in which Baseline sits. And it’s really about generating some of the tooling. So we use Gemara to sort of the scaffolding, I guess, for the control catalog. And it’s a software project that provides a framework to build control catalogs on. We do that. We’re working on tooling to automate some of the baseline evidence collection to make it easier for maintainers to you know, quickly evaluate where they are and what tasks they need to do to achieve their desired level. There’s a very smart man who has done a lot of mapping. This CRob guy has done a lot of work to map baseline to existing things like the best practices badge, as well as other external frameworks like the CRA.

CRob (10:29)
I’ve heard of that.

Ben (10:30)
Right?! Various NIST guidance, you know, really kind of make it so that, you know, baseline gives you not just, you know, confidence in your security posture, but then also gives you pointers to, you know, these more regulatory kind of control catalogs, where if you have a vendor coming to you and saying, hey, we need you to be secure, you can say, well, here, here’s what we meet. Here’s the list of things now you know. You know, so we really try, you know, we want to make sure that baseline is a part of an ecosystem and not just this really good idea that we have off in the corner that is sort of academic.

CRob (11:14)
That’s that’s excellent. That actually helps me pivot to my next set of questions. Let’s move out of the ethereal world of theory and talk about some real world applications of this. We just recently released a case study where we worked with another OpenSSF project named GUAC. And I just loved reading this case study. Could you maybe walk us through what the project was trying to prove and how baseline helped the GUAC project?

Eddie (11:44)
Yeah, that one was actually remarkably easy because all I had to do was yell at Ben and then it was suddenly done. [Crob laughing]

Ben (11:53)
So, you know, with that case study, we had the advantage of I’m a contributor to GUAC and then also as the baseline SIG Lead, like there’s some good overlap there. You know, so really what we were looking at is, you know, sort of a two pronged approach. One, you GUAC, the graph for understanding artifact composition, is a software supply chain security project. It would be really bad if it were, say, compromised in a supply chain incident, right? So, when you’re a security project, you have to have your own house in order. And so, you know, from the beginning, the project has really been done with that in mind. But we want to see, like, you know, validate our assumptions. Like, are we actually doing these things that, you know, are sort of the best practices

CRob (12:41)
Make sense.

Ben (12:43)
And then also, like, you know, from the baseline perspective, we want to get that real world, like here’s an actual project using it. What are the things that are unclear? What are the things that makes that don’t make sense? What are the things that are really easy? And so, you know, with that, we were, was able to use the Linux foundations, LFX insights, now has some automated

evidence collection. And so that, you know, was able to mark up a lot of boxes off right away. Some things are like, well, that’s just how GitHub works. So check, check, check. And so in the space of an hour or so, I was able to do…

CRob (13:35)
An hour?!

Ben (13:26)
An hour. I got level one and level two almost done. There were like four or five things where I was like, I’m not sure if we as a project would consider this sufficient or in a couple cases like we don’t actually document our dependency selection process. There is one, we don’t just add things William Nilliam, but you know we just need to write that down because as new contributors come on they need to know too and so like you know it was the amount of work that actually needed to be done to check the boxes off was really low. Which was very you know good news for me on both sides because I was gonna be the one doing the work.

And I’m the one trying to tell people like, you should use baseline to evaluate your project security. And so we really would love to have more projects do that sort of initial run and give us that feedback and help us. We spent a lot of time as a SIG with probably two dozen people at least have been involved over the last two years.

Coming up with these ideas, debating, you know, what needs to be included, what needs to be excluded. Eddie and I recently spent several hours coming up with definitions of words that were very precise so that we could be very clear and unambiguous. Like when we say project, this is what we mean in this context. and, we’ve tried very hard to keep this as a not a security wish list, but like a practical set of things for real world project maintainers. But, even with dozens of people involved, that’s only dozens of experiences. We want this to be something that’s useful to all eleventy billion open source projects out there.

So we need some more like real world do this, come back and tell us, hey, this doesn’t make sense. “This really is not a fit.” “My project can never do this because” – that kind of information.

CRob (15:36)
That’s awesome. From your perspective, as a maintainer, not that you’ve gone through this for GUAC, as a maintainer, how does that add value to you? What are you hoping to leverage out of that experience beyond the project itself, but as a GUAC maintainer, what are you hoping to gain from going through this baseline process?

Ben (15:58)
Well, I think the first thing is that it just gives confidence that like, yep, we’re doing the right things. We are doing what we can to reasonably protect ourselves from security incidents, particularly supply chain compromise, because GUAC isn’t running a service or anything.

And then, you know, being able to build on that. And then, you know, if, you know, we get emails like, Daniel Stenberg gets from, you know, car manufacturers and stuff like that, you know, we can, you know, just be like, yep, here’s our, our baseline level go have fun – (Daniel, if you’re listening, I would love for the cURL project to try out the baseline) and then you can just be like Yep, here’s my statement that we meet this baseline level as of this date. Have fun. If you want more, send me a check.

CRob (16:59)
So Eddie, we’ve talked about the maintainer a lot. But let’s talk about some other personas that are involved with the baseline and might care. From like an OSPO or an executive or security engineer perspective, what do you see the value of a project going through and having achieved some level of baseline.

Eddie (17:20)
Oh yeah. I mean, any metric badge check mark, right? It’s always helpful because going off of the number of GitHub stars only gets you so far.

CRob (17:35)
Right.

Eddie (17:36)
Especially now, we see that there’s actually an inverted relationship between stars and security for Hugging Face projects.

CRob (17:46)
Huh, really?

Eddie (17:47)
Yeah. Like there’s like somebody, well damn, now I’m gonna have to like find the research and actually show it to you to back my claim up. But yeah, was a little while ago somebody posted something where they found that it used to be more stars is more eyes. More eyes is more critiques. More critiques is more security, right? But for like ML projects, these kinds of things that you find on Hugging Face are the folks who are doing something fishy are pretty good at spoofing stars.

CRob (18:27)
Gaming the system, huh? I don’t like that. That makes me sad. And angry.

Eddie (18:33)
Yeah. And it’s like the more fishy that their thing is, the better their skill set is at spoofing stars. So it’s just kind of a weird thing. So when we have something like the best practices badge, Like, CNCF loves that, like the TOC loves that. Within TAG security and compliance, we obviously also love, it was not meaning to be a contrast statement. You like shook your head, you’re like, what, do you guys disagree? No, we don’t disagree. But there is also this desire to have something that is a little bit more fleshed out, right, which is why we were like, real big on CLO monitor and things like that. So the more fidelity that the badge has the more interesting it is. But I mean anything anything that can help accelerate that selection process is really helpful for the like The OSPO type of personality that you’re talking about.

CRob (19:37)
It’s been interesting kind of working with these projects and then being like a downstream consumer it there are many tasks within application development and app sec that are very difficult to measure. And some things are, I can verify what your settings are in your source forge. I can validate if you’re using multi-factor authentication or not. But there’s like just some tasks that are very difficult. And I’m excited that it’s not a solved problem yet, but the team has a lot of great ideas. And I think things like using security insights and other tools, to help provide that additional evidence showing that yes, here’s our policy. And a lot of the baseline encompasses some tasks that developers don’t always love, which is things like documentation.

Eddie (20:36)
Yeah, we have a lot of documentation checks. That is the number one question that we get, which is a fair question set. But one of the most common question sets is just like, what does documentation have to do with security?

CRob (20:49)
So Eddie, what does documentation have to do with security?

Eddie (20:53)
This is one of those situations where I actually struggle to answer at first. I have an answer. But the first 10 seconds is me going, why is this even? Isn’t it obvious? This is obvious, right? And then I look around the room and it’s like, it’s not obvious. OK. So there’s a couple different types of documentation that we need. So we need the things that you would put in a SECURITY.MD.

Just where do I go if I find a bug, if I find a security vulnerability? Who should I contact? Where should I post this information? What should I expect back from me? Those types of things. But then there’s also stuff if I’m using the project. If I need to run GUAC, Is GUAC secure by default? Is everything locked down when I turn it on? So it might be a little bit harder to turn on and deploy into my cloud infrastructure or whatever, but I don’t need to worry about it. Or is it the opposite? Is it insecure by default? Because almost all projects are insecure by default. The goal is to get more users. So you make it easy to deploy. And that means that when you turn this on, it’s going to have root access to something, it’s gonna have some network traffic that would not be recommended for prod, things like that. And so if we don’t have clear, easily accessible documentation with like a link that people know how to get to that kind of thing, like if this isn’t created and it’s not in a place that people know about it, then you’re actually deploying software that can be secure, but in practicality for users, there’s a high likeliness that they’re going to deploy it in securely. So you might have done your job, but people aren’t gonna be using it in the secure fashion because you haven’t documented it well enough or made it available or clear to them. And those are just like the two that come straight to mind. Like there’s a few different documentation needs that we have.

Ben (23:00)
And some of that, the documentation controls too are around like project policy in terms of, and I mentioned the dependency selection process. you can’t rely on, well, everyone knows this because one, people forget, two, if it’s not written down, everyone knows it, but everyone might know a slightly different thing. And then, you three, hopefully you’re bringing new contributors into your community. They need a place to learn about these things. And so, you know, having some of those things like, you know, we look for dependencies, you know, we prefer that they are actively maintained that they have, you know, maybe an OpenSSF Scorecard score above a certain threshold or like maybe there’s an advanced algorithm you use to mix a bunch of things together and then figure out, you know, maybe, you know, if it’s, you know, a project within an ecosystem, you don’t pull in just random things off of package repository, you have an internal repository that you mirror things into to protect from things like that right but if that’s not written down if that’s not you know clearly documented for the people who need it it’s not going to get followed.

CRob (24:15)
So let’s get out our crystal balls and look into the future. You know what do you guys see for orbit the catalog and just this general let’s work in general?

Eddie ()
What do we see for the future? So we’ve right now we’ve stabilized the control catalog, I would like to, I would like to make it a Gemara complete control catalog, right? So it lists out the capabilities, the threats and the controls, right? Because we’ve written a threat informed control catalog, but we haven’t captured and documented, what threats are we mitigating with this? So I think that’d be pretty cool. How close are we to doing that? I don’t know.

The other thing is just getting, more people to actually demonstrate compliance with the controls? think most projects, especially in the LF, are gonna be predominantly compliant already. Like you’ve already done all this stuff. We just want to be able to tell everybody that you’ve done it.

CRob (25:16)
Get credit for your homework.

Eddie (25:18)
Yeah, we wanna give you credit for this, right. And so that’s gonna be a big lift is going through and doing that hour of work with GUAC. Doing that hour of work with all of these different projects kind of adds up. So that’s gonna be something that I hope happens very soon. Within CNCF, we did it in the security slam fashion, right? So OpenTelemetry, Meshery, OSCAL-COMPASS, and Flux actually, were all part of that in the spring. And that went pretty well. Where the breakdown happened was on multi-repo projects like OpenTelemetry. I think it was 70 repos.

Yeah, like a lot of repos. think Kubernetes is double that, right? Yeah. So when you have so many different repos and we need to go in and say, here’s where the documentation is for this repo. Here’s where the policy is for this repo, right? It gets a little bit bumpy. And I think there’s still some room for improvement on how we’re capturing and doing those evaluations. say, I think I have a backlog. I know. There’s improvement on that.

But as more people are going about that and giving feedback, like Ben comes and says, this is where something took 20 minutes, but I expected it to take five. Then we can actually make those improvements and improve our backlog, refine our backlog a little bit.

Ben (26:51)
Yeah, and I would, know, to Eddie’s point and you mentioned earlier, CRob, but we do not have fully complete tooling to measure all the measurable things yet. And so that’s an area that the Orbit Working Group is working on as a group. We’ve also had some sort proto discussions about having a catalog of examples. What does a dependency selection policy look like? What does this documentation thing look like?

In baseline itself on my backlog includes like just going out real world example, you know, from Fedora, from curl, from Kubernetes, from wherever, like here are some things that look like what we would suggest you have. And then, you know, ideally, I think we’d also want to have a project that is just templates for each of these things that are templatizable. Like you don’t have, you know, so code of conduct licenses, those are pretty well established.

A lot of this other stuff like what what is sort of like the platonic ideal of a security MD file? What is you know the best dependency selection policy that people can just you know do a choose your own adventure? I want this this this put it together. This is what makes sense for my project. Here you go. It’s no it’s of no use to anyone to have everybody writing this from scratch over and over again, especially if they’ve not seen an example of it before.

CRob (28:21)
So as we wind down here. are the calls to action do you have for the community or whether it’s developers in the OpenSSF or just kind of unaffiliated maintainers? What would you like folks to take away from this?

Ben (28:37)
I would love them to look at the open source project security baseline, baseline.openssf.org and evaluate your project against it and give us feedback. What worked? What didn’t? What do you think? Why isn’t this there? We want this real world feedback on the control catalog so we can make sure it is actually fit for the purpose we’ve designed it for. So for me personally, that’s the biggest takeaway I want from people listening to this.

Eddie (29:09)
Complain loudly. That’s what I want. We are trying to create an accelerator. We’re trying to improve the ecosystem. We’re trying to improve the lives of maintainers. And any single place where this is slowing down a maintainer, that is outside of intent. That is a design flaw of some kind. If this is slowing you down, if this is confusing, if you’re getting pushback from some end user who now thinks that you’re doing worse than before you started, before baseline existed, right? We heard that feedback from somebody. It’s like, hey, LFX Insights turned on their scanner, and now I have a user who thinks that our project’s doing a bad job with security. And it’s like, oh, well, that didn’t meet expectations.

CRob (30:00)
That was an unintended consequence.

Eddie (30:02)
Yeah. And it was that perception was inaccurate. The tests were accurate but imprecise, right? They nailed exactly what the tests were trying to do. They were very, very, very much there, but not, they weren’t aiming in the right direction, right? And so we refined like, okay, let’s zone that in, move it closer to the bullseye on what we’re trying to achieve. And I think we’re getting a lot better at that. But that’s because somebody came and ruffled our feathers and was like, hey, you’re not doing what you said you’re trying to do. we thought we were.

CRob (30:43)
Right.

Ben (30:45)
Yeah. And I just want to point out that the baseline is itself an open source project with open public meetings, pull requests welcome. We truly do want feedback and contribution from people who have tried things out or don’t understand. I shared very early on on my social media accounts and a guy I know came back and was like, we could never meet this. And it turns out the wording was just awful. We did not make this clear at all. And yeah, we fixed that. it’s like, OK, we went back and forth a few times. All right, this is our intent. We have now captured it well. And I think the wording is a lot better on that because people were confused and asked questions.

CRob (31:30)
Well, and to your patches welcome comment, we’ve had decent engagement with open source maintainers. I would love to see us have more downstream GRC security people giving us feedback from your perspectives. What other compliance regimes or laws would you like to see? And did we get our compliance mapping right? Is it spot on? Does it speak to the needs you need to have to defend yourselves against auditors and regulators?

Well, Eddie and Ben, two amazing community members, friends of the show here. Thank you for your time. Thank you for all you do across your fleet of open source projects that you contribute to and maintain. And with that, we’re going to call it a day. I want to wish everybody a wonderful day here from sunny Amsterdam. And happy open sourcing. Bye, everybody.

Eddie (32:23)
Thanks CRob.

Ben (32:24)
Thanks, CRob.

Oct 21

Love0

What’s in the SOSS? Podcast #43 – S2E20 Building Trust in Open Source: Seth Larson’s Journey from Maintainer to Security Leader

By OpenSSF Podcast

Seth Larson, Security Developer-in-Residence at the Python Software Foundation, joins What’s in the SOSS? to discuss trust, documentation, and the evolution of secure-by-default practices in open source.

Oct 16

Love0

What’s in the SOSS? Podcast #42 – S2E19 New Education Course: Secure AI/ML-Driven Software Development (LFEL1012) with David A. Wheeler

By OpenSSF Podcast

Summary

In this episode of “What’s In The SOSS,” Yesenia interviews David A. Wheeler, the Director of Open Source Supply Chain Security at the Linux Foundation. They discuss the importance of secure software development, particularly in the context of AI and machine learning. David shares insights from his extensive experience in the field, emphasizing the need for both education and tools to ensure security. The conversation also touches on common misconceptions about AI, the relevance of digital badges for developers, and the structure of a new course aimed at teaching secure AI practices. David highlights the evolving nature of software development and the necessity for continuous learning in this rapidly changing landscape.

Listen on Apple Podcasts Listen on Spotify Listen on Overcast Listen on Pocket Casts

Conversation Highlights

00:00 Introduction to Open Source and Security
02:31 The Journey to Secure AI and ML Development
08:28 Understanding AI’s Impact on Software Development
12:14 Myths and Misconceptions about AI in Security
18:24 Connecting AI Security to Open Source and Closed Source
20:29 The Importance of Digital Badges for Developers
24:31 Course Structure and Learning Outcomes
28:18 Final Thoughts on AI and Software Security

Episode Links

Transcript

Yesenia (00:01)
Hello and welcome to What’s in the SOSS, OpenSSF podcast where we talk to interesting people throughout the open source ecosystem. They share their journey, expertise and wisdom. So yes, I need said one of your hosts and today we have the extraordinary experience of having David Wheeler on a welcome David. For those that may not know you, can you share a little bit about your row at the Linux Foundation OpenSSF?

David A. Wheeler (00:39)
Sure, my official title is actually probably not very illuminating. It says it’s the direct, I’m the director of open source supply chain security. But what that really means is that my job is to try to help other folks improve the security of open source software at large, all the way from it’s in someone’s head, they’re thinking about how to do it, developing it, putting it in repos, getting it packaged up, getting it distributed, receiving it just all the way through. We want to make sure that people get secure software and the software they actually intended to get.

Yesenia (01:16)
It’s always important, right? You don’t want to open up a Hershey bar that has no peanuts in the peanuts, right? So that was my analogy for the supply chain security in MySpace. Because I’m a little sensitive to peanuts. I was like, you know, you don’t want that.

David A. Wheeler (01:22)
You

David A. Wheeler (01:31)
You don’t want that. And although the food analogy is often pulled up, I think it’s still a good analogy. If you’re allergic to peanuts, you don’t want the peanuts. And unfortunately, it’s not just, hey, whether or not it’s got peanuts or not, but there was a scare involving Tylenol a while back. And to be fair, the manufacturer didn’t do anything wrong, but the bottles were tampered with by a third party.

Yesenia (01:40)
Mm-mm.

David A. Wheeler (01:57)
And so we don’t want tampered products. We want to make sure that when you request an open source program, it’s actually the one that was intended and not something else.

Yesenia (02:07)
So you have a very important job. Don’t play yourself there. We want to make sure the product you get is the one you get, right? So if you don’t know David, go ahead and message him on Slack, connect with him. Great gentleman in the open source space. And you’ve had a long time advocating for secure software development in the open source space. How did your journey lead to creating a course specifically on secure AI and ML driven development?

David A. Wheeler (02:36)
As with many journeys, it’s a complicated journey with lots of whens and ways. As you know, I’ve been interested in how do you develop secure software for a long time, decades now, frankly. And I have been collecting up over the years what are the common kinds of mistakes and more importantly, what are the systemic simple solutions you can make that would prevent that problem and eliminating it entirely ideally.

Um, and over the years it’s turned out that in fact, for a vast number, for the vast majority of problems that people have, there are well-known solutions, but they’re not well known by the developers. So a lot of this is really an education story of trying to make it so that software developers know how to do things. Now it’s a fair, you know, some would say, some would say, well, what about tools? Tools are valuable. Absolutely.

If to the extent that we can, we want to make it so that tools automatically do the secure thing. And that’s the right thing to do, but that’ll never be perfect. And people can always override tools. And so it’s not a matter of education or tools. I think that’s a false dichotomy. It’s you need tools and you need education. You need education or you can’t use the tools well as much as we can. We want to automate things so that they will handle things automatically, but you need both. You need both.

Now, to answer your specific question, I’ve actually been involved in and out with AI to some extent for literally decades as well. People have been interested in AI for years, me too. I did a lot more with symbolic based AI back in the day, wrote a lot of Lisp code. But since that time, really machine learning, although it’s not new, has really come into its own.

And all of a sudden it became quite apparent to me, and it’s not just me, to many people that software development is changing. And this is not a matter of what will happen someday in the future. This is the current reality for software development. And I’m going to give a quick shout out to some researchers in Stanford. I’ll have to go find the link. So who basically did some, I think some important studies related to this.

David A. Wheeler (04:59)
When you’re developing software from scratch and trying to create a tiny program, the AI tools are hard to beat because basically they’re just creating, know, they’re just reusing a template, but that’s a misleading measure, okay? That doesn’t really tell you what normal software development is like. However, let’s assume that you’re taking existing programming and improving it, and you’re using a language for which there’s a lot of examples for training. Okay, we’re talking Python and Java and, you know, various widely used languages, okay?

David A. Wheeler (05:28)
If you do those, turns out the AI tools can generate a lot of code. Some of it’s right. So that means you have to do more rework, frankly, when you use these tools. Once you take the rework into account, they’re coming up with a 20 % improvement in productivity. That is astounding. And I will argue that is the end of the argument. Seriously, there are definitely, there are companies where they have a single customer and the customer pays them to write some software. If the customer says never use AI, fine, the customer’s willing to pay 20 % improvement, I will charge that extra to them. But out in most commercial open source settings, you can’t throw, you can’t ignore a 20 % improvement. And that’s current tech, that’s not future tech. mean, the reality is that we haven’t seen improvements like this since the switch from hut, from assembly to high level languages, the use of, you know, the use of structure programming, I think was another case we got that kind. And you can make a very good case that open source software was that was a third case where you got that digital, productivity. Now you could also argue that’s a little unfair because open source didn’t improve your ability to write software. makes you didn’t have to write the software.

David A. Wheeler (06:53)
But that’s okay. That’s still an improvement, right? So I think that counts. But for the most part, we’ve had a lot of technologies that claim to improve productivity. I’ve worked many over the years. I’ve been very interested in how do you improve productivity? Most of them turned out not to be true. I don’t think that’s true for AI. It’s quite clear for multiple studies. mean, not all studies agree with this, by the way, but I think there’s enough studies that there’s a productivity improvement.

David A. Wheeler (07:21)
It does depend on how you employ these, that, but you know, and they’ll get better. But the big problem now is everyone is on the list. This is a case where everyone, even if you’re a professional and you’ve been doing software development for years, everybody’s new at this part of the game. These tools are new. And the problem here is that the good news is that they can help you. The bad news is they can harm you. They can do.

David A. Wheeler (07:50)
They can produce terribly insecure software. They can also end up being the security vulnerability themselves. And so we’re trying to get ahead of the game and looking around what’s the latest information, what can we learn? And it turns out there’s a lot that we can learn that we actually think is gonna stay on the test of time. And so that’s what this course is, those basics they’re gonna apply no matter what tool you use.

David A. Wheeler (08:17)
How do you make it say you’re using these tools, but you’re not immediately becoming a security vulnerability? How is it so that you’re less likely to produce vulnerable code? And that turns out to be harder. We can talk about why that is, but that’s what this course is in a nutshell.

Yesenia (08:33)
Yeah, I know I had a sneak preview at the slide deck and I was just like, this is fantastic. Definitely needed it. And I wanted to take a moment and give a kudos to the researchers because the engine, the industry wouldn’t be what it is today without the researchers. Like they’re the ones that are firsthand, like try and failing and then somebody picks it up and builds it and it is open source or industry. then boom, it becomes like this whole new field. So I know AI has been around for a minute.

David A. Wheeler (09:01)
Yeah, let me add that. I agree with you. Let me actually separate different researchers because we’re building on the first of course, the researchers who created these original AI and ML systems more generally, obviously a lot of the LLM based research. You’ve got the research specifically in developing tools for developing, for improving software development. And then you’ve got the researchers who are trying to figure out the security impacts of this. And those folks,

Yesenia (09:30)
Those are my favorite. Those are my favorite.

David A. Wheeler (09:31)
I’m Well, we need all of these folks. But the thing is, what concerns me is that remarkably, even though we’ve got this is really new tech, we do have some really interesting research results and publications about their security impacts. The problem is, most of these researchers are really good about, you know, doing the analysis, creating controls, doing the research, publishing a paper, but for most people, publishing a paper has no impact. People are not going to go out and read every paper on a topic. That’s, know, they have work to do basically. So if you’re a researcher makes these are very valuable, but what we’ve tried to do is take the research and boil it down to as a practitioner, what do you need to know? And we do cite the research because

David A. Wheeler (10:29)
You know, if you’re, if you’re interested or you say, Hey, I’m not sure I believe that. Well, great. Curiosity is fantastic. Go read the studies. there’s always limitations on studies. We don’t have infinite time and infinite money. but I think the research is actually pretty consistent. at least with Ted Hayes technology, we, can’t guess what the great grand future holds.

David A. Wheeler (10:55)
But I’m going to guess that at least for the next couple of years, we’re going to see a lot of LLMs, LLM use, they’re going to build on other tools. And so there’s things that we know just based on that, that we can say, well, given the direction of current technology, what’s okay, what’s to be concerned about? And most importantly, what can we do practically to make this work in our favor? So we get that 20 % because we’re going to want it.

Yesenia (11:24)
Yeah, at this point in time, we’re seedling within the AI ML piece. What you said is really, really important. It’s just like, so much more to this. There’s so much more that’s growing. And I want to take it back to something you had mentioned. You’re talking about the good that is coming from the AI ML. And there is the bad, of course. And for the course that you’re coming out, what is one misconception about AI in the software development security that you hope that this course will shatter? What myth are you busting?

David A. Wheeler (11:53)
What myth am I busting? I guess I’m going to cheat because I’m going to respond with two. It’s by the fact that I actually can count. guess, okay, I’m going to turn it into one, which is, guess, basically either over or under indexing on the value of A. Basically expecting too much or expecting too little. Okay, basically trying to figure out what the real expectations you should have are and not go outside that. So there’s my one. So let me talk about over and under. We’ve got people.

Yesenia (12:30)
Well, I’m going to give you another one because in software everything starts with zero. So I’ll give you another one.

David A. Wheeler (13:47)
Okay, all right, so let me talk about the under. There are some who have excessive expectations. We’ve got the, you know, I think vibe coding in particular is a symptom of this, okay? Now, there are some people who use the word vibe coding as another word for using AI. I think that’s not what the original creator of the term meant. And I actually think that’s not helpful because it’s a whole lot like talking about automated carriages.

Um, very soon we’re only going to be talking about carriages. Okay. Everybody’s going to be using automation AI except the very few who don’t. Okay. So, so there’s no point in having a special term for the normal case. Um, so what, what I mean by vibe coding is what the original creator of the term meant, which is, Hey, AI system creates some code. I’m never going to review it. I’m never going to look at it. I’m not going to do anything. I will just blindly accept it. This is a terrible idea. If it matters what the quality of the code is. Now there are cases where frankly the quality of the code is irrelevant. I’ve seen some awesome examples where you’ve got little kids, know, eight, nine year olds running around and telling a computer what to do and they get a program that seems to kind of do that. And that is great. I mean, if you want to do vibe coding with that, that’s fantastic. But if the code actually does something that matters, with current tech, this is a terrible idea. They’re not.

They can sometimes get it correct, but even humans struggle making perfect code every time and they’re not that good. The other case though is, man, we can’t ever use this stuff. I mean, again, if you’ve got a customer who’s paying you extra to never do it, that’s wonderful. Do what the customer asks and is willing to pay for it. For most of us, that’s not a reasonable industry position. What we’re going to need to do instead is learn together as an industry how to use this well. The good news is that although we will all be learning together, there’s some things already known now. So let’s run to the researchers, find out what they’ve learned, go to the practitioners, basically find what has been learned so far, start with that. And then we can build on and improve and go and other things. You don’t expect too much, don’t expect too little.

Yesenia (15:28)
Yeah, the five coding is an interesting one, because sometimes it spews out like correct code. But as somebody who’s written code and reviewed code and like done all this with the supply chain, I’m like. It’s like that extra work you gotta kind of add to it to make sure that you’re validating your testing it and it hasn’t just accidentally thrown in some security vulnerability in in that work. And I think that was. Go ahead.

David A. Wheeler (15:51)
What I can interrupt you, one of the studies that we cited, they basically created a whole bunch of functions that could be written either insecurely or securely as a test. Did this whole bunch of times. And they found that 45 % of the time using today’s current tooling, they chose the insecure approach. And there’s reason for this. ML systems are finally based on their training sets.

They’ve been trained on lots of insecure programs. What did you expect to get? You know, so this is actually going to be a challenge because when you’re trying to fight what the ML systems are training on, that is harder than going with the flow. That doesn’t mean it can’t be done, but it does require extra effort.

Yesenia (16:41)
We’re going extra left at that point. All right, so you your one and I gave you, know, one more because we started at zero. Any other misconception that is being bumped at the course.

David A. Wheeler (16:57)
Um, I guess the, uh, I guess the misconception sort is nothing can be done. And, uh, of course the whole course is a, uh, a, a, a stated disagreement with examples, uh, because in fact, there are things we can do right now. Now I would actually concede if somebody said, Hey, we don’t know everything. Well, sure. Uh, you know, I think all of us are in a life journey and we’re all learning things as we go. Uh, but that doesn’t mean that we have to, um, you know, just accept that nothing can be done. That’s a fatalistic approach that I think serves no one. There are things we can do. There are things that are known, though maybe not by you, but that’s okay. That’s what a course is for. We’ve worked to boil down, try to identify what is known, and with relatively little time, you’ll be far more prepared than you would be otherwise.

Yesenia (17:49)
It is a good course and I the course is aimed for developers, software engineers, open source contributors. So how does it connect to real world open source work like those that are working on closed source versus that open source software?

David A. Wheeler (18:04)
Well, okay, I should first quickly note that I work for the Open Source Security Foundation, Open Source is in the name, so we’re very interested in improving the security of open source software. That is our fundamental focus. That said, sometimes the materials that we create are not really unique to open source software. Where it can be applied by closed source software, we try to make that clear. Sometimes we don’t make it clear as we should, but we’re working on that.

Um, and frankly, in many cases, I think there’s also worth noting that, um, if you’re developing closed source software, the vast majority of the components you’re using are open source software. I mean, the average is 70 to 90 % of the software in a closed source system software system is actually open source software components. Uh, simple because it doesn’t make sense to rebuild everything from scratch today. That’s not a, an M E economically viable option for most folks. So.

in this particular case for the AI, it is applicable equally to open source and closed source. It applies to everybody. and this is actually true also for our LFD 121 course on how to develop secure software. And when you think about it, it makes sense. The attackers don’t care what your license is. just, you know, they just don’t care. They’re going to try to do bad things to you regardless of, of the licensing.

And so while certainly a number of things that we develop like, you know, the best practices badge are very focused on open source software, you know, other things like baseline, other things like, for example, this course on LFT 121, the general how to develop secure software course, they’re absolutely for open source and closed source. Because again, the attackers don’t care.

Yesenia (19:53)
Yeah, they just they just don’t they’re they’re actually just trying to go around all this like they’re trying to make sure they learn it so that they know what to do. Unfortunately, that’s the case. And this course you said it offers a digital badge. Why is this important for developers and employers?

David A. Wheeler (20:11)
Well, I think the short answer is that anybody can say, yeah, I learned something. But it’s, think, for, I guess I should start with the employers because that’s the easier one to answer. Employers like to see that people know things and having a digital badge is a super easy way for an employer to, to make sure that, yeah, they actually learn at least the basics of, you know, that topic. you know, certainly it’s the same for you know, university degrees and other things. You when you’re an employer, you want the, it’s very, important that people who are working for you actually know something that’s critically important. And while a degree or digital badge doesn’t guarantee it, it at least gives that additional evidence. For people, mean, obviously if you want, are trying to get employed by someone, it’s always nice to be able to prove that. But I think it’s also a way to both show you know something to others and frankly encourage others to learn this stuff. We have a situation now where way too many people don’t know how to do some what to me are pretty basic stuff. You know I’ll point back to the LFD 121 course which is how to develop secure software. Most colleges, most universities that’s not a required part of the degree. I think it should be.

David A. Wheeler (21:35)
But since it isn’t, it’s really, really helpful for everybody to know, wait, this person coming in, do they not, they’ve got this digital badge. That gives me much higher confidence going in as somebody I’m working with and that sort of thing, as well as just encouraging others to say, hey, look, I carried enough to take the time to learn this, you can too. And both LFD 121 and this new AI course are free, so that in there online so you can take it at your pace. Those roadblocks do not exist. We’re trying to get the word out because this is important.

Yesenia (22:16)
Yeah, I love that these courses are more accessible and how you touched on the students, like students applying for universities that might be more highly competitive. They’re like, hey, look, I’m taking this extra path to learn and take these courses. Here’s kind of like the proof. And it’s like Pokemon. It’s good to collect them all, know, between the badges and the certifications and the degrees.

I definitely that’s the security professional’s journeys. Collect them all at this point with the credibility and benefits.

David A. Wheeler (22:46)
Well, indeed, course, the real goal, of course, is to learn, not the badges. But I think that badges are frankly, you know, you collecting the gold star, there is nothing more human and nothing more okay than saying, Hey, I, you know, I got to I got a gold star for if you’re doing something that’s good. Yes, revel in that. Enjoy it. It’s fun.

David A. Wheeler (23:11)
And I don’t think that these are impossible courses by any means. And unlike some other things which are, know, I’m not against playing games, playing games is fun, but this is a little thing that’s both can be interesting and is going to be important long-term to not only yourself, but every who uses the code you make. Cause that’s gonna be all of us are the users of the software that all developers as a community make.

Yesenia (23:42)
Yeah, there’s a wide range impact from this, not just like even if you don’t create software, just understanding and learning about this, you’re a user to understanding that basic understanding of it. So I want to transition a little bit to the course because I know we’re spending the whole time about it. Let’s say I’m a friendly person. I signed up for this free LFEL 1012. Can you walk me through the course structure? Like what am I expected to take away from the course in that time period?

David A. Wheeler (24:09)
Okay, yeah, so let’s see here. So I think what I should do is kind of first walk through the outline. Basically, I mean, the first two parts unsurprisingly are introduction and some context. And then we jump immediately into key AI concepts for secure development. We do not assume that someone taking this course is already an expert in AI. I mean, if you are, that’s great. It doesn’t take, we’re not spending a lot of time on it, but we wanna make sure that you understand the basics, the key terms that matter for software development. And then we drill right into the security risks of using AI assistance. I want to make it clear, we’re not saying you can’t use just because something has a risk, everything has risks, okay? But understanding what the potential issues are is important because then you can start addressing them. And then we go through what I would call kind of the meat of the course, best practices for secure assistant use.

You know, how do you reduce the risk that the assistant itself doesn’t become subverted, it starts working against you, things like that. Writing more secure code with AI, if you just say, write some code, a lot of it’s gonna be insecure. There are ways to deal with that, but it’s not so simple or straightforward. For example, it’s pretty common to tell AI systems that, hey, I’m an expert in this topic and suddenly it gets better. That trick doesn’t work.

No, you may laugh, but honestly, that trick works in a lot of situations, but it doesn’t work here. And we’ve actually researched showing it doesn’t work. So there are things that work, but it’s more it’s, it’s more than that. And finally, reviewing code changes in a world with AI. Now, of course, involves reviewing proposed changes from others. And in some cases, trying to deal with the potential DDoS attacks as people start creating far more code than anybody can reasonably review. Okay. We’re have to deal with this. and frankly, biggest problem, frankly, the folks who are vibe coding, you know, they, they run some program. It tells them 10 things. I’ll just dump all 10 things at them. And no, that’s a terrible idea. you know, and the, the curl folks, for example, have had an interesting point where.

They complained bitterly about some inputs from AI systems, which were absolute garbage, wasted their time. And they’ve praised other AI submissions because somebody took the time to make sure that they were actually helpful and correct and so on. And then that’s fantastic. you know, basically you need to push back on the junk and then find and then welcome the good stuff. And then, of course, a little conclusion wrap up kind of thing.

Yesenia (27:01)
I love it. it was a good outline. was not seeing it. Is this like videos accomplished with it or is this just like a module click through?

David A. Wheeler (27:10)
Well, basically we, we group them into little chapters. forgot what their official term is. It’s chapters, section modules. I don’t remember what the right term is. I guess I should, but basically after you go that, then there’s a couple of quiz questions and then a little videos. Basically the idea is that we want people to get it quickly, but you know, if it’s just watch a video for an hour, people fall asleep, don’t remember anything. That’s the goal is to learn, not just, you know, sleep through a video.

David A. Wheeler (27:39)
So little snippets, little quiz questions, and at the end there’s a little final exam. And if you get your answers right, you get your badge. So it’s not that terribly hard. We estimate, it varies, but we estimate about an hour for people. So it’s not a massive time commitment. Do it on lunch break or something. I think this is going to be, as I said, I think this is going to be time well spent.

David A. Wheeler (28:07)
This is the world that we are all moving to, or frankly, have already arrived at.

Yesenia (28:12)
Yeah, I’m already here. think I said it’s a seedling. It’s about to grow into that big tree. Any last minute thoughts, takeaways that you want to share about the course, your experience, open source, supply chain security, all of the above.

David A. Wheeler (28:27)
My goodness. I’ll think of 20 things after we’re done with this, of course. well, no, problem is I’ll think about them later in French. believe it’s called the wisdom of the stairs. It’s as you leave the party, you come up with the point you should have made. so I guess I’ll just say that, you know, if you develop software, whether you’re not, whether you realize or not, it’s highly likely that the work that you do will influence many, many.

Yesenia (28:31)
You only get zero and one.

David A. Wheeler (28:54)
About many, many people, many more than you probably realize. So I think it’s important for all software developers to learn how to develop software, secure software in general, because whether or not you know how to do it, the attackers know how to attack it and they will attack it. So it’s important to know that in general, since we are moving and essentially have already arrived at the world of AI and software development, it’s important to learn the basics and yes.

Do keep learning. Well, all of us are going to keep learning throughout our lives. As long as we’re in this field, that’s not a bad thing. I think it’s an awesome thing. I wake up happy that I get to learn new stuff. But that means you actually have to go and learn the new stuff. And the underlying technology remark is, it’s actually remarkably stable in many things. This is a case though, where, yes, a lot of things change in the detail, but the fundamentals don’t. But this is something where, yeah, actually there is something fundamental that is changing. One time we didn’t use AI often to help us develop software. Now we do. So how do we do that wisely? And there’s a long list of specifics. The course goes through it. I’ll give a specific example so it’s not just this highfalutin, high level stuff. So for example,

Pretty much all these systems are based very much on LLMs, which is great. LLMs have some amazing stuff, but they also have some weaknesses. One is in particular, they are incredibly gullible. If they are told something, they will believe it. And if you tell them to read a document that gives them instructions on some tech, and the document includes malicious instructions, that’s what it’s going to do because it heard the malicious instructions.

David A. Wheeler (30:48)
Now that doesn’t mean you can’t use these technologies. I think that’s a road too far for most folks. But it does mean that there’s new risks that we never had to deal with before. And so there’s new techniques that we’re going to need to apply to do it well. And I don’t think they’re unreasonable. They’re just, you know, we now have a new situation and we’re have to make some changes because of new situation.

Yesenia (31:11)
Yeah, it’s like you mentioned earlier, like you can ask it to be an expert in something and then it’s like, oh, I’m an expert. That’s what I laughing. I was like, yeah, I use that a lot. I’m like, the prompt is you’re an expert in sales. You’re an expert in brand. You’re an expert in this. And it’s like, OK, once it gets in.

David A. Wheeler (31:25)
But the thing is, that really does work in some fields, remarkably. And of course, we can only speculate sometimes why LLMs do better in some areas than others. But I think in some areas, it’s quite easy to distinguish the work of experts from non-experts. And the work of experts is manifestly and obviously different. And at least so far, LLMs struggle to do that.

David A. Wheeler (31:54)
This differentiation in this particular domain. And we can speculate why but basically, the research says that doesn’t work. So don’t do that. Do there are other techniques that have far more success, do those instead. And I would say, hey, I’m sure we’ll learn more things, there’ll be more research, use those as we learn them. But that doesn’t mean that we get to excuse ourselves from ignoring the research we have now, even though we don’t know it.

David A. Wheeler (32:23)
We don’t know everything. We won’t know everything next year either. Find out what you need to know now and be prepared to learn more Seagull.

Yesenia (32:32)
It’s a journey. Always learning every year, every month, every day. It’s great. We’re going to transition into our rapid fire. All right, so I’m going to ask the question. You got to answer quiz and there’s no edit on this point. All right, favorite programming language to teach security with.

David A. Wheeler (32:56)
I don’t have a favorite language. It’s like asking what my children, know, which of my children are my favorite. I like lots of programming languages. That said, I often use Java, Python, C to teach different things more because they’re decent exemplars of those kinds of languages. But so there’s your answer.

Yesenia (33:19)
Those are good range because you have your memory based one, which is the see your Python, which more scripts in the Java, which is more object oriented. So you got a good diverse group.

David A. Wheeler (33:28)
Static type, right, you’ve got your static typing, you’ve got your scripting, you’ve got your lower level. But indeed, I love lots of different programming languages. I know over 100, I’m not exaggerating, I counted, there’s a list on my website. But that’s less impressive than you might think because after you’ve learned a couple, the others tend to not, they often are similar too. Yes, Haskell and Lisp are really different.

David A. Wheeler (33:55)
But most Burmese languages are not as different as you might think, especially after you’ve learned a few. So I hope can help.

Yesenia (34:01)
Yeah, the newer ones too are very similar in nature. Next question, Dungeon and Dragon or Lord of the Rings?

David A. Wheeler (34:11)
I love both. What are we doing? What are you doing to me? Yeah, so I play Dungeons and Dragons. I have watched, I’ve read the books and watched the movies many times. So yes.

Yesenia (34:24)
Yes, yes, yes. First open source project you ever contributed to.

David A. Wheeler (34:30)
Wow, that is too long ago. I don’t remember. Seriously, but it was before the term open source software was created because that was created much later. So it was called free software then. So I honestly don’t remember. I sure it was some small contribution to something somewhere like many folks do, but I’m sorry. It’s lost in the midst of times back in the eighties. Maybe. Yeah. The eighties somewhere, probably mid eighties.

Yesenia (35:00)
You’re going to go to sleep now. Like,

David A. Wheeler (35:01)
So, yeah, yeah, I’m sure somebody will do the research and tell me. thank you.

Yesenia (35:09)
There wasn’t GitHub, so you can’t go back to commits.

David A. Wheeler (35:11)
That’s right. That’s right. No, was long before get long before GitHub and so on. Yep. Carry on.

Yesenia (35:18)
When you’re writing code, coffee or tea?

David A. Wheeler (35:22)
Neither! Coke Zero is my preferred caffeine of choice.

Yesenia (35:26.769)
And this is not sponsored.

David A. Wheeler (35:28.984)
It is not sponsored. However, I have a whole lot of empty cans next to me.

Yesenia (35:35)
AI tool you find the most useful right now.

David A. Wheeler (35:39)
Ooh, that one’s hard. I actually use about seven or eight depending on what they’re good for. For actual code right now, I’m tending to use Claude Code. Claude Code’s really one of the best ones out there for code. And of course, five minutes later, it may change. GitHub’s not bad either. There’s some challenges I’ve had with them. They had some bugs earlier, which I suspect they fixed by now.

But in fact, I think this is an interesting thing. We’ve got a race going on between different big competitors, and this is in many ways good for all of us. The way you get good at anything is by competing with others. So I think that we’re seeing a lot of improvements because you’ve got competing. And it’s okay if the answer changes over time. That’s an awesome thing.

Yesenia (36:36)
That is awesome. That’s technology. And the last one, this is for chaos. GIF or GIF?

David A. Wheeler (36:42)
It’s GIF. Graphics. Graphics has a guck in it. And yes, I’m aware that the original perpetrator doesn’t pronounce it that way, but it’s still GIF. I did see a cartoon caption which said GIF or GIF. And of course I can hear it just reading it.

Yesenia (36:53)
There you have it.

Yesenia (37:05)
My notes is literally spelled the same.

David A. Wheeler (37:08)
Hahaha!

Yesenia (37:11)
All right, well there you have it folks, another rapid fire. David, thank you so much for your time today, for your impact contribution to open source in the past couple decades. Really appreciate your time and all the contributors that were part of this course. Check it out on the Linux Foundation website. then David, do you want to close it out with anything on how they can access the course?

David A. Wheeler (37:38)
Yeah, so basically the course is ecure AI/ML-Driven Software Development its numbers LFEL 1012 And I’m sure we’ll put a link in the show. No, I’m not gonna try to read out the URL But we’ll put a link in there to to get to it But please please take that course. We’ve got some other courses

Software development, you’re always learning and this is an easy way to get the information you most need.

Yesenia (38:14)
Thank you so much for your time today and those listening. We’ll catch you on the next episode.

David A. Wheeler (38:19)
Thank you.

Oct 07

Love0

What’s in the SOSS? Podcast #41 – S2E18 The Remediation Revolution: How AI Agents Are Transforming Open Source Security with John Amaral of Root.io

By Jeff Diecks Podcast

Summary

In this episode of What’s in the SOSS, CRob sits down with John Amaral from Root.io to explore the evolving landscape of open source security and vulnerability management. They discuss how AI and LLM technologies are revolutionizing the way we approach security challenges, from the shift away from traditional “scan and triage” methodologies to an emerging “fix first” approach powered by agentic systems. John shares insights on the democratization of coding through AI tools, the unique security challenges of containerized environments versus traditional VMs, and how modern developers can leverage AI as a “pair programmer” and security analyst. The conversation covers the transition from “shift left” to “shift out” security practices and offers practical advice for open source maintainers looking to enhance their security posture using AI tools.

Listen on Apple Podcasts Listen on Spotify Listen on Overcast Listen on Pocket Casts

Conversation Highlights

00:25 – Welcome and introductions
01:05 – John’s open source journey and Root.io’s SIM Toolkit project
02:24 – How application development has evolved over 20 years
05:44 – The shift from engineering rigor to accessible coding with AI
08:29 – Balancing AI acceleration with security responsibilities
10:08 – Traditional vs. containerized vulnerability management approaches
13:18 – Leveraging AI and ML for modern vulnerability management
16:58 – The coming “remediation revolution” and fix-first approach
18:24 – Why “shift left” security isn’t working for developers
19:35 – Using AI as a cybernetic programming and analysis partner
20:02 – Call to action: Start using AI tools for security today
22:00 – Closing thoughts and wrap-up

Episode Links

Transcript

Intro Music & Promotional clip (00:00)

CRob (00:25)
Welcome, welcome, welcome to What’s in the SOSS, the OpenSSF’s podcast where I talk to upstream maintainers, industry professionals, educators, academics, and researchers all about the amazing world of upstream open source security and software supply chain security.

Today, we have a real treat. We have John from Root.io with us here, and we’re going to be talking a little bit about some of the new air quotes, “cutting edge” things going on in the space of containers and AI security. But before we jump into it, John, could maybe you share a little bit with the audience, like how you got into open source and what you’re doing upstream?

John (01:05)
First of all, great to be here. Thank you so much for taking the time at Black Hat to have a conversation. I really appreciate it. Open source, really great topic. I love it. Been doing stuff with open source for quite some time. How do I get into it? I’m a builder. I make things. I make software been writing software. Folks can’t see me, but you know, I’m gray and have no hair and all that sort of We’ve been doing this a while. And I think that it’s been a great journey and a pleasure in my life to work with software in a way that democratizes it, gets it out there. I’ve taken a special interest in security for a long time, 20 years of working in cybersecurity. It’s a problem that’s been near and dear to me since the first day I ever had my like first floppy disk, corrupted. I’ve been on a mission to fix that. And my open source journey has been diverse. My company, Root.io, we are the maintainers of an open source project called Slim SIM (or SUM) Toolkit, which is a pretty popular open source project that is about security and containers. And it’s been our goal, myself personally, and as in my latest company to really try to help make open source secure for the masses.

CRob (02:24)
Excellent. That is an excellent kind of vision and direction to take things. So from your perspective, I feel we’re very similar age and kind of came up maybe in semi-related paths. But from your perspective, how have you seen application development kind of transmogrify over the last 20 or so years? What has gotten better? What might’ve gotten a little worse?

John (02:51)
20 years, big time frame talking about modern open source software. I remember when Linux first came out. And I was playing with it. I actually ported it to a single board computer as one of my jobs as an engineer back in the day, which was super fun. Of course, we’ve seen what happened by making software available to folks. It’s become the foundation of everything.

Andreessen said software will eat the world while the teeth were open source. They really made software available and now 95 or more percent of everything we touch and do is open source software. I’ll add that in the grand scheme of things, it’s been tremendously secure, especially projects like Linux. We’re really splitting hairs, but security problems are real. as we’ve seen, proliferation of open source and proliferation of repos with things like GitHub and all that. Then today, proliferation of tooling and the ability to build software and then to build software with AI is just simply exponentiating the rate at which we can do things. Good people who build software for the right reasons can do things. Bad people who do things for the bad reasons can do things. And it’s an arms race.

And I think it’s really both benefiting software development, society, software builders with these tremendously powerful tools to do things that they want. A person in my career arc, today I feel like I have the power to write code at a rate that’s probably better than I ever have. I’ve always been hands on the keyboard, but I feel rejuvenated. I’ve become a business person in my life and built companies.

And I didn’t always have the time or maybe even the moment to do coding at the level I’d like. And today I’m banging out projects like I was 25 or even better. But at the same time that we’re getting all this leverage universally, we also noticed that there’s an impending kind of security risk where, yeah, we can find vulnerabilities and generate them faster than ever. And LLMs aren’t quite good yet at secure coding. I think they will be. But also attackers are using it for exploits and really as soon as a disclosed vulnerability comes out or even minutes later, they’re writing exploits that can target those. I love the fact that the pace and the leverage is high and I think the world’s going to do great things with it, the world of open source folks like us. At the same time, we’ve got to be more diligent and even better at defending.

CRob (05:44)
Right. I heard an interesting statement yesterday where folks were talking about software engineering as a discipline that’s maybe 40 to 60 years old. And engineering was kind of the core noun there. Where these people, these engineers were trained, they had a certain rigor. They might not have always enjoyed security, but they were engineers and there was a certain kind of elegance to the code and that was people much like artists where they took a lot of pride in their work and how the code you could understand what the code is. Today and especially in the last several years with the influx of AI tools especially that it’s a blessing and a curse that anybody can be a developer. Not just people that don’t have time that used to do it and now they get to of scratch that itch. But now anyone can write code and they may not necessarily have that same rigor and discipline that comes from like most of them engineering trades.

John (06:42)
I’m going to guess. I think it’s not walking out too far on limb that you probably coded in systems at some point in your life where you had a very small amount of memory to work with. You knew every line of code in the system. Like literally it was written. There might have been a shim operating system or something small, but I wrote embedded systems early in my career and we knew everything. We knew every line of code and the elegance and the and the efficiency of it and the speed of it. And we were very close to the CPU, very close to the hardware. It was slow building things because you had to handcraft everything, but it was very curated and very beautiful, so to speak. I find beauty in those things. You’re exactly right. I think I started to see this happen around the time when JVM started happening, Java Virtual Machines, where you didn’t have to worry about Java garbage collection. You didn’t have to worry about memory management.

And then progressively, levels of abstraction have changed right to to make coding faster and easier and I give it more you know more power and that’s great and we’ve built a lot more systems bigger systems open source helps. But now literally anyone who can speak cogently and describe what they want and get a system and. And I look at the code my LLM’s produce. I know what good code looks like. Our team is really good at engineering right?

Hmm, how did it think to do it that way? Then go back and we tell it what we want and you can massage it with some words. It’s really dangerous and if you don’t know how to look for security problems, that’s even more dangerous. Exactly, the level of abstraction is so high that people aren’t really curating code the way they might need to to build secure production grade systems.

CRob (08:29)
Especially if you are creating software with the intention of somebody else using it, probably in a business, then you’re not really thinking about all the extra steps you need to take to help protect yourself in your downstream.

John (08:44)
Yeah, yeah. think it’s an evolution, right? And where I think of it like these AI systems we’re working with are maybe second graders. When it comes to professional code authoring, they can produce a lot of good stuff, right? It’s really up to the user to discern what’s usable.

And we can get to prototypes very quickly, which I think is greatly powerful, which lets us iterate and develop. In my company, we use AI coding techniques for everything, but nothing gets into production, into customer hands that isn’t highly vetted and highly reviewed. So, the creation part goes much faster. The review part is still a human.

CRob (09:33)
Well, that’s good. Human on the loop is important.

John (09:35)
It is.

CRob (09:36)
So let’s change the topic slightly. Let’s talk a little bit more about vulnerability management. From your perspective, thinking about traditional brick and mortar organizations, how have you seen, what key differences do you see from someone that is more data center, server, VM focused versus the new generation of cloud native where we have containers and cloud?

What are some of the differences you see in managing your security profile and your vulnerabilities there?

John (10:08)
Yeah, so I’ll start out by a general statement about vulnerability management. In general, the way I observe current methodologies today are pretty traditional.

It’s scan, it’s inventory – What do I have for software? Let’s just focus on software. What do I have? Do I know what it is or not? Do I have a full inventory of it? Then you scan it and you get a laundry list of vulnerabilities, some false positives, false negatives that you’re able to find. And then I’ve got this long list and the typical pattern there is now triage, which are more important than others and which can I explain away. And then there’s a cycle of remediation, hopefully, a lot of times not, that you’re cycling work back to the engineering organization or to whoever is in charge of doing the remediation. And this is a very big loop, mostly starting with and ending with still long lists of vulnerabilities that need to be addressed and risk managed, right? It doesn’t really matter if you’re doing VMs or traditional software or containerized software. That’s the status quo, I would say, for the average company doing vulnerability maintenance. And vulnerability management, the remediation part of that ends up being some fractional work, meaning you just don’t have time to get to it all mostly, and it becomes a big tax on the development team to fix it. Because in software, it’s very difficult for DevSec teams to fix it when it’s actually a coding problem in the end.

In traditional VM world, I’d say that the potential impact and the velocity at which those move compared to containerized environments, where you have

Kubernetes and other kinds of orchestration systems that can literally proliferate containers everywhere in a place where infrastructure as code is the norm. I just say that the risk surface in these containerized environments is much more vast and oftentimes less understood. Whereas traditional VMs still follow a pattern of pretty prescriptive way of deployment. So I think in the end, the more prolific you can be with deploying code, the more likely you’ll have this massive risk surface and containers are so portable and easy to produce that they’re everywhere. You can pull them down from Docker Hub and these things are full of vulnerabilities and they’re sitting on people’s desks.

They’re sitting in staging areas or sitting in production. So proliferation is vast. And I think that in conjunction with really high vulnerability reporting rates, really high code production rates, vast consumption of open source, and then exploits at AI speed, we’re seeing this kind of almost explosive moment in risk from vulnerability management.

CRob (13:18)
So there’s been, over the last several, like machine intelligence, which has now transformed into artificial intelligence. It’s been around for several decades, but it seems like most recently, the last four years, two years, it has been exponentially accelerating. We have this whole spectrum of things, AI, ML, LLM, GenAI, now we have Agentic and MCP servers.

So kind of looking at all these different technologies, what recommendations do you have for organizations that are looking to try to manage their vulnerabilities and potentially leveraging some of this new intelligence, these new capabilities?

John (13:58)
Yeah, it’s amazing at the rate of change of these kinds of things.

CRob (14:02)
It’s crazy.

John (14:03)
I think there’s a massively accelerating, kind of exponentially accelerating feedback loop because once you have LLMs that can do work, they can help you evolve the systems that they manifest faster and faster and faster. It’s a flywheel effect. And that is where we’re going to get all this leverage in LLMs. At Root, we build an agentic platform that does vulnerability patching at scale. We’re trying to achieve sort of an open source scale level of that.

And I only said that because I believe that rapidly, not just us, but from an industry perspective, we’re evolving to have the capabilities through agentic systems based on modern LLMs to be able to really understand and modify code at scale. There’s a lot of investment going in by all the major players, whether it’s Google or Anthropic or OpenAI to make these LLM systems really good at understanding and generating code. At the heart of most vulnerabilities today, it’s a coding problem. You have vulnerable code.

And so, we’ve been able to exploit the coding capabilities to turn it into an expert security engineer and maintainer of any software system. And so I think what we’re on the verge of is this, I’ll call it remediation revolution. I mentioned that the status quo is typically inventory, scan, list, triage, do your best. That’s a scan for us kind of, you know, I’ll call it, it’s a mode where mostly you’re just trying to get a comprehensive list of the vulnerabilities you have. It’s going to get flipped on its head with this kind of technique where it’s going to be just fix everything first. And there’ll be outliers. There’ll be things that are kind of technically impossible to fix for a while. For instance, it could be a disclosure, but you really don’t know how it works. You don’t have CWEs. You don’t have all the things yet. So you can’t really know yet.

That gap will close very quickly once you know what code base it’s in and you understand it maybe through a POC or something like that. But I think we’re gonna enter into the remediation revolution of vulnerability management where at least for third party open source code, most of it will be fixed – a priority.

Now, zero days will start to happen faster, there’ll be all the things and there’ll be a long tail on this and certainly probably things we can’t even imagine yet. But generally, I think vulnerability management as we know it will enter into this phase of fix first. And I think that’s really exciting because in the end it creates a lot of work for teams to manage those lists, to deal with the re-engineering cycle. It’s basically latent rework that you have to do. You don’t really know what’s coming. And I think that can go away, which is exciting because it frees up security practitioners and engineers to focus on, I’d say more meaningful problems, less toil problems. And that’s good for software.

CRob (17:08)
It’s good for the security engineers.

John (17:09)
Correct.

CRob (17:10)
It’s good for the developers.

John (17:11)
It’s really good for developers. I think generally the shift left revolution in software really didn’t work the way people thought. Shifting that work left, it has two major frictions. One is it’s shifting new work to the engineering teams who are already maximally busy.

CRob (17:29)
Correct.

John (17:29)
I didn’t have time to do a lot of other things when I was an engineer. And the second is software engineers aren’t security engineers. They really don’t like the work and maybe aren’t good at the work. And so what we really want is to not have that work land on their plate. I think we’re entering into an age where, and this is a general statement for software, where software as a service and the idea of shift left is really going to be replaced with I call shift out, which is if you can have an agentic system do the work for you, especially if it’s work that is toilsome and difficult, low value, or even just security maintenance, right? Like lot of this work is hard. It’s hard. That patching things is hard, especially for the engineer who doesn’t know the code. If you can make that work go away and make it secure and agents can do that for you, I think there’s higher value work for engineers to be doing.

CRob (18:24)
Well, and especially with the trend with open source, kind of where people are assembling composing apps instead of creating them whole cloth. It’s a very rare engineer indeed that’s going to understand every piece of code that’s in there.

John (18:37)
And they don’t. I don’t think it’s feasible. don’t know one except the folks who write node for instance, Node works internally. They don’t know. And if there’s a vulnerability down there, some of that stuff’s really esoteric. You have to know how that code works to fix it. As I said, luckily, agent existing LLM systems with agents kind of powering them or using or exploiting them are really good at understanding big code bases. They have like almost a perfect memory for how the code fits together. Humans don’t, and it takes a long time to learn this code.

CRob (19:11)
Yeah, absolutely. And I’ve been using leveraging AI in my practice is there are certain specific tasks AI does very well. It’s great at analyzing large pools of data and providing you lists and kind of pointers and hints. Not so great making it up by its own, but generally it’s the expert system. It’s nice to have a buddy there to assist you.

John (19:35)
It’s a pair programmer for me, and it’s a pair of data analysts for you, and that’s how you use it. I think that’s a perfect. We effectively have become cybernetic organisms. Our organic capabilities augmented with this really powerful tool. I think it’s going to keep getting more and more excellent at the tasks that we need offloaded.

CRob (19:54)
That’s great. As we’re wrapping up here, do you have any closing thoughts or a call to action for the audience?

John (20:02)
Call to action for the audience – I think it’s again, passion play for me, vulnerability management, security of open source. A couple of things. same. Again, same cloth – I think again, we’re entering an age where think security, vulnerability management can be disrupted. I think anyone who’s struggling with kind of high effort work and that never ending list helps on the way techniques you can do with open source projects and that can get you started. Just for instance, researching vulnerabilities. If you’re not using LLMs for that, you should start tomorrow. It is an amazing buddy for digging in and understanding how things work and what these exploits are and what these risks are. There are tooling like mine and others out there that you can use to really take a lot of effort away from vulnerability management. I’d say that for any open source maintainers out there, I think you can start using these programming tools as pair programmers and security analysts for you. And they’re pretty good. And if you just learn some prompting techniques, you can probably secure your code at a level that you hadn’t before. It’s pretty good at figuring out where your security weaknesses are and telling you what to do about them. I think just these things can probably enhance security open source tremendously.

CRob (24:40)
That would be amazing to help kind of offload some of that burden from our maintainers and let them work on that excellent…

John (21:46)
Threat modeling, for instance, they’re actually pretty good at it. Yeah. Which is amazing. So start using the tools and make them your friend. And even if you don’t want to use them as a pair of programmer, certainly use them as a adjunct SecOps engineer.

CRob (22:00)
Well, excellent. John from Root.io. I really appreciate you coming in here, sharing your vision and your wisdom with the audience. Thanks for showing up.

John (22:10)
Pleasure was mine. Thank you so much for having me.

CRob (22:12)
And thank you everybody. That is a wrap. Happy open sourcing everybody. We’ll talk to you soon.

Sep 23

Love0

What’s in the SOSS? Podcast #40 – S2E17 From Manager to Open Source Security Pioneer: Kate Stewart’s Journey Through SBOM, Safety, and the Zephyr Project

By OpenSSF Podcast

Summary

In this episode of What’s in the SOSS, CRob has an inspiring conversation with Kate Stewart, a Linux Foundation veteran who took an unconventional path into open source as a manager rather than a developer, navigating complex legal challenges to get Motorola’s contributions upstream. Now a decade into her tenure at the Linux Foundation, Kate leads critical initiatives in safety-critical open source software, including the Zephyr RTOS project and ELISA, while being instrumental in the evolution of SPDX and Software Bill of Materials (SBOM). She breaks down the different types of SBOMs, explains how the Zephyr project became a security exemplar with gold-level OpenSSF badging, and shares practical insights on navigating the European Union’s Cyber Resilience Act (CRA). Whether you’re interested in embedded systems, security best practices, or the evolving regulatory landscape for open source, this episode offers valuable perspectives from someone who’s been shaping these conversations for years.

Listen on Apple Podcasts Listen on Spotify Listen on Overcast Listen on Pocket Casts

Conversation Highlights

00:00 Intro Music & Promo Clip
00:00 Introduction and Welcome
00:42 Kate’s Current Work at Linux Foundation
02:18 Origin Story: From Motorola Manager to Open Source Advocate
06:38 Building Global Open Source Teams and SPDX Beginnings
09:45 The Variety of Open Source Contributors
10:57 Deep Dive: What is an SBOM and Why It Matters
17:05 The Evolution of SBOM Types and Academic Understanding
19:21 Cyber Resilience Act and Zephyr as a Security Exemplar
26:46 Zephyr’s Security Journey: From Badging to CNA Status
31:05 Rapid Fire Questions
32:19 Advice for Newcomers and Closing Thoughts

Episode Links

Transcript

Intro Music + Promo Clip (00:00)

CRob (00:07.862)
Welcome, welcome, welcome to “What’s in the SOSS?” the OpenSSF’s podcast where we talked to the amazing people that help produce, manage and advocate this amazing movement we call open source software. Today we have a real treat – someone that I’ve been interacting with off and on over the years through many kind of industry ecosystem spanning things. So I would like to have my friend, Kate Stewart, give us a little introduction, talk about yourself.

Kate Stewart (00:42.103)
Rob, glad to be here. Right now, I work at the Linux Foundation and have been focusing on what it’s going to take to actually be able to use open source in safety critical contexts. So obviously, security is going to be a part of that. We need to make sure that there’s no vulnerabilities and things like that. But it does go beyond that. And so that’s been my focus for the last few years. I’ve been working on a couple of open source projects, which is Zephyr and the other of which is Elisa and then helping with a variety of other embedded projects like Yocto and Zan and so forth. Trying to figure out what can we do to make sure that we actually get to the stage where we can do safety analysis at scale with vulnerabilities happening all the time with open source. And it’s a really good challenge. I’ve also been involved in SBOM World a fair amount over the years for fun. I think I was involved with the SBOM World before it was called the SBOM World actually with a project called SPDX, or software package data exchange is what it’s called now, and it’s now moved to being system package data exchange because data and models and all these other lovely concepts are part of the whole transparency that we’re going to need. And the transparency that we’re going to need is what we’re going to need for safety. So these things also come together with that theme in mind.

CRob (02:02.702)
Awesome, that’s an important space that we haven’t had a lot of folks talk about. Safety is just as important, if not more so, when we’re thinking about security and just kind of protecting people. So let’s dive into your background. What’s your open source origin story, Kate? How did you get into this crazy space?

Kate Stewart (02:18.967)
Well, I’m a little bit different than most. Okay. I got into this not as a developer. I got into this as a manager. Okay. So I basically was managing a team of developers doing bring up software back in the Motorola days. I was about 20 plus years ago now. And Apple had just finished pivoting away from the PowerPC. And so we needed to be able to prove that what we were doing in the silicon was actually going to work. And so this is part of embedded story and the enablement story to me. But what I ended up having to do is we went and looked at Linux. We went, OK, yeah, let’s work with Linux. Let’s work with the GCC tool chains, and let’s use that as our enablement platform. And then I had to, you know, we had our customers saying, OK, that’s fine, but until it’s upstream, we don’t believe it. So it was one of those sorts of ones, right?

It doesn’t, your port doesn’t exist unless it’s upstream as far as they were concerned. So we could get it running in our labs, but it’s all of a sudden I had to go and deal with the lawyers and figure out how can we get it so that we can actually contribute upstream. And I pretty much was doing, basically I went with the lawyers and when I was at Motorola and worked with them, convinced them, educated them, know, education, all that to get them so that they could understand, we’re not going to take a lot of risk.

CRob (03:18.866)
Mm-hmm.

Kate Stewart (03:47.313)
And it’s in our best interest to sell our chips. So we got that happening. And then the company spun out FreeScale for Motorola. And I got to do it all over again with a new set of lawyers. So I was managing the open source teams. And then we did a lot of work with the team in Austin. And then we started scaling that out worldwide to all the other open source teams in Motorola, or FreeScale at the time then, and managing teams.

CRob (04:00.05)
Hahaha

Kate Stewart (04:16.363)
Teams in China, teams in Israel, teams in Canada, and gosh, Japan and France. Anyhow, we had a good selection of teams around the world. And so making sure that we could actually do all this properly at scale was an interesting challenge. So that was my start in open source. And this is why you sort of see me in the licensing space, because I’ve been talking to lawyers a lot. And that’s kind of where SPDX started, is because we had to keep doing the same metadata over and over and over.

And so my colleagues at Wind River were looking at the same stuff. My colleagues at Monovista were looking at the same stuff. We had no way of collaborating. So it was a language to collaborate with. that if I go and scrub this curl package and pull all the licensing information out and sanitize it, I could share it with you type of deal and vice versa. So that’s kind of how SPVX started with being able to share the metadata about projects and make things transparent so that we could do the risk analysis properly. After that. you know, I basically went and got an opportunity to join Canonical and I was a Buntus release manager for two and a half years. So all of a sudden that was a whole different view of open source, right? I was coming from a nice little embedded space and at Canonical, I learned all about dependencies and all about how you had to make sure that your full dependency chain was actually satisfied and coherent.

CRob (05:23.664)
Nice.

Kate Stewart (05:39.447)
Or you were going to have problems. I also learned a lot about zero days and making the whole security story come together so that we could ship a distribution that people were coming, that was not going to cause problems for people. And so there’s about five, a bunch of releases I was released management for in that time period. And so that taught me a lot about open source at scale in the current environment. And after that, I went to, I was doing the roadmaps at Linaro.

CRob (05:51.43)
Mm-hmm.

Kate Stewart (06:09.399)
Product management, was the director of product management at Lenaro to figure out where did the army ecosystem want to collaborate? What topics do want to collaborate on? And so I was there for a couple of years and then I joined the Linux Foundation in 2015. So this will be my 10th year and it’s been a fun night. I know. So I’ve been, yeah, yeah, it is. they brought me in because the lawyers at the Linux Foundation knew me from the SPDX world. But after I joined, they sort of they had me point to a certain problem initially when I joined and we figured out a solution. But then they realized, oh, she understands embedded. Ooh. So they basically asked me to pull together the funding pool for real-time Linux. And I said to them that that was a big problem. And now, as of last year, we finally have everything upstream. It’s only taken eight years, but we finally got it. We preempt our taste at patches. And then after I was able to get that going well, went, hmm, we’ve got this RTOS called Zephyr that Intel is contributing up and we need someone to run it and start to build a program up. And so it was sort of like, okay, let’s figure out how we can deal with this one. So it was fun. We did a lot of surveys with developers, what were the big pain points in the IoT ecosystem? Back in 2015, I don’t know if you remember, but the big joke was what’s the S in IoT? Yeah, exactly. And so…

Focusing on security in Zephyr has been the focus pretty much since the day we launched the project. We had our first security committee meetings. And every time we found a new security best practice, we did try to apply it into Zephyr and to see where we are. I think we were the fifth project to get a gold badge out of the OpenSSF Badging Program. I think there’s a few more now, but not that many. We have gone to that level.

Kate Stewart (08:02.417)
And certainly Zephyr was there before Linux was there, just as side thing. So we actually got it.

CRob (08:08.274)
I’ll tell Greg next time I see him.

Kate Stewart
He knows it. I’ve teased him for many years. So nothing new there. But yeah, so we were trying to do best practices in Zephyr. And we’ve been working towards safety in Zephyr. Now it’s taken us a while to figure out a path that would work in the context of open source. But these were the two things that we were told back in 2015 by the developers that they wanted to see in open source Arctos. And so that’s what we’ve been focusing on. the project has grown really well over the years. We’re now the fifth most active project at the Linux Foundation in terms of project velocity, especially by CNCF, not us. So we have some degree of separation there.

And we’ve actually hit the top 30 of all open source projects now, so we’re the 25th. So it’s had a pretty good trajectory. And this just goes to show, if you try to do the best practices, it makes a project successful. And developers want to be there because they can build on it. So that’s kind of the origin story of where I am to today. I’ve been starting to work on it. I continue to work on Linux-related topics with the safety, with the Elisa project.

Kate Stewart (09:20.331)
And we’re busy trying to figure out similar paths, how we can go for certifications with Linux. So that’s been growing slowly. But both of these projects are ones that grow very slowly over time. And they just sort of creep up bit by bit by bit by step by step by step. And I’ve got some great, yeah, exactly, pedal by pedal by pedal. I’ve got some great board members in both projects who are very much engaged and have been doing a lot to help us move it forward. So I’ve been very lucky in that sense too. Yeah.

CRob (09:45.362)
Awesome.

That’s an amazing journey that you’ve described and touching on so many interesting different areas. I love like through these interviews, I get to hear how people got here and the variety of skills that people bring that you’re able to contribute very meaningfully to upstream is just amazing. It really makes me happy to hear that you have again, kind of a non-traditional open source background. That’s great.

Kate Stewart (10:09.729)
Yeah, like I say, whenever the academics go do surveys about open source origins, I always enjoyed being able to do that to say, it isn’t just developers that do open source and help open source. Realistically, it’s a community. And so it’s, know, everyone has different roles and different abilities to contribute in different ways. And as we all want to go after the same vision, the right thing happens. It’s pretty much what happened. It’s pretty much what I’ve seen anyhow.

CRob (10:31.89)
So where I first got to interact with you way back in days of yore was on this little thing that you alluded to, this small little thing called Software Bill of Materials. I joined the second NTIA call and you were one of the presenters in the room. So for our audience, could you maybe give me an elevator pitch on what’s an SBOM and why is it important?

Kate Stewart (10:57.835)
Well, so Software Bill of Materials is a way of exposing the metadata about the software you’re using in a transparent fashion. And so it’s basically putting in together the key elements of what are your components and how are they linked together? What’s the relationships between them? And understanding these relationships and how these components interact is what gives you the ability to decide if something is potentially at risk or not. Is it vulnerable or is it not vulnerable? What are the transitive dependencies? What’s happening? Realistically, there was a lot of simplifications that were made at that point in time in the initial S-BOM work. They’re starting to come back to bite us now. Just saying, one of which was the only thing was includes as a dependency. Well, realistically,

Kate Stewart (11:53.067)
You need to know something statically linked is something dynamically linked. How are these, how are these things interacting? and we’ve got, you know, things emerging right now in the whole AI front of, know, what training data we’re using, you know, what model on your models and how is this all assembled? And so these are things that we’d actually dealt with in SPDX a long time ago, but the SBOM community wasn’t ready to talk about it at that time. They’re starting to move in this direction now, but you’ll find, like I say, I’m having a lot of fun right now reading some academic papers, because I’ll be talking about some of these topics at the end of the month up in Ottawa at the MSR. So I’ll be doing a keynote there about some of this stuff. And I was looking at all these academic papers to see, OK, what do they think an S-BOM is, just so that I figure out where I have to talk to them about this stuff.

Kate Stewart (12:52.263)
One of the things that they’re seeing is that there’s a lot of the interpretations and the subtleties around the S-BOM is this data of components and the relationship between these components is very much like an elephant. It depends on which part of the elephant you’re coming in at it from and what you’re feeling with with your little line hands as to what you perceive it as.

Kate Stewart (13:18.999)
And we went through the, I guess it was in the CISA working group. I was a fly on the wall on the first one and opened my mouth, which is why I was on the second NTIA call. And that’s why I basically took over co-leading the formats and tooling working group under NTIA. And I was an active participant in the framing discussions. So I’ve been pretty much involved with it all the way through. then I was working when we first had the the SBOM Everywhere. So we had the SBOM Working thing in Washington. I was there. And that was sort of the start of figuring out, okay, well, we want to have the SBOM Everywhere sync to start focusing issues. And so when there was a hiatus between NTIA and CISA, that SBOM Everywhere group was keeping the discussion going in a reasonably collective way. And we’re sort of starting to head into that again with the NTIA SBOM Everywhere, pulling voices together, and understanding how this technology is evolving and where the strengths and weaknesses are and where the gaps are, filling the gaps. So I’ve been involved with the OpenSSSF, OpenSSF, SBOM Everywhere SIG under the tooling group, basically with Josh Breshers since it started, trying to make sure that we had the different voices coming in and we had the different perspectives available so that we could start to get a good lie of the land. And we started work on clarifying what is actually happening with the types of SBOMs? Because the type of data you have available for source SBOMs is what historically the legal focus focused on for license compliance, but still useful for the safety people. But then you have the build SBOMs where you say, OK, these are the pieces I’m putting together. And the question is, you capturing, these are all the source files that went into my build image, or are you just capturing these are the components that go into my build image?

Kate Stewart (15:14.559)
But these are build types of SBOMs. And they have different purposes, and they have different ability to reduce false positives. Specifically, if you don’t compile a piece of code into your package, you’re not vulnerable to it. there’s a lot of, I’d like to get rid of a whole bunch of these false positives in the security space. I really think we should be able to, if we actually get the right level of information. So we can take that and then what you actually configure may impact whether you have vulnerability or not when you deploy it. So what have you released and where are you deploying? And then what’s happening to the system as it’s running? Are you updating your libraries? Are there potentially interactions between your runtime libraries and what you put down as images? All these are different types of data that can legitimately be put into an SBOM. And there’s different levels of trust depending on where you are in the ecosystem, how much you actually how much the people who have put the data into the format actually understand the data and have confidence. Because there’s a lot of tools out there, which is the sixth type, which is an analysis SBOM. And these are ones that are looking at a different part of the life cycle or off to the side and trying to guess what’s going on. And the guessing what’s going on is if you don’t have anything else, that’s what you need to do. No question. But if you can get it precisely from the build flows and from your whole ecosystem as it’s being used, as it’s being deployed, as it’s being monitored, you’re going to have a lot more accuracy, which removes a lot of problems. So that type of concept that we sort of picked up on in OpenSSF, Seth Swansegan, has been picked up by CISUN. And we’ve got a short white paper out about the same thing. That concept hasn’t hit academia yet.

CRob (17:03.858)
We better get that over there.

Kate Stewart (17:05.55)
Uh-huh. so this is I’ll be having some fun. So the whole concept of the different data and like they’re trying to look at like all these SBOMs everywhere and how do you work with them, et cetera. And it’s sort of like, well, first question one is how authoritative is the people who produce this SBOM on you? Is it garbage in? Is it someone just filling in the fields and not really understanding what they’re filling in?

CRob (17:28.676)
Mm-hmm.

Kate Stewart (17:31.797)
Or does it come from a source that you trust and that you think actually knows what you’re talking about so you can build on it further and augment it and everything else? So yeah, these are the challenges I think that are kind of interesting for us to be playing with right now. And so I was part of, in SysI, working on the basically the tool quality was basically we were working on looking at tooling and looking at the formats and actually ended up doing a revision to the revision of the framing document. So we basically pulled together consensus on what the next additional sets of fields were going to be that got added into the tree. And then I think CISA has plans, or at least had plans, we’ll find out what happens this year, of basically going through a more formal process. But they’ve got the stakeholder input from us now. And that was the stage of where we thought, and realistically, the lawyers managed to show up in those meetings and convince them that, they did need to keep licenses in there now. Thank you. Because it’s another form of risk at the end of the day. It’s just risk forms. And so people creating these things and doing policies, the academics are busy trying to study this because it’s an interesting area for them. And so I think we’ll see a lot of innovation in the next few years and getting towards a stage where we can actually do product lines at scale.

Um, and being able to keep things safe, I think is where we need to aim for the whole SBOM initiatives and they need to be system-wide. They can’t just be software anymore. You need to know which data sets you’ve used because you can have poison data sets all over the place. can have, um, you know, bad training on your models might have an impact if you’ve not weighted things properly. These are all factors that when people are trying to understand what went wrong, they’re going to need to be able to look at.

CRob (19:21.458)
What I love about my upstream work is that I get to collaborate with amazing people like you on these really hard challenges and know software bill of materials has been won. Let’s move on to another interesting problem we all get to work with. The European Union last year released a new piece of legislation, the CRA, the Cyber Resilience Act.

Kate Stewart (19:45.565)
Yes.

CRob (19:47.43)
And it’s a set of requirements for predominantly manufacturers, but also a little bit for open source stewards that talks about the requirements for products with digital elements. through our parallel work, we worked with our, with Hillary Carter and the LF research team. And we did two study, we sponsored two studies. The first study, which was very sad news where we polled 600 people participating in upstream open source, whether that was a manufacturer, a developer, a steward like us. And the findings there were a little scary, where a lot of people are unaware and uncertain is the title of the paper. People don’t know about this law and the requirements that are coming down the pipe. But in concert with that, LF Research also released a paper that has, I feel, is some pretty spectacular news. The title of the other paper was Pathways to Cybersecurity, Best Practices in Open Source, and a project that’s near and dear to your heart was cited as one of the exemplars of an upstream project that takes security seriously and is doing the right types of things. So could you maybe talk a little bit about Zephyr and talk about like what your personal observations of what this report kind of detailed?

Kate Stewart (21:13.749)
Yeah, we’re going to have a gap. Also, there’s a lot of things that some of these projects have done right. Zephyr, for one, like I say, we’ve taken security seriously right since the project started. Literally, the first security committee meeting happened the day after we launched the project. it’s in our blood that we make sure that we’ve tried to do this right. We’ve been, you know, we started trying to figure out, and was funny enough, was part of the Core Infrastructure Initiative Open Source Badging started at that point in time roughly as well, that’s us. And so we looked at it going like, well, how do we improve our security posture as a project? Because we were coming at this from a community and we started using the badging program as looking at the criteria, filling it out, understanding how this all works. And we initially got to passing.

Yay, we got it. We’ve got our basic passing. And then David Wheeler got busy with his community, and he came up with a silver and a gold level on us. And we just kept increasing the resilience of the project at the heart of it all. So this was also really quite educational. So we started looking at that. And as we were going for that first passing, we realized, it. What’s involved with becoming a CVE, numbering authority, or CNA?

Kate Stewart (22:37.399)
And so I started working and reaching out to the folks behind that and understanding what’s involved. And so we ended up having myself and some of the security committee, we ended up meeting with them trying to figure out, okay, well, what does it take for project to be a CNA, not a company, but a project? And so in, think 2017, 2018, we actually got our CNA criteria. We fulfilled the CNA criteria and the project itself has been a CNA with a functioning P-SIRT.

CRob (23:06.066)
Nice.

Kate Stewart (23:06.557)
We’ve been plugged into the whole P-SIRT community and the first community and everything else for the last few years. And last year at first conference, I actually gave a talk about Zephyr because a lot of the corporation folks don’t realize that projects can do this too, if they have the will and the initiative behind their membership and their developers that they want to be this way. we’ve been looking at these things and tackling that.

CRob (23:14.514)
Excellent.

Kate Stewart (23:36.311)
And then we went after, we were sort of sitting on our laurels to a certain extent before we started really going after the silver. And the automation behind that badge kicked us out because some of the website lights weren’t working anymore. Yeah. So I went, oh, it kicked us out. We’ll take it seriously now. OK. This is good. It wasn’t just a paper exercise. There’s actually something that’s keeping you honest there behind the scenes. And that checking behind the scenes motivated us then to go after silver and then gold, and we finally got gold. then, so I can say we were about the fifth, I think the fifth to get gold, fourth or the fifth. I was really, you we were pretty happy. And we’ve maintained it since then. And every year we do a periodic audit of the materials. We started looking at the scorecard practices as a project. And so we started looking at last year and the security committee, actually, this is the amusing part here. The security committee was going, ah, this really doesn’t apply to people building things with Zephyr, et cetera, et cetera. And we were going like, oh, well, OK.

Kate Stewart (24:33.761)
And then when we had this little recent incident with that exposure, all of a sudden we’re talking on weekends and they realized, shit, we better take this all seriously. So we’re improving our posture there too now. So I think we went like from 50 something up to 78 and we’ve got plans for getting ourselves up to the hundred level type of deal. So we’re continuing to improve our posture and work on that sort of stuff there. So Zephyr takes security very, very seriously as you can tell.

CRob (24:52.914)
Excellent.

Kate Stewart (25:03.031)
It really is, you know, part of the reason I think we’ve been a successful project is the fact that we do have a strong security story. We’ve got our threat models. We’ve got, know, we’ve got the things that people are saying to be looking at and we keep putting them and we have, people that meet from our members and from the community and, you know, continue to refine our posture. So I think it’ll always be that way. And so whenever I can find a new practice, we’re starting to look at, can we apply it? How does it match to what we’ve already done?

And so I’m curious when the baseline stuff rolls out, what is it going to look like? And I suspect we’ve already hit there, but we might learn something. That’s always good. And the nice thing about doing it this way, which was applying the best practices badging and forth, let us put things in place. They’re serving us well for the CRA. And that’s probably why we showed up in this report, because

You know, lot of the things that we do with the US government, once the Europeans figure out, what database are they going to be throwing things at? Who is going to be the C-CERTS? We’ve been doing it already on the US side. We should be able to do it on the European side. It’s just a matter of figuring it out a little bit. Just basically, testing a few of our processes. But we’ve already been doing these types of things in one direction. We just have to broaden the reach a little bit more.

CRob (26:26.651)
Yeah, I think it’ll be a pretty easy pivot. You need to make some small adjustments, do some documentation potentially, but it sounds like you’ll be in a pretty good spot to fulfill that any obligations underneath that anyone that uses Zephyr will be in a great space to defend themselves like the Market Surveillance Authority or other groups.

Kate Stewart (26:46.037)
Right. And, you know, one of the things that the project did when we started looking at the criteria coming down, and the needs of the criteria was it was looking like, okay, they want to have this, this longer support period, like, you know, between five years up to 25 type of deer. And so the separate projects TSC actually in March voted to extend our LTS support from 2.5 years to five years. And we’ve got that we’re doing different things in different periods of time.

CRob (27:12.551)
Yay!

Kate Stewart (27:15.863)
Okay, so what we’re doing traditionally now for the first two years, two and a half years, and then we’re basically just focusing on security and critical updates after that. So it keeps the load reasonable for our maintainers and our release team members. And so that’s kind of how we’ve attracted. So, yeah, the TSE voted on it, it was approved. And so that’s our bit we could do to help shrink the gap between the steward obligations and the manufacturer obligations. So we’re trying to make it.

Kate Stewart (27:45.013)
Zephyr is friendly as possible for the manufacturers. Now we are going to have a challenge that I don’t see a clean story for yet. We went through some bulk vulnerability assessments and things like that, and we’ve changed some processes. And what we do is we have documented in the project, we will respond to any vulnerability reports in seven days. We will basically, you know, we’ll acknowledge them, you know, start working with whoever’s reporting things in, and then we will in 30 days have it fixed upstream.

And then in 60, we’ll have another 60 day embargo window. total 90 days of embargo window before we make it public. And we do this because in the embedded space, you know, it’s a lot harder sometimes to remediate things in the field. And so we wanted to make sure that, you know, the people who are trusting Zephyr as their RTOS, you know, would have a chance to work with their customers. Now the CRA, you know, and so the CRA is asking for, especially in severe vulnerabilities, some pretty different timelines. And so I’m really going to be interested in how that’s all going to boil itself out. The other thing that’s interesting is that the CRA is calling for us to notify our customers. Well, we don’t know who’s using Zephyr. So one of the mechanisms we put in place earlier, which was a product, like basically a vulnerability notification list for our product makers.

Kate Stewart (29:12.887)
So any of our members in the project or any people who can show us that they’re using a product that’s got Zephyr in it, we’ll add them to the notification list under embargo. And so we’re trying to handle it that way. But that’s going to be the best we’re going to able to do. We won’t be able to find this end user because we don’t have the line of sight. And now the cut, the manufacturer.

CRob (29:32.028)
But exactly, and that’s not necessarily your responsibility is the upstream. That’s the manufacturer somebody that’s commercializing this.

Kate Stewart (29:36.317)
Well, it’s one of those sections that applies a little bit to the stewards.

CRob (29:45.81)
It does, but the timelines are not identical.

Kate Stewart (29:47.147)
So that’s why we’re always going to do to whoever wants to let us know about them. Yeah, let us know about them. I think sanity will prevail, and we will not be subject to various other, some of the punitive stuff. But I think we can certainly look at trying to make sure that what we’ve done is as much as we can do with the data we have available to us. And hence, the more transparency we make into this ecosystem, the better. Speaking back to S-BOMs, you know, Zephyr actually, every build of Zephyr, you can get three SBOMs out with just a couple command line tweaks. You get sources SBOM of the Zephyr sources from the files you pulled from. Of course, the sources from the application you’ve used. So you get source SBOMs for each of those. And then you get a build SBOM that links back with cryptographic encryption to the source and to the, to the, to the source SBOMs and lets you know exactly which files made it in as well as which, you know, dependency links and components you may have pulled from your ecosystem. But that level of transparency is what we’re going to need for safety. And we have it today with Zephyr from following these best practices on security. So we’re in reasonable shape, I think, for the regulatory industries as well.

CRob (31:05.414)
Good. Well, I could talk about SBOM and CRA all day, but let’s move on to the rapid fire part of our interview.

Kate Stewart (31:14.134)
Okay.

CRob (31:17.66)
got a bunch of questions here. Just give me your first emotional response. First thing that comes to mind here. First question, VI or EMACs.

Kate Stewart (31:28.599)
VI.

CRob (31:30.546)
Nice, very good, very good. These are all potentially contentious somewhat, so don’t feel bad about having an opinion. What’s your favorite open source mascot?

Kate Stewart (31:34.638)
Tux

CRob (31:45.106)
Excellent choice. What’s your favorite vegetable?

Kate Stewart (31:50.625)
Carrots.

CRob (31:52.434)
Very nice. Star Trek or Star Wars?

Kate Stewart (31:58.455)
Star Trek.

CRob (32:00.541)
Very good. There’s a pattern. So everyone I’ve asked that so far has been trekkers, which is good. And finally, mild or spicy food.

Kate Stewart (32:09.611)
Depends. Probably, at this point now more mild. There’s certain things I like spicy though.

CRob (32:14.578)
Ha

CRob (32:19.42)
Fair enough. Well, thank you for playing along there. And as we wind down here, Kate, do you have any call to action or any advice to someone trying to get into this amazing upstream ecosystem?

Kate Stewart (32:33.303)
There’s lots of free training out there. Just start taking it. Educate yourself so that you can participate in the dialogue and not feel completely overwhelmed. Each domain has its own set of jargon, be it lawyers, licensing jargon, security professionals with their jargon, safety folks with all their standards jargon. Everyone talks about certain concepts a little bit differently. so taking the free training that’s available from the Linux Foundation and other places just so that you’re aware of these concepts. actually, before you start opining on some of these things, actually do your homework. I know that’s a horrible concept, but like I said, was reading on some of these papers, these academic papers, and it was pretty clear that they hadn’t done their homework in a couple of areas. So that was a bit sort of like, yeah, OK. So yeah, do your homework before you start to opine. But do your homework. Educate yourself. Do your homework. And then find the areas that are most interesting to you, because there’s so many areas where people need help these days and there’s a lot of things we can participate in. And you don’t need to be a developer to participate as well. Obviously developers make everything come together and make it all work, but there’s a need for people doing a lot of other tasks. And if you want to try and make things move forward, there’s lots of ways.

CRob (33:49.586)
Well, thank you for sharing your story and thank you for all the amazing contributions you’re making to the ecosystem. Kate Stewart, thank you for showing up today.

Kate Stewart (33:56.919)
Thank you very much for all the work you’re doing in OpenSSF. I think this is really important. Thank you.

CRob (34:03.718)
My pleasure. And to everyone out there, thank you for listening and happy open sourcing. We’ll talk soon. Cheers.

Sep 09

Love0

What’s in the SOSS? Podcast #39 – S2E16 Racing Against Quantum: The Urgent Migration to Post-Quantum Cryptography with KeyFactor’s Crypto Experts

By OpenSSF Podcast

Summary

The quantum threat is real, and the clock is ticking. With government deadlines set for 2030, organizations have just five years to migrate their cryptographic infrastructure before quantum computers can break current RSA and elliptic curve systems.

In this episode of “What’s in the SOSS,” join host Yesenia as she sits down with David Hook (VP Software Engineering) and Tomas Gustavsson (Chief PKI Officer) from Keyfactor to break down post-quantum cryptography, from ELI5 explanations of quantum-safe algorithms to the critical importance of crypto agility and entropy. Learn why the financial sector and supply chain security are leading the charge, discover the hidden costs of migration planning, and find out why your organization needs to start inventory and testing now because once quantum computers arrive, it’s too late.

Listen on Apple Podcasts Listen on Spotify Listen on Overcast Listen on Pocket Casts

Conversation Highlights

00:00 Introduction
00:22 Podcast Welcome
00:01 – 01:22: Introductions and Setting the Stage
01:23 – 03:22: Post-Quantum 101 – The Quantum Threat Explained
03:23 – 06:38: Government Deadlines and Industry Readiness
06:39 – 09:14: Bouncy Castle’s Quantum-Safe Journey
09:15 – 10:46: The Power of Open Source Collaboration
10:47 – 13:32: Industry Sectors Leading the Migration
13:33 – 16:33: Planning Challenges and Crypto Agility
16:34 – 22:01: The Randomness Problem – Why Entropy Matters
22:02 – 26:44: Getting Started – Practical Migration Advice
26:45 – 28:05: Supply Chain and SBOMs
28:06 – 30:48: Rapid Fire Round
30:49 – 31:40: Final Thoughts and Call to Action

Episode Links

Transcript

Intro Music + Promo Clip (00:00)

Yesenia (00:21)

Hello and welcome to What’s in the SOSS, OpenSSF’s podcast where we talk to interesting people throughout the open source ecosystem, sharing their journey, experiences and wisdom. Soy Yesinia Yser, one of your hosts. And today we have a very special treat. I have David and Tomas from Keyfactory here to talk to us about post quantum. Ooh, this is a hot topic. It was one definitely that was mentioned a lot in RSA and upcoming conferences.

Tomas, David I’ll hand it over to you. I’ll hand it over to Tomas – introduce yourself.

Tomas Gustavsson (00:56)

Okay, I’m Thomas Gustavsson, Chief PKI Officer at Keyfactor. And I’ve been a PKI nerd and geek for working with that for 30 years now. I would call it applied cryptography. So as compared to David, I take what he does and builds PKI, a digital signature software with it.

David Hook (01:17)

And I’m David Hook. My official title is VP Software Engineering at KeyFactor, but primarily I’m responsible for the care and feeding of the bountycast of cryptography APIs which basically form the core of the cryptography that KeyFactor and other people’s products actually use.

Yesenia (01:35)

Very nice. And for those that aren’t aware, like myself, who is kind of new into the most post-quantum cryptology, could you explain like I’m five of what that is for our audience?

David Hook (01:46)

So one of the issues basically with the progress that’s been made in quantum computers is that there’s a particular algorithm called Shor’s algorithm which enables people to break conventional PKI systems built around RSA and Elliptic-Curve, which are the two most common algorithms being used today. The idea of the post-quantum cryptography effort is to develop and deploy algorithms which are not susceptible to attack from quantum computers before we actually have a quantum computer attacking us. Not that I’m expecting the first quantum computer to get out of a box, well, you know, sort of run rampaging around the street with a knife or anything like that. But the reality is that good people and bad people will actually get access to quantum technology at about the same time. And it’s really the bad people we’re trying to protect people from.

Tomas Gustavsson (02:39)

Exactly, and since more or less the whole world as we know it runs on RSA and EC, that’s what makes it urgent and what has caused governments around the world to set timelines for the migration to post quantum cryptography or quantum safe cryptographies. It’s also known as.

David Hook (03:03)

Yeah, I was just gonna say that that’s probably quantum safe is in some ways a better way of describing it. One of the issues that people have with the term post quantum is in the industry, a lot of people hear the word post and they think I can put this off until later. But yeah, the reality is that’s not possible because once there is a quantum computer that’s cryptographically relevant, it’s too late.

Yesenia (03:23)

So from what I’m hearing, sounds that post quantum cryptology is gaining urgency. And as we’re standardizing these milestones, including our government regulations, what are you seeing from your work with Bouncy Cancel, EJBCA, and SignServer? And of course, other important ecosystem players like our HSM vendors as they’re getting ready for these PQC deployments.

David Hook (03:49)

So I guess the first thing is, from the government point of view, the deadline is actually 2030, which is only about five years away. That certainly has got people’s attention. And that includes in Australia where I’m from. Now, what we’re seeing at the moment, of course, is that for a lot of people, they’re waiting for certified implementations. But we aren’t actually seeing people use pre-certified implementations in order to get some understanding of what the differences are between the quantum algorithms, the post quantum algorithms rather, and the original RSA PKI algorithms that we’ve been using before. One of the issues of course is that the post quantum algorithms require more resources. So the keys are generally bigger, the signature sizes are generally bigger, payloads are generally bigger as well. And also the mechanism for doing key transport in post quantum relies on a system called a KEM which is a key encapsulation mechanism. Key encapsulation mechanisms in usage are also slightly different to how RSA or Diffie-Hellman works, elliptic-curve Diffie-Hellman, which is also what we’re currently used to using. So it’s going to have to be some adaption in that too. What we’re seeing certainly at bouncer-caster levels, there’s a lot of people now starting to try new implementations of the protocols and everything they’re using in order to find out what the scalability effects are and also where there are these issues where they need to rephrase the way some processes are done just because the algorithms no longer support the things they used to support it.

Tomas Gustavsson (05:24)

I think it’s definitely encouraging that things have moved quite a lot, so of course the cryptographic community have worked on this for many, many years and we’ve now moved on from, you know, what can we do to when and how can we do it? So that’s very encouraging. There’s still a few final bits and pieces to be finished on the front of standardization and the certifications as David mentioned.

But things are, you know, dripping in one by one. For example, hardware security modules or HSM vendors are coming in one by one. for the actually the right kind of limited use cases today, selecting, you know, ready some vendors or open source projects, you can make things work today, which has really been kind of just in the last couple of months, a really big step forward for planning to being able to execute.

Yesenia (06:27)

Very interesting. And we’ll jump over to like bouncy castle. It’s from my experience within the open source world, it’s been a very long time that it’s been a trusted open source crypto library. How do you approach supporting post quantum algorithms while maintaining the trust and the interoperability? That’s a hard word for me.

David Hook (06:50)

Yeah, that’s all right. It’s not actually an easy operation to execute in real life either.

Yesenia (06:55)

Oh, so that works.

David Hook (06:57)

Yeah, so it works well. So with Bouncy Castle, what we able to do is we actually, our original set of post-quantum algorithms was based on round three of the NIST post-quantum competition. And we actually got funding from the Australian government to work with a number of Australian universities to add those implementations and also one of the universities was given funding to do formal validation on them as well. So one part of the process for us was, well guess there were three parts, one part was the implementation which was done in Java and C sharp and then in addition to that then we had somebody sit down and actually study the work that was done independently to make sure that we hadn’t introduced any errors that were obvious and to check for things like side channels and that way there were timing operation considerations that might have caused side channel leakage.

And then finally, of course, with the interoperability, we’ve been actively involved with organizations like the IETF and also the OpenSSL mission. And that’s allowed us to work with other open source projects and also other vendors to determine that our certificates, for example, and our private keys and all that have been encoded in a manner that actually allows them to be read and understood by the other vendors and other open source APIs. And on top of that, we’ve also been active participants in working with NIST on the ACVP stuff, which is for algorithm validation testing, to make sure the actual implementations themselves are producing the correct results. And that’s obviously something that we’ve worked with across the IETF and OpenSSL mission as well. So, you know part of actually generating a certificate of course is you’ve got to able to verify the signature on it. So that means you have to be able to understand the public key associated with it. That’s one checkbox and then the second one of course is the signature for example makes sense too.

Yesenia (08:52)

So, it sounds like there’s a lot of layers to this that have to be kind of checked off and gives it the foundation for this. Very nice.

Tomas Gustavsson (09:02)

I would say that what is so good to work in open source is that without collaboration we won’t have a chance to meet these tight deadlines that governments are setting up. So, and the great thing in open source community is that lot of things are transparent and easy to test.

Bouncy Castle is released open source, EGBC and Science Server are released open source and early. Not only us, of course, but other people can also start testing and grabbing OpenSSL or OQS from the Linux Foundation. You can test interoperability and verify it. And actually, you do find bugs in these early tests, which is why I think open source is the foundation to…being able to do this.

Yesenia (9:58)

Yeah, open source gives us that the nice foundation while we might have several years. I know with the migration itself, it’s going to take a while, especially trying to figure out how to, how is it going to be done? So just wanted to look into what remains of 2025 and of course, beyond. You know, we’re approaching a period where many organizations will need to start migrating, especially the critical infrastructure and our software supply chains. What do you anticipate will be the most important post quantum cryptographic milestone or shifts this year?

Tomas Gustavsson (10:32)

Definitely, we see a lot of interest from specific sectors. I said, supply chain security is a really big one because that was also, say, the first or definitely one of the first anticipated use cases for post-quantum cryptography because if you cannot secure the supply chain with over there updates and those kinds of things, then you won’t be in a good position to update or upgrade systems once a potential potent quantum computer is here. So everything about code signing, software supply chain is a huge topic. And it’s actually one of the ones where you will be able to do production usage or people are starting to plan and test production usage already or some actually have already gone there.

Then there’s industries like the finance industry, which is encouraging, I guess, for us all who have a bank that we work with, that they are very early on the ball as well to plan the huge complex system they are running and doing actually practical tests now and moving from a planning phase into an implementation phase.

And then there are more, I would say, forward looking things which are, you know, very long term like telecom are looking to the next generation like 6G where they are planning in post-quantum cryptography from the beginning. So there’s everything from, you know, right now to what’s happening in the coming years and what’s going to happen, you know, definitely past 2030. So a lot of all of these things are ongoing.

While there are still, of course, body of organizations and people out there who are completely ignorant, not in a bad way, right? They just haven’t reached, been reached by the news. There’s a lot of things in this industry, so you can’t keep track of everything.

Yesenia (12:43)

Right, they’re very unaware potentially of what’s to come or even if they’re impacted.

Tomas Gustavsson (12:49)

Yes.

David Hook (12:50)

So the issue you run into of course for something like this is that it costs money. That tends to slow people down a bit.

Tomas Gustavsson (12:58)

Yeah, that’s one thing when people or organizations start planning, they fall into these non obvious things like from a developer when you just develop it and then someone integrates it and it’s going to work. But large organization, they have to look into things like hardware depreciation periods, right? When if they want to be ready by 2035 or 2030, they have to plan backwards to see when can we earliest start replacing hardware if it’s routers or VPN and these kind of things. And when do we need to procure new software or start updating and planning our updates because all these things are typically multi-year cycles in larger organizations. And that’s why things like the financial industry is trying to start to plan early. And of course, we as suppliers are kind of on the bottom of the food chain. We have to be ready early.

David Hook (14:02)

One of the, actually, I guess there’s a couple of runs across where the money’s got to get spent too. So the first one really is that people need to properly understand what they’re doing. It’s surprising how many companies don’t actually understand what algorithms or certificates that got deployed. So people actually need to have their inventory in place.

The second thing, of course, that we’ll probably talk about a couple of times is just the issue of crypto agility. It’s been a bit of a convention in the industry to bolt security on at the last minute. And we generally get away with it. Although we don’t necessarily produce the best results. But the difference between what we’ve seen in the past and now where people really need to be designing crypto agile implementations, meaning that they can replace key side certificates, keys, even whole algorithms in their implementations, is that you really have to design a system to deal with that upfront. And in the same way as we have disaster recovery testing, it’s actually the kind of thing that needs to become part of your development testing as well. Because as I was on a panel recently for NIST and as one of the people on that panel pointed out, it’s very easy to design something which is crypto agile in theory. But it’s like most things, unless you actually try and make sure that it really does work, that’s only when you actually find out that you’ve actually accidentally introduced a dependency on some old algorithm or something that you’re trying to get rid of.

So there’s those considerations as well that need to be made.

Yesenia (15:43)

Seems like a lot to be considered, especially with the migration and just the bountiful information on post quantum as well. I want to shift gears just a little bit and just throw in some randomness and talk about the importance of randomness. It’s just a topic that with many companies promoting things like QRNG and research just revealing breakable encryption keys, mostly due to weak entropy – Can you talk about why entropy can be hard to understand and what failures it depends on?

David Hook (16:20)

Yeah, entropy is great. You talk to any physicist and usually what you’ll find out is they’re spending all their time trying to get rid of the noise in their measurement systems. And of course, what they’re talking about there is low entropy. What we want, of course, in cryptography, because we’re computer scientists, we do everything backwards, we actually are looking for high entropy. So high entropy really gives you good quality keys.

That is to say that you can’t predict what actual numbers or bit strings will actually appear in your keys. And if you can’t predict them, then there’s a pretty good chance nobody else can. That’s the first thing. Of course, one slight difference, again, because we’re computer scientists and we like to make things a bit more difficult than they need to be sometimes, we actually in cryptography talk about conditioned entropy, which is what’s defined in a recent NIST standard, which has got the rather catchy name of SPA 890B.

Yesenia (17:24)

Got you.

David Hook (17:25)

And that’s become sort of the, I guess, the current standard for how to do it properly, and that’s been adopted across the globe by a number of countries. Now…one of the interesting times of this, of course, is the quantum effects actually are very good for generating lots of entropy. So we’re now seeing people actually producing quantum random number generators. And the interesting thing about those is that they can just provide virtually an infinite stream of entropy at high speed. This is good because the other thing that we usually do to get entropy is we rely on what’s called opportunistic entropy.

So on a server, for example, you go, know, how fast is my disk going? How, where am I getting blocks from? You know, what’s the operating system doing? How long is it taking the user to type something in? Is there network latency for this or that? Or, you know, all these sort of things that all these operating system functions that are taking place. How long does it take me to scan a large amount of memory? These all contribute to, you know, bits of randomness really because they’re characteristic of that system and that system only.

The issue of course that we’ve got is that nowadays a lot of systems are on what you call virtual architectures. So the actual machine that you’re running on is a virtual machine. And so it doesn’t necessarily have all those hardware characteristics that it can get access to. And then there’s the other problem, know, which is like when we do stuff fast now, we use high speed ram disks, gigabit ethernet, you all this sort of stuff. And suddenly a lot of things that used to introduce random random-ish sort of delays are no longer doing that because the hardware is running so fast and so hot, which is great for user response times, but for generating cryptographic keys, maybe not so nice. And this is really where the QRNGs, I think, at the moment are coming into their own because they provide an independent way of actually producing entropy that the opportunistic schemes that we previously used are suddenly becoming ineffective for.

Tomas Gustavsson (19:34)

I might add in there that the history is kind of littered with severe breakages due to entropy failures. We have everything from Debian wikis, which we still suffer from even though it was ages ago. We had the ROCA wikis which led to replacement of like a hundred million smart cards a bunch of years ago and there’s still research, you know, recent research that shows that off on the internet there’s breakable RSA keys in certificates which are active due to typically being generated maybe on a constrained device during the boot up phase where it hadn’t gathered enough in entropy yet. So it becomes predictable. So there’s a lot of bad history around this and it’s not obvious how to make it correctly. Typically you rely on the platform to give it to you.

But then, when the platform isn’t reliable enough, it fails.

David Hook (20:37)

And the interesting thing about that is that, know, the RSA keys that Thomas was talking about, you don’t need a quantum computer to break them. I mean, it’d be nice to have one to break them with because then you could claim you had a quantum computer. But the reality is you don’t need to wait for a quantum computer because of the poor choices that have been made around entropy. The keys are breakable now – using conventional computers. So yeah, entropy is important.

Yesenia (21:04)

The TLDR entropy is important. And we are heading towards that time of this migration and stuff. As we had mentioned earlier, a lot of companies, they just might not be aware. They might not feel like they fall under this migration and these standards that are coming along. So I just wanted to see if y’all can share some practical advice – for organizations that are beginning their post-quantum journey, what are one or two steps that you’d recommend that they take now?

Tomas Gustavsson (21:35)

I think, yep, some things we touched on already, like this inventory. So in order to migrate away from future vulnerable cryptography, you have to know what you have and where you have it today. And there’s a bunch of ways to do that. And it’s typically thought as kind of the first step in order to allow you to do some planning for your migration. I mean, you can do technical testing as well. We’re computer geeks here, so we like the testing.

While you’re doing [unintelligible] and planning, can test the obvious things that you know already that you know you’ll have to migrate. So there’s a bunch of things you can do in parallel. And then I think I mentioned is that you have to think backwards to realize that even though 2030 or 2035 doesn’t sound like tomorrow, it’s in a cryptographic migration scenario, or software and hardware replacement cycle it is virtually tomorrow. while they were saying that the best time to start was 10 years ago, but the second best time to start is now.

Yesenia (22:49)

I mean, it’s four and half years away.

David Hook (22:51)

Yeah, and we’ve still got people trying to get off SHA-1. It’s just those days are gone. The other thing too, of course, is yeah, people need to spend a bit of time looking at this issue of crypto agility because the algorithms that are coming down the pipe at the moment, while they’ve been quite well studied and well researched, it’s not necessarily going to be the case that they’re actually going to stay the algorithms that we want to use. And that might be because it could show up that there’s some issues with them that weren’t anticipated and parameter sizes might need to be changed to make them more secure. Or there’s a lot of ongoing research in the area of post-quantum algorithms and it may turn out that there are algorithms that are a lot more efficient to offer smaller key sizes or smaller signature sizes, which certain applications are one to migrate to quite quickly.

So, know, if you can imagine, you know, having a conversation with your boss where, you know, suddenly there’s some algorithm that’s going to make you twice as productive and you have to explain to him that you’ve actually hard coded the algorithm that you’re using. I don’t think a conversation like that’s going to go very well. So flexibility is required, but as I said, the flexibility needs to be designed into your system. in the same way as you have disaster recovery testing, it needs to be tested before deployment. can actually change the algorithms we need to.

Tomas Gustavsson (24:14)

Yeah, we’ve actually, you often say that since you’re doing this work on migration now, you know, it’s an opportunity to look at crypto agility. If you’re changing something, make it crypto agile. And the same thing, you know, classic advice is if you rely on vendors, be it commercial or open source, ask them about their preparedness for quantum readiness when they’re going to be ready. So you have to challenge everything, both us, you know, in the in our community, right? There are among different open source projects, nothing is start to build and build any new things which are non crypto agile or not prepared for quantum safe algorithms and for old stuff to actually plan. It’s going to take some man hours to update it to be quantum safe in many cases, in most all cases.

David Hook (25:10)

Yeah, don’t be afraid to ask people that are selling your stuff what their agility story is and what their quantum safe story is. I think all of us need to do that and respond to it.

Yesenia (25:21)

Yes, ask and respond. What would be areas or organizations that folks, let’s just say it when they’re aware, they could go ahead and ask if they’re getting started.

David Hook (25:30)

So probably internally, it’s obviously your IT people. I would start by asking them, because they’re the people on the call face. And then, yeah, as Thomas said before, it’s the vendors that you’re working with, because this is one of the things about the whole supply chain – most of us, even in IT, are not using stuff that’s all in-house, we’ve usually got other people somewhere in our supply chain responsible for the systems that we’re making use of internally. And so, you know, people need to be asking everyone. And likewise, your suppliers need to be following the same basic principle, which is making sure that in terms of how their supply chains work, again, there’s this coverage of, you know, what is the quantum safe story and, know, how these systems that have been given to them, all these APIs or products that have been given, how they crypto agile, what is required to change things that need to be changed.

Tomas Gustavsson (26:30)

Now this is a great use case for your SBOMs and CBOMs.

David Hook (26:34)

Exactly, their time has arrived.

Yesenia (26:36)

There you go. It has arrived. Time for the boms. For those unaware, I just learned Cbom because I work with AISboms and Sboms. I just learned Cboms were cryptographic boms. So in case someone was like, what is a Cbom now? There you go. We dropped the bomb on you.

Let’s move over now to our rapid fire part of the interview. I’ll pose a few questions and it’s going to be whoever answers them first. Or if you both answer them the same time, we’ll figure that out.

But our first question, Vim or Emacs?

David Hook (27:06)

Vim or Emacs? Vim! Good answer. I didn’t even know that was a question. I thought it was a joke. I’m sorry, I’m a very old school.

Tomas Gustavsson (27:19)

I was told totally Emacs 20 years ago.

Yesenia (27:22)

You know, we just got to start the first one of throwing you off a little bit. Make sure you’re awake, make sure I’m awake. I know we’re on very different time zones, but…

David Hook (27:29)

I was using VI in 1980. And I’ve never looked back.

Yesenia (27:33)

Our next one is Marvel or DC?

David Hook (27:36)

Yeah, what superheroes do prefer? Oh yeah. I’m really more a Godzilla person. know, Mothra, Station Universe for Love, that kind of thing. Yeah. I don’t know if Marvel or DC has really captured that for me yet.

Tomas Gustavsson (27:56)

Yeah, I remember Zelda, was. There was the hero as well. That was in the early 90s, maybe 80s even.

David Hook (28:05)

Yeah. There you go. Sorry.

Yesenia (28:07)

There you go. Not it’s OK. You got to answer. Sweet or sour?

Tomas Gustavsson (28:10)

Sour.

David Hook (28:11)

Yeah, I’d go sour.

Yesenia (28:12)

Sour. Favorite adult beverage?

Tomas Gustavsson (28:18)

Alcohol.

David Hook (28:22)

Probably malt whiskey, if I was going to be specific. But I have been known to act more broadly, as Thomas has indicated, so probably a more neutral answer.

Yesenia (28:35)

Thomas is like, skip the flavor, just throw in the alcohol.

Tomas Gustavsson (28:40)

Well, I think it has to be good, but it usually involves alcohol in some form or the other.

Yesenia (28:47)

Love it. Last one. Lord of the Rings or Game of Thrones?

David Hook (28:52)

Lord of the Rings. I have absolutely no doubt.

Tomas Gustavsson (28:55)

I have to agree on that one.

Yesenia (28:57)

There you go, there you have it folks, another rapid fire. Gentlemen, any last minute advice or thoughts that you want to leave with the audience?

David Hook (29:05)

Start now.

Tomas Gustavsson (29:07)

And for us, if you’re a computer geek, this is fun. So don’t miss out on the chance to have some fun.

David Hook (29:16)

Yeah, we pride ourselves on our ability to solve problems. So now is a good time to shine.

Yesenia (29:22)

There you have it. It’s time to start now and start with the fun. Thank you both so much for your time today, your impact and contribution to our communities and those in our community helping drive these efforts forward. I look forward to seeing your efforts in 2025. Thank you.

David Hook & Tomas Gustavsson (29:41)

Thank you. Thank you.

Aug 26

Love0

What’s in the SOSS? Podcast #38 – S2E15 Securing AI: A Conversation with Sarah Evans on OpenSSF’s AI/ML Initiatives

By craig Podcast

Summary

In this episode of “What’s in the SOSS,” we welcome back Sarah Evans, Distinguished Engineer at Dell Technologies and a key figure in the OpenSSF’s AI/ML Security working group. Sarah discusses the critical work being done to extend secure software development practices to the rapidly evolving field of AI. She dives into the AI Model Signing project, the groundbreaking MLOps whitepaper developed in partnership with Ericsson, and the crucial work of identifying and addressing new personas in AI/ML operations. Tune in to learn how OpenSSF is shaping the future of AI security and what challenges and opportunities lie ahead.

Listen on Apple Podcasts Listen on Spotify Listen on Overcast Listen on Pocket Casts

Conversation Highlights

0:00 Welcome and Introduction to Sarah Evans
0:48 Sarah Evans: Role at Dell Technologies and Involvement in OpenSSF
1:38 The OpenSSF AI/ML Working Group: Genesis and Goals
3:37 Deep Dive: The AI Model Signing Project with Sigstore
4:28 AI Model Signing: Benefits for Developers
5:20 Transition to the MLSeCOps White Paper
5:49 The Mission of the MLSecOps White Paper: Addressing Industry Gaps
7:00 Collaboration with Ericsson on the MLEC Ops White Paper
8:15 Identifying and Addressing New Personas in AI/ML Ops
10:04 The Power of Open Source in Extending Previous Work
10:15 Future Directions for OpenSSF’s AI/ML Strategy
11:21 OpenSSF’s Broader AI Security Focus
12:08 Sneak Peek: New Companion Video Podcast on AI Security
12:31 Sarah’s Personal Focus: The Year of the Agents (2025)
13:00 Security Concerns: Bringing Together Data Models and Code in AI Applications
14:00 Conclusion and Thanks

Episode Links

Transcript

0:00 Intro Music & Promo Clip: We have so much experience in applying secure software development to CI/CD and software, we can extend what we’ve learned to the data teams and to those AI/ML engineering teams because ultimately, I don’t think that we want a world where we have to do separate security governance across AI apps.

CRob:

0:20: Welcome, welcome, welcome to What’s in the SOSS, where we talk to interesting characters from around the open source security ecosystem, maintainers, engineers, thought leaders, contributors, and I just get to talk to a lot of really great people along the way. Today we have a friend of the show we’ve already had discussions with her in the past. I am so pleased and proud to introduce my friend Sarah Evans. Sarah, for our audience, could you maybe just tell them, remind them, you know who you are and what do you do and what you’ve been up to since our last talk.

Sarah Evans:

0:57: Well, thanks for having me here. I’m a distinguished engineer at Dell Technologies, and I have two roles. One is I do security applied research for my company looking at the future of security in our products and what innovation that we need to explore to improve the security by design. My second role is to activate my company to participate in OpenSSF, which I have thoroughly enjoyed getting to work with friends such as yourselves. I am very active and engaged in the AI/ML working group and trying to advocate for AI security.

CRob:

1:37: Awesome, yeah. And that actually brings us to our talk today. Our friends within your working group, the AI/ML working group, you’ve had a flurry of activity lately. I would love to talk about, you know, first off, let’s give the audience some context. Let’s talk about what is this group, and what’s kind of your some of your goals.

Sarah Evans:

1:58: Yeah, so the AI/ML working group really kind of came into fruition about a year and a half ago, I think, and we needed a space where we could talk about how the work that software developers were doing would change as they started to build applications that had AI in it. So were there things that we were doing today that could apply to the way the technology was changing?

One of the initial concerns is software secure software development we know a lot about that, but we may know less about AI. So is is a home for AI and OpenSSF appropriate? Should we be deeply partnering with some of the other foundations that are creating these data sets, creating the tools and models, and so we started the working group where our commitment to the tech was that we would deeply engage with the other groups around the ecosystem which we have. Done, but then we’ve also been looking for where are those places that are uniquely in the OpenSSF wheelhouse or swim lane of expertise on extending software security to AI applications, and I think that we’ve done a really good job of kind of exploring some of those places.

One of them has been with a white paper that we are partnering with another member in Ericsson to deliver, and that is something that we’re very proud of sharing with the community.

CRob:

3:28: Great, I’m really excited to talk about these projects because I for one welcome our robot overlords. Let’s first off start off – we had a big, you guys had a big announcement that really seems to have captured the imagination of the community. Let’s talk about the AI model signing project.

Sarah Evans:

3:47: Yes, so the model signing project, we worked that as a special interest group within our working group. We were approached by, some folks who are working in partnership with SigS store and. The idea was that if you can use Sigstore to sign code, could you extend Sigstore to sign a model and fill and close a gap that didn’t exist in the industry, and as you know, we were able to do that. There was a team of people that came together in the open source fashion to extend a tool to a new use case. And that’s just been very exciting to watch that evolve.

CRob

4:27: That’s awesome.

Sarah Evans:

4:28: So thinking about it from the developer perspective, I’m a developer working in the AI, how does this help me?

CRob

4:36: Right?

Sarah Evans:

4:36: So right now if you are pulling a model off of hugging face as an example, you don’t have any cryptographic digital signature on that model that that verifies it. The way you would with code. And so if that model has been signed with the SIS store components, then now you have the information that you would use to validate code. You can also follow some of those similar processes to validate a signed model.

CRob

5:07: Pretty cool.

Sara Evans

5:08: Yeah, it’s a really good use case for the supply chain security. And extending what we know about software to models and data that are part of our AI applications.

CRob

5:20: This seems to be kind of a theme for you taking classic ASA and applying it to the newer technologies. So let’s move on to the white paper. You and I have collaborated around some graphics for this, and then you’ve got a couple of folks you’re working with on the white paper. You’re shepherding through review and publication, and you should be able to read that now. So you know why do you think this talk, let’s talk about the white paper, you know, what’s it about? What’s it kind of the mission of it?

Sarah Evans

5:49: When the AI/ML working group first kicked off, I knew that we had seen this evolution of developing on open source software and processes called DevOps and then those evolved to DevSecOps over time. And so with the disruptive technology around AI/ML, I wanted to know what were the processes that a data scientist or an AI/ML engineer used and did they have the security governance they needed in their operational processes.

So I started to look at what is DataOps, what is MLops, what is LLMOps, like all the alphabet soup of ops all the ops. And I couldn’t find a lot of information online. And so I thought this is an industry gap that we have and we have so much experience in applying secure software development to CICD and software.

We can extend what we’ve learned to the data teams and to those AI/ML engineering teams because ultimately I don’t think that we want a world where we have to do separate security governance across AI apps that have these different operational pieces in them.

I was doing my research and I found a white paper by Ericsson on MLSecOps in the telco environment. Ericsson being a fellow member of OpenSSF, I, you know, worked through their OSPO and through some of the connections that we have in OpenSSF said, Hey, can you introduce me to those authors? I would love to see if we could up level that as a general resource to the community as an OpenSSF whitepaper. We were able to do that. They have been a fantastic partner in collaboration.

And so now we have for the industry an MLSecOps white paper reference architecture and some documentation about extending in two ways:

One is if you’re a software developer now and you’re being asked to build an AI app, you have more information about what goes on in that MLOps environment.
And if you are a person who’s creating an MLOps app and you haven’t had secure development training before, you now have a resource so it really serves kind of an existing member of our community and a new member of potential members of our community.

CRob

8:14: That’s really awesome. Congrats on that. Another area that we’ve collaborated on, the OpenSSF has a series of personas. We have 5 personas and that kind of organizes and drives our work. We have a Maintainer developer persona and OSPO persona and executive persona and so forth but one thing that you came to me that you realized early on as you were developing this white paper is there was a, there’s some gaps. Could you maybe talk about those gaps and what we’ve done to address them?

Sarah Evans

8:46: Yeah, where we found the gaps were in sub-personas so those main core personas that OpenSSF has been working with were, were just solid. We still have developers and maintainers, we still have security engineers…we still have folks working in our open source program offices, but the sub-personas were very software developer focused.

They really didn’t include some of the personas that we were seeing related to curating data sets, putting together end to end architectures, or, kind of putting together a pipeline for machine learning as a data engineer. So we, I worked based off of the language in that original Ericsson white paper that we have up leveled to an OSSF white paper to take those personas that work in that MLE op space and add them as sub personas within OpenSSF. So now we can all start to have the same language and understanding around who might be developing software applications, new members of our community that we want to be inclusive of and have language to understand how to reach them and partner with them.

CRob

10:04: I just love the power of open source where you find some previous work, you get value out of it, and then you expand it. Thank you so much for contributing that back.

Sarah Evans

10:13: Absolutely.

CRob

10:15: And where are you going from here? Where are the next steps around the white paper?

Sarah

10:19: I think we want to spend some time championing and then you know, meeting with our community we’ve discovered that potentially OpenSSF would like to have a broader AI/ML strategy or program and so really understanding how those strategic efforts will evolve and making sure that we can plug into those and provide resources that that strategically move OpenSSF forward into this new space those could include an MLSecOps document or maybe even a converged enterprise view of multiple ops but we’re also open to just looking at. Maybe some of the other areas that have been identified such as dealing with potentially AI slop or other concerns related to AI/ML.

I think there’s a really great opportunity for OpenSSF to look through our stack of tools and processes and understand how we can extend those to AI/ML use cases and applications.

I know that there is an opportunity to have a strategic program around AI and securing AI applications, and I’m really excited and looking forward to what the future of OpenSSF tools, processes, procedures, best practices look like so we can really support our software developers as they’re developing secure AI applications.

CRob

11:12: That’s awesome. I’m really looking forward to collaborating with you all and kind of championing and showcasing the work going forward. So thank you very much.

Let’s move along. We will be creating a new companion video podcast focused on this amazing community of AI security experts we have here within OpenSSF and within the broader community, and we’ll be talking about AI security news and topics. And I’m going to give this, take this opportunity to give the listeners a sneak peek of what we might be discussing very soon. So from your perspective, Sarah, you know, beyond these cool projects that you’re working on, what are you personally keeping an eye on in this fast moving AI space?

Sarah Evans

12:42: Well, I’ll tell you, 2025 is the year of the agents, and understanding the accelerated rate that agents that impact they will have on AI applications has been something I’ve been spending a lot of time on.

CRob

12:56: Pretty cool. I’m looking forward to learning more with everyone together. And from your perspective again, what’s keeping you up at night in regards to this crazy AI/ML, LLM, GenAI agentic, blah blah blah, machine space? What what are you concerned about from a security perspective?

Sarah Evans

I think for me from a security perspective bringing together data models and deploying it with code really puts an end to end AI application. It puts a lot of pressure on teams that may not have had to tightly work together before to begin to tightly work together. And so that’s why the personas and the and the converged operations and thinking about how do we apply what security we know to new areas is so important because we don’t have a moment to lose.

There’s such accelerated excitement around leveraging AI and leveraging agents that’s going to be very important for us to have a common way to talk to each other and to begin to solve problems and challenges so that we can innovate with this technology.

CRob

13:59: Excellent. Well, Sarah, I really appreciate your time come and talk to us about these amazing going on and kind of giving us a sneak peek into the future. And you know, I, I want to thank you again from behalf of the foundation, our community, and you know all the maintainers and enterprises that we serve. So thanks for showing up today.

Sarah Evans

14:17: Yeah, thanks, CRob.

CRob

14:18: Yeah, and that’s a wrap today. Thank you for listening to what’s in the SOSS. Have a great day and happy open sourcing.

Outro

14:29: Like what you’re hearing, be sure to subscribe to what’s in the SOSS on Spotify, Apple Podcasts, Antennapod, Pocketcast, or wherever you get your podcasts. There’s a lot going on with the OpenSSF and many ways to stay on top of it all. Check out the newsletter for open source news, upcoming events, and other happenings. Go to OpenSSF.org/newsletter to subscribe. Connect with us on LinkedIn for the most up-to-date OpenSSF news and insight, and be a part of the OpenSSF community at OpenSSF.org/getinvolved. Thanks for listening and we’ll talk to you next time on What’s in the SOSS.

Aug 12

Love0

What’s in the SOSS? Podcast #37 – S2E14 Open Source Security: OSTIF’s 10-Year Journey of Collaborative Audits

By OpenSSF Podcast

Summary

In this episode of “What’s in the SOSS,” Derek Zimmer and Amir Montezari from the Open Source Technology Improvement Fund (OSTIF) discuss their decade-long mission of providing security resources to open source projects. They focus on collaborative, maintainer-centric security audits that help projects improve their security posture through expert third-party reviews, without creating fear or overwhelming developers.

Listen on Apple Podcasts Listen on Spotify Listen on Overcast Listen on Pocket Casts

Conversation Highlights

00:00 Introduction
00:22 Podcast Welcome
01:04 OSTIF Founders Introduction
02:31 OSTIF’s Mission and Approach
05:28 Relationship Management and Expertise
08:01 Evolution of Security Engagement Methods
12:15 Making Security Audits Less Intimidating
18:00 Rapid Fire Questions
20:45 Closing, Call to Action

Episode Links

Transcript

CRob 0:22
Welcome, welcome. Welcome to What’s in the SOSS, the OpenSSF podcast, where I get to talk to some of those amazing people on the planet that are helping secure the open source software we all know we all use every day and that we love today, I have some very special friends with us that are doing the yeoman’s work trying to help work with projects to help improve their security posture. I have Amir and Derek from OSTIF. Can I give you guys just a brief moment to introduce yourselves?

Derek Zimmer: 0:54
Sure, I’m Derek Zimmer, founder of OSTIF. We’ve been doing this for 10 years now and take it away. Amir.

Amir Montezary: 1:04
Thank you. Amir Montezary, Managing Director of OSTIF, open source technology improvement fund, yeah, absolutely thrilled to be here on the podcast and to be talking with you, CRob, and to be talking about the work that we do. As Derek mentioned, this is our 10 year anniversary. So coming up on 10 years of really developing this organization, the processes and really fine tuning to a degree what we do and the value that we provide to the open source ecosystem. So absolutely thrilled to be here and to talk about it.

CRob 1:40
That’s amazing. So happy birthday OSTIF, for our audience that might not be familiar directly with your work. Could you maybe tell it? Tell us what OSTIF is, and what do you all do?

Derek 1:53
Sure. So we founded the organization 10 years ago on the idea that we needed a maintainer centric organization that could bring security resources to projects. There were some efforts in the past to do something similar to what we do, but most of the time, those were very corporate centric. So the ideas that circulated around them were very were dictating what open source should be doing and not we’re here to help. And here’s some resources so that that different perspective was the the kickoff for why we wanted to create something different.

Amir 2:36
Yeah, absolutely. And and still today we see that open source projects, because of their very nature, you know, they need a very strong, independent body to to help them. We provide that platform, being a nonprofit organization, being vendor neutral, being neutral in all senses of the word, and just solely focused on, as Derek mentioned, helping projects, getting them the security resources that they need, and in a way, most importantly, being able to provide those resources in a way that directly impacts the project and its security posture was really what drove us to start this organization. You know, typically, open source developers, maintainers, are not security experts, and that’s okay. Security is a very difficult topic, and like, like a lot of other things, it’s best to be left to the experts. So while, of course, there are things individual developers and maintainers can do to, you know, improve their their hygiene, so to speak, and improve the security posture of their projects, we found that getting independent third party expert audit review in a way that is again meant to be collaborative, as in, these auditors work with the maintainers, as opposed to kind of dictating to maintainers or telling them, you know, things to do, work with them on improving, kind of the holistic security posture of their project, and we found that to be really successful. A lot of research suggests that this is a very good practice to do. I come from a background in it, auditing, reviewing critical payment systems in the United States. That is a great field, and that we saw that that level of independent review, or third party review, that kind of due diligence, really helps improve the the state or posture of a software project. So so it was really. Founded on the need for it to exist. We saw there was a big need for this, that a mechanism to get security help, to open source projects, working directly with maintainers, and doing it in a way that is inclusive and impactful and most importantly, efficient, is kind of what drove us to do what we do, and so in terms of kind of how we do that, it’s largely a lot of just relationship management. So we’ve in the last 10 years, built a really vast network of security experts, researchers, a lot of which are solely focused in the open source security space, so they kind of understand some of the idiosyncrasies involved in open source software, and can, again, can actually provide meaningful review work and collaboration and essentially handle that whole process, because there are quite a lot of moving parts between. You know, typically you have a separate body funding the work, you have the maintainers or contributor base that could be very much distributed around the world. You don’t always have, I guess, established kind of decision making structures, as you might see in a corporate setting or in a more commercial environment. So we kind of handle all of that, all of that goodwill building, relationship building, project management, contract management, basically all of the pieces so that all that, all that’s needed for a funder, for example, someone who wants to fund security outcomes, or the project you know that would like to improve their security posture, they can just focus on that, and we, as an organization, as an independent body, essentially handle all of the all of the minutia and the administrivia and the facilitation and management to make it, to make it a very streamlined and efficient process. So that’s kind of high level overview.

CRob 7:23
As you both are aware, you have been long time participants and partners with our foundation and also our friends over at Alpha-Omega. From your perspective, kind of with your 10 years of working in this particular space. What do you all see as the main value that projects get out of these types of engagements?

Derek 7:47
So actually, this has changed over time, because we started out experimentally trying things just to see what works and what doesn’t. Initially, we started out as a bug bounty organization. So our concept was that companies would donate money to us, we’d establish bug bounties for projects, and then those projects would get the security benefits. What we quickly found out was this does not work well for projects that don’t have a lot of security resources, because they get buried in bunk reports things that are not actually problems. And then there’s also the bag bounties, where some dependency has a vulnerability, and then someone will go shop around to every project that depends on that dependency and try to get a bug bounty out of it and and so on and so forth. And then, increasingly, AI is also becoming a problem because it is doing automated reports to maintainers which are not accurate and then have to be thrown away, and they can be done at a much greater pace than an individual could just a few years ago. So essentially, we, we abandoned that entire thing and went to the idea of having professionals come in, give all of the support that they can give to the project, and kind of meet them where they are, and then extend their their testing so that they get long term benefit from the review as well. So So it started out with skin in our knees and finding stuff that didn’t really work, and then progressed over time, after a lot of feedback to where we are now, which seems to be extremely helpful.

Amir 09:34
So yeah, and to echo that, I would say, I would say the main value of our engagements is that direct impact. You know, we go directly to the project, to the main work with the maintainers or contributors of a project, actually going to the source. You know, the source as in reviewing and improving the code of a project. Project its design, and as Derek mentioned, one way we’ve added even more value as part of our engagements over time is creating or augmenting tooling for projects as well, so that they can continue to have security scrutiny and tools that can help them in their development cycles and to help projects mature. So I would say that that direct focus on the projects, on their code base and on the on the tried and true practice of a expert third party review is how we’re really delivering a lot of value. I would say through our engagements, we’re coming up, as I mentioned on our on our 10 year anniversary next month, and I think we have found well over 100 high or critical vulnerabilities and these projects as parts of our as part of our audits. Thank you. Thank you. We’re really we’re really proud of what we’ve been able to do and the positive impact we’ve been able to make. And yeah, and I think that really comes from sticking to our mission and to our commitment to this best practice of, you know, expert third party review, but doing it in a way that is collaborative and impactful. So so we didn’t just find all of those, those vulnerabilities, those have all been fixed and remediated, and a lot of those, at least a good portion of them were kind of design bugs or or classes of bugs that very well, you know, could eliminate future problems very effectively, not in a, unfortunately, not in a very Easily, easy to measure way, but, but the feedback suggests that the projects are, in fact, much in a much better state after our engagements. So we’re really happy to be able to do that.

CRob 12:15
That’s phenomenal. I love the fact that you all started off in one direction, and then you learned a little bit, and you’ve pivoted so you’ve evolved yourselves. Thinking about your engagements over the last almost decade, is there one thing you wish a project or a developer knew or did prior to coming into one of these engagements that would make the whole enterprise be more successful or go more smoothly. What was one thing you wish people did or knew?

Derek 12:46
So the big takeaway is that if you do a security engagement with us, it’s not scary, because we are here to help. We will offer you any support and resources that we have. You know we’re not going to find a big pile of bugs that you don’t understand, dump a document on you and walk away. The whole point of this is to help projects improve by giving them everything that they need and meeting them where they are. So the FAQ we usually get from maintainers is, you know, how long is this going to take? How much time do I have to invest into this? And then always the questions about, are you going to drop zero days on me at the end of this engagement? And of course, we follow disclosure policies that everybody agrees on and also we are very flexible. So if there’s a design level problem that requires a big rewrite, we’re not going to just drop it on the internet in 90 days. We’re going to be forgiving. So the pressure from us is very low, and I think that that’s one thing that maintainers would really like to hear from, you know, working with us.

Amir 14:07
yeah, plus one to that, Derek, I would say it’s very not meant to be a collaboration. It’s meant to be a engagement that is collaborative in nature. And I, I do wish more developers knew that it wasn’t as again, to echo you Derek, it’s not, it’s not a scary thing. It’s not you’re like, you’re going to be going in front of a tribunal, and you know, it’s very much, let’s work together to make this project better. And I’ve, I’ve I’ve observed personally that it’s one of those types of things where the more you put in, so the more that developers, maintainers, contributors, the more that they’re able to put into the engagement, in terms of providing audit teams with in. Site or with feedback or context, because I think that’s the piece that really is missing significantly with a lot of the, as Derek mentioned, kind of the tooling and some of the other kind of at scale things that at scale solutions, they really lack that context that is really important, especially in terms of security, when it comes to security in a code base, so it definitely has a multiplier effect. You know, the more we’ve seen projects being engaged in the audit, typically, we found much better results. And I can even give a direct case study example, where one an engagement that we were involved in. The audit team and the developer team happened to be our train ride apart, so they were able to arrange, essentially, an in person kind of orientation, kind of to really just discuss and get to know each other and gets in, you know, it was a really cool thing, and we learned that that led to a much better understanding of the code base as the team was auditing it, and that allowed them to find more significant findings, because, again, they had that greater understanding as a result of the context provided By the by the team and and, and actually that that same team that we worked with on this direct engagement yesterday at one of our virtual meetups, we learned that they did something similar. So their client wasn’t as was a quick it was a flight. But flights in Europe are shorter just and they were able to get together with the with the main maintainers of the project, and do, again, a very similar thing, where they were able to get together discuss, and that led to a much better understanding of the project, and allowed the auditors to add that much more value as part of the audit. So I to sum it up, I would say, as I said, add value. That’s I would that’s how I would sum it up. Is that I wish more developers knew that this is about adding value. It’s about collaborating. It’s not about, you know, making you feel bad about making mistakes or anything like that. You know, human beings will always, will always, you know, will always have that, that, you know, human error, and it’s totally normal and fine. And that’s why this as a practice is so important, because, you know, it’s such a common practice in software and really in the in the greater kind of landscape, you know, independent review. And so, yeah, I would say, you know, it’s meant to be collaborative. It’s not the scary thing. It’s really more about, as Derek said, helping and giving you resources to make your project better than anything else.

CRob 17:53
That’s amazing, and I really appreciate just kind of the innovative ideas and the coming to where the project is mentality and really you guys are making sure that security audits aren’t scary at all. But let’s move on to the rapid fire part of the interview. Are you ready for rapid rapid rapid fire? Got a couple wacky questions. Just give me the first thoughts to come out of your mouths, vi or Emacs?

Derek 18:22
oh, VI

Amir 18:25
yeah. Second that excellent.

CRob 18:26
There are no wrong answers, but there are better answers than others, right? What’s your favorite open source mascot?

Derek 18:36
Oh, I’d have to say the VLC cone. Nice, just because it’s nonsense, and they admit that it’s nonsense, and they constantly get asked about it and give nonsense answers. So it’s fantastic.

Amir 18:51
That’s a good point. And you can always tell who the VLC people are at, like FOSDEM, for example, because they have the big, the big cone on the head. And that’s a really good question. There’s a lot of really good ones out there. I’ve honestly found that the this the simpler ones mascots are, I tend to remember them more, but there’s, I’d say, for me, there’s too many good ones to pick so…

CRob 19:16
That’s a very diplomatic answer. I appreciate that. Spicy or mild food?

Derek 19:22
spicy all the way

CRob 19:28
nice, that is always the right answer.

Amir 19:30
Some of our greatest ideas came over spicy food. So…

CRob 19:35
And finally, and most importantly, Star Trek, or Star Wars.

Derek 19:40
So I’d say I’m Star Trek. I I like the idea of everybody working together toward, you know, a peaceful, wide, reaching society,

CRob 19:52
Open source of you. That’s awesome.

Amir 19:54
I would also say Star Trek. I missed the Star Wars kind of lore growing up, yeah, my experience with Star Wars, I had a high school teacher who, anytime he would not be able to make class, instead of a substitute teacher, he would just play the beginning of the first Star Wars movie. I think it was episode four, so I’ve seen the first 30 minutes plenty of times. So maybe that left a bad taste in my mouth with Star Wars.

CRob 20:27
I see we’ve had very different life experiences. That’s great. Well, thank you, gentlemen. I really appreciate you putting up with the nonsense. And then finally, as we wrap up, do you have a call to action for the community or developers, as we kind of close out

Derek 20:45
Sure, I would say we really operate on the principles of Spoon Theory. Have you ever heard of that? It’s from psychology. And the principle is that you have so many spoons of energy that you can devote to various things, and the way that we apply this to open source is thinking about the security knowledge and the just general energy available among open source communities. Some of them are very well supported. They have dedicated staff that are paid, and it’s their job to be there and be available. And then you have the complete opposite end of the spectrum, which is a solar solo maintainer invented a thing. That thing somehow became a really important piece of infrastructure. They don’t have any security knowledge, so they do what they can, you know, reading documents and and whatever, but they don’t have the available energy to invest in security so that that’s where I’m coming from. When I say, meet projects, where they are, and the call to action would be, if you are a security researcher and you’re interacting with open source, this is what you need to consider is their position on that spectrum of knowledge and available energy. So…

CRob 22:09
Amir?

Amir 22:10
Yeah, plus one to that, and to add, I would just say that if there’s one thing I’ve learned, you know, from doing this for 10 years, it’s that. It’s it’s important work, and it needs there. There’s almost an unlimited demand for it. You know, I was really shocked when I saw how some of the you know, projects, biggest names and open source projects, household names that we hear every day, really needed almost the same, if not more, security help than maybe the smaller projects, because, for example, some of the really big projects, because they have so much more scrutiny, they have a lot more noise to go through, for example, or they have, they could potentially have huge backlogs of bugs that they just haven’t gotten the time or resources to go through. And so I think my call to action would be, you know, we are one of the one tool in the in the toolkit, but I do think what we do really does help open source projects and we can do more with more. So we always are typically trying to do the most we can with what we have and which we always do, of course, but I think we really could do more with more so we can add more more help for projects, more diligence for projects, more ongoing support for projects. The work that we’ve been doing, doing tooling, augmentations, for example, has been really successful. And, you know, we, and we as a small organization, we are always happy and willing to take on more work. So we’re always open to new collaborations, new collaborate tours and helping how we can to fulfill our mission, which has been to help open source projects improve their security. So yeah, come talk to us. We’re involved in a lot of the open source security foundation working groups and events. As you mentioned, we’ve been strategic partner for Linux Foundation and OpenSSF for some time now. So yeah, we are always happy to collaborate and help how we can in the nature of open source. And so I’d say that’s that’s all I have. All right,

CRob 24:38
Derek and Amir from OSTIF, thank you all for your amazing work and helping collaborate with our developer community, and that’s going to be a wrap. Happy open sourcing, everybody. We’ll talk to you all soon. Goodbye.

Amir
Cheers, everyone. Thanks.

Outro
Like what you’re hearing. Be sure to subscribe to What’s in the SOSS on Spotify, Apple podcasts and Antenna, Pocket Cast, or wherever you get your podcasts. There’s a lot going on with the OpenSSF, and many ways to stay on top of it all. Check out the newsletter for open source news, upcoming events and other happenings. Go to openssf.org/newsletter to subscribe. Connect with us on LinkedIn for the most up to date. OpenSSF, news and insight and be a part of the OpenSSF community. At OpenSSF.org/getinvolved. Thanks for listening, and we’ll talk to you next time on What’s in the SOSS.