Skip to main content
All Posts By

craig

What’s in the SOSS? Podcast #9 – Sonatype’s Brian Fox and the Perplexing Phenomenon of Downloading Known Vulnerabilities

By Podcast

Summary

Brian Fox is Co-founder and Chief Technology Officer at Sonatype, bringing over 28 years of hands-on experience driving software development for organizations of all sizes, from startups to large enterprises.

A recognized figure in the Apache Maven ecosystem and a longstanding member of the Apache Software Foundation, Brian has played a crucial role in creating popular plugins like the maven-dependency-plugin and maven-enforcer-plugin. His leadership includes overseeing Maven Central, the world’s largest repository of open-source Java components, which recently surpassed a trillion downloads annually.

As a Governing Board member for the Open Source Security Foundation, Brian actively contributes to advancing cybersecurity. Working with other industry leaders, he helped create The Open Source Consumption Manifesto, urging organizations to elevate their awareness of the Open Source Software (OSS) components they use.

Conversation Highlights

  • 00:57 Brian shares his background
  • 03:56 The confusing trend of downloading assets on Maven Central with known vulnerabilities
  • 08:16 How this trend continues in other repos
  • 11:08 Brian and CRob discuss Log4Shell
  • 16:54 Brian answers CRob’s rapid-fire questions
  • 18:46 Brian’s advice for up-and-coming security professionals
  • 19:50 Brian’s call to action

Transcript

Brian Fox soundbite (00:01)
The customer is not gonna accept as an excuse, like, why did you get hacked? Well, my direct dependencies were fine, it’s those indirect ones that were bad. Like, nobody cares, right? Like if you buy a pie from a bakery and you get sick, it’s not acceptable for them to go, well it was my sugar provider that did a bad job, here. 

CRob (00:18)
Hello everybody, welcome to What’s in the SOSS? I’m CRob. I do security stuff on the internet. I’m also a community leader within the Open Source Security Foundation, and I’m one of the hosts of this amazing podcast. Today I’ve got my friend Brian Fox. He’s the CTO and co-founder of Sonatype and he is an amazing open source contributor and he’s a great collaborator with us and a lot of industry efforts. So I want to say, Brian, welcome.

Brian Fox (00:49)
Yeah, thanks for having me.

CRob (00:50)
Could you maybe share a little bit about your origin story? How you got into open source and kind of the stuff you’re into today?

Brian Fox (00:57)
Oh, sure, my origin story. That’s a long one, but short version is I dabbled in open source before it was I think called open source in college actually, some PERL CGI open source stuff way, way back in the day. But I would say my first real introduction into like durable open source, let’s say — I had I dabbled with a JSP library at SourceForge. And I don’t really count that, but you can see they’re all connected in a straight line. I got involved with Apache Maven super early

in the 2.0 alpha days. I wrote some plugins. I was basically doing development management during the day, had some folks working for me, and we were trying to transform our builds from Ant to Maven and move from CVS, I guess it was, to Subversion. 

Yeah, I’m dating myself a little bit. And we were running into some issues, and so because I didn’t have much opportunity to code during the day, I was coding at night. So I would go home and start working on fixing bugs and writing plugins that the folks working for me during the day the next day would then use to move forward. And so I wrote some pretty popular plugins, the Dependency plugin and the Enforcer plugin that later came from, the Code Haas Project — Code Haas was sort of an adjacent open SourceForge, if you will, that eventually got pulled into Apache Maven proper. And I came along with it as a committer, was later a project management committee member, and for a while after we started Sonotype was even the chair of that project. 

And, and for, for historical reasons that, you know, we could have a whole podcast on, you know, the Maven central repository has always been a separate project from the Apache Maven software base. And basically, a handful of us, and then later Sonatype, have basically been the stewards of that all along. So, I know where all the bodies are buried and have lots of war stories from, from open source Java and Maven Central in general, from, for the last 20-plus years. I suppose.

CRob (03:03)
So I’m a little bit of a data nerd and I know you share that kind of passion for data and numbers. Your organization, Sonatype, puts out an annual report, well many of annual reports, but the one I’m most interested in is the State of Open Source Supply Chain Report. That’s one, I’ve referenced that many, many years. It’s pretty amazing. And I want to kind of talk about some of your findings from the most recent one. Maybe we can kind of, that’ll get us into this topic. But you know, when I was looking at the 2023 report, I noticed that you had many statements, one of which was you noted that 96% of all vulnerable downloads from Maven Central had known fixes available. What’s up with that? Why are folks still downloading known vulnerable packages when there’s a fix available?

Brian Fox (03:56)
Yeah, yeah, this has been my soapbox, as you know, for a couple of years. 2022 was the first year we published that stat. Sadly, it was unchanged in 2023. It did not get better. What does this stat mean exactly? At a point in time, when a thing that is in Maven Central, and I think off the top of my head, I’m not certain. I think around 12% of the things in central have a known vulnerability. But they’re skewed, obviously, towards more popular versions of it. So when those things are downloaded, looking at the point in time that it was downloaded, was there already a fix for that vulnerability available? So in other words, it’s not a case of this vulnerability is out there and people are using it because it’s not fixed.

After Log4Shell, which I’m sure we’ll get into a bit more, you know, we saw a lot of people talking about like open source should do a better job of fixing their stuff. And it’s like, well, when you sit where I am and you look at that and you go, well, wait a minute, when these things are being consumed from the repository, 96% of the time, the fix already is available. So what does that tell you? It tells you that the consuming organizations are not choosing to update for the fix. It shouldn’t be a 0% stat, right? There’s always going to be vulnerabilities that in certain contexts just don’t apply, and so you can continue to use the thing, or you have other mitigating controls.

But 96% is crazy. And the fact that it doesn’t change. And so why does that happen? There’s lots of reasons. And I’ve been recently in blog posts and articles exploring more the psychological side of things. I think there’s, humans are wired to procrastinate and hope that the future will be better. And I think there’s an element of that, that people look at it and go, well, you know, we’ve lived with these things so long, you know, I’m afraid to do an update. Dr. Wheeler and I have kind of toyed with the idea that API changes introduce a bunch of these problems, or at least the fear and the history of API breakages have caused people to be afraid to make an update. 

And so there’s lots of little reasons that cause organizations to just sit there and not update and therefore continue to consume these known vulnerable things. And so I’ll ask the obvious question because when I point this stat out, everybody turns around and asks the same question, which is, Brian, why do you make vulnerable versions available for download in Maven Central at all? Right? And so, you know, part of that is precedence in history that Maven Central is known for its stability, that we don’t allow authors to just up and decide to unpublish something. For those of you that remember left-pad and NPM years ago, it was a little bit of a tit-for-tat, and a maintainer removed a package and it broke like the internet, right? And so for reasons like that, we don’t allow things to just up and disappear. 

Most of the time, like I said, not everything is universally applicable. So I use an analogy. For some people, peanut butter can kill them. They are deathly allergic to peanut butter. I can go into the store right now, I buy peanuts, buy peanut butter, buy things with peanut butter in them. Why? Because I happen to, fortunately, not be allergic to peanut butter. I do not have that vulnerability, or at least that vulnerability does not apply to me. So like that analogy, why should we take every single component down just because it has a vulnerability and make it impossible to reproduce old things, break things that don’t need to be broken? 

And I draw a distinct line between a vulnerability and a known malicious package. A known malicious package is like saying we’re going to continue to sell salmonella-tainted  food because some people luckily don’t get affected by it. Like, no, that’s very, very different, right? So salmonella components, yeah, they’re coming out immediately. Peanut butter, they should be labeled, they should be understood so that people that have vulnerabilities can know about it, right? And so that’s why we don’t just disappear components merely because there’s a vulnerability. 

CRob (08:04)
You have a lot of observability into Maven Central. Do you feel that you know the pattern you’re seeing does that probably exist in other repositories as well? Do you think it’s similar? 

Brian Fox (08:16)
I know it does. Yeah, I know it does. How do I wanna phrase this? There are similar problems. They don’t have the same root in the origin of it. So, for example, Maven has a strong namespace in the group ID, right? So typically that would be the reverse DNS of your company, just like Java package standards. That’s where we, we didn’t invent it. We just followed the Java standard. And when, when somebody comes to Maven Central, we validate that they control that domain and that namespace, or alternatively they can use their project URL at GitHub as an example, right? And, and we do the validation and that’s so somebody can’t show up to the repository and pretend to be Eclipse and start publicizing things. You can’t pretend to be Apache and start publishing things. 

That’s an important piece of this. The other piece of it is that Maven prefers stability over auto-updates. So it prefers staying on a version number. So it’s sort of an anti-pattern in Maven land to say, just get me the latest version of this thing all the time. So that choice, that design choice, has unintentionally created the issue that I talked about, that people aren’t upgrading, right? That’s an unintended side effect of that. But now let’s look at what’s happening on ecosystems like NPM, Python, Ruby, where the tool has made the choice that we are always gonna check and fetch the latest version unless I’m told not to, right? 

That’s why you have the concept of a lock file to lock it down because if you don’t lock it, it’s going to fetch it. And also, unfortunately, those ecosystems also don’t have a strong namespace enforcement. So what happened starting around 2017, the attackers figured this out and they start publishing components into the repository that have confusingly similar names, basically a typosquat of a name. And since there’s no namespace enforcement, it’s very hard for consumers to tell the difference between am I getting the legit component or not?

And you couple that with the fact that the tool is going to auto-update these versions all the time, which means if somebody were to compromise an actual legitimate JavaScript project and put a fake component up there — and this has happened many times — the consumers will fetch it almost immediately. So you’ve created this situation where it’s easy for the attackers to put stuff in the repo and everybody will consume it almost immediately so you have a ready willing audience, which means you’re seeing on those ecosystems this massive stat we talk about, 350,000 known malicious component attacks in the last couple of years and it’s been doubling year over year. 

That’s only happening in those ecosystems. So we don’t have this problem in Maven because there’s not the auto-update and there is the namespace. But we’re having the massive attacks on these other ones because they do, right? So it’s sort of like these design decisions lead to different unintended consequences, I guess I would say.

CRob (11:08)
So let’s let’s talk about a specific example. Our friend Log4Shell that ruined many, many IT folks holidays in December a year or two back. I have an old stat: there were almost 250,000 known downloads of the malicious packages of Log4j which was about 29% of all the downloads at that time.  How is it that people are still grabbing that known malicious package? Why is that with such a widely known vulnerability?

Brian Fox (11:45)
(Laughs)  Yeah, I wish I knew, honestly. So I’m looking at the latest stats and anybody listening to the podcast, you can see them yourself. We have a resource center that’s been up since 2021. It’s at sonatype.com/log4j. So you can see those stats yourself. As we sit here right now, we’re looking at 405 million downloads since it was announced and worse the last seven-day average is 35% of those are of those non-vulnerable versions. And so it’s actually gone backwards in the last year. We were down at one point to, I think it was in October when we published the report, it was in the twenties and it took multiple years to get there, which was really disappointing, but it slid backwards for reasons that I have not been able to explain. 

And so, I also use this one sort of as the perfect bellwether for the reasons that you outlined, right? That it’s fairly easy to exploit. It’s fairly publicized. Everybody knows about it. And it was sort of one of these situations very early on that every vendor was getting asked by their customers, did you remediate this? And they would have to explain why. 

And so it created this situation that it was easier to just upgrade than it was to explain why you weren’t upgrading. So like I said before, I wouldn’t expect the uptake of every vulnerability to be 100% updates. In Log4Shell,  I think that is closer to true, and so we should expect to see the uptake here to be you know closer to, you know, 80, 90, 95% I would expect. I’ve done some research recently to look into the origin of this because you know people will often try to justify like, it’s just a few people with broken builds that are running all the time, right? Because it does seem incredulous. 

I’ve looked into the vulnerable downloads within the last month and they’re coming from millions of IPs worldwide. You can’t pin it down to, this is like a provisioning script that’s running on EC2 or something like this, right? Or some serverless configuration that might be happening in a Lambda, you know, weird things like that. No, it’s all over the map. 

And when I’ve looked at like the cohort of dependencies when I’ve taken the IP numbers that are fetching these vulnerable versions and I look at the cohort of dependencies to see like, is it one application everybody’s built building or something weird. Is it one place that’s causing this as a transit of dependency? I’ve not been able to put my finger on it. It is all over the map, which just tells me it’s kind of what I suspected in the beginning. It’s just average usage and folks for whatever reason just freaking haven’t upgraded. (Laughs) 

So it’s really disappointing because if it were any of the other things I mentioned, we could go attack that at its source and have a big impact. Instead, it’s messaging like this to get people to pay attention. And this is, it kind of intersects the SBOM area. If you can’t produce an SBOM, kind of implies you don’t know what’s inside your software. If you don’t know what’s inside your software, why should we expect that you’re doing a good job of updating it?

CRob (15:493)
So thinking about solutioning this a bit, do you have any advice for what developers or downstream consumers can do as they’re ingesting open source components to avoid this continual downloading of known vulnerable packages?

Brian Fox (15:15)
Yeah, I mean, tooling has existed for a long time to help you understand your entire transitive hull of your application. You know, Sonatype’s been providing this for, what year is it now? 14 years. (Laughs)So this is not a new thing. And you know, in the world, the regulated world, we’re seeing legislation all over the place from the US government, from European Union, PCI standards, everybody’s pushing towards, you need to be able to produce an SBOM. And this is part of the reason why, that it’s no longer acceptable to just blindly ignore what’s inside your software. And we saw for years organizations were only focused on the direct dependencies, the things their developers use, and ignored the transitive dependencies. 

But there’s more transitive dependencies than direct ones. So your odds are of pulling in something bad are even higher further down. It’s quite literally the iceberg. 70% of it is coming from these open source transitive dependencies. You need to have visibility in that because your customer is not going to accept as an excuse, like, why did you get hacked? Well, my direct dependencies were fine. It’s those indirect ones that were bad. Nobody cares. If you buy a pie from a bakery and you get sick, it’s not acceptable for them to go, well, it was my sugar provider that did a bad job here. Like I’m sorry like you know you still created this thing and sold it to me you’re still responsible. 

And I think that ultimately that’s what it’s gonna take that some form of liability reform to the point where organizations are actually responsible for the outcomes is needed to change this behavior. 

CRob (16:54)
Well, I think this is gonna be the first of an intermediary step in an ongoing conversation. And hopefully we can help raise th at awareness and get folks focused in on this unique problem. Let’s move on to the rapid fire section of the interview here. Got a couple questions and we’re gonna see what your thoughts are. First question, Brian, spicy or mild food?

Brian Fox (17:20)
Depends on the day, but let’s go with spicy today.

CRob (17:23)
(Sound effect “Oh, that’s spicy) Nice. Being a developerologist as your background, what’s your preference? Vi or Emacs?

Brian Fox (17:34)
Oh, Vi.

CRob (17:35)
Huzzah! What’s your favorite type of whiskey?

Brian Fox (17:41)
Ooh, ooh. I don’t know. I like variety. How’s that? I like to explore new ones. I have one that they make here — actually, it’s in Vermont, just over the border — they make it from maple syrup. And that one is really unique. So I’m gonna go with that. How’s that?

CRob (17:58)
All right, that sounds like a delight. And our final,  most controversial question, tabs or spaces?

Brian Fox (18:06)
(Sighs)  Spaces.

CRob (18:09)
Spaces, very good. There are no wrong answers, but that’s interesting how that causes a lot of controversy.

Brian Fox (18:17)
Yeah, you’re not going to ask about, you know, Alman versus K&R? I mean, that that always is related to tabs and spaces.

CRob (18:24)
(Laughter) Maybe in season two.

Brian Fox (18:28)
Yeah!

CRob (18:30)
Well, thanks for playing along. (Sound effect: That’s so sweet!) As we close out, what advice do you have for  someone that’s thinking about entering this field, whether it’s being an open source developer or getting into the field of cybersecurity? What advice do you have for newcomers?

Brian Fox (18:46)
I think my advice is always the same. Pick something that you’re interested in, whether it’s robotics or build software, whatever, and go find an open source project and try to get involved. It’s a great way to learn a whole bunch of skills. I mean, to be able to learn how to write better code, but also how to navigate, you know, the inevitable politics when you have more than, you know, two humans involved. The communication skills, the politics, the collaboration. 

I think you can learn a lot from open source, especially if you’re, you know, let’s say in high school or college and you don’t necessarily have the ability to get a job here. You can go get some real-world experience through open source projects and there’s open source for basically everything. So you know if you’re into rockets or planes whatever go find something it’s out there. It’s even easier today than it was, you know, 20 years ago, right? And and that would be my advice. 

CRob (19:44)
And finally, do you have a call to action for our listeners to help kind of inspire them?

Brian Fox (19:50)
If your organization can’t immediately assess — if I were to tell you about a new vulnerability right now that you never heard of, if you can’t immediately assess — if you’re even using that component anywhere in your organization or further, you can’t quickly produce the list of applications that are using that, then you’re basically powerless to respond to this next problem, right? That situation I just described is what most of the world went through in, what, December 9th, 2021, when Log4Shell dropped on them. 

We have studies, it was in the 2022 report, and I think we probably repeated it in 23, that showed organizations that had tooling in place were mediated their portfolio of thousands of applications within days versus other organizations that spent six months doing triage. So if you can’t immediately have a system that can tell you, are you using this component anywhere, any version of it, and you can’t get to the point of saying, we are using this precise version in these applications, then you need to solve that problem immediately. If only to prepare for the inevitable next thing than to prepare for the legislations that are going to require you to have the SBOMs.

And I would add, if you have solved that, then you need to be looking at these intentionally malicious problems because those require different solutions. They’re not going to show up in SBOMS. They’re trickier because they’re attacking the developer as soon as they download them. So if you think you have your inventory under control, you need to ask yourselves, how would I know if a developer downloaded one of these typosquatted components that may have dropped a backdoor or might have had custom code that simply exfiltrated data directly in the open source? Your traditional scanners are not going to pick this up. So I guess that’s two things depending on where you sit on that spectrum.

CRob (21:39)
Well, excellent. I appreciate you have some amazing advice and those are some interesting things to think about and act on. As always, I really appreciate your partnership and your ongoing contributions to help make the whole ecosystem better, Brian. You’re an amazing partner.

Brian Fox (21:53)
Yeah. Likewise. Thanks, CRob, for inviting me.

CRob (21:55)
Cheers!

Announcer (21:56)
Thank you for listening to What’s in the SOSS? An OpenSSF podcast. Be sure to subscribe to our series of conversations on Spotify, Apple, Amazon or wherever you get your podcasts. And to keep up to date on the Open Source Security Foundation community, join us online at OpenSSF.org/get involved. We’ll talk to you next time on What’s in the SOSS?

What’s in the SOSS? Podcast #8 – Intel’s Arun Gupta and Giving Back to Security Communities

By Podcast

Summary

Arun Gupta is vice president and general manager of Open Ecosystem Initiatives at Intel Corporation and the OpenSSF Governing Board Chair. Arun has been an open source strategist, advocate, and practitioner for nearly two decades. He has taken companies such as Apple, Amazon, and Sun Microsystems through systemic changes to embrace open source principles, contribute, and collaborate effectively.

On July 9-10, the OpenSSF will attend the 2024 OSPOs for Good symposium hosted by the UN. Arun and What’s in the SOSS? co-host Omkhar Arasaratnam will lead a session called “Engaging the Open Source Community.”

Following the symposium on July 11, attendees are invited to come to a secondary event, What’s Next for Open Source? It will feature a collection of curated workshops to discover how to build and gather the skills you need to move forward with open source. Omkhar is coordinating the security track and presenting opening remarks. Arun will offer closing remarks. 

Conversation Highlights

  • 02:13 – Arun’s general outlook on security and life
  • 03:39 – Arun shares his personal background and illustrious career history
  • 09:04 – Comparing the OpenSSF and the Cloud Native Computing Foundation (CNCF)
  • 13:30 – Arun details his work with the United Nations
  • 16:42 – Areas that a lot of security professionals are getting wrong
  • 18:20 – Arun answers Omkhar’s rapid-fire questions
  • 19:08 – Advice Arun would give to aspiring security professionals
  • 20:40 – Arun’s call to action for listeners

Transcript

Announcer (00:01)
A quick programming note: On July 9th and 10th, the OpenSSF will attend the 2024 OSPOs for Good symposium hosted by the UN. What’s in the SOSS? co-host Omkhar Arasaratnam and today’s guest Arun Gupta of Intel will lead a session called “Engaging the Open Source Community.”

Following the symposium on July 11th, attendees are invited to attend a secondary event, What’s Next for Open Source? It will feature a collection of curated workshops to discover how to build and gather the skills you need to move forward with Open Source. Omkhar is coordinating the security track and presenting opening remarks. Arun will offer closing remarks. 

For more information, check out the links in this podcast’s episode description.

Arun Gupta soundbite (00:45)
Sometimes security folks focus too much on the technology. Take a step back. Have that empathy for the customer. Does the customer understand that language? Are you talking in their language? Is the intent of your dialog landing the impact on the customer who’s listening to that discussion? 

Omkhar Arasaratnam (01:03)
Welcome to What’s in the SOSS? I am your host and the general manager of the OpenSSF Omkhar Arsaratnam. And with us today, we have Arun Gupta. Arun, tell us what you do.

Arun Gupta (01:18)
Omkhar, thank you for having me here. I’m really happy and excited to be here. I am the Vice President and General Manager for Open Ecosystem Team at Intel. And that’s my day job. As part of that, I coordinate open source strategy across the entire company, whether it’s software, hardware, all through the stack, why we contribute, how we contribute, how do we bring alignment across different business units.

So that’s quite an exciting venture actually. In addition, because of Intel’s legacy, it allows me to do a lot of chop wood, carry water kind of work in the community. So I’m really fortunate and very grateful to be the chair of the OpenSSF governing board, in addition to the chair of the Cloud Native Computing Foundation governing board as well.

Omkhar Arasaratnam (02:07)
Holy cow, where do you find time to sleep? And you run as well, you’re a runner, is that right?

Arun Gupta (02:13)
I like to run. I think the running is what gives me… I was listening to a podcast this morning and it talks about self-compassion. And I think that’s something that I’m really big on. So it’s very important to be compassionate to yourself. Make sure you’re taking care of yourself so that you can do all of these other things. 

I must say I’m really blessed and fortunate in that sense that people like the way I think, like the way I operate, like the way I treat them, going back to Maya Angelou’s quote, really. So, and I think that’s what has helped me get into the leadership position. There are a lot of wonderful people in this world, but you know, making sure you are listening to people, engaging with people, taking care of them, being empathetic to them. Those are some of the traits that you really need to be in this leadership position. But it really, gives me a satisfactory feeling at the end of the day, being the governing chair of the two of the largest Linux Foundation foundations essentially, and drive them forward.

Omkhar Arasaratnam (03:15)
Your contributions to the community have been numerous and this certainly isn’t your first first day in open source. And I think your numerous contributions to the community is part of the reason why you’ve been elected to such prestigious positions within these two foundations. Can you talk to us a little bit about your history in open source? How long have you been doing it? How’d you get your start?

Arun Gupta (03:39)
Yeah, I grew up in India. I moved to the United States. I moved to the United States back in ‘98. And I was very fortunate to literally go to sun.com/jobs and apply for a job. And I was one of the original JDK team members. And gosh, over two decades ago, we started changing the culture at Sun Microsystems at that point of time. It was a very close source company, Solaris, Netscape application server, all that. 

And then that’s when we started changing the culture at Sun. How do we take this closed source application server and make it an open source application server? And we realized it’s not just about putting the source code over the firewall, but it’s really bringing that people process, culture change, essentially, all of that kind of coming together, essentially. And that sort of…so over 20 years ago is when that bug got into me and I found it very exciting. It’s like, wow, you know, this is core competency of the company and you’re putting that out in the open, but yet that allows you to collaborate with your partner and be able to compete with them as well. That was quite exciting. So back in 2003, 2004 timeframe is when I started getting into that movement and it was still new at that point of time. 

But then, over the last 20 years, that’s the only way I’ve lived and operated exclusively. From Sun, I went to Red Hat, where you will see on their walls of their offices, “First they laugh at you, then they fight with you, and then you win.” And that kind of mantra kind of gets into your blood because that’s the open source philosophy, right? Then I was at CouchBase, then I was at Amazon, part of the open source strategy team, where I was on loan to multiple service teams crafting their open source strategy. 

I remember launching Amazon EKS, Amazon’s managed Kubernetes service back in 2017, and educating the service team that, hey, how do you participate in the open source community? What does it mean? There’s a concept of social dynamics, social engineering that you need to understand. You can’t just submit a pull request and expected to be accepted. So I think that’s the norm that I taught. And then I was on loan to multiple service teams. 

After Amazon, I spent a couple of years at Apple and I was fortunate enough to craft their first open source program office. So I built their first open source program office, went all the way up to the multiple executives, building that case, why Apple should contribute to open source, and a lot of fun over there.

But over the last couple of years, Greg Lavender, our CTO reached out and he says, “Arun, we would like to build open ecosystem culture back at Intel.” And so I’m very fortunate enough here. After a very long time, I feel very happy and excited that all through my management chain to Pat Gelsinger, I don’t have to explain what open ecosystem is. They are the ones that are really pushing the boundary and the entire company is built on…we believe walled gardens prohibit innovation. We believe open ecosystem creates an equitable playground for multiple players to collaborate and also increase the total addressable market so that you can do more fun things on top of that. So I think in that sense, very fortunate to be working at Intel and very fortunate and blessed to be working in this open source movement for the last couple of decades.

Omkhar Arasaratnam (07:07)
What an inspirational story. And it’s, I will second that Intel is definitely one of the examples of an organization that really gets open source. As an old kernel guy, it always used to make me smile to see that the new bits for whatever the new processor was would hit the kernel well ahead of the Silicon being released to the street. And there was a big focus on upstream as well as maintaining the ecosystem that we all enjoy. So thank you Intel and thank you for the work that you do there Arun. 

Arun Gupta (07:41)
It keeps it sustainable. The reason we contribute is because, as you said, Intel has been the largest corporate contributor to Linux kernel for 15 plus years. We contribute there because our customers, when they buy a laptop from Fry’s or Best Buy or an online retail store, they expect when they download Ubuntu or whatever operating system of their choice, it would work out of the box and be able to leverage the latest processors.

And that’s the reason, honestly, we contribute to 300 plus open source projects, whether it’s Linux kernel, PyTorch, TensorFlow, Kubernetes, OpenJDK, and a wide range of projects, because it’s a customer obsession that truly gets us there. And that’s what makes open source sustainable as well.

Omkhar Arasaratnam (08:24)
See, I know you’ve been doing this a long time because you mentioned Fry’s and they’ve been defunct for three years.

Arun Gupta (08:30)
(Laughter) I still love that place. It’s funny because in our neighborhood, Fry’s have been converted into a pickleball court now here.

Omkhar Arasaratnam (08:38)
(Laughter) No kidding. We’ll have to play pickleball the next time I see you. 

Arun Gupta (08:42)
That’s right, yeah!

Omkhar Arasaratnam (08:43)
Switching gears slightly, let’s talk a little bit about the work that you do within Linux Foundation as the board chair for both CNCF as well as the OpenSSF. These are big tasks. I’d love to understand what similarities you see between the security community at the OpenSSF and the cloud native community at CNCF.

Arun Gupta (09:04)
A lot of commonality. They are both, as at Intel we call as, BHAG. Big Hairy Audacious Goals. Both these foundations have those BHAGs essentially. I mean, if you think about CNCF is about how do we make cloud native computing ubiquitous, no matter where you are? And similarly, OSSF, Open Source Security Foundation, talks about how do we secure open source software for the greater public good? 

But there is definitely a lot of similarity between the two foundations. They’re both Linux foundation sub-foundations. They both have a governing board. There are 28 members in CNCF and 23 in OpenSSF per my count this morning. They both have a technical body like CNCF has the technical oversight committee, and OpenSSF has technical advisory council. So both have that element. Now, CNCF also has a technical advisory group, which is about security, where they dig into the details of how do you secure cloud native infrastructure? Security is the most boring thing, right? I mean, it works until it doesn’t work and then everything breaks. So I think that’s a super important element. So you could…

Omkhar Arasaratnam (10:13)
When it’s done well, it’s very boring.

Arun Gupta (10:15)
(Laughter) Right exactly. (Laughter) So I think it’s very important that security is job number one, even in cloud native computing. You can make it ubiquitous, but if it’s not secure, it’s absolutely useless in that sense. So I think that’s the way they think about it. There is a tag security where there is deep focus on how do we make sure that we are making this secure? But so far, that focus has been only on the cloud native computing. And I think that’s exactly where OpenSSF shines up. 

OpenSSF is fulfilling a gap. which is looking at a bigger, broader landscape to identify how do we secure the broader open source software? That’s where tools like OpenSSF Scorecard, Salsa, Sigstore, these are the tools. There is no need for CNCF or any other foundation to create those. That’s where OpenSSF is bringing out these tools that will plug in right there the gaps that CNCF is feeling and any other foundation is feeling.

Within OpenSSF and CNCF, of course, there is a lot of collaboration, but the tools that OpenSSF are creating are available for the broader open source community. So whether you are Apache or Eclipse or in any other community for that sake, those tools are widely available. And let’s be deliberate, let’s be conscious about what kind of interactions can be done to make the cloud native computing more secure so that it fulfills both of our joint agendas and win -win situation. 

And honestly, the way OpenSSF looks at it is as we are creating tools, we can create the tools in silo, but if those tools are not implemented or agreed upon by other communities, again, they’re going to be meaningless. So really making sure that as we are creating this OpenSSF scorecard, how they could be adopted across a wide range of CNCF projects, whatever specifications we come up with, we created secure software development guiding principles. Like, how do you make sure that your software is built using a secure covenant? Now we could come up with a covenant, but really working with CNCF saying that, okay folks, as you are building your project, here are these guiding principles that you should be following. 

So I think in that sense, there’s a very strong cohesion between, the stuff that is being done by OpenSSF and then implemented by CNCF. And again, the idea is if there are gaps identified, there is a clear communication channel, which is more important so that they can give feedback to us. There is of course a public channel, but there is a strong backend channel as well, which enables that high bandwidth communication for the leaders to communicate and share details.

Omkhar Arasaratnam (12:54)
Absolutely, and we have definitely benefited from that back channel and I think the community has definitely benefited from the cohesion, as you put it, that’s been brought together. One of the reasons that many of us get involved with open source and a lot of us are passionate about open source is due to the fact that it’s a public good. I know you’ve done a lot of work with the UN as well and would love to hear your thoughts on the intersection of open source as a public good and what the UN envisages how open source can help the globe.

Arun Gupta (13:30)
Yeah, when the United Nations created these Millennium Development Goals — what they used to call as MDGs at the turn of the millennium, smack at the beginning of the century — those goals were, again, BHAGs, you know, big hairy audacious goals. No poverty, no hunger, no crime, racial, you know, minimize racial injustice, gender equality, beautiful climate policies, you know, policies and all of really wonderful audacious goals. 

And as I’ve been involved with the UN for the last year or so, it’s been really exciting and very humbling experience, because it’s very clear, you have these goals, but the way to solve these goals, of course, is a human element. But a large part of it is a technology element. So last year, I was involved with TED AI, which is a brand new conference, which is again a section of TED, a type of a TED conference that was started in San Francisco last year. 

So last year we worked with TED and the UN to run a hackathon. And the hackathon basically had about 130 participants from all over the country, which basically took a shot at how can we solve this UN sustainable development goals using open source technologies, leveraging AI and cloud native technologies, essentially? vSo that was pretty exciting. 

A couple of months ago, we had KubeCon Paris, and that’s where we had again a very tight collaboration with the United Nations and the Office of Technology within the UN. Really, really good discussion. There were folks from the United Nations who came to the cloud native hacks, which is basically the hackathon that we did at KubeCon, where they talked about the importance and the relevance of the Sustainable Development Goals. These were started at Millennium Development Goals, but 2015 they realized it’s not just about the Millennium, it’s about the sustainability of the humankind. So the name was changed from MDGs to SDGs. 

A very beautiful, a very humbling effort. And I’m very excited to continue that partnership with the UN going forward. Looking forward, we are going to KubeCon Salt Lake City. So we’re going to have a cloud native hacks over there. Highly, highly encourage to bring more and more such places where we can bring that UN hackathon to different events and make an impact to the SDG, essentially making the world a bit more sustainable.

Omkhar Arasaratnam (15:55)
Those are definitely some big, hairy, audacious goals, but also, I think, goals that are good for humankind. And it’s very encouraging to hear this kind of collaboration. I’ve been doing security for a long time. I’ve been doing security for about 20 years. But I always self-identify as a software engineer first that happens to have been doing security for a very long time. 

With that perspective, I personally find there’s a lot of things that security folks just, I guess in their intent of being incredibly security-oriented, that they miss from your perspective. As a software engineer for a very long time. What are security folks getting wrong? 

Arun Gupta (16:42)
Yeah, I think when I think about a conversation, I always think in terms of empathy, that what is my end customer? What do they want? What is the problem that I’m solving for them? That’s super important. Sometimes the security folks focus too much on the technology. Take a step back. Have that empathy for the customer. Does the customer understand that language? Are you talking in their language? Is the intent of your dialogue landing the impact on the customer who’s listening to that discussion? 

The second problem, which is funny enough, is not the technology. Humans are often the weakest link in security. So as security professionals, we sometimes overlook the importance of training and awareness programs for employees. Or we underestimate the potential impact of social engineering attacks. How we could have people just maneuver their way, particularly given how prevalent open source is, how 90 to 9 % of the infrastructure is relying upon open source. For two years, somebody could just social engineer their way into it and then plant something in the software is pretty dynamic. So I think how do we understand the social engineering part of it?

And I guess the last part really is the comms part of it. We need to work very closely with other departments — IT, legal management, developers — making sure the comms are being sent out on a regular basis, the trainings are being done regularly. So focusing on these elements would only make it that much more impactful.

Omkhar Arasaratnam (18:20)
Valuable insight is always Arun. We’re gonna move into the rapid-fire section now.  Okay, spicy or mild food?

Arun Gupta (18:29)
I would say spicy. I’ve always been a spicy person. I like that.

Omkhar Arasaratnam (18:32)
All right, text editors, Vi, VS Code or Emacs?

Arun Gupta (18:36)
Vim, actually.

Omkhar Arasaratnam (18:38)
Vim is the winner! Now this is a highly controversial question. Tabs versus spaces? 

Arun Gupta (18:44)
Oh, yeah, spaces, baby spaces. 

Omkhar Arasaratnam (18:47)
Spaces, all right!

Arun Gupta (18:48)
Yeah, I’m not gonna lose a relationship over it, but spaces it is.

Omkhar Arasaratnam (18:50)
(Laughter) All right, to close us out Arun, for somebody that’s entering our field today, maybe somebody that just graduated from an undergrad in comp sci or somebody that’s making a career change to move into our field, what advice would you have for them?

Arun Gupta (19:08)
Yeah, I was talking to a friend’s son actually, you know, this kid is in high school and up until now he wanted to be a lawyer. And then one day he just comes to the house and he says, I want to be a cyber security professional. And my eyes immediately lit up. I said, “Oh, that’s fantastic! What do you want to do?” And like, I had a very interesting conversation with him. And of course I pointed him to all the training and the certifications and the courses that are offered by OpenSSF. 

My general advice is with ChatGPT with so much of internet resources available, which were not available when I was in college or when I was growing up initially, there is no lack of knowledge. Have that genuine curiosity, dig into it. Don’t be afraid of AI. Embrace it, use tools like ChatGPT to bounce your ideas, build that prompt engineering skill. 

What do you want to really do? Dig into the why of it. Look under the hood, see what’s going on and what could you do? And most importantly, if you find there is a place where you can scratch your itch, do it, contribute. And the more you contribute, the more you collaborate, the more you get your name out there, the more you build the credibility. And, remember, it’s a marathon, it’s not a sprint. So be in it for the long haul.

Omkhar Arasaratnam (20:35)
That’s great advice, Arun. What’s your call to action for our listeners? What would you have them do following this episode?

Arun Gupta (20:40)
I would really encourage them, again, seems to be self-serving, but I would really encourage them to take a look at openssf.org. Look at all the wonderful resources, training, education, certifications that we provide over there. Take a look at that. Come to an event. Go to your local meetup. And the last one, which is very important that I’ve seen particularly people who are graduating out of college don’t have that imposter syndrome. You know, I was there exactly where you are right now 25, 30 years ago, and all it takes is perseverance, grit, and resilience. So just have that in you and roll with it.

Omkhar Arasaratnam (21:22)
Thank you so much, Arun. It’s been a pleasure having you and hope to speak to you again very soon.

Arun (20:21:26)
Thank you so much.

Announcer (22:27)
Once again, for more information about OpenSSF’s activities at the OSPOs for Good symposium and the follow-up event, What’s Next for Open Source, check out the links in the episode description of this podcast. And be sure to catch every episode of What’s in the SOSS by subscribing to the podcast on Spotify, Apple, Amazon or wherever you get your podcasts. And to learn more about the OpenSSF community, visit openssf.org/getinvolved. We’ll talk to you next time on What’s in the SOSS?

What’s in the SOSS? Podcast #7 – Stacklok’s Adolfo García Veytia Digs Into SBOMs and VEX

By Podcast

Summary

The world of software bill of materials (SBOMs) is both complex and fascinating. And few people know the SBOM community better than Adolfo García Veytia — aka Puerco — Staff Software Engineer at Stacklok. Puerco is also a Technical Lead with Kubernetes SIG Release specializing in supply chain improvements to the software that drives the automation behind the release process.

He’s also one of the original authors of OpenVEX, an OpenSSF project working towards a minimal implementation of VEX that can be easily embedded and attested. Puerco is also a contributor to the SPDX project and a maintainer of several SBOM OSS tools. He’s passionate about writing software with friends, helping new contributors and amplifying the Latinx presence in the cloud-native community.

Conversation Highlights

  • 01:04 – Puerco shares his background
  • 02:21 – What SBOMs are and why they’re so important
  • 06:42 – An overview of standards in the SBOM space
  • 09:58 – Puerco details his work on VEX projects
  • 14:05 – Puerco enters the rapid-fire portion of the interview
  • 15:06 – Advice Puerco would offer aspiring open source or security professionals
  • 16:12 – Puerco’s call to action for listeners

Transcript

Adolfo García Veytia soundbite (00:01)
So imagine if you were looking at the video and you see the cook not washing their hands when they cook. That would suck, right? We see a transparent supply chain kind of in the same spirit. The more information you have about your supper, you may be able to do better decisions on whether or not to consume it.

CRob (00:18)
Hello everyone, I’m CRob. I do security stuff on the internet and I’m also a community leader at the OpenSSF. And one of the cool things I get to do as part of the OpenSSF is I get to co-host What’s in the SOSS podcast. With us this week, we have my friend Adolfo, who goes by the handle Puerco out there on the internet. Puerco, bienvenido, al show.

Adolfo García Veytia (00:41)
Gracias, CRob. It’s super exciting to be in a podcast, but also in a podcast with one of my favorite people in the world. So thank you.

CRob (00:50)
I know, right? For those uninitiated in the audience, they might not be aware of the origin story of yourself. Could you maybe just explain a little bit of like what you do with open source and upstream and just kind of maybe some of the projects you’ve worked on?

Adolfo García Veytia (01:04)
Yeah, for sure. I’ve been working on open source projects probably over 10 years or so. I started contributing back in the era of Perl, writing, you know, filing bug reports and documentation for Perl modules. Then did some contributions for PHP. And then when I really started doing more contributions was when I joined the Kubernetes project.

I started up going up the ladder and I became one of the technical leads for Kubernetes Secret release where I specialized on the release process and working on the security features that we have in our releases. And then from then, I started creating lots of different tools that we needed to secure Kubernetes itself, which some of them became projects of their own. And now I’m working on some of the same stuff here in the OpenSSF.

CRob (01:59)
Nice, excellent. So I have a very important topic that you and I get to talk about all the time, but the audience might not be as familiar with. Let’s talk a little bit about software bills of materials, SBOM. Could you maybe describe why SBOMs are important for both developers and also downstream consumers?

Adolfo García Veytia (02:21)
Yeah, for sure. So SBOM, the software bill of materials, to give it a description of what it is in a short sentence, it’s just a list of components that make up your software. That’s the most basic definition of it. Some people may be familiar with SBOM or if not what it is with a term because of some legal requirements and regulation that has come up.

But the fact is that SBOM is kind of the base of the transparent supply chain. So if you’ve seen the news in the past couple of years, there’s a big push to make our software supply chain more secure. And that means can I do good decisions on the risk I’m taking when I’m ingesting third-party software? And to me and to some other SBOM enthusiasts, SBOM is kind of the base of that transparent supply chain. So the way we’re trying to make the supply chain more secure is combating it with information.

So whenever, imagine if you went to a restaurant and before you tried your dinner, imagine you could know exactly the ingredients that went to it, who cooked it. Imagine if you could see a video when they were cooking it. That would give you the ultimate assurance of your plate, right? Of your dinner plate. So imagine if you were looking at the video and you see the cook not washing their hands when they cook. That would suck, right? So we see a transparent supply chain kind of in the same spirit. So the more information you have about your software, you may be able to do better decisions on whether or not to consume it.

And this one is kind of the first layer of that transparent supply chain. As a developer, when you have an SBOM about your project, you kind of have a key to go to a lot of different services that are starting to come up to ask information about your dependencies. So for example, I think the most basic use case is you go to a security scanner and you present an SBOM and say, okay, scanner, give me all of the vulnerabilities that you know are present in this SBOM. And based on the information on your SBOM, the scanner replies back. So that’s kind of the basic use case I see for it.

CRob (04:38)
That’s pretty cool. And I’m correct in remembering there are different types of SBOM. SBOM isn’t just one monolithic thing. You might issue or create an SBOM at different points within the development process, right?

Adolfo García Veytia (04:5:0)
Yeah, exactly. So as the software lifecycle moves forward from idea to software repository or code base to builds and deployment, more information becomes available. And some of the information that goes into your SBOM may not be, it may not be possible to know it at the different stages of the software lifecycle.

So to give an example, if your project requires OpenSSL and it’s dynamically linked, you will not know the effective version until you deploy that binary and it links against your system binaries. So based on that idea that information becomes available as the software moves forward, different kinds of SBOMs have been developed. So there’s the design SBOM, which captures more or less the plan that you want to do.

There’s the source SBOM that looks that a generator looks at your code base, extracts the information that it can and gives it back to you. And then there’s a build SBOM where once you run the build, your compiler or interpreter may do the decision on which version of the dependencies it’s gonna pull down. And even the dependencies of the dependencies, because those may change as your dependencies get new releases. And that gets captured.

And then there’s the, I think the next one is analyzed where basically you take something like an SCA scanner and then point it at your artifact, make some deductions by looking at it from the outside. Gives you that, gives you another kind of SBOM. And finally, there’s the deploy SBOM, which looks at your software once it’s installed and running in a system.

CRob (06:32)
Wow. It’s a lot to keep track of. What types of tools or what are some of the standards that people might bump into in this SBOM space?

Adolfo García Veytia (06:42)
Yeah, so there are two main standards of SBOM. One is SPDX from the Linux Foundation, and the other one is CycloneDX, which is hosted on OWASP, currently undergoing standardization as well. Both standards are more than enough to capture that list of materials. Both standards share more or less the same abstractions when regarding that list of components. And both standards have also grown to capture much more than just a list of dependencies. They can capture things like build provenance and information about machine learning and AI workloads and others.

CRob (07:21 )
And I know that there’s a couple tools that you’re personally involved with. Can you talk about your project, Protobom, and then the bomctl?

Adolfo García Veytia (07:30)
Protobom was born out of a contract or yeah, I think the Radware’s contract from DHS and CISA. They put out like a call for to design a way to exchange information between those formats, and then the company I was working on back at the time, we looked at it and decided that it was a good opportunity to create one abstraction that can handle any SBOM data so that you could basically work with any kind of present or future SBOM formats and without having to care about the implementations. There are a bunch of reasons why. I could happily go, but probably it’s too boring for non-SBOM nerds like me.

CRob (08:19)
No, I think that’s really cool that such a thing exists. Pretty awesome.

Adolfo García Veytia (08:23)
If you think about SBOM tooling as a stack, at the very bottom you have the very strong foundations of both formats, CycloneDX and SPDX, but then Protobom is kind of the next layer in the stack. So that gives you like a universal IO layer to write and read between two formats in your application. That sits on top of Protobom.

And on top of Protobom, we’ve seen a number of tools starting to use it to interact with the formats, but also work with SBOM data. One of those is a bomctl. So full disclosure, I’m not yet part of the bomctl project. I work closely with them because it’s based on Protobom. And the idea of bomctl is that it will be a CLI tool to basically do the most basic operations that most people need to do with SBOMs. Things like updating data in your SBOMs, mixing them, changing formats, those basic operations that most people need to do when they get an SBOM, process an SBOM or share an SBOM are going to be handled by bomctl.

The aim is that bomctl, will provide that great user experience in the CLI, but also the idea is that Protobom will house the required functionality to perform those operations.

CRob (9:43)
So let’s talk about another SBOM-adjacent effort that you and I get to collaborate on together. VEX, the vulnerability exchange. Could you talk about what VEX is and how it kind of plugs into or complements an SBOM?

Adolfo García Veytia (9:58)
Yeah, for sure. So VEX, the way I see VEX helping the overall situation of the secure supply chain is that its main goal is to reduce noise. So part of the work that anybody trying to assess the risk in their supply chain involves triaging vulnerabilities. So when you have a super transparent supply chain as enabled by SBOM and other technologies, you start to get more information.

And with that information, you get things like false positives in scanners. And when your SBOM starts to capture things like your build environment or your build tooling and the dependencies in our container image, you start getting more and more components, which leads to more and more false positives. So to combat this, the idea of VEX came up in the SBOM circles organized by mainly CISA. So VEX, the Vulnerability Exploitability Exchange, is a way for software producers or I hate the word producers, but maybe software authors or maintainers to communicate the impact that vulnerabilities and their components have on their software.

So to give you an idea, if I share a container image and it has some old operating system package that I’m not using, it cannot be triggered in any way, that vulnerability may show up in my user scanners when they scan my container image, but they may not be affected. So VEX is a mean for me as a software publisher to create a small bit of information that gets communicated to my consumers, ideally to their scanners, so that those warnings can be, if not turned off, given more context so that they can make better decisions, and especially to help them triage those vulnerabilities more in a more efficient way.

CRob (12:00)
And when folks are issuing VEX statements, there’s a couple different types, a couple diffrent states that that statement can represent. And what are those?

Adolfo García Veytia (12:10)
Well we think about VEX as a one-shot message that turns off the lights in my security scanner, so to speak. But in reality, VEX is designed to be a communications channel to inform downstream consumers of the whole life cycle of the assessment of vulnerability in my project. So when a new vulnerability gets discovered or reported in one of my dependencies, VEX can give me different statuses that I can communicate to inform them about the evolution of the assessment. So you start by issuing a VEX statement that says that the vulnerability is under investigation, letting them know that you’re aware of it and they’re looking at it. So it’s not getting ignored.

Then the next one, once you have an assessment, you can send another message telling them you’re affected. So if it pops up in their scanner, it’s a true positive, but more importantly, you can send through the VEX channel some extra information about how to mitigate or other information. Or you can also let them know that it’s not affected and then you can inform them why not and that message can potentially turn off some lights or warnings in scanners and alerts. And the last one is fixed, right? So if I got a new release but that new release is showing up as vulnerable, you can send a fixed message.to let them know that this new release is not affected.

CRob (13:35)
It sounds like a really useful kind of emerging tool. I’m looking forward to watching this develop.

Adolfo García Veytia (13:42)
Yeah, I mean, we’re excited on how this is evolving and because it’s a really simple communications channel.

CRob (13:50)
Excellent. Well, let’s move on to the rapid-fire part of our questions. I’m gonna have a couple questions, and generally they’re binary, but if you want to add a little embellishment, please do. First question, spicy or mild food?

Adolfo García Veytia (14:05)
Oh, spicy, of course. I’m Mexican, what else?

CRob (14:12)
(Laughter) Well played, sir. Next question, VI or Emacs?

Adolfo García Veytia (14:18)
VI but not by choice, just by default on my distro.

CRob (14:23)
Do you have an alternative, better alternative?

Adolfo García Veytia (14:25)
Yes, I use JOE.

CRob (14:27)
I haven’t heard
of that one. I’ll have to look that up. Very nice. And our last question, very controversial out there in the ecosystem, tabs or spaces?

Adolfo García Veytia (14:37)
So my thinking is tabs, but I’ve been finding out that spaces plays better with others. So I like tabs because you get control of the indentation visually, but most everywhere it doesn’t work as expected. So I end up defaulting to spaces.

CRob (14:57)
Excellent, excellent. So now as we wind down, do you have any advice to someone? A young developer or someone getting into open source or cybersecurity that any advice for these newcomers to our field?

Adolfo García Veytia (15:06)
Oh yeah. So first of all, the two most important pieces of advice that I can give: one, do not be afraid to take on projects, to ask questions, most importantly, ask your questions. You’ll find that most open source nowadays is very friendly. And the other is show up. Most of those communities are built by people who recognize each other. Even just showing up, showing your face, hearing about the problems, giving simple or complex opinions, everything is super valid. Sometimes just listening to others rant is super valuable. And that’s how you find yourself super quickly on a track to become a maintainer of one of some of the most important projects out there. Yeah.

CRob (15:59)
That’s awesome. Thank you. That’s good advice for newbies. And final question here. Do you have some kind of call to action? Are you looking to inspire our listeners to maybe take up some causes or help out somewhere?

Adolfo García Veytia (16:12)
Yeah, for sure. So I probably could have some calls to action for both projects. So for Protobom, I think we’re looking for folks who maintain SBOM tools. So right now, the strongest implementation is in Go. So if you have an SBOM tool in Go and you want to or you’re planning to start a new SBOM project in Go, come talk to us or look at our repos in github.com slash Protobom.

We hope that you’ll find the project very compelling and helpful for your new project or existing project. So let us know if something is missing or whatever. Also, if you’re familiar with SBOM and in another language, we would like to see more implementations of Protobom in other languages. That’s one.

And for OpenVEX, help us bootstrap the ecosystem. So we’re trying to spread little VEX feeds wherever we can so that when you build artifacts, we can start recognizing those VEX feeds and trying to issue more accurate vulnerability scans. So if you want to help out, let us know and we can help you kick off your your VEX stream.

CRob (17:29)
Excellent. Well, thank you so much for joining us, Adolfo. I really appreciate your collaboration and your leadership in the upstream ecosystem. Thank you for joining What’s in the SOSS? today.

Adolfo García Veytia (17:39)
No, thank you for inviting me, CRob. I’m always happy to chat with you.

CRob (17:43)
Excellent.

Announcer (17:44)

Thank you for listening to What’s in the SOSS? An OpenSSF podcast. Be sure to subscribe to our series of conversations on Spotify, Apple, Amazon or wherever you get your podcasts. And to keep up to date on the Open Source Security Foundation community, join us online at OpenSSF.org/getinvolved. We’ll talk to you next time on What’s in the SOSS?

What’s in the SOSS? Podcast #6 – A Man Called CRob: Introducing the Newest Co-host of What’s in the SOSS?

By Podcast

Summary

Christopher Robinson (aka CRob) is the Director of Security Communications at Intel Product Assurance and Security. He also serves as the Open SSF’s Technical Advisory Committee (TAC) Chair. And soon, CRob will step into another role: co-host of What’s in the SOSS? With 25 years of enterprise-class engineering, architectural, operational and leadership experience, Chris has worked at several Fortune 500 companies with experience in the financial, medical, legal, and manufacturing verticals. He also spent six years helping lead the Red Hat Product Security team as their Program Architect.

Conversation Highlights

  • 00:57 – CRob’s day-to-day activities and his affiliation with the OpenSSF
  • 03:15 – The insight CRob will bring to the podcast as co-host
  • 05:46 – What developers writing “post-bang” code should be considering
  • 08:40 – Lessons open source could learn from corporate and vice versa
  • 12:17 – CRob explores the evolution of open source
  • 14:22 – Crob answers Omkhar’s rapid fire questions
  • 15:57 – CRob’s advice to people entering the cybersecurity field
  • 18:18 – CRob’s call to action for listeners: give back

Transcript

CRob soundbite: (00:01)
First and foremost, open source is agile. And that’s something that corporations need to understand. And that it moves at a totally different velocity and isn’t necessarily beholden to month-end change freezes or a year-end close when you’re trying to balance the books. So open source is always moving.

Omkhar Arasaratnam (00:19)
Welcome to What’s in the SOSS? I’m your host Omkhar Arasaratnam and the general manager of the Open Source Security Foundation, the OpenSSF. Joining us today is a good friend of mine. He sometimes goes by Christopher Robinson, he sometimes goes by the security Lorax, but most often he goes by CRob. CRob, welcome to the show!

CRob (00:41)
Hey, thanks for having me Omkhar.

Omkhar Arasaratnam (00:43)
It is a pleasure.

CRob (00:46)
The pleasure’s mine, sir.

Omkhar Arasaratnam (00:48)
So, other than being a security Lorax, can you tell us your title and what you do in your day job as well as the work you do with the OpenSSF?

CRob (00:57)
So I like to say that I really don’t do anything is my claim to fame. I just write and I talk on podcasts like this stuff. But my title is I’m the Director of Security Communications for Intel. So I help make the internet a little less sad about vulnerabilities that impact our portfolio, do crisis communications, I work with our PSIRT and whatnot. And then the other half of my time is spent towards upstream community work like our collaboration at the OpenSSF.

Omkhar Arasaratnam (01:28)
For our listeners that may not have attended a TAC meeting, things of that nature, do you want to talk to what your role is as TAC Chair, what the TAC does and how the TAC works in the OpenSSF to help make open source software a little more secure?

CRob (01:45)
Right. So the Technical Advisory Council, or TAC, is a technical body. We are voted on every year, and some of our seats are appointed by the governing board. But the duty of the TAC is to take a look at the technical initiatives of the foundation. So things like software projects or work on specifications and standards or guides. And we help steer that, making sure things are aligning with the strategic direction of the foundation and they all support the general overall kind of ecosystem uplifting of open source security for everybody.

Omkhar Arasaratnam (02:21)
Amazing, and thank you for all the work that you do. So much of the amazing projects that we have under the OpenSSF thrive under the tutelage of the TAC, and thank you for that. Now as the saying goes this ain’t your first rodeo. You’ve been doing security for a while. 

CRob (02:40)
Are you saying I’m old?

Omkhar Arasaratnam (02:42)
It was implied. (Laughter) And for our audience, the many years of wisdom and experience that CRob brings to the table is one of the reasons that CRob is our new co-host on What’s in the SOSS? As we all believe, he’s going to bring a lot of that experience to the interviews in speaking with some of our guests. CRob, I’m going to put you on the spot, my friend, some thoughts as the newest minted co-host of What’s in the SOSS? What, what do we have in store?

CRob (03:15)
I’m really excited. I’ve got some, a whole list of folks lined up, so we’re going to talk about things like why are people still installing known vulnerable packages? We’re going to talk about large corporations adopting a lot of the projects of the OpenSSF and kind of understanding how an actual person could do some of that. We’re going to talk about coordinated vulnerability disclosure. We’re going to talk about the Linux distros and how kind of their role in helping support security of the ecosystem. So again, I think we have a lot of amazing content, some amazing people that are contributors to the community that we’re going to talk to.

Omkhar Arasaratnam (03:49)
I’m very excited. I can’t wait to hear all the interviews that you have lined up and, and learn more about how people are adopting the work that we’ve been working so hard on. In your career in which you’ve done many things, you spent a lot of time on what I would call the post-bang side. So vulnerability comes out or an exploit occurs in the field. Can you help orient our audience to some of the work that you’ve done in that arena and some of the unique experiences that you’ve had?

CRob (04:22)
Yeah, absolutely. I’ve spent almost seven years of my career working as part of Red Hat product security where under Mark Cox, I helped run the program with him and then eventually partnered with Vincent Danen, who’s now the current leader there. And so I have a lot of experience on both pre- and post-bang. But for my particular skillset is the post-bang piece, trying to help make sure that when that public disclosure goes out, and the world is aware of something going on, trying to make sure that they have adequate information and access to the fixes so that they can treat the problem that’s being disclosed. 

And it’s important that it’s sometimes overlooked as when the responders or developers are in the heat of the moment trying to fix the problem. They’re not thinking about the downstream consequences or what happens when this thing goes public. You don’t want to be on the front page of The Register or The New York Times. So it’s trying to think about things like that and making sure that people can defend themselves.

Omkhar Arasaratnam (05:22)
You have a very particular set of skills, Mr. CRob. What should our developers be thinking of as they’re writing code to make the role of that person that has to deal with the after the fact, the post-bang, a little bit easier? What can they be doing in development so that it’s a little less, the incident is a little less exciting?

CRob (05:46)
I have a couple pieces of advice. First and foremost, all projects should talk about — whether it’s a single maintainer or a team — you need to sit down and think about how, when you get a vulnerability report, how you’re going to handle that and ultimately how you’re going to disclose it. And in the industry, we call that a vulnerability disclosure program or a security policy. So it’s important to have that. And that helps set expectations both for when the finder comes to you — they’re making a demand of something getting fixed. It also helps your downstream understand exactly what you’re gonna do. Are you going to fix something? How quickly are you gonna fix something? So having that documentation upfront and that agreement helps set the expectation so that when things go public, ideally it goes smoothly for your downstream. 

And next, you need to understand kind of what your project depends on. Very few projects are self-contained and they only have code and only use code written within that project, you have dependencies, you have libraries and other components that you’re calling out to. So understanding what you depend on helps you react when one of those components has a vulnerability. And then that’ll ideally help avoid your downstream asking, are you affected? Are you affected? So understanding your dependencies, consider writing an SBOM for your team’s use, your project’s use, but also consider sharing that for your downstreams, because that’s valuable information to them to understand and helps with their vulnerability remediation. 

Next up, I would say that find a big brother or a big sister in the community. There’s a lot, open source has been around for three decades now that since Linus released the kernel, find someone that you look up to, that you respect and ask them advice. Line up people that when you’re in the middle of a crisis, people you can talk to that might have more security expertise than you so that you can ask questions and help get help if you need it. But having that community, big brother, big sister is invaluable to give you that advice and guidance and suggestions on how you might be able to do something. 

And then finally, I would encourage all developers to embrace the mentality of Kaizen, that continuous improvement. So what you do today isn’t necessarily going to make you successful tomorrow. So think about it with every release, with every vulnerability that you fix, think about how you can adjust your practices or tweak your tools so that maybe you can avoid that situation going forward.

Omkhar Arasaratnam (08:08)
That’s some great advice and certainly speaks from those — how many decades of wisdom was it that we were up to — so, sorry, I’ve said too much. With your experience, I’m wondering between the time that you’ve spent on open source and in corporate, what lessons can be learned from either side? Like what should corporate security be thinking of that open source does incredibly well? And what does open source need to think about that maybe corporate security has been doing a little bit better of a job?

CRob (08:40)
That’s a really interesting question. I think, and I can address both points of that. First off, for those of you that aren’t aware, at its core, open source software is about sharing. It’s about openness and transparency and meritocracy, where the best ideas win. So, first and foremost, open source is agile. And that’s something that corporations need to understand, that it moves at a totally different velocity and isn’t necessarily beholden to month-end change freezes or year-end close when you’re trying to balance the books. So open source is always moving. So just being aware of that and that agility not only is in the process, it’s also in their thinking where they are able to ingest new diverse ideas, different perspectives. 

And that’s something that corporations potentially can learn from kind of seeing this attitude and embracing the fact that, you know, somebody, a grizzled veteran of a quarter century in development and security might not have all the answers. And somebody, a new junior person might have just as valid an insight into the problem as everybody else. Next up, where developers are incredibly creative people, but kind of like me, they tend to be a little lazy. So wherever possible, they love automation. And that helps them become more efficient, helps them be that creative source of inspiration and ideas. So open source is all about automating. And that’s something I would really recommend. 

Back when I was in several corporations I was at, we always looked for opportunities to automate a process. And whether this is part of your CI/CD pipeline, part of your deploying servers or configuring Kubernetes clusters, whatever. If there’s like, a repetitive low-value task, make a robot do it for you, write automation to make that go away. And then again, just thinking about open source very much kind of avoids that monoculture and they avoid monolithic thinking. So again, you’re going to have really great diversity of thought and diversity of background. And I would again, encourage corporations to embrace that.

Now flipping this on its head, corporations are generally, hopefully, good at making money and good at directing resources and managing time and priorities. So that’s where open source kind of falls down a little bit. We’re not quite as organized and disciplined as we could be. And where I would, the very first and foremost thing, and you’ll see this, it varies widely by project and community, but documentation and process are key because it not only helps you draw in new members if they can understand the kind of what your processes are and how they can interact with it. 

But it also helps your downstream understand what you’re doing, how you’re doing it and having procedures like tabletop exercises where we went through this effort and the vulnerability disclosures working group at the foundation. We’ve been partnering to help create these resources so that an upstream project can understand the value of doing a tabletop exercise or going through a threat modeling exercise to understand how their software can be broken. 

So these processes, the discipline and rigor — not that I’m saying you would totally want to be inflexible — but open source definitely can adopt some of these things to help improve the quality of life of the maintainers and the downstream.

Omkhar Arasaratnam (11:52)
Make the incidents boring.

CRob (11:54)
Right, exactly. It should be just a non-event. It’s just another patch.

Omkhar Arasaratnam (11:58)
That’s definitely some interesting lessons that can be learned from either side. Now, in your experience in using open source, I’m sure over the years you’ve seen it evolve. How have you seen it from your earliest days engaged with open source to where we are now? Have you seen that evolution?

CRob (12:17)
So when I started off, I ran a corporate engineering and operations team and we were responsible for first web security of a large financial institution. And then I moved over to the distributed side, the distributed server side, where I ran the Unix Linux team. So back then, when I first got into this space, the idea of large companies using open source was generally almost exclusively a cost play where we had feature parity, you know, that the Unix and Linux were generally the same. And it was pure cost savings where you could deploy a server for one quarter the cost of a traditional Unix device. 

And, you know, there wasn’t a lot of innovation, but getting those ideas from Linux and open source into these large financial institutions helped them understand and be able to harness some of the innovation and velocity that happens upstream. That’s where you started to gradually over the last decade or so, I’ve seen companies go from, we’re just trying to save a couple bucks to, hey, there’s some amazing innovations here. I want to integrate these features into my own development practices or my own portfolio so that I can be that, you know, take advantage of some of these cutting edge things. 

And that’s like how things like Kubernetes and containers all came about because that was a way of trying to help improve operational efficiency. So now again, it’s kind of transitioned to where we are now where people are participating upstream and helping set some of that innovation being very engaged upstream to help steer things to have good outcomes downstream. It’s a nice evolution and open source software development. It used to be we were all kind of a Waterfall waterfall style of development model or it was very slow and regimented.

And now we’ve changed to agile methodologies and that’s the de facto way how software is developed today.

Omkhar Arasaratnam (14:14)
It’s certainly amazing to see how adoption has changed over the years, and I can’t wait to see what’s coming next. 

CRob (14:21)
Yeah, me too.

Omkhar Arasaratnam (14:22)
So now, CRob is when we transition to rapid fire. So here’s how rapid fire works. Can I ask you a few questions? 

CRob (14:30)
You may.

Omkhar Arasaratnam (14:31)
And in asking those questions, I’ll provide a set of pre-scripted answers or the right answer may be, no, Omkhar, you got that wrong. Here’s what I think. Are you ready to go?

CRob (14:43)
I am, sir.

Omkhar Arasaratnam (14:44)
Alright, first one. Spicy or mild food?

CRob (14:48)
Spicy, no other answer is valid.

Omkhar Arasaratnam (14:50)
Totally agree. Spicy or nothing. And I recall from a few meals that we’ve shared and a few drinks that we’ve shared, you know how to hold your spice, man. You do all right. All right, this next one’s a bit trickier. Favorite text editor: Vim, VS Code, or Emacs?

CRob (15:08)
VI, colon q, baby!

Omkhar Arasaratnam (15:15)
There you go! Exclamation mark. Quit without saving. Now the last one is incredibly controversial. Tabs or spaces?

CRob (15:21)
(Exasperated sigh) I like tabs, but I understand that spaces have their place and are useful. So again, I’m not a purist. I would lean towards tabs, but you’ll see some occasional spaces. 

Omkhar Arasaratnam (15:33)
That’s an incredibly diplomatic answer. 

CRob (15:36)
I do my best. 

Omkhar Arasaratnam (15:38)
In wrapping things up, CRob, what advice would you have for somebody entering our field today? Be they somebody graduating from college versus maybe somebody who’s made a career change and decided for whatever reason to get in a cyber. I mean, that’s, I don’t know why they’d make that decision, but what advice would you give somebody entering our field?

CRob (15:57)
I have so much advice. It’s hard to choose just one, but I will say this. In my experiences in the last (intentional murmur) century of doing cybersecurity and software development–

Omkhar Arasaratnam (16:08)
What was that, three centuries?

CRob (16:10)
Pretty close. I think that there are so many interesting things going on, so many different and unique aspects of the trade. And I would, it’s very easy to get overwhelmed. It’s like you’re a kid in the candy shop and you want a little of everything. And then when you do that, you get sick and you vomit.

So my advice for new folks is find things that you’re interested in, that you’re passionate about, that excite you, and go find people that have that similar interest. You’ll do better work when it’s focused on things that spark joy for you. And then secondly, once you find your people, so to speak, you find your community, whatever the type of specific nuance you’re interested in, whether it’s GRC or application security or risk management, talk to people.

Find someone that is a mentor or a role model. Reach out to them and engage with them. It’s been my experience that people within both open source and cybersecurity, they like to talk and they like to talk about themselves and share their experiences. And it might be intimidating. You might think of someone as an air quotes rock star that they’re superhuman, but they’re not. They’re just people just like you. And they like to share their experiences. 

And I have personally benefited from having a long trail of mentors in this space where people have been incredibly generous with their time and helped teach me concepts that I was unfamiliar with, or I was able to, you know, in reverse, I helped mentor a lot of people as well. I also work with an organization called ISC2, and I’m a certified cybersecurity person. And part of our code of ethics, part of the idea behind those certifications is improving the trade. So, my contribution to that is helping groom the next generation of folks. 

So,  I feel obliged to do this and many people in my space do. So again, find people that you look up to, talk to them, have them share their experiences and you’ll learn more over coffee or an adult beverage than you ever will from a book or classroom.

Omkhar Arasaratnam (18:08)
Some sage wisdom and for what it’s worth, CRob, I consider you to be one of our rock stars. 

CRob (18:13)
Aww, thank you.

Omkhar Arasaratnam (18:14)
Last question. What’s your call to action for our listeners? 

CRob (18:18)

I talk a lot at cybersecurity conferences and every time it’s generally about open source security and every presentation I have a similar slide where I say, chances are very good, 90 to 98% that you’re using open source software. Why aren’t you contributing back? And contribution isn’t necessarily development or money.

But show up to those communities, give those communities feedback, help them out with some docs, take some notes in a meeting, just participate in the conversation. Be that sounding board as a developer’s testing out new features. If you have the skill set and you are a development person, contribute some patches, fix some bugs, look at their backlog. 

And that is a huge help to a project. Somebody comes in and either provides them feedback or helps them work on their backlog. And that’s the best way that you get a lot of value from this free software. It’s just simple, easy ways to get back.

Omkhar Arasaratnam (19:18)
Patches are welcome.

CRob (19:19)
Patches are always welcome.

Omkhar Arasaratnam (19:21)

CRob, thank you for joining us today on What’s in the SOSS? It’s my pleasure that you’re going to be joining us as co-host. So I’m really looking forward to hearing your episodes as well. And yeah, thanks for all the support and all that you do for the community.

CRob (19:34)
Very welcome. I’m looking forward to it. Thank you.

Announcer (19:37)
Thank you for listening to What’s in the SOSS? An OpenSSF podcast. Be sure to subscribe to our series of conversations on Spotify, Apple, Amazon or wherever you get your podcasts. And to keep up to date on the Open Source Security Foundation community, join us online at OpenSSF.org/getinvolved. We’ll talk to you next time on What’s in the SOSS?

What’s in the SOSS? Podcast #5 – OpenAI’s Matt Knight and Exploring the Intersection of AI and Open Source Security

By Podcast

Summary

Matt Knight is Head of Security at OpenAI, where he builds IT, privacy and security programs. His teams also collaborate on security research with teams across OpenAI and with the broader security research community. Their goal is to explore the frontier of AI, understand its impacts and maximize its benefits, especially in the cybersecurity domain.

Conversation Highlights

  • 00:40 – Matt’s duties at OpenAI
  • 01:52 – Matt’s accidental journey into cybersecurity
  • 05:18 – The intersection of AI and open source
  • 06:45 – Matt’s thoughts on how AI can help security professionals
  • 08:53 – Details on the AI Cyber Challenge (AIxCC)
  • 10:53 – Matt answers Omkhar’s rapid-fire questions
  • 12:29 – Advice Matt would give to aspiring security professionals
  • 13:00 – Matt’s call-to-cation for listeners

Transcript

Matt Knight soundbite (00:01)
AI has the potential to help cybersecurity practitioners where they’re constrained. That’s important because cybersecurity engineers face a lot of constraints. Every security team is constrained by capabilities, is always up against pressure to be faster and is always up against pressure to access greater scale. 

Omkhar Arasaratnam (00:18)
Welcome to What’s in the SOSS? I am your host Omkhar Arasaratnam. I am also the general manager of the OpenSSF. Today we have a good friend of mine Matt Knight Matt. Why don’t you tell us what you do?

Matt Knight (00:31)
Hey, my name is Matt Knight. I’m the head of security at OpenAI.

Omkhar Arasaratnam (00:34)
I feel like you’re burying the lead here. What does the head of security at OpenAI do? I mean, it doesn’t sound like a boring job.

Matt Knight (00:40)
Yeah, it keeps me on my toes. So I joined OpenAI back in 2020 and have been building the security, privacy and IT programs since then. Before OpenAI, I spent most of my career protecting companies and institutions that had comparable high-value research and technology. I also co-founded a company called Agitator that focused on security research. And if you go far enough back, I started my career as an electrical engineer before getting into security.

But these days I spend most of my time focused on security engineering and building the systems for developing and deploying advanced AI. My teams and I also collaborate on security research with teams across OpenAI and with the broader security research community to explore the frontier of this technology, understand its impacts and also maximize its benefits, specifically as far as I’m concerned, on the cybersecurity domain. 

And internally this involves using large language models wherever we can to enable our security program. And yes, even doing some open source work of our own, too. So it’s great to be here and I look forward to a great discussion.

Omkhar Arasaratnam (01:46)
Wonderful. Thanks, Matt. Sounds like you’ve had quite a journey in terms of security. Why don’t we start at the beginning? How’d you get into security?

Matt Knight (01:52)
To put the bottom line up front: accidentally. So I started my career as an electrical engineer, as I mentioned, I studied EE in college and EE is a pretty big field. You can be sort of on one end of the spectrum, you can be doing analog electronics on the other end, you can do digital, coupled with software engineering.

And I was always more on the digital side. So my first job out of college, I was working as an embedded software engineer, writing software for wireless networking stacks. And it was pretty interesting work, but I got to a point in my work where I found that I needed a spectrum analyzer to debug a system I was working on. And if you’ve ever had to buy lab equipment, you know that it’s really expensive. 

But right around that time, there was this open source project called GNU Radio that was getting a lot of buzz. And GNU Radio was really cool because it was this powerful open source signal processing toolkit that enabled basically like using software to implement all these different radio engineering and signal processing tools.

And between GNU Radio and some low-cost commodity hardware, I was able to get my hands on, I was able to basically build my own spectrum analyzer to help me do my work in developing and debugging these wireless systems. So I had this toolkit for monitoring the spectrum and it was pretty useful for that, but I kind of kept playing with it and found that you could use it not only to, you know, capture and analyze signals, you could also use it to replace signals, to generate your own signals. 

And, you know, realized that, you know, when you, if you captured a signal and replayed it, a lot of devices would just accept it and would, would, you know, treat that as valid. And that really freaked me out. Also got me, you know, was sort of my first contact with how, you know, vulnerable much of that ecosystem was. And I kind of couldn’t look away from it. 

So, I started doing security research on my own nights and weekends, wound up having the opportunity to make a career out of it. And wound up doing a lot of work in that, you know, open source or in that, that wireless security space, a lot of which was supported by this really vibrant open source community at the time. And I did that for a while and then made the choice a couple of years later to make a career transition into what I’m doing now.

And I’ve been spending roughly the last decade playing defense. I still have a lot of passion for the wireless space, but these days I’m spending my time protecting companies rather than doing wireless research.

Omkhar Arasaratnam (04:16)
You might have come up with the Flipper Zero before the Flipper Zero.

Matt Knight (04:19)
Flipper Zero is pretty cool. No, I was working with other sort of open source and some proprietary bits of hardware, but really underpinning all of it was GNU Radio. GNU Radio is a really, really powerful open source tool. There’s a ton of great academic and commercial work and research being done on it.

They have a great community around it. I’ve spoken to their conference a couple of times. And if you go far enough back, I open sourced at least one GNU Radio module myself based on research that I’ve done. So quick shout out to that community. It’s still going strong. And I’m always impressed with the great folks who work on that toolchain are coming up with.

Omkhar Arasaratnam (04:57)
I’m happy to hear that. So open source, it’s been with you for a really long time. Let’s talk about your day job now. So you’re working on a lot of cutting-edge stuff. As we think about AI, large language models, generative AI, how much of that world is supported by open source? What’s that look like?

Matt Knight (05:18)
Quite a bit of it is derived from open source. And I’d say that most companies are, to some degree, leveraging open source and also building their own. If we look at many of the frameworks that the AI industry leverages, think things like PyTorch and TensorFlow, they started in various ways within companies but now are open source and are sort of robustly supported by broad communities.

And if you go beyond the frameworks, there are of course, myriad dependencies that companies depend on to do their research and also to run their infrastructure. And of course, much of the world’s AI training infrastructure runs on Linux, which is, you know, of itself, of course, open source. So I’d say that by and large, you know, open sources is pretty important to AI research and innovation. 

But beyond AI, you know, much of the tools that the security industry uses too, you know, have open source connections too. So there’s a network security tool called Zeek that most security, well, I shouldn’t say most, but many security teams use in different ways that’s really powerful. And then, you know, in other domains like code scanning, we’ve got some newer tools like Semgrep and CodeQL that are really powerful. 

Omkhar Arasaratnam (06:25)
So we talked about how open source is a lot of foundational components of what we use in terms of AI today and how open source is a foundational component in a lot of the security tools we use. What if we inverted that? How can we use AI to improve the security of open source? Do you have any thoughts on that?

Matt Knight (06:45)
I do. So my teams and I spend a portion of our time studying and analyzing how advances in AI may impact cybersecurity. And we want those benefits to be, of course, as broadly distributed as possible. And what more deserving beneficiary of that than the open source software ecosystem?

And a thesis that I’ve sort of been refining here is that AI has the potential to help cybersecurity practitioners where they’re constrained. And that’s important because cybersecurity engineers face a lot of constraints. Every security team, to some degree, is constrained by capabilities, is always up against pressure to move faster, and is always up against pressure to access greater scale. 

Do you have the expertise to find and fix the security problems wherever they may lie? Can you find and fix the problems fast enough to mitigate issues before they turn into real problems? Can you fix the problems wherever they happen to exist? There’s always more code to analyze, there’s always more logs to analyze, there’s always more that you can do to get leverage. Can you access them efficiently? 

So we are finding that AI is broadly useful to alleviate some of these pressures. And we as a team look to incorporate language models into our own work wherever we can. Now, of course, it is necessary to be aware that these tools are very imperfect on their own. They have things that they’re really good at and they also have a lot of downsides. So we’re looking for places in which we can implement these tools to benefit our work while also managing the drawbacks and downsides. 

Omkhar Arasaratnam (08:30)
Sounds good. So we’re recording this just after the Open Field Competition for AIxCC closed. It closed on April 30th. Can you share with the audience a little bit about the AIxCC, the AI Cyber Challenge, how OpenAI is involved, obviously, OpenSSF is a supporter as well.

Matt Knight (08:53)
I’m happy to and Omkhar, I think we first met at DEF CON last year in connection with the AI Cyber Challenge and I’m glad to be supporting this initiative along with OpenSSF. So the AI Cyber Challenge really has a great mission at its core, which is to find and fix vulnerabilities in the open source software that powers and underpins the critical infrastructure that we all rely on.

And it’s very timely because we’ve seen this great explosion in AI capabilities that’s largely been driven by language models. And while we’ve seen so much capability growth in language models, I believe that static analysis, that is finding and fixing vulnerabilities in source code, is an area where language models have historically underperformed. I think it’s a rich area for research, but because the capabilities are still emergent, I think success in this challenge is gonna involve a lot more than just like clever prompt engineering to get results. 

But the challenge is great because it engages a robust security research community. It brings a whole bunch of folks who wouldn’t necessarily participate in a program like this into the fold. And it’s also gonna happen and play out publicly. I think the semi-finals and finals are slated to be at DEF CON, which will be a great way to get even more of the community involved. And I’m pretty enthusiastic about what it’s gonna produce. If we look at where conventional static analysis tools fall short, AI and language models have the potential to really bring different capabilities to this domain to help, I think, fill in some areas that could really benefit the sort of static analysis tool ecosystem.

Omkhar Arasaratnam (10:36)
I’m really looking forward to seeing what our competitors come up with as well. We’re going to move into the rapid-fire section. So I’m ready when you’re ready. And the right answer for any of these may be one of the answers I provide or no, Omkhar, I actually, I think it’s something else. So are you ready?

Matt Knight (10:55)
Let’s go, hit me.

Omkhar Arasaratnam (10:57)
Spicy or mild food?

Matt Knight (11:00)
Okay, so I have Irish heritage and I grew up in a family and household where salt was an exotic spice. So my answer may surprise you. I am a spicy food guy all the way.

Omkhar Arasaratnam (11:14)
All right, man. Well, we’ve got…we’re going to be grabbing a meal soon. I hope you’re…I may bring some hot sauce with me. Now, the next couple of questions are very engineering-focused. Text editor of choice: VI, VS Code or Emacs?

Matt Knight (11:32)
I am a VI or Vim person all the way. And, I mean, beyond it just being the first thing that I learned, it’s on everything. You know, it’s on your, it’s on your Linux, you know, laptop, it’s on your, you know, all the servers you’re going to jump onto, but it’s also on a lot of embedded systems. You know, you’ve got the small low profile and embedded versions of it that you see in various places. So it’s a pretty useful editor to fall back on.

Omkhar Arasaratnam (11:59)
I’m also a Vim guy, so full support here. Last but not least, tabs or spaces?

Matt Knight (12:05)
Spaces and specifically two of them.

Omkhar Arasaratnam (12:09)
(Laughter) Excellent Thanks so much for for going through that. Now Matt, as we close out the podcast, what advice do you have for somebody entering our field today — somebody that’s either a new grad? Just completed their undergrad in comp sci or somebody that may be switching careers. What advice do you have for somebody entering today?

Matt Knight (12:29)
Love this, love this question. The world is changing beneath our feet very quickly with whether it’s the emergence of AI or just sort of more generally the pace at which the software ecosystem or the security ecosystem moves. So my advice to anyone who’s getting started is really to stay curious. And if you commit to that and a lifetime of learning, just enjoy the ride.

Omkhar Arasaratnam (12:55)
And the last question for you, what’s your call to action for our listeners?

Matt Knight (13:00)
A couple of things here will be exciting to the listeners here. The first is that we at OpenAI are hiring. So if you’re interested in AI or language models or security, please do give us a look. I also want to just briefly mention our cybersecurity grant program. This is something that all your listeners should feel encouraged to participate in. But we’re giving out a million dollars in cash incentives plus API credits to fuel innovation and research in defense of cybersecurity. 

We love open source as part of that. So if you are working on or want to work on some sort of open source innovation to help benefit the ecosystem, we’d love to take a look and fund it. Just some ideas of things that we’re excited about. So, you know, porting code to memory-safe languages. If you want to look at applying AI to that, that would be awesome. We think that confidential computing for GPUs could be a pretty important layer for protecting AI services going forward. And we’d love to fund some work around that, and other ideas too. We’re always looking for research collaborations with the broader community. So we’d love to hear from you. And just lastly, two of my colleagues were at Black Cat Asia, Paul McMillan and Fotios Chantzis. They actually open sourced some of their work coming out of that. Some automation that we built at OpenAI to help enable our work and help our teams move faster. So if that sounds interesting, I encourage you to go check that out.

Omkhar Arasaratnam (14:25)
Thanks so much, Matt. Really appreciate your time. Thank you for joining us on the podcast.

Matt Knight (14:29)
My pleasure, thanks for having me.

Announcer (14:31)
Thank you for listening to What’s in the SOSS? An OpenSSF podcast. Be sure to subscribe to our series of conversations on Spotify, Apple, Amazon or wherever you get your podcasts. And to keep up to date on the Open Source Security Foundation community, join us online at OpenSSF.org/getinvolved. We’ll talk to you next time on What’s in the SOSS?

What’s in the SOSS? Podcast #4 – Eric Brewer and the Future of Open Source Security

By Podcast

Summary

In this episode, Omkhar talks to Eric Brewer, professor emeritus of computer science at the University of California, Berkeley and vice president of infrastructure at Google. He’s also on the Governing Board of the OpenSSF. His research interests include operating systems and distributed computing. He is known for formulating the CAP theorem about distributed network applications in the late 1990s.

Conversation Highlights

  • 01:15 – Eric discusses his background
  • 03:14 – Improving security in a corporate vs. open source environment
  • 05:58 – Advancements Eric has noticed in open source in recent years
  • 07:17 – How to make software repositories more secure
  • 08:58 – The next big hurdle in open source security
  • 11:12 – Rapid-fire questions: Mild or spicy food? Vim or Emacs? Spaces or tabs?
  • 12:42 – Eric’s advice for aspiring security professionals
  • 14:45 – The importance of being active in security communities

Transcript

Eric Brewer (soundbite) (00:01)
I do think we need to start tackling build services and automated testing. And the reason these are harder is because they cost money. Even if you’re willing to volunteer time to work on open source, you may not be willing to pay for extensive building and testing. 

Omkhar Arasaratnam (00:18)
Hi everyone. Welcome to What’s in the SOSS, a podcast by the OpenSSF. I’m the OpenSSF General Manager Omkar Arasaratnam. And with us this week, we have Eric Brewer from Google. Eric, can you give us a little bit about your title and what you do at Google?

Eric Brewer (00:38)
Happy to. Nice to be on this podcast. I am a VP and fellow at Google. I’ve been working on cloud-related things for a long time, really since the 90s, before we had a cloud. At Google, I’ve been working on things like Kubernetes and open source security, and that’s what led me to help start the OpenSSF.

Omkhar Arasaratnam (00:58)
And for our listeners, a long time ago, Eric and I both worked together. And in fact, how we got to know each other was through my interest in open source security. And of course, the leadership that you were providing in open source security at Google. Can you talk a little bit about that aspect for our listeners?

Eric Brewer (01:15)
Yes, so through my work in Kubernetes, I came to realize in around 2018 that, you know, this was obviously a successful project, but it was also very complicated. At the time, it had like 1,200 dependencies. So it touched a whole bunch of code and that if you just look around, it wasn’t hard to find things that we depended on that realistically weren’t all that trustworthy.

And people were not very careful at the time about what kinds of things got included or how they would get used. And so that was a bit of a wake-up call for me. Then I started to look at a whole bunch of other things. I’m like, well, this is not just a dependency problem. It’s actually a supply chain problem more broadly. 

I started giving internal talks on supply chain security, which by the way, that term didn’t exist at the time either. We didn’t really call it that. It’s the right term, but how do we build our software? Why do we know it’s built correctly? And especially at the time, lots of times people would include packages directly off the Internet like you don’t even know what’s in there. My friend Kelsey Hightower often said it’s kind of like picking up a USB drive off the street, sticking it in your laptop. It’s like everyone knows not to do that, but we don’t know not to take random packages off the Internet, even though it’s a very apt analogy. 

I started thinking about that in 2018. I started thinking about what Google could do, but at the end of the day, this is not a Google problem. As much impact as Google can have, this is an industry-wide problem and needs an industry-wide solution.

Omkhar Arasaratnam (02:41)
As a grumpy old guy that’s been doing security for about two decades, the very thought of somebody randomly plugging in a USB stick they found in the parking lot sends chills down my spine. And I tota lly get the equivalence in terms of, hey, what’s the software you’re consuming off the internet? 

Now, Eric, you’ve worked in large organizations and you’ve also done a lot of work in open source. Can you speak to the differences in how you would improve security in a corporate environment like when you’re shipping code at Google versus what’s possible in an open source project?

Eric Brewer (03:14)
They are pretty different, which is something I hope to change eventually. And I give lots of talks to Google Cloud customers on this kind of topic for the corporate view, at least, which is roughly, have control of the code you’re using, which often means private copies of things. The private copies give you two advantages. One of them is you’re not taking it directly off the internet, even if it came indirectly from the internet. Second, when you have some kind of problem, you actually know exactly what code you used. Because another problem with pulling stuff off the internet is that code changes and you may not know which version you have or when it was pulled. And so you may not even know exactly what you’re running. 

So there’s some basic things like know the code you’re using. And then of course, the next big one after that is use a build service, right? You’re not allowed to build production code on your laptop because that device is probably in most cases not trustworthy for its supply chain attributes. And so those are kind of two basic examples of things that corporations can easily do and should be expected to do. 

Open source maintainers don’t have it so easy. For example, clearly they’re gonna use dependencies off the internet, they’re not gonna have a proper copy of all the dependencies, that kind of misses the point. And they typically have to build and test some packages at their own expense using whatever means they have. So to ask them to pay for a build service to do that is pretty unreasonable. And in fact, one of the things I’m looking at is how we could maybe build or rebuild packages in a secure signed way for free for open source maintainers. Maybe not all the time, but when you want to do a major release or something like that. So unclear where that will go, but there is plenty we could do to make it easier to be a maintainer for open source package, especially one that’s critical to the rest of the world.

Omkhar Arasaratnam (05:02)”
Absolutely, the notion of these kind of hermetically sealed build environments that are kept absolutely pristine versus some of the horror stories that we can think of — like, you know random laptop mining bitcoin on three of the cores while trying to do a production build of a package — that definitely has some obvious benefits in terms of using the hermetically sealed build environment, of course. 

It does sound like we’ve got quite a few things that we need to be focusing on, but since your involvement, which even predates the creation of the OpenSSF way back in 2017, 2018, I’m now referring to your involvement versus the OpenSSF’s inception in 2020, what are some accomplishments that we’ve had in the field of open source security? I mean, it’s been about, you know five, six years. What have we done so far that you’ve been proud of?

Eric Brewer (05:58)
I think there’s a bunch of things. The biggest one is probably just awareness, right? The US now has an open source strategy, right? There’s a cybersecurity executive order. Both of those things I had some influence on, but I didn’t cause them, right? The importance of these issues caused them. And so that awareness is just symptomatic of broader awareness from all the ecosystems like Python or Rust where they’re now taking their role seriously in how to secure supply chains, how to, in general, improve the hygiene of development in open source and also how to make it easier on maintainers. Those are all good goals. And our best approach is to prove the tooling that maintainers use so they don’t have to do the right thing by some kind of behavioral change. That’s hard to communicate, hard to educate.

If they can use the tools they like and it happens to do the right thing, that would be the best outcome.

Omkhar Arasaratnam (06:55)
That sounds like some great accomplishments that we’ve had and completely agree with you. Let’s move now to talk a little bit about the watering holes of our community, the software repositories, the package repos. Can you talk a little bit about how improving the security around those software repositories can help improve the security of open source software in general?

Eric Brewer (07:17)
Those are critical, critical components. Now, source code for these things is mostly in GitHub, and that does help in the sense that GitHub is at least a group that cares about improving security issues, so I thank them for that. Package managers are more complicated because there’s many of them, and each ecosystem has its kind of own culture and philosophy around these things, and so we’re not gonna have some universal agreement on exactly how to do supply chain security. But I think we can agree on some basic things. 

Like you would like to know for which are your critical projects and which are not. We’re not trying to improve, you know, or maybe raise the security burden on all things open source. Most things open source, frankly, aren’t relevant to national security. But a surprising number are. So I do expect we’ll have some kind of national dichotomy over time, which is those projects that know and actively accept that they’re part of critical infrastructure and take it seriously. And those that really want to just make their own website and create a new programming language do various things that are for their exploration or fun or all the great reasons one might do open source, right? But they’re not committing to say this is a viable thing to use in critical infrastructure.

I’d kind of like to get to the point where most projects know which camp they’re in, right? Because we have a bunch in the middle right now where it’s mostly fun, but oh, by the way, it’s included in very important things, right? Not by their choice, by the way. Being included in something is often not the choice of the maintainer, and that also causes many problems.

Omkhar Arasaratnam (08:49)
Yeah, that dependency graph can be very interesting. Looking ahead, what do you see as the next big hurdle in open source security, and what can we do about it?

Eric Brewer (08:87)
It’s a great question. We’ve done a lot of focus recently on package managers and SBOM generation. There’s still plenty to do there. I do think we need to start tackling build services and automated testing. And the reason these are harder is because they cost money. Even if you’re willing to volunteer time to work on open source, you may not be willing to pay for extensive building and testing. And how that gets paid for is I think one of the great open questions that we need to sort out the next several years. 

The point though is that we can’t really quickly fix security bugs today in part because we don’t know when you make a change to an open source package if it’s gonna break users of that package or not. So there’s some uncertainty about even pure security changes about what impact they might have on the rest of the ecosystem. Now the good news is most of the time these changes are small, most of the time they don’t break the ecosystem. But we don’t really know because we don’t have that many test cases and, in general, we don’t have test cases for all the dependents either. All the things that use Log4j, they may not have test cases that can tell if the new version of Log4j is gonna break them or not. So we have to kind of put forward changes, hope they don’t break stuff and cross our fingers. And that’s not a great place to be. 

I think if we had more automated testing, more test cases, control builds, all those would improve our chance to deploy security patches quickly and effectively when the time came. So that is an important. medium-term agenda, for sure. And this is a place actually where machine learning may be a great benefit because it looks like we’ve had some success already with generating test harnesses for fuzzing, improving test coverage. I think we can generate test cases, or we can probably do more on the front of actually making simple patches easy for a maintainer to accept. So I kept this patch to do a version bump to get the new Log4j. Some of that already, but there’s, these are mundane things and if we can do mundane things automatically, that will save everyone a lot of unfun time.

Omkhar Arasaratnam (11:07)
Absolutely, reduce toil wherever possible, right?

Eric Brewer (11:11)
Absolutely.

Omkhar Arasaratnam (11:12)
So we’re going to shift gears a bit and now we’re going to go into what I hope are a bunch of fun rapid-fire questions for you. So first out the gate, and I’ll provide you two options. The third option of course is, Omkar, it’s neither of those. Here’s my answer. So the first question is, spicy food or mild food?

Eric Brewer (11:33)
Sadly mild.

Omkhar Arasaratnam (11:34)
Oh no. Oh no. Would you like to say more on that?

Eric Brewer (11:38)
Well, I’ve been to lots of countries where there is no mild. Hyderabad in India comes to mind. The mildest they have is near the top of my spice chart. I do have, for better or worse, a very sensitive palate, which is great for wine tasting and a few other things, but it’s not good for eating spicy food.

Omkhar Arasaratnam (11:58)
Okay, well, good to know. Second question for the rapid fire. Vim or Emacs or other?

Eric Brewer (12:06)
Emacs still. Picked it up in grad school in the 90s and know certain keyboard shortcuts that are just deep in my brain.

Omkhar Arasaratnam (12:08)
Oh boy.

Omkhar Arasaratnam (12:16)
Emacs is a great operating system that has an editor attached to it. All right. Number three in the rapid-fire, and the final one for the rapid-fire, Eric. Tabs or spaces?

Eric Brewer (12:26)Spaces.

Omkhar Arasaratnam (12:27)
All right! We’re moving to our closeout now. So you’ve been in the industry for quite a while. You’re very well-regarded and have accomplished a number of things. What advice do you have for somebody that’s entering our field today?

Eric Brewer (12:42)
Well, the good news is I think open source is a great way to boost a career in computer science or tech more broadly. And it’s amazing how often I’ve wanted to hire people because of what I saw them write in open source. And I’m not saying that’s an easy path, but boy, it is a good path because it’s very clear to tell what you care about by what you choose to work on, and I can see the code you write, how you feel about lots of different topics, for example, tabs versus spaces, which is not a criterion I would use for hiring, but I might notice. 

Omkhar Arasaratnam (13:17)
Good to know.

Eric Brewer (13:19)
So there’s that role of open source, which is as a proving ground or a way to increase visibility. And by the way, if you’re helping those projects, that’s a great thing to do, regardless of even if you’re doing it for reasons to get exposure to learn about a project or space.

The second thing I would say is that, you know, it’s worth learning how to interact with many different projects. The easy thing to do is pick a project and work on it, and there’s certainly value in that. But I also feel like I’ve learned more when I’m working on five or ten different things at a time. They have slightly different cultures or different rules about what you do to submit code or readability, things like that. And those choices are worth understanding. And I’m not even saying that they’re good or bad. Point is they are diverse and the communities are diverse, and it’s much more important, I think, to be able to contribute to someone else’s project than to your own project, Most of what the world needs is interconnections of projects and glue. And that is itself a skill that’s worth acquiring.

Omkhar Arasaratnam (14:26)
I completely agree and the old adage I suppose holds up which is we can’t solve tomorrow’s problems using yesterday’s thinking, so encouraging that diversity of thought is certainly paramount, especially in security. Last question for you Eric and thank you for having been so generous with your time. What’s your call to action for our listeners?

Eric Brewer (14:45)
It’s a great question. It depends a little bit on the listeners, but I think I would start with all of the communities typically based around languages like Python or Rust, and what can you do to make your community have better support for security? And that can be all kinds of things. It can be two-factor authentication. It can be helping with automation for SBOMs or for signing things correctly or being able to know for sure which source goes with the built artifact. There’s so many things that communities can do, and kind of need to do, that that’s an area of great need right now, where they aren’t hard things to do. They just need people to think about for their community, how do they want to solv d then go do it. And that would be a huge benefit to the community but also to the world.

Omkhar Arasaratnam (15:35)
Thank you so much, Eric, for your time. That’s it for What’s in the SOSS. Stay tuned for our next episode.

Announcer (15:40)
Thank you for listening to What’s in the SOSS? An OpenSSF podcast. Be sure to subscribe to our series of conversations on Spotify, Apple, Amazon or wherever you get your podcasts. And to keep up to date on the Open Source Security Foundation community, join us online at OpenSSF.org/getinvolved. We’ll talk to you next time on What’s in the SOSS?

What’s in the SOSS? Podcast #3 – Mark Russinovich and AI’s Impact on Software Engineering and Open Source Software Security

By Podcast

Summary

In this episode, Omkhar talks to Mark Russinovich, CTO of Microsoft Azure. Mark oversees the technical strategy and architecture of Microsoft’s cloud computing platform. Mark is also on the Governing Board of the OpenSSF. He’s a widely recognized expert in distributed systems, operating system internals, and cybersecurity. Mark’s also the author of the Jeff Aiken cyberthriller novels Zero Day, Trojan Horse and Rogue Code, and co-author of the Microsoft Press Windows Internals books.

Conversation Highlights

  • 00:36 – Mark on his role at Azure
  • 01:30 – Where AI is headed and its impact on enterprises
  • 04:06 – The task of teaching a machine learning model to unlearn Harry Potter
  • 06:32 – The good and bad of AI hallucinations
  • 10:35 – The promise of more secure open source software via AI
  • 13:05 – Mark answers Omkhar’s “rapid-fire” questions: mild or spicy food, Vim, Emacs or VS Code and tabs or spaces
  • 15:01 – Why aspiring software engineers should still learn to code

Transcript

Mark Russinovich soundbite (00:01)
I think we’re still a ways away from AI completely just taking over coding. Just like, you can have people that might be able to get away with never learning how to code, and just always prompting AI. When things go wrong, knowing what’s going on underneath will make you more effective than the person that doesn’t.

Omkhar Arasaratnam (00:18 )
Hi everyone, and welcome to What’s in the SOSS? I’m your host Omkhar Arasaratnam, the general manager of the OpenSSF. Today we have a good friend of mine, Mr. Mark Russinovich, Azure CTO. What does it mean to be the Azure CTO? Let’s get into that.

Mark Russinovich (00:36)
What I tell people, the short version is, lead technical strategy and architecture for the Azure platform. There’s a lot behind that, though. I work with engineering teams. I do work on architecture. I also, as part of it, focus on security and helped co-found the Open Source Security Foundation as part of looking at how we can improve all of our industry supply chain for open source.

Omkhar Arasaratnam (00:56)
And thank you for that. It is certainly a challenging mission we’re on. Now, you buried the lead a bit. You didn’t talk about the continued work that you’re doing on Sysinternals.

Mark Russinovich (01:04)
(Laughter) Yeah, I’m also known as Mr. Sysinternals. I still do occasional side work on Sysinternals. My favorite tool, by the way, and if you haven’t seen it and your’re Windows user, is Zoomit, which lets you annotate the screen and it’s great for demos and presentations.

Omkhar Arasaratnam (01:21)
And if I recall, for those whose eyesight has suffered over the years as mine, it helps with that too.

Mark Russinovich (01:26)
Yeah, I use it frequently myself for that.

Omkhar Arasaratnam (01:28)
(Laughter) Absolutely. You know, other than the leadership that you’ve provided in security, one of the other areas that you’ve been focusing on in terms of the leading edge of our industry is in AI and machine learning. Generative AI, in particular, holds a lot of promise. Where do you see this heading?

Mark Russinovich (01:48)
First of all, it’s hard to predict where things are heading because the rise of generative AI and the capabilities that we see in it took just about everybody by surprise. And I think that there’s probably more surprises in store for us. So there’s going to be some discontinuities, but generally, the trajectory is that we’re going to have AI assistance that are our personal assistants that help us in all aspects of our life. And then in the enterprise scenarios, we’ll have AI assistants that are automating a lot of the work that today humans are required to do and helping humans make decisions in all aspects of their work across enterprises.

Omkhar Arasaratnam (02:28)
On a personal level, if you don’t mind getting into it, how have you been using AI personally? How has that helped your day? What kind of toilsome tasks has AI been able to automate for you, and where are you seeing the limits? Like, where are we not quite there yet?

Mark Russinovich (02:42)
Well, there’s basically three different ways that I use AI. One of them is if there’s a topic that I’m not that familiar with and I want to know more, rather than going to a web search for it, I just go and ask an AI assistant to teach me about it. And the nice thing is I can tell it, hey, teach me as if I’m a high school student. Teach me as if I’m an expert in this, these other holes that I might have in my knowledge. And it crafts an appropriate response at the right altitude.

The other way is summarization. So looking at lots of papers saying to the AI, summarize this paper for me. And that gives me a high-level view of what’s in it. And then I can go dive into specific sections that I want to learn more about. And then the final way is as programming, both for Sysinternals and the AI programming work that I do on the side, I use GitHub Copilot, and it has transformed the way that I code to change me into an expert at things that I don’t really know much about, like Python and PyTorch, and changed the way that I approach programming and that I really don’t want to have to type any code anymore. I want to just tell AI to type it for me.

Omkhar Arasaratnam (03:51)
Now we were chatting a little while ago. You’d actually taken a sabbatical last year. And while you had a lot of wonderful quality time with the family, you also used some of that time to start picking up on generative AI. What kind of projects do you get into at that time?

Mark Russinovich (04:06)
So I wanted to get my hands dirty during the sabbatical where I had more time on exploring something that was novel and where I’d learn a lot in the process. And so one of the things that I recognized from the early days of me looking at the rise of generative AI and the cost of these large models — where they can cost millions to tens of millions of dollars or more to train — is the issue of training data that is problematic that you discover after you finish the training of the model.

For example, copyrighted information, GDPR, poison data, you want to have a version of the model that reflects not having been trained on that without spending, again, the millions of dollars to retrain it from scratch. And so, I thought unlearning would be a fantastic tool for these kinds of scenarios. And so me and Ronen Eldan, another researcher at Microsoft, decided to see if we could get large language model, specifically Llama 7b from Meta, to forget Harry Potter because these models know Harry Potter really deeply. And so that was the kind of summer project, and we succeeded in getting Llama 7b to forget Harry Potter. So when you ask it to complete a sentence like, “Harry went back to school that fall and saw his friends…” the pre-trained model would say, “Of course, Ron and Hermione,” even though there was no other Hogwarts reference there other than basically indirect reference to school and the name Harry. That’s how deeply they are trained on Harry Potter content. Now the version that we made, where it forgets the Harry Potter universe, will say, “Went back to school to see his friends Sarah and Joe and take a class from their favorite professor,” and with some generic name.

Omkhar Arasaratnam (05:48)
That’s, that’s really cool. That’s incredibly interesting. And I think addresses some of the, at least what I’ve heard of as some concerns, especially adversarial use of improperly trained models and things of that nature. One of the things that I have a personal concern about —and it’s probably just being keenly aware of my own limitations —you’d mentioned one of the use cases that you use AI for today is in quickly coming up to speed on a subject that you’re unaware of.

What do you think about this notion of hallucinations and the possibility that the AI may not give you the most accurate information today? And I recognize it’s a point-in-time statement. How do you get by that?

Mark Russinovich (06:32)
It seems like you might have read my mind or maybe we had a conversation about this that I don’t recall. Hallucination, I think, is actually the biggest challenge for use in high-impact scenarios, which many enterprise scenarios where you’d want to apply AI are high impact, where if the model hallucinates something, you could have a big problem. Let me just also say that hallucination actually has positive attributes to it.

So when you want to be creative, when you want to write a document that actually flows nicely, hallucinations, which is just the model going off of the training distribution a bit, helps it be more creative and easier to read. Now in the kind of enterprise scenario where automating a workflow, a hallucination can cause a problem. And especially when you get into multi-agent systems, that can really pose a problem where agents might have a dozen interactions, and somewhere along the lines one of the agents hallucinates something, and the workflow continues with the others not being aware of what happened and making decisions and continuing orchestration based off of the incorrect assumptions in that hallucination.

And that by the end of the workflow, you’ve got an output that is completely wrong but you can’t tell why it’s wrong or where it went wrong. So that’s an example of where I think taming hallucination is key. And there’s a few approaches to taming hallucination, just leveraging the existing capabilities of the models, like grounding it with RAG content, like meta prompts that tell it to check its work, like having another model or itself go back and review its work with a separate prompt.

But I think we need other techniques, too. Because even while that drives down hallucinations, depending on what scenario, you’ll see hallucination rates between 5% and 20% or 30%, depending on what the model is being asked to do if it’s off its training distribution. So I think that there’s a need for AI models that detect hallucinations, that correct hallucinations, and then even just kind of old fashioned validation of what the model is producing. And code is a great example of this, where you could see the models generating code.

Now, a lot of times it’s going to be correct. A lot of times, though, it’s got bugs in it, like referencing packages that don’t exist, because it’s like, oh, there should be a package named this that does this. And so it’ll put it in. And it actually doesn’t exist. And so your code doesn’t even compile or run. So a simple validation is just compile it or run it. Are there any errors? A more sophisticated validation is don’t just compile and run it, but check to see if it actually produces output.

And the third one, actually, first-level validation is just look at it and see if there’s any problems. Second would be run it and see if it actually runs. And third one would be create unit tests for it and then validate against the unit tests. So I think there’s this need for domain specific knowledge validators with degrees of validation based on how much cost you want to spend on validation, which is relative to the impact of a hallucination.

Omkhar Arasaratnam (09:35)
Yeah, that makes a lot of sense. We’ve, I think we’ve all heard about the Canadian airline a few months ago where their AI BackChat bot had made a particular statement about a ticket that somebody had purchased that had to do with them traveling for bereavement. And it had given them incorrect information, and the Canadian court system ended up finding in favor of the passenger. I mean, it was the airline’s chat bot which they took to be as gospel.

It’s very interesting that you brought up the notion of different regression tests or unit tests that we could take when writing software. Turning the focus of how we may apply AI now to a challenge that you and I both face on a daily basis, what are your thoughts about AI helping to secure open source software, whether it be challenges like the DARPA AI Cyber Challenge that we’re helping out on or maybe in more general?

Mark Russinovich (10:35)
So AI is going to do a tremendous amount over the next few years for open source software. And there’s a few things that you can see right away that it can do, like code reviews. Look for bugs. And already it’s good enough to detect certain kinds of bugs just by that. But we can continue to fine-tune it to learn better how to spot security vulnerabilities in software as just through code reviews. The other one is through this kind of validation that I talked about.

So, and people have already started to explore this, like fuzz testing based on AI-driven fuzzer that is more sophisticated about looking for problems by expecting the code and then deciding the best way to fuzz it. So it’s kind of combining the human type of reasoning with automated fuzzing.

The other one is helping to generate the code like we talked about. But I think one thing that is going to be a great boon and can already be done today, is documenting code. And there’s tremendous amounts of code out there in Linux and elsewhere that have no documentation. The code is the documentation. So somebody that is new to the code comes and says, “I want to contribute to this, but I have no idea what’s going on. It takes me a long time to come up to speed and learn the code.”

When an AI model can inspect the code and generate comments and in the header for the function or in inline comments to describe exactly what’s going on. And while that’s not rocket science, it can save tons of hours probably of work as somebody coming up to speed with the code base. Not just that, but you have an AI chatbot that sees the code, and you can ask it questions about the code to learn how the code is working more quickly. I think that one is a very near here-and-now capability that AI can use to help security and open source contributions.

Omkhar Arasaratnam (12:24)
I think that makes a lot of sense. In fact, as we think about things like self-documenting code where previously the documentation was the code, I think use cases like that, to the point that you made earlier, also provide almost a semi-automatic method, right?

So even if the quality isn’t right there today, and even if there’s a slight hallucination or imagination in terms of the LLM’s inference of what the code’s supposed to do. Presumably, you have enough familiarity with the syntax of the language that you would be able to pick up correctness of that interpretation. But even still the error rate is probably lower than you having to manually grok through all the code yourself. I see that.

I think we’re gonna move into the rapid-fire section now, Mark. And with any of these questions, I’m gonna give you an either-or but the reality is there could be a third answer, which is, “No, Omkhar, I actually, I feel this way.” So I think the first one is quite binary, but spicy or mild food?

Mark Russinovich (13:23)
Spicy.

Omkhar Arasaratnam (13:25 )
All right. You know, some of our other guests have been leaning mild, and we have a dinner coming up and I’m going to find someplace with spicy food for you. I think I know the answer to this, but I’m going to ask it anyway. Vim, Emacs or VS Code?

Mark Russinovich (13:39)
VS Code, but if it is, I think that the true question here, the pure question is Emacs or Vim or VI?

Omkhar Arasaratnam (13:48)
You know, I’d like to say that, but I’ve been messing around with VS Code lately, and it’s not just because you’re on this recording, Mark. I’m starting to dig VS Code.

Mark Russinovich (13:57)
Now, I love VS Code, that’s what I use. But prior to VS Code, and there was very much the Vim camp and the Emacs camp, I was strongly in the Emacs. In fact, I used Emacs until the late 90s, before I started to use Microsoft Visual C, its own editor. Like, I just don’t understand the VI people at all, I’m just baffled.

Omkhar Arasaratnam (14:18)
Well, I’m definitely a VI guy and you know, Emacs is a great operating system that happens to have an editor attached to it. (Laughter) Last one for the rapid –

Mark Russinovich (14:31)
I still have trouble quitting VIs.

Omkhar Arasaratnam (14:33)
(Laughter) I presume you can get Emacs key bindings for VS Code.

Mark Russinovich (14:38)
You know what I grew out of Emacs key bindings? But I mean Emacs key bindings, and by the way, I just don’t know why Emacs key bindings aren’t the default for shells either. The key bindings for shells that come default are just ridiculous.

Omkhar Arasaratnam (14:51)
They are. They are. Tabs or spaces?

Mark Russinovich (14:54)
I don’t really care that much. That’s one where I’m like, whatever, as long as it formats correctly visually.

Omkhar Arasaratnam (15:01)
Makes sense as far as it’s consistent, I guess. So to close us out, Mark, what advice do you have for somebody entering our field today? We’ve both been in the field for quite a while. We’ve seen a lot of stuff. A lot of stuff has changed with that wealth of knowledge. How would you guide somebody that’s entering our field today?

Mark Russinovich (15:18)
Entering our field meaning software engineering. I guess the elephant in the room aspect of that question is should they learn to code or not? Is that what’s going to happen? Do they need to? And I would actually say, yes, go ahead and learn how to code. And I’d say that for a couple of reasons. One, it’s a way to give you critical thinking about an end-to-end process from the high-level objective down to actually how you implement it.

And that translates to other domains as well. And even if AI is going to be doing some of the low-level lifting, you still need to have the high-level, how things fit together and flow that you’re going to get as you learn to code. The other reason is that I think we’re still a ways away from AI completely just taking over coding. And just like you can have people that might be able to get away with never learning how to code and just always prompting AI, when things go wrong, knowing what’s going on underneath will make you ten times more effective than the person that doesn’t. And so just, you know, that reason, but certainly the first one that I mentioned, I would say you’re not wasting your time by learning how to code.

Omkhar Arasaratnam (16:31)
I think that’s great advice for people entering software engineering today. Last question for you. What’s your call to action for our listeners? What would you have them do immediately following this show?

Mark Russinovich (16:41)
Go check out all the learning materials that the Open Source Security Foundation offers and learn how to secure your open source supply chains.

Omkhar Arasaratnam (16:48)
Thanks very much, Mark. And I can’t say, I can’t thank you enough for being a guest on our show. Look forward to catching up with you shortly and thank you for joining What’s in the SOSS?

Mark Russinovich (17:00)
Yeah, thanks for having me. I’m a great conversation.

Announcer (17:02)
Thank you for listening to What’s in the SOSS? An OpenSSF Podcsat. Be sure to subscribe to our series of conversations on Spotify, Apple, Amazon or wherever you get your podcasts. And to keep up to date on the Open Source Security Foundation community, join us online at OpenSSF.org/getinvolved. We’ll talk to you next time on What’s in the SOSS?

What’s in the SOSS? Podcast #2 – Christoph Kern and the Challenge of Keeping Google Secure

By Podcast

Summary

In this episode, Omkhar talks to Christoph Kern, Principal Software Engineer in Google’s Information Security Engineering organization. Christoph helps to keep Google’s products secure and users safe. His main focus is on developing scalable, principled approaches to software security.

Conversation Highlights

  • 00:42 – Christoph offers a rundown of his duties at Google
  • 01:38 – Google’s general approach to security
  • 03:02 – What Christoph describes as “stubborn vulnerabilities” and how to stop them
  • 06:42 – An overview of Google’s security ecosystem
  • 10:00 – Why memory safety is so important
  • 12:23 – Solving memory safety problems via languages
  • 16:23 – Omkhar’s rapid-fire questions
  • 18:28 – Why Christoph thinks this may be a great time for young professionals to enter the cybersecurity industry

Transcript

Christoph Kern soundbite (00:01)
The White House just put out a memo talking about memory safety and formal methods for security. I would have never believed this a couple of years ago, right? It’s becoming a more important table-stakes. It might be actually a very interesting time to get into this space without having to sort of swim upstream the whole time.

Omkhar Arasaratnam (00:17)
Hi everyone, it’s Omkhar Arasaratnam. I am the general manager of the OpenSSF and the host of the What’s in the Sauce? podcast. With us this week, we have Christoph Kern. Christoph, welcome.

Christoph Kern (00:30)
Thank you, Omkhar, for having me. It’s an honor to be here, and I’m looking forward to this conversation.

Omkhar Arasaratnam (00:34)
It’s a pleasure, Christoph. So, background. Tell us a little bit about where you work and what you do.

Christoph Kern (00:42)
I’m a principal engineer at Google. I’ve been there about 20 years and a bit, so quite a long while. I work in our information security engineering team, which is basically product security. So we look after the security posture of all the services and applications that Google offers to our users and customers. And a lot of that time, I focused on essentially trying to figure out scalable ways of providing security posture across hundreds of applications and to a high degree of assurance at that.

Omkhar Arasaratnam (01:13)
Well, I think if memory serves, we spoke a couple of times when I was at Google, a couple of times after Google. I mean, securing Google full stop, no caveat, no asterisk. That’s a lot of stuff. So what are some of the ways that y’all have thought about securing all the things within Google? I presume you just don’t have a fleet of security engineers that descend upon every project.

Christoph Kern (01:38)
Right, exactly. To make this scale, you really have to think about invariance that you want to hold for every application, and also classes of common defects that you want to avoid having in any of these hundreds of applications. And the traditional way of doing this has been to try to educate developers and to use sort of after the fact code reviews and testing and penetration testing and, you know, In our experience, this has not actually worked all that well. And we, over the years, sort of realized that we really need to think about the environments in which these applications are being built. And so usually there’s like many applications that are fairly similar, right? Like we have hundreds of web front ends and they have many aspects of their threat model that are actually the same for all of them, right?

Cross-site scripting, for instance, is an issue for every web app, irrespective of whether it’s a photo editor or a banking app or an online email system. And so we can kind of take advantage of this observation to scale the prevention of these types of problems by actually building that into the application framework and the underlying libraries and the entire developer ecosystem, really, that developers use to build applications. And that has turned out to work really quite well.

Omkhar Arasaratnam (02:53)
Now, in the past, you’ve referred to this class of stubborn vulnerabilities. Can you say a little bit more about stubborn vulnerabilities and what makes them so stubborn and hard to eliminate?

Christoph Kern (03:02)
Yeah, there’s a list of vulnerabilities that the folks who make this common weakness enumeration, the CWE, put out. So they’ve been putting out the, sort of, top 25 most dangerous vulnerabilities list for years. And recently, they started also making a list of the ones that consistently appear near the top of these lists over many years. And those are then, evidently, classes of problems that are in principle well understood.

We know exactly why they happen and what the, sort of, individual root cause is, and yet it turns out to be extremely difficult to actually get rid of them at scale and consistently. And this is then evidenced in the fact that they just keep reappearing, even though there’s been guidance on how to, in principle, in theory, avoid them for years, right? And it’s well understood what you, in principle, need to do. But applying that consistently is very difficult.

Omkhar Arasaratnam (03:52)
Software engineer to software engineer. What’s the right way of fixing these vulnerabilities? I mean we’ve thrown WAFs at them, we’ve taken all kinds of input validation techniques. What would you recommend? Like, how does Google stop those?

Christoph Kern (04:06)
I think the systemic root cause for these vulnerabilities being so prevalent is that there is an underlying API that developers use that puts the burden on developers to use it correctly and safely. Essentially, all of these APIs that are in this class of injection vulnerabilities consume a string, a sequence of characters, that is then interpreted in some language. It could be SQL in the case of SQL APIs. And then leading to SQL injection or JavaScript embedded in HTML in the case of XSS, right?

And the burden is on developers to make sure that when they write code, they assemble strings that are then passed to one of those APIs, that the way they’re assembled is following secure coding guidelines. In this case, questions of how you would escape or sanitize an untrusted string that’s embedded in HTML markup, for instance. And you have to do this hundreds of times over in a large application because there’s lots of code in a typical web app that assembles smaller strings into HTML markup that is then shipped to a browser and rendered. And it’s extremely difficult to not forget in one of those places or apply the wrong rule or apply it inconsistently. And this is just really, really difficult, right? And this is why those vulnerabilities keep appearing.

Now, to get rid of them, what we found, the only thing that actually works, is to really rethink the design of the API and change it. And so we just went ahead effectively and changed the API so it no longer consumes a string, but rather consumes a specific type that is dedicated to that API and essentially holds the type contract, the promise, that its value is actually safe to use in that context. And then we provide libraries of builders and constructors that are written by experts, by security engineers, that actually follow safe coding rules.

And then as an application developer, you really don’t have the opportunity to incorrectly use that API anymore, because the only way to make a value that will be accepted by the API is to use those expert-built libraries, right? And then effectively the type system of the language just glues everything together. It then also makes sure that when a value is constructed in one module, there’s like some module, maybe even in a backend, that makes HTML markup or a snippet of HTML markup that’s shipped to a browser and then embedded into the DOM in the browser, the type system ties those two places that are otherwise very difficult to understand because they’re very far away. They might be written by different teams. The type system ties those two things together and actually makes sure that the underlying coding rules are actually followed consistently and always.

Omkhar Arasaratnam (06:42)
Other than SQL injection and cross-site scripting, can you provide any other practical examples or maybe just to reflect back on how this has shown up in the security properties of Google products? Has this been broadly adopted by Google developers? Has there been some resistance? Can you talk a little bit about that from a developer experience perspective?

Christoph Kern (07:06)
The way Google’s developer ecosystem evolved for different reasons, really for productivity and quality reasons, the design of that ecosystem actually helped us greatly, right? So Google has this single monorepo where all the common infrastructure, including compilers and toolchains and libraries, are provided by central teams to all the application developers. And there’s really no practical way for somebody to build code without using that. It would be just very expensive and outlandish to even think of. And so if we build these things into those centrally-provided components, and we do it in a way that doesn’t cause undue friction, most people just don’t even notice.

They’ll just use a different API to make strings that get sent to a SQL query, and it just works, right? If it doesn’t work, then they’ll read the document and say, “Oh, this API wants a trusted SQL string instead of a string, so I’ll have to figure out how to make that and here’s the docs.” And once they figure this out once, they’re on their way. And so we’ve actually seen fairly little resistance to that. And of course, we’ve designed it so that it’s easy to use, right, otherwise we would see complaints.

One interesting thing we’ve done, I think that actually sort of in hindsight helped a lot is that we’ve chosen ourselves to make the maintainers and developers of these APIs, the security engineers, the first line customer support, so to speak, for developers using them. So we have this internal, sort of,  equivalent of Stack Overflow, where people can ask questions. And our team actually monitors the questions about these APIs. And that inherently requires us to, so we don’t get drowned in questions or problems, to design them and iterate them on an ongoing basis to make them easier to use. So that in almost all use cases, developers can just be on their way by themselves without needing any help. And so that’s really helped to sort of tune these APIs and figure out their corner cases in their usability and make them both easy to use and secure at the same time.

Omkhar Arasaratnam (09:00)
That’s a wonderful overview. And just to summarize, by baking these kind of protections right into the tooling that the developers use, they don’t have to waste mental effort on trying to figure out how to sanitize a string. It’s already there. It’s already present. If you have a new developer coming in from the outside who maybe doesn’t have experience with using these trusted types, the actual API that they would call won’t accept a raw string. So they’re forced into it.

And I guess the counterbalance to ensure that you have a usable API for your tens of thousands of developers within Google is that essentially the people that write this also have to support it. So it’s in their best interest to make it as friction-free as possible for the average developer inside of Google. I think that’s, that’s excellent.

We’re going to switch gears. Google recently published a paper on memory safety of which you were one of the co-authors. So let’s talk about memory safety a little bit. Can you explain to the listeners why it is important?

Christoph Kern (10:00)
Yes, I think memory safety essentially, or the memory safety problem, is essentially an instance of this sort of problem we just talked about, right? It is due to the design in this case of a programming language or programming languages that have language primitives that are inherently unsafe to use, where the burden is on the developer to make sure that the surrounding code ensures the safety precondition for that primitive. So for instance, if you’re dereferencing a pointer and C or C++, it’s your responsibility as a programmer to be sure that anytime execution gets to that point, that pointer still points to validly allocated memory, and it hasn’t been deallocated by some other part of the code previously before you got here, right?

And so if that was the case, it would lead to a temporal safety violation because you have like a use-after-free, for instance, vulnerability. Similarly, when you’re indexing into an array, it’s your responsibility to make sure that the index is in bounds and you’re not reading off the end of the array or previous to the beginning. Otherwise, you have a spatial safety issue.

And I think what makes memory safety particularly stubborn is that the density of uses of potentially unsafe APIs or language features for memory safety is orders of magnitude more than some of these other vulnerability classes. So if you look at SQL injection, in a large program you might maybe have tens of places where the code is making a SQL query versus in a large C program or C++ program, you’ll have you know, thousands or tens-of-thousands of places that dereference pointers, like every other line of code literally is a potential safety violation, right? And so with that density of potential mistakes, there will be mistakes.

There’s absolutely no way around it. And that sort of is borne out by experience in that code that is written in an unsafe language tends to have its vulnerabilities be memory safety vulnerabilities.

Omkhar Arasaratnam (11:50)
Many languages nowadays, be it Python, JavaScript, or for lower level software development, Golang or Rust, all proclaim these memory safety properties and are often referred to with absolutes, as in solving an entire class of problem. I think you and I have been around software engineering for long enough that such bold claims are often met with a bit of cynicism. Can you talk about how these languages are actually solving these entire classes of memory safety problems?

Christoph Kern (12:23)
Yes, I think the key to that is that if you use them to good effect, you can design your overall program to enable modular reasoning about the safety of the whole thing. And in particular, design the small fragments of your code that do need to use unsafe features. So in Rust, it might be that you might need a module. For instance, if you want to implement a linked list, a doubly linked list, you need unsafe, right? Or you need reference counters.

But what you can do is write this one module that uses potentially unsafe features so that it is self-contained and its correctness and safety can be reasoned about without having to think about the rest of the program. So basically when you write this linked list implementation, for instance, in Rust, you will write it in a way such that the assumptions it needs to make about the rest of the program are entirely captured in the type signatures of its API.

And you can then validate by really thinking about it hard, but it’s a small piece of code, and you might get your experts in unsafe Rust to look at it, that module will behave safely for any well-typed program it is embedded in, right? And once you are in that kind of a place, then you will get actually a very high assurance of safety of the whole program. Just out of the fact that it type checks, right?

Because the components that use unsafe features are safe for any well-typed caller, and the rest of it is inherently safe due to the design of the language, there really is very little opportunity for a potential mistake. And that’s, I think, again, borne out of practice in that, like, in Java or JVM-based languages, memory safety really is a very rare problem. We’ve had some buffer overflows in, I don’t know, like, image parsers that use native code and stuff like that.

But it’s otherwise a relatively rare problem compared to the density of this type of bug in code that’s written in a language where unsafety is basically everywhere across the entire code base.

Omkhar Arasaratnam (14:23)
So, I mean, the obvious thing seems to be, OK, let’s wave our magic wands and rewrite everything in a memory-safe language. Obviously, things aren’t that simple. So what are the challenges with simply shifting languages, and how do you address large legacy code bases?

Christoph Kern (14:39)
Unless there is some breakthrough in like ML-based automated rewriting, I think we have to live with the assumption that the vast majority of C++ code that is in existence now will keep running as C++ until it reaches its natural end of life. And so we have to really think about,  as we make this transition to memory safety, probably over a span of like decades, really, where do we put our energy to get the best benefit for our investments in terms of increased security posture?

And so I think there’s a couple of areas where we can look at, right? So for instance, there’s some types of code that are particularly risky, it’s most likely very valuable to focus on those and replace them with a memory-safe implementation. So we might replace an image parser that’s written in C or C++ with one that’s written in Rust, for instance.

And then beyond that, if we have a large C++ code base that we can’t really rewrite feasibly and we can’t just stop using it because we need it, we’ll have to look at incremental ways we can improve its security posture. And there is some interesting approaches for instance, we for instance, I think are somewhat confident that it’s possible to achieve a reasonable assurance for spatial safety in C++ through approaches like safe buffers by basically adding runtime checks.

For temporal safety, it’s much more difficult. There are some ideas, you know, there’s like some work in Chrome for instance, using these wrapper types for pointers called MiraclePtr. There might be some hardware mechanisms like MTE. And there’s a lot of trade-offs between cost and performance impact and achievable security improvement that will really probably take some time to shake out. But you know, we’ll get there at some point.

Omkhar Arasaratnam (16:23)
I’m glad to hear that the problem is at least tractable now. Moving over to the next part of our podcast Christoph we’re gonna go through a series of rapid-fire questions. Some of these will have one or two options, but the last option is always, “No Omkhar. Actually, I think it’s this.” So we’re gonna start off with one that’s quite binary, which is spicy or mild food.

Christoph Kern (16:45)
I don’t think it’s actually that binary. In the winter, when it’s cold out, I tend to gravitate to more like sort of, you know, savory  German type cooking. That’s my cultural background. And then in the summer, I’m more leaning towards the like zesty, more spicy flavors, you know, so it maybe varies throughout the year.

Omkhar Arasaratnam (17:02)
Interesting. For me I tend to gravitate to spicy as a default, but then when the weather gets cooler, I find that spicy is even higher priority for me as it helps me to feel a bit warm. OK, the next one’s a bit controversial one based on some previous guests: VI, VS Code, Emacs?

Christoph Kern (17:22)
For me, it really depends on what I’m working on. I’ll use whatever code editor that’s most well supported for the language. So for like, say Rust it might be VS Code, but then in Google we have our own thing that’s supported by a central team. My muscle memory is definitely VI key bindings, but I actually, at the age of like 45 or something, I decided to finally learn Emacs so I could use org mode, but I do use it with a VI key binding, so.

Omkhar Arasaratnam (17:48)
Excellent. Tabs or spaces?

Christoph Kern (17:51)
You know, I haven’t thought about that in a long time. I think many years ago, the language platform teams at Google basically decided that all code needs to be automatically formatted. And so basically, you do whatever the thing does and what’s built into the editors. And it never really occurs even as a question anymore.

Omkhar Arasaratnam (18:09)
Makes it easier. One less thing to worry about. To close it out, what advice do you have for somebody entering our field today, somebody that’s just graduating with their undergrad in comp sci or maybe just transitioning into an engineering field from another field that’s interested in tackling this problem of security?

Christoph Kern (18:28)
You know, maybe it’s actually a particularly good time to get into this field. I think I’ve been very fortunate to have worked in an organization that really does make security a priority. And so usually when you approach somebody you want to work with on improving security posture of a product, it’s rarely a question of whether or not this should be done at all.

You don’t have to justify your existence, right? It’s really usually questions about the engineering of exactly how to do it. And that’s a very nice place to be, right? You’re not constantly arguing to even be there, right? At the table. And I think maybe I’m a little hopeful that this is now changing for other organizations where that’s not so obvious, right? Like, you hear a lot more talk about security and security design.

I mean, the White House just put out a memo talking about memory safety and formal methods for security. It was like, I would have never believed this if you’d told me this a couple of years ago, right? So I think it’s becoming a more important and sort of obvious table-stakes part of the conversation. And so it might be actually a very interesting time to get into this space without having to sort of swim upstream the whole time.

Omkhar Arasaratnam (19:34)
We’ve talked about stubborn vulnerabilities, safe coding, memory safety. What is your call to action for our listeners, having absorbed all this new information?

Christoph Kern (19:44)
I think it is well past time to no longer put the burden on developers and really view these problems as systemic outcomes of the design of the frameworks and application frameworks and production environments, the entire developer ecosystem. Like in the article we recently published, we kind of put this as t he security posture is an emergent property of this entire developer ecosystem, and you really can’t actually change the outcome by not focusing on that and only blaming developers. It’s not going to work.

Omkhar Arasaratnam (20:18)
Christoph, thank you for joining us, and it was a pleasure to have you. Look forward to speaking to you again soon.

Christoph Kern (20:23)
Thank you, yeah, it was a pleasure to be here.

Announcer (20:25)
Thank you for listening to What’s in the SOSS? An OpenSSF podcast. Be sure to subscribe to our series of conversations on Spotify, Apple, Amazon or wherever you get your podcasts. And to keep up to date on the Open Source Security Foundation community, join us online at OpenSSF.org/getinvolved. We’ll talk to you next time on What’s in the SOSS?

What’s in the SOSS? Podcast #1 – Vincent Danen and the Art of Vulnerability Management

By Podcast

Summary

In this episode, Omkhar talks to Vincent Danen, Vice President of Product Security at Red Hat, responsible for security and compliance activities for all Red Hat products and services. He’s also on the Governing Board of the OpenSSF. Vincent has been involved with open source and software security for over 20 years, leading security teams and participating in open source communities and development.

Conversation Highlights

  • 00:39 – Vincent shares his background in security and responsibilities at Red Hat
  • 03:36 – The importance of maintaining a sense of calm during security incidents
  • 05:18 – Omkhar and Vincent discuss their experiences learning about the infamous Heartbleed Bug
  • 09:05 – Vincent offers advice on how to address vulnerability management and the importance of trusting your vendors
  • 11:34 – Not every threat or vulnerability requires swift and immediate action
  • 12:46 – Pitfalls organizations should avoid in vulnerability management
  • 15:40 – Vincent answers Omkhar’s “rapid-fire” questions: mild vs. spicy food, text editor or choice and tabs vs. spaces
  • 16:32 – Advice Vincent would give to aspiring security professionals and the importance of being open-minded

Transcript

Vincent Danen soundbite (00:01)
I want somebody to come out and create a bug scanner. Go tell me all the bugs that are in the software that I have. Not the security issues but the bugs. Because that list is gonna be way longer. And I guarantee you that some of those bugs are far more impactful for you as a user than some of these security issues.

Omkhar Arasaratnam (00:17)
Welcome to What’s in the SOSS? I’m your host Omkar Arasaratnam and with me this week we have fellow Canadian Vincent Danen. Vincent, how are you doing my friend?

Vincent Danen (00:28)
Good, Omkhar. How are you?

Omkhar Arasaratnam (00:29)
I’m doing just dandy. So for our audience, I would love to do a quick intro. Why don’t you give them your name, title and what you do?

Vincent Danen (00:39)
Sure, so Vincent Danen, Vice President of Product Security at Red Hat. I just actually celebrated 15 years at Red Hat a month ago.

Omkhar Arasaratnam (00:47)
Congratulations.

Vincent Danen (00:49)
Thank you. Prior to that, I was at Mandriva for those long-time listeners who know the history of Linux. I was doing security work for them for about eight years. So I’ve been knee-deep in open source security for over 20 years now, and it just makes me feel old.

Omkhar Arasaratnam (01:04)
You’re, you’re an O.G. as the kids say, and let me let me drop some street cred: You know, I used to be a Red Hat certified engineer in Red Hat 7.2 And I didn’t say RHEL, I said Red Hat 7.2 because I’m an old guy, too.

Vincent Danen (01:20)
Yeah, well you got some street cred for sure.

Omkhar Arasaratnam (01:23 )
That’s a really cool title. Sounds incredibly important. Can you give our listeners a bit of an overview as to, you know, being the person in charge of product security? What does that mean at Red Hat?

Vincent Danen (01:35)
Yeah, I mean, product security at Red Hat has, I mean, that name kind of gives it away, right? It is about the security of our products. Our remit is effectively all of the proactive/reactive security concerns around our portfolio of products. So if you think about it, that, you’d mentioned RHEL, that’s one. OpenShift, Ansible, Middleware, EAP, a ton of products. And of course, we like to support these things for a very long time. So multiple versions of the same product. So effectively, my team ingresses a number of vulnerability information. So new CVEs are discovered, either they’re reported directly to us, either under embargo or not.

We get information from CVE, other reporters. You’re familiar with the Linux distros mailing list. So we get information that way as well. So we’re kind of ingressing all of these vulnerabilities. We triage them and determine their effectiveness or effectedness to our products. And then we kind of go through the whole process of rating the vulnerability in terms of its severity, how it’s impacted in the products.

And then kind of just follow that through with engineering who are going to fix these things and test them, release them out to our customers. We provide a ton of information about CVEs because customers really like to know, “ What does this thing do and should I be sweating or is this okay?” We also focus a lot on, say, our internal build pipelines, how we curate the open source, how we interact with upstream. We do a lot on the compliance front as well. So it’s like a very robust view of security, kind of from front to end for all of our products.

Omkhar Arasaratnam (03:14)
That sounds like an incredibly broad scope. And at some point, you have to tell the listeners when you have time to sleep It sounds like you’re on all the time, like most of us are in cybersecurity.

Vincent Danen (03:25)
Yes, although I do sleep and actually sleep pretty good. One of the benefits of having a fantastic team to work with. So I don’t have to worry about everything. I have a great team to work with,  and they do a lot of the heavy lifting.

Omkhar Arasaratnam (03:36)
That’s wonderful to hear. And I certainly get that. Back in the day, when I first started in cybersecurity, incident response was one of the things that I had. And an old manager of mine often said, whenever we have to deal with an incident, there should be a sense of urgency, but it shouldn’t be panic. And what I’m hearing from you is you’ve got a team that’s really set up to handle that sense of urgency properly without the panic that could be a negative force.

Vincent Danen (04:03 )
It’s actually interesting that you mentioned that because one of our goals is to, particularly with a lot of these named vulnerabilities, so those have been a phenomenon for at least the last dozen years. Because Heartbleed actually just celebrated a 10-year anniversary, I think it was earlier this week or last week.

Omkhar Arasaratnam (04:19)
Yeah, I didn’t get a cake, but I remember.

Vincent Danen (04:22)
I didn’t get a cake either, but I do remember when it happened. I was hip-deep in that as well. But one of our goals is to maybe quell that sense of panic that our customers or other people in the industry have. So we really try to take a look at these vulnerabilities from the perspective of what does it actually do and do I need to be worried? And then convey that information as clearly and concisely as possible to our customers so that we’re not seeing undue panic.

I mean, there are certain things we should absolutely be panicking about, right? Like, these are things where if we produce a patch, I mean, we want you to apply it as quickly as possible. There is that sense of urgency. But when we’re looking and analyzing these things, I kind of think of it more akin to a firefighter. If you’re in the middle of a blaze trying to put that fire out and you’re panicking, you’re not going to be very effective, right? So we want to be as kind of calm, cool, collected, measured, as clear as possible.

Omkhar Arasaratnam (05:18)
The analog that I use often to describe that same concept: A neighbor of mine is a paramedic, and one of the things he pointed out to me was you’ll notice that paramedics never run at an accident scene. And, It’s not, I mean they certainly move with urgency, but they don’t run because they don’t want to cause more harm through acting in a non-stoic and measured manner by kind of running, running into the proverbial scene. Of course, we see that on TV all the time, but you know, TV is not reality.

I do want to come back to the Heartbleed thing for just a moment. It’s said that when you look back on your life, there are certain key moments that everybody remembers. And for those that were maybe the generation prior to us, it was the JFK assassination. For our generation to betray our age to the viewers — or the listeners — it was probably the Challenger explosion. It was probably, you know, 9/11.

I have that indelible kind of memory of Heartbleed, and the reason I have that indelible memory is I have very poor discipline when it comes to turning off work. And I wish I had better discipline. My wife also wishes I had better discipline. But ten years ago, I had promised my wife and kids were going to go to Hawaii for the first time. We were in Maui, and this was still back in the days when everybody had a Blackberry. I left my Blackberry at home turned off, and we were on the beach in Maui, and I came back in and I turned on the TV. And I was like, “Oh boy, what a day to be disconnected from work.” What was, what was your experience?

Vincent Danen (07:04)
First, I’ll say you were one of the lucky ones to be disconnected from it.

Omkhar Arasaratnam (07:07)
By total coincidence.

Vincent Danen (07:10)
Yeah, yeah. No, the thing that sticks out for me the most, there’s two. One is that our Red Hat Summit was about a month later, and that was all anyone wanted to talk about was Heartbleed. And that’s not what I was there for. Right, so that was interesting and that kind of sticks in my head. The other one was I actually remember my mother phoning me, and she’s completely, sorry Mom if you hear this, completely computer illiterate. Right? I have to go to her house to help her fix the remote because she did something to the TV, and it’s literally one button, right? But she phones me, and she’s like, “Hey, I heard about this computer thing on the radio.”

And I was like, “What are you even talking about?” A, you picked up on this,  and you knew it was somehow relevant to me, which was shocking. And then secondly, it was like, it was on the radio. And to that point, I had never heard, like the local news, I had never heard of a security issue in software ever getting that kind of airplay. This thing was really noisy.

Omkhar Arasaratnam (08:05)
My family non-technical analog is my dad, and again, apologies to my dad if he hears this. Like, my dad wants to send me articles about, you know, the latest scam that’s out there and, you know, don’t get title scammed out of your house and stuff like that and the odd meme and not do much else. But when dad starts sending me stuff that’s like, “Hey, do you know about this?” Yeah, then then it really puts things in perspective as to how this affects society.

So I think the, I mean, the conclusion in all this that I’m drawing to is vulnerability management is hard to do properly. And being able to kind of filter signal and noise and get down to something that’s actually actionable shouldn’t be based on whether Vincent’s mom hears it on the radio or Omkhar’s dad finds it on a news website. What are some key considerations for our listeners? What should they be thinking about when they start thinking about vulnerability management?

Vincent Danen (09:05)
That’s a great question because it’s something I think about a lot. I actually talk about it a lot as well. The caveat here being I work for Red Hat, and so this is my day job, right? And so I deal with a lot of customers who have a lot of questions, particularly about this topic, right? So the first thing that I would say is you have to know your vendor before you pick them. There’s a fundamental trust factor that comes into play with your vendor. And I’m not even talking just from a security perspective, right? Like, you have to be able to trust the software that you’re using or the vendor who puts it out, right? And there’s a couple of reasons for that.

Vendors typically will assess a vulnerability themselves, right? I know we have things like NVD and OSV and, like, other kind of CVE aggregation systems, but a vendor typically rates the severity of a vulnerability in terms of their product. I’ve heard in the past, I haven’t heard it recently, but somebody actually accused me of lowballing a vulnerability because I didn’t want to have to fix it. I was like, well, that’s really weird. You know, like, you trust me to run your workloads, to do all this work that you’re doing, to build value in your business, right? To run your platforms and whatnot. But you’re not going to trust me when I say that this vulnerability doesn’t matter for these particular reasons, right? Which is a little weird. You trust me for one thing, but you don’t trust me for the other.

Omkhar Arasaratnam (10:20)
It is strange.

Vincent Danen (10:22)
So, I mean, there is a trust relationship with your vendor, and I think that extends to when they say something is impactful or not, you have to kind of believe that, right? And it’s really important because I was looking at a, I think, it was a GRUB vulnerability a couple months ago.

Omkhar Arasaratnam (10:39)
The Bootloader?

Vincent Danen (10:41)
Yeah, the Bootloader. And when I was looking at the CVSS ratings for that GRUB vulnerability, we had it rated one way. I think Debian had it rated a different way. SUSE rated it the same. F5 rated it like really low, right? In the context of their environment and how accessible it is in their devices. Right? So I mean, they rated it in the context of the way that they use it and kind of the environment around it. And that’s typically what vendors do. So I wouldn’t sit there and say, “Yeah, go look at, you know, how Debian writes stuff, and that’s exactly how it works for Red Hat.” Because it’s not true.

Omkhar Arasaratnam (11:16)
And presumably there may be some, I mean, it could be mitigations in your build chain that you include. It could be, to your point, is this an appliance? And is this something that’s a materially accessible vulnerability remotely or something of that nature based on your usage?

Vincent Danen (11:34)
A hundred percent. RHEL being an operating system, and you can do whatever you want with it, we don’t know. OpenShift is more of an appliance platform and it’s built a very specific way, and there’s a limited amount that you can do with it, right? In terms of how you’re messing around with the different components. The same component might be present in both. In RHEL, I can use it however I want. I can use it as part of my own application, I can use it on the system, whatever. In OpenShift, that might be one very specific piece of plumbing with one very specific use that’s either the vulnerable code isn’t being used, or there’s literally no way for a user or an attacker to access it. So the fact that the vulnerability is there, I mean, okay, yes, technically it’s there, but in any possible use of OpenShift, it’s not going to be material. You’d have to break OpenShift really, really bad in order to even access it, and then you’ve got bigger problems.

Omkhar Arasaratnam (12:30)
Absolutely. So the notion of reachability or exploitability is obviously key and a huge part of how people should be triaging these vulnerabilities as they do come up. What are some other pitfalls that people should avoid in vulnerability management?

Vincent Danen (12:46)
Well, I think one of them is just the notion that, you know, as we were discussing here, that every vulnerability matters, right? Most of them don’t. So I kind of look at it as like, don’t sweat the fact that your scanner is showing up a bunch of low or medium or moderate vulnerabilities. That’s probably fine, right?

I would worry more about the critical and important or high vulnerabilities that it’s showing because those are the ones that are more likely to be exploited and are more likely to be damaging if they are. Interestingly enough, Red Hat produces a risk report on an annual basis. Last year, out of the, what is it, about 1,600 vulnerabilities that impacted us, only 1.2 % were actually known to be exploited. The prior year was at 0.4%. Now the majority of those are in those critical and important vulnerabilities. And there was like a handful in the moderate levels, like, I think three.

So I think about it like about a thousand moderates and two of them are exported. Like, why are we panicking over the other 998 that are effectively immaterial and not actually being used? Now, a little plug for Red Hat here is when we find out that something is being exploited, that kind of raises it to our level of, “OK, this is actually an issue.” And if we hadn’t fixed it already, we’re going to fix it. So we’ll always proactively do the criticals and the importance because it could be any one of those that could be exploited, cause damage.

But we’re not worrying about all of them because, I mean, frankly, I actually had this thought the other day. Because I hear a lot about these vulnerability scanners, right? And they’re very noisy. Sometimes they’re not very accurate and they show a lot of things. I want somebody to come out and create a bug scanner. Go tell me all the bugs that are in the software that I have. Like not the security issues, but the bugs. Because that list is gonna be way longer. And I guarantee you that some of those bugs are far more impactful for you as a user than some of these security issues, particularly the low vulnerabilities.

Omkhar Arasaratnam (14:44)
Absolutely. I mean security properties of a program are essentially an aspect of quality. And looking at them holistically in terms of all quality issues is an interesting view. One of the ways  I’ve described this in the past is security is like this infinite problem space, and if you don’t have a way of reasoning over what’s actually important, you’re going to be chasing down rabbit holes forever and a day. And some of the work that we’re actually doing in the Security Toolbelt group within the OpenSSF is around doing these kind of threat modeling and risk assessments to really pick up on, “Look, OK, in the fullness of time, we should address all the things, but what do I need to address now? And how do I need to address it?”

Vincent with all that said, I think we’re going to jump into the rapid-fire round. Are you ready?

Vincent Danen (15:39)
Absolutely.

Omkhar Arasaratnam (15:40)
All right. Spicy or mild food?

Vincent Danen (15:44)
Mild. Although my mother likes spicy food, and I think that turned me off as a youngster. I’m starting to get back into handling a little bit of heat.

Omkhar Arasaratnam (15:51)
I’d like to be the Sherpa on your journey.

Vincent Danen (15:54)
Thank you.

Omkhar Arasaratnam (15:55)
Text editor of choice: Vim, VS Code, Emacs or other. That’s an option as well.

Vincent Danen (16:01)
Vim.

Omkhar Arasaratnam (16:03)
Yes! All right. You know, you, you slipped on the spicy food. You redeemed yourself on the text editor. This next one is incredibly influential: tabs or spaces?

Vincent Danen (16:14)
I’m a spaces guy.

Omkhar Arasaratnam (16:15)
Yes! Alright. We, we will continue to be good friends, Vincent

Vincent Danen (16:20)
Awesome!

Omkhar Arasaratnam (16:21)
In closing out, thank you so much for all your great advice, but for somebody that’s entering our field today, what would you tell them? What sage wisdom would you impart?

Vincent Danen (16:32)
Probably two things. One, as you and I are both aware, I’ve been here for a long time. It’s very easy to be burnt out and stressed out and everything else by this work. Not to take anything away from the fantastic firefighters and paramedics and everything else, but it feels a lot like first responder-type work. So I say, take care of yourself first. If you don’t take care of yourself, you’re no good to anybody else. And we’re here to be good to other people, right?

And then the other part I would say that I think is actually really, really important is for people to stay curious. Right? If we think about this XZ Backdoor that we just had recently, it was curiosity that found it. I mean, at the end of the day, that’s what it was. This thing is a little bit weird, and I don’t understand it, so I’m gonna go digging. We have to be curious. I don’t really care how you build it, I wanna know how you break it. Right? And I think that’s a very important mindset for security people, so being curious is super important.

Omkhar Arasaratnam (17:25)
That’s some great advice. Last but not least, what’s your call to action for our audience?

Vincent Danen (17:32)
Be open-minded. Find a good reputable vendor to enable you on your — I hate the term digital transformation — but your digital transformation journey, right? Find a reputable vendor to work with there and then trust them, right? There’s a lot of great software vendors out there, a lot of great open source communities, projects, et cetera, who are desperately doing the right thing for those around them. And I think that that should inspire and has earned trust. And we have to trust the people we work with.

Omkhar Arasaratnam (18:02)
Vincent, thank you so much for being generous with your time. Be safe, and thank you so much for coming on What’s in the SOSS?

Vincent Danen (18:10)
Thanks, Omkhar.

Announcer (18:11)
Thank you for listening to What’s in the SOSS? An OpenSSF podcast. Be sure to subscribe to our series of conversations on Spotify, Apple, Amazon or wherever you get your podcasts. And to keep up to date on the Open Source Security Foundation community, join us online at OpenSSF.org/getinvolved. We’ll talk to you next time on What’s in the SOSS?

OpenSSF Newsletter: January 2024

By Newsletter

Open Source Security Foundation(OpenSSF) – Who We Are

The OpenSSF is a diverse global community dedicated to making the world a better place through open source software. Join us in enhancing the security of open source, and together, let’s create a safer world. Check out our new video!

OpenSSF Election Results for Technical Advisory Council and Representatives to the Governing Board

We are thrilled to kick off 2024 by announcing the OpenSSF representatives to the Governing Board and the establishment of a new and expanded Technical Advisory Council elected by the community. Congratulations, and we look forward to a great year ahead!

Election Image TAC