All Posts By

OpenSSF

What’s in the SOSS? Podcast #53 – S3E5 AIxCC Part 3 – Buttercup’s Hybrid Approach: Trail of Bits’ Journey to Second Place in AIxCC

By Podcast

Summary

In the third episode of our AI Cyber Challenge (AIxCC) series, CRob sits down with Michael Brown, Principal Security Engineer at Trail of Bits, to discuss their runner-up cybersecurity reasoning system, Buttercup. Michael shares how their team took a hybrid approach – combining large language models with conventional software analysis tools like fuzzers – to create a system that exceeded even their own expectations. Learn how Trail of Bits made Buttercup fully open source and accessible to run on a laptop, their commitment to ongoing maintenance with prize winnings, and why they believe AI works best when applied to small, focused problems rather than trying to solve everything at once.

This episode is part 3 of a four-part series on AIxCC:

Conversation Highlights

00:04 – Introduction & Welcome
00:12 – About Trail of Bits & Open Source Commitment
03:16 – Buttercup: Second Place in AIxCC
04:20 – The Hybrid Approach Strategy
06:45 – From Skeptic to Believer
09:28 – Surprises & Vindication During Competition
11:36 – Multi-Agent Patching Success
14:46 – Post-Competition Plans
15:26 – Making Buttercup Run on a Laptop
18:22 – The Giant Check & DEF CON
18:59 – How to Access Buttercup on GitHub
21:37 – Enterprise Deployment & Community Support
22:23 – Closing Remarks

Transcript

CRob (00:04.328)
And next up, we’re talking to Michael Brown from Trail of Bits. Michael, welcome to What’s in the SOSS.

Michael Brown (ToB) (00:10.688)
Hey, thanks for having me. I appreciate being here.

CRob (00:12.7)
We love having you. So maybe could you describe a little bit about your organization you’re coming from, Trail of Bits, and maybe share a little insight into what your open source origin story is.

Michael Brown (ToB) (00:23.756)
Yeah, sure. So Trail of Bits is a small business. We’re a security R &D firm. We’ve been in existence since about 2012. I’ve personally been with the company about four years plus. I work there within our research and engineering department. I’m a principal security engineer, and I also lead up our AIML security research team. So Trail of Bits, we do quite a bit of government research. We also work for commercial clients.

And one of the common threads in all of the work that we do, not just government, not just commercial, is that we try to make it as public as much as we possibly can allow. So for example, sometimes, you know, we work on sensitive research programs for the government and they don’t let us make it public. Sometimes our commercial clients don’t want to publicize the results of every security audit, but to the maximum extent that our clients allow us to, we make our tools, we make our findings, we make them open source. And we’re really big believers in

that the work that we do should be a rising tide that raises all ships when it comes to the security posture for the critical infrastructure that we all depend upon, whether we’re working on hobbies at home and whether we’re building things for large organizations, all that stuff.

CRob (01:37.32)
love it. And how did you get into open source?

Michael Brown (ToB) (01:42.146)
Honestly, I’ve just kind of always have been there. So realistically, you know, the open source community is where a lot of the research tools that I started out my research career. That’s where you found them. So I started off a bit in academia. I got my undergrad in computer science and then went and did something completely different for eight years. And then when I kind of

Uh, you know, for context, I joined the military. flew helicopters for like eight years and basically nothing in computing. But as I was starting to get out, um, the army, I, know, I was getting married, about to have kids. I kind of decided I wanted to be, you know, around the house a little bit more often. um, you know, I started getting a master’s degree at Georgia Tech. They’re offering it online. And then after I did that, I went to go, um, do a PhD there and also work for their, um, uh, their applied research arm, Georgia Tech Research Institute.

So lot of the work that I was doing was, you know, cutting edge work on software analysis, compilers and AI ML. And a lot of the stuff that I, you know, built the tools that I did my research on, they came from the open source community. They were tools that were open sourced as part of the publication process for academic work. They were made publicly and available open source by companies like Trail of Bits before I came to work with them as the result of government research projects.

So, honestly, I guess I don’t really have much of an origin story for when I got there. I kind of just landed there when I started my career in security research and just stayed.

CRob (03:16.814)
Everybody has a different journey that gets us here. And interestingly enough, you mentioned our friends at Georgia Tech, which was a peer competitor of yours in the AICC competition, which you all on Trail of Bits team, I believe your project was called Buttercup. And you came in second place. You had some amazing results with your work. So maybe could you tell us a little bit about the…

Michael Brown (ToB) (03:33.741)
Yeah, that’s correct.

CRob (03:43.15)
What you did is part of the AI CC competition and kind of how your team approached this.

Michael Brown (ToB) (03:51.022)
Yeah. So, um, you know, at the risk of sounding a bit like a hipster, um, I’ve been working at the intersection of software security, compiler, software analysis, AI ML for, you know, basically, um, almost my entire, uh, career as a research scientist. So, you know, dating back to the earliest program I worked on, uh, for, DARPA was back in 2019. And, um, so this was, this was before the large language model was a predominant form of the technology or kind of became synonymous with AI. So.

CRob (04:04.719)
Mm.

Michael Brown (ToB) (04:20.792)
For a long time, I’ve been working and trying to understand how we can apply techniques from AI ML modeling to security problems and doing the problem formulation, make sure that we’re applying that in an intelligent way where we’re going to get good solid results that actually generalize and scale. So as the large language model came out and we started recognizing that certain problems within the security domain are good for large language models, but a lot of them aren’t.

When the AI cyber challenge came around, we always approached this and this, I was the lead designer, my co-designer, Ian Smith. And I, you know, when we sat down and make the, the original concept for what became Buttercup, we always took an approach where we were going to use the best problem solving technique for the sub problem at hand. So when we approached this giant elephant of a problem, we did what you do and you have an elephant and you’ve got to eat it, eat it one bite at a time.

So each bite we took a look at it and said, okay, you we have like these five or six things that we have to do really, really well to win this competition. What’s the best way to solve each of these five or six things? And then the rest of it became an engineering challenge to chain them together. Our approach is very much a hybrid approach. This was a similar approach taken by the first place winners at Georgia Tech, which by the way, if you’ve got to be beat by anybody being beat by your alma mater, it takes a little bit of this thing out of it. So, you know, we came in first and second place. It’s funny, I actually have another Georgia Tech PhD alumni.

CRob (05:33.832)
You

Michael Brown (ToB) (05:42.926)
on my team who worked on Buttercup. So Georgia Tech is very well represented in the AI cyber challenge. So yeah, we’ve always had a hybrid approach. The winning team had a hybrid approach. So we used AI where it was useful. We used conventional software analysis techniques where they were useful. And we put together something that ultimately performed really, really well and exceeded even my expectations.

CRob (05:45.458)
That’s awesome.

CRob (06:07.56)
I can say I mentioned in previous talks, I was initially skeptical about the value that could have been derived from this type of work. But the results that you and the other competitors delivered were absolutely stunning. You have converted me into a believer now that I think AI absolutely has a very positive role can play both in the research, but also kind of the vulnerability and operations management space. What

Looking at your buttercup. What is unique about your approach with the cyber reasoning system with the buttercup?

Michael Brown (ToB) (06:45.39)
Yeah, so it’s funny you say that we converted you. I kind of had to convert myself along the way. There was a time in this competition where I thought, you know, this whole thing was going to kind of be reliant on AI too much and was going to fall on its face. And then, you know, at that point, I’d be able to say like, see, I told you, you can’t use LLMs for everything. But then it turns out, you know, as we got through there, we use LLMs for two critical areas and they worked much better than I thought they would. I thought they would work pretty well, but they ended up working to a much better degree than I thought they actually would. you know, what makes Buttercup unique?

CRob (06:49.852)
Yeah.

CRob (07:00.678)
You

Michael Brown (ToB) (07:15.69)
is that, like I said, we take a hybrid approach. We use AIML for very good problems that are well-suited for AIML. And what I mean by that is when we employ large language models, we use them on small subproblems for which we have a lot of context. We have tools that we can install for the large language model to use to ensure that it creates valid outputs and outputs that can carry on to the next stage with a high degree of confidence that they’re correct.

CRob (07:30.076)
Mm-hmm.

CRob (07:43.912)
Mm-hmm.

Michael Brown (ToB) (07:45.934)
And then in the places where we have to create or sorry, in one of the places where we have to use conventional software analysis tools, those areas are very amenable to the conventional analysis. So, you what I mean by this? good example is, for example, we needed to produce a proof of vulnerability. We have to have a crashing test case to show that when we claim a vulnerability exists in a system, we can prove through reproduction that it actually exists. Large language models aren’t great.

at finding these crashing test cases just by asking it to look at the code and say, hey, what’s going to crash this? They don’t do very well at that. They also don’t do well at generating an input that will even get you to a particular point in a program. But fuzzers do a great job of this. So we use the fuzzer to do this. But one of the things about fuzzers is they kind of take a long time. They’re also more generally aimed at finding bugs, not necessarily vulnerabilities.

CRob (08:36.808)
Mm-hmm.

Michael Brown (ToB) (08:42.702)
So we use an AIML, or Large Language Model based accelerator, or a C generator, to help us generate inputs that were going to guide the fuzzer to either saturate the fuzzing harnesses that existed for these programs more quickly. They would help us find and shake loose more crashing inputs that correspond to vulnerabilities as opposed to bugs. And those things really, really helped us deal with some of the short analysis and short.

processing windows that we encountered in the AI cyber challenge. So was really a matter of using conventional tools, but making them work better with AI or using AI for problems like generating software patches for which there really aren’t great conventional software analysis tools to do that.

CRob (09:28.018)
So as you were going through the competition, which went through multiple rounds, are there anything that surprised you or that you learned that, again, you said your opinion changed on using AI? What were maybe some of the moments that generated that?

Michael Brown (ToB) (09:45.226)
Yeah, so there I mean there were a couple of them. I’ll start with one where I can pat myself on the back and I’ll finish with one where I was kind of surprised. So first, we had a couple of moments that were really kind of vindicating as we went through this. Our opinion going into this was that large language models, couldn’t just throw the whole problem at it and expect it to be successful. So going into this, there was a lot of things that we did that we did

CRob (09:49.405)
hehe

Michael Brown (ToB) (10:14.774)
two years ago when we first started out that, you know, that’s like two years ago, it’s like five lifetimes when it comes to the development of AI systems now. So there were some things that we did that didn’t exist before that became industry standard by the time we finished the competition. So things like putting your LLM queries or your LLM prompts in a workflow that includes like validation with tools or the ability to use tools.

CRob (10:29.298)
Mm-hmm.

Michael Brown (ToB) (10:43.062)
That was something that was not mainstream when we first started out, but that was something that we kind of built custom into Buttercup when it came particularly to patching. And then also using a multi-agent approach. know, a lot of the, you know, I don’t know, a lot of the hype around AI is that, know, you just ask it anything and it gives you the answer. You know, we’re asking a lot of AI when we say, here’s a program, tell me what vulnerabilities exist, prove they exist, and then fix them for me.

And also don’t make a mistake anywhere along the way. It’s way too much to ask. we found particularly with patching, were, back then, multi-agent systems or even agentic systems or multi-agentic systems were unheard of. were still just, we were still using chat GPT 3.5, still very much like chatbot interactions, web browser interactions.

CRob (11:16.564)
Yeah.

Michael Brown (ToB) (11:36.438)
integration into tools was certainly less widespread. So we had seen some very early work on archive about solving complex problems with multiple agents. So breaking the problem down for it. And we used this and our patcher ended up being incredibly good. was our most important and our biggest success on the project. Really want to shout out Ricardo Charon, who’s our leader, the lead developer for our patching agent.

CRob (11:47.976)
Mm-hmm.

Michael Brown (ToB) (12:06.414)
or for a patching system within for both the semi finals and finals in the ICC. He did an incredible job and we really built something that I, like I said, I regard as our biggest success. So, know, sure enough, and as we go through this two year competition, now all of a sudden, you know, multi agentic systems, multi agentic tool enabled systems, they’re all the rage. This is how we’re solving these challenging problems. And also a lot of this problem breakdown stuff that has made its way baked into the models now, the newer thinking and reasoning models from

Anthropic and open AI respectively. They you you can give it these large complicated problems and will do it will first try to break it down before trying to solve it. So you know we were building all that stuff into our system before. It came about so that that’s an area where you know, like I said, we learned along the way that we had the right approach from the beginning and it’s really easy to go back and say that that’s what we learned that we were right. So on the other side of this I will have to say, you know, I’ll reiterate I was really surprised at how well.

CRob (12:53.639)
Mm-hmm.

Michael Brown (ToB) (13:04.11)
language models were able to do some of the tasks we asked it to do. Part of it’s how we approach the problem. We didn’t ask too much of it. And I think that’s part of the reason why the large language models were successful. an area that I thought where it was going to be much more challenging was patching. But it turned out to be an area where, a certain degree, this is kind of an easier version of the problem in general because open source software, which are the targets of the AI cyber challenge, they’re ingested into the training.

CRob (13:08.924)
Mm.

Michael Brown (ToB) (13:31.404)
data for all of these large language models. the models do have some a priori familiarity with the targets. So when we give it a chunk of vulnerable code from a given program, it’s not the first time it’s seen the code. But still, they did an amazing job actually generating useful patches. The patch rate that I expected personally to see was much lower than the actual patch rate that we had, both in the semifinals and in the finals. So even in that first year window,

CRob (13:33.64)
Mm.

Michael Brown (ToB) (13:58.63)
I was really kind of blown away with how well the models were doing at code generating, code generation tasks, particularly small focused code generation tasks. So I think, think large language models are kind of getting a bad rap right now when it comes to like, you know, trying to vibe code entire applications. They’re like, gosh, this, this code is slop. It’s terrible. It’s full of bugs and stuff. Well, well, you did also ask you to build the whole thing. You know, if I asked a junior developer to build a whole thing, they probably also put together some.

CRob (14:07.366)
Yeah.

CRob (14:17.233)
Yeah.

CRob (14:26.258)
Yeah.

Michael Brown (ToB) (14:26.71)
and gross stuff. But when I ask a junior developer to give me a bug fix, much like the large language model, when I ask it for a more constrained version of problem, they tend to do a better job because there’s just fewer moving parts. So yeah, those are are two things I took away. One that, you know, like I said, I get to pat myself on the back for another that was actually surprising.

CRob (14:46.012)
That’s awesome. That’s amazing. So now that the competition is over and what does the team plan to do next beyond this competition?

Michael Brown (ToB) (14:57.098)
Yeah, so I mean, look, we spent a lot of our time over the last two years. A lot of I wouldn’t quite say blood. I don’t think anyone would bled over this, but we certainly had some tears. We certainly had a lot of anxiety. you know, we put a lot of we put a lot of ourselves into Buttercup. And so, you we want people to use it. So to that end, Buttercup is fully available and fully open source. know, DARPA made it a contingent, a contingency of participating in the competition that

CRob (15:09.917)
Mm-hmm.

Michael Brown (ToB) (15:26.892)
you had to make the code that you submitted to the semi finals and the finals open source. So we did that along with all of our other competitors, but we actually took it one step further. So the code that we submitted to the finals is great. It’s awesome, but it runs that scale. It used $40,000 of a $130,000, I think, total budget. And it ran across like an Azure subscription that had multiple nodes.

countless replicated containers. This is not something that everyone can use. We want everyone to use it. So actually in the month after we submitted our final version of the CRS, but before DEF CON occurred, where we figured out that we won, we spent a month making a version of Buttercup that’s decoupled from DARPA’s competition infrastructure. So it runs entirely standalone on its own, but more importantly, we scaled it down so it’ll run on a laptop.

CRob (16:18.696)
Mm-hmm.

Michael Brown (ToB) (16:25.154)
We left all of the hooks. We left all of the infrastructure to scale it back up if you want. So the idea now is that if you go to trail of bits.com slash buttercup, you can learn about the tool. have links to our GitHub repositories where it’s available and you can go, you can go download buttercup on your laptop and run it right now. And if you’ve got an API key, that’ll let you spend a hundred dollars. We can run a demo to show you that, we can find and patch a vulnerability and live.

CRob (16:51.496)
That’s easy.

Michael Brown (ToB) (16:53.164)
Yeah, so anyone can do this right now. So if you’re an organization that wants to use Buttercup, you can also use the hooks that we left back in to scale it up to the size of your organization, the budget that you have, and you run it at scale on your own software targets. So even users beyond the open source community, we want this to be used on closed source code too. So yeah, our goal is for, you you asked what we’re gonna do with it afterward. We made it open source, we want people to use it.

And even on top of that, we don’t want it to bit rot. So we actually are going to retain a pretty significant portion of our winnings of our $3 million prize. We’re going to retain a pretty significant portion of that. And we’re going to use it for ongoing maintenance. So we’re maintaining it. We’ve had people submit PRs that we’ve accepted. They’re tiny, know, it’s only been out for like a month, but you know, and then we’ve also made quite a few updates to the public version of Buttercup afterwards. So it’s actively maintained.

There’s money from the company’s putting its money where its mouth is. We’re actively maintaining it. The people who built it are part of the people who are maintaining it. We are taking contributions from the community. We hope they help us maintain it as well. And yeah, we’ve made it so anyone can use it. I think we’ve taken it about as far as we can possibly go in terms of reducing the barriers to adoption to the absolute minimum level for people to use Buttercup and leverage AI to help them find and patch vulnerabilities at scale.

CRob (18:16.716)
I love that approach. Thank you for doing that. How did you fit the giant check through the teller window?

Michael Brown (ToB) (18:22.574)
Fortunately, that check was a novelty and we did not actually have a larger problem than the AICC itself to solve afterward, which was getting paid. So yeah, we did have the comically large check, you know, taped up in our booth at the AICC village at DEF CON and it certainly attracted quite a few photographs from passersby.

CRob (18:26.716)
Ha ha ha!

CRob (18:31.964)
Yeah.

CRob (18:37.864)
Mm-hmm.

Michael Brown (ToB) (18:47.736)
I don’t know. think if you get on like social media or whatever and you look up AICC, if you see anything, there’s probably lots of pictures of me throwing a big smile up and you two thumbs up underneath the check that random people took.

CRob (18:59.464)
So you mentioned that Buttercup is all open source now. So if someone was interested in checking it out or possibly even contributing, where would they go do that?

Michael Brown (ToB) (19:07.564)
Yeah, so we have a GitHub organization. We’re Trail of Bits. You can find Buttercup there. You can also find our public archives of the old versions of Buttercup. So if you’re interested in what, like, the code that actually won the competitions, you can see what got us from the semifinals to the finals. You can see what won us second place in the finals. And you can also download and use the version that’s actively maintained that’ll run on your laptop. And all three of them are there. Their repository name is just Buttercup.

We are not the only people who love the princess bride. So there are other repositories named butter. Yeah. There are other, there are other repositories named butter cup on GitHub. So you might have to sift it a little bit, but yeah, basically it github.com slash trailer bits slash butter cup, I think is like 85 % of the URL there. don’t have it memorized, but, but yeah, you can find it publicly available and, along with a lot of other tools that, trailer bits has made over the years. So we encourage you to check some of those out as well. A lot of those are still actively maintained.

CRob (19:39.036)
That’s what it was.

Michael Brown (ToB) (20:03.72)
have a lot of community support. believe it or not at last, I counted like something like 1250 stars. buttercup is only like our fifth most popular tool that, that trail of bits has created. So, you know, we, we were quite, we were quite notable for creating some binary lifting tools that are up there. we have also some other tools that we’ve created recently for, parser security, analysis like that, like graftage.

And then also some kind of more conventional security tools like algo VPN, like still rank above buttercup. So as awesome as buttercup is, it’s like, it’s like only the fifth coolest tool that we’ve made as voted on by the community. So check out the other stuff while you’re there too. believe it or not, buttercup isn’t, isn’t, isn’t our most popular, offering.

CRob (20:51.56)
Pretty awesome statement to be able to say. That’s only our fifth most important tool.

Michael Brown (ToB) (20:53.966)
Yeah.

Michael Brown (ToB) (20:58.444)
I don’t know, you know, like personally, I’m kind of hoping that maybe we move up a few notches after people get time to like go find it and, you know, and start it. But, you know, we, we’ve, we’ve made some other really significant and really awesome contributions to the community, even outside of the AI cyber challenge. So I want to really, really stress all of that stuff is open source. We, you know, we, we aren’t just doing this because we have to, we actually care about the open source community. We want to secure the software infrastructure. We want people to use the tool and secure the software for, you know, before they, before they, you know,

Get it out there so that you we we we tackle this like kind of untackable problem of securing this massive ecosystem of code.

CRob (21:37.606)
Michael, thank you to Trey Labitz and your whole team for all the work you do, including the competition runner-up Buttercup, which did an amazing job by itself. Thank you for all your work, and thank you for joining us today.

Michael Brown (ToB) (21:52.802)
Yeah, thanks for having me. You one last thing to shout out there. If you’re an organization, you’re looking to employ Buttercup within your organization. Don’t be bashful about reaching out to us and asking about use cases for deploying within your organization. We know we’re happy to help out there. That’s probably an area that we focus on a little bit less in terms of getting this out the door for average folks to use or for individuals to use. So we’re definitely interested in helping to make sure Buttercup gets used.

Like I said, reach out to us, talk to us if you’re interested in Buttercup, we want to hear.

CRob (22:23.44)
Love it. All right. Have a great day.

Michael Brown (ToB) (22:25.678)
All right, thanks a lot.

What’s in the SOSS? Podcast #52 – S3E4 AIxCC Part 2 – From Skeptics to Believers: How Team Atlanta Won AIxCC by Combining Traditional Security with LLMs

By Podcast

Summary

In this 2nd episode in our series on DARPA’s AI Cyber Challenge (AIxCC), CRob sits down with Professor Taesoo Kim from Georgia Tech to discuss Team Atlanta’s journey to victory. Kim shares how his team – comprised of academics, world-class hackers, and Samsung engineers – initially skeptical of AI tools, underwent a complete mindset shift during the competition. He shares how they successfully augmented traditional security techniques like fuzzing and symbolic execution with LLM capabilities to find vulnerabilities in large-scale open source projects. Kim also reveals exciting post-competition developments, including commercialization efforts in smart contract auditing and plans to make their winning CRS accessible to the broader security community through integration with OSS-Fuzz.

This episode is part 2 of a four-part series on AIxCC:

Conversation Highlights

00:00 – Introduction
00:37 – Team Atlanta’s Background and Competition Strategy
03:43 – The Key to Victory: Combining Traditional and Modern Techniques
05:22 – Proof of Vulnerability vs. Finding Bugs
06:55 – The Mindset Shift: From AI Skeptics to Believers
09:46 – Overcoming Scalability Challenges with LLMs
10:53 – Post-Competition Plans and Commercialization
12:25 – Smart Contract Auditing Applications
14:20 – Making the CRS Accessible to the Community
16:32 – Student Experience and Research Impact
20:18 – Getting Started: Contributing to the Open Source CRS
22:25 – Real-World Adoption and Industry Impact
24:54 – The Future of AI-Powered Security Competitions

Transcript

Intro music & intro clip (00:00)

CRob (00:10.032)
All right, I’m very excited to talk to our next guest. I have Taesoo Kim, who is a professor down at Georgia Tech, also works with Samsung. And he got the great opportunity to help shepard Team Atlanta to victory in the AIxCC competition. Thank you for joining us. It’s a really pleasure to meet you.

Taesoo Kim (00:35.064)
Thank you for having me.

CRob (00:37.766)
So we were doing a bunch of conversations around the competition. I really want to showcase like the amazing early cutting edge work that you and the team have put together. So maybe, can you tell us what was your team’s approach? What was your strategy as you were kind of approaching the competition?

Taesoo Kim (00:59.858)
that’s a great question. Let me start with a little bit of a background.

CRob (00:)
Please.

Taesoo Kim (00:59)
Ourself, our team, Atlanta, we are multiple group of people in various backgrounds, including me as academics and researchers in security area. We also have world-class hackers in our team and some of the engineers from Samsung as well. So we have a little bit of background in various areas so that we bring our expertise.

Taesoo Kim (01:29.176)
to compete in this competition. It’s a two-year journey. We put a lot of effort, not just engineering side, we also tinkled with lot of research approach that we’ve been working on this area for a while. Said that, I think most important strategy that our team took is that, although it’s an AI competition…

CRob (01:51.59)
Mm-hmm.

Taesoo Kim (01:58.966)
…meaning that they promote the adoption of LLM-like techniques, we didn’t simply give up in traditional analysis technique that we are familiar with. It means we put a lot of effort to improve, like fuzzing is one of the great dynamic testing for finding vulnerability, and also traditional techniques like symbolic executions and concocted executions, even directed fuzzing. Although we suffer from lot of scalability issues in those tools, because one of themes of AIxCC is to find bugs in the real world.

And large-scale open source project. It means most of the traditional techniques do not scale in that level. We can analyze one function or a small number of code in the source code repository when it comes to, for example, Linux or Nginx. This is crazy amount of source code. Like building a whole graph in this gigantic repository itself is extremely hard. So that we start augmenting LLM in our pipeline.

One of the great examples of fuzzing is that when we are mutating input, although we leverage a lot of mutation techniques in the fuzzing side, we also leverage the understanding of LLM in a way that LLM also navigates the possibility of mutating places in the source code in a way that they can generate some of the dictionaries, providing vocabulary for fuzzer, and realize the input format that they have to mutate as well. So lot of augmentations of using LLM happen all over the places in traditional software analysis technique that we are doing.

CRob (03:43.332)
And do you feel that combination of using some of the newer techniques and fuzzing and some of the older, more traditional techniques, do you think that that was what was kind of unique and helped push you over the victory line and the cyber reasoning challenge?

Taesoo Kim (04:01.26)
It’s extremely hard to say which one contributed the most during the competition. But I want to emphasize that finding bugs in the location of the source code versus formulating input that trigger those vulnerability in our competition, what we call as proof of vulnerability. These two tasks are completely different. You can identify many bugs.

But unfortunately, in order to say this is truly the bug, you have to prove by yourself by showing or constructing the input that triggered the vulnerability. The difficulty of both tasks are, I would say people do not comprehend the challenges of formulating input versus finding a vulnerability in the source code. You can pinpoint without much difficulty the various places in the source code.

But in fact, that’s an easier job. In practice, more difficult challenge is identifying the input that actually reach the place that you like and trigger the vulnerability as a result. So we spend much more time how to construct the input correctly to show that we really prove the existence of vulnerability in the source.

CRob (05:09.692)
Mm-hmm.

CRob (05:22.94)
And I think that’s really a key to the competition as it happened versus just someone running LLM and scanners kind of randomly on the internet is the fact that you all were incented to and required to develop that fix and actually prove that these things are really vulnerable and accessible.

Taesoo Kim (05:33.718)
Exactly.

Taesoo Kim (05:42.356)
Exactly. That also highlights what practitioners care about. So you ended up having so many false positives in the security tools. No one cares. There are a of complaints about why we are not using security tools in the first place. So this is one of the important criteria of the competition. one of the strengths in traditional tools like buzzer and concord executor, everything centers around to reduce the false positives. The region people.

CRob (05:46.192)
Yes.

Taesoo Kim (06:12.258)
take Fuzzer in their workflow. So whenever Fuzzer says there is a vulnerability, indeed there is a vulnerability. There’s a huge difference. So that we start with those existing tool and recognize the places that we have to improve so that we can really scale up those traditional tool to find vulnerability in this large scale software.

CRob (06:36.568)
Awesome. As you know, the competition was a marathon, not a sprint. So you were doing this for quite some time. But as the competition was progressing, was there anything that surprised you in the team and kind of changed your thinking about the capabilities of these tools?

Taesoo Kim (06:51.502)
Ha

Taesoo Kim (06:55.704)
So as I mentioned before, we are hackers. We won Defqon CTF many times and we also won F1 competition in the past. So by nature, we are extremely skeptical about AI tool at the beginning of the competition. Two years ago, we evaluated every single existing LLM services with the benchmark that we designed. We realized they are all not usable at all.

CRob (07:09.85)
Mm-hmm.

Taesoo Kim (07:24.33)
not appropriate for the competition. Instead of spending time on improving those tools, which we felt like inferior at the beginning, so our motto at that time, our team, don’t touch those areas. We’re going to show you how powerful these traditional techniques are. So that’s why we progressed the semi-final. We did pretty well. We found many of the bugs by using all the traditional tools that we’ve been working on. But like…

Immediately after semifinal, everything changed. We reevaluated the possibility of adopting LLM. At that time, just removing or obfuscating some of the tokens in the repository, the LLM couldn’t even reason anything about. But suddenly, near or around semifinal, something happened. We realized that even after we inject or

If you think of it this way, there is a token, and you replace this token with meaningless words. LLM previously all confused about all these synthetic structures of the source code, but now, or on semifinal, they really understand. Although we tried to fool many times, you really catch up the idea, which is a source code that they never saw before, never used in the training, because we intentionally create this source code for the evaluation.

We start realizing that we actually understand. We shock everybody. So we start realizing that there are so many places, if that’s the case, there are so many places that we can improve. Right? So that’s the moment that we change our mindset. So now everything about LLM, everything about the new Asian architectures, so that we ended up putting humongous amount of efforts creating various architectures of Asian design that we have.

Also, we replaced some of software analysis techniques with LLM as well, surprisingly. For example, symbolic execution is a good example. It’s extremely hard to scale. Whenever you execute one instruction at a time, you have to create the constraint around them. But one of the big challenges in real-world software, there are so many, I would say, hard-to-analyze functions exist. Meaning that, for example, there is a

Taesoo Kim (09:46.026)
Even NGINX as an example, we thought that they probably compared the string to string at a time. But the way they perform string compare in NGINX, they map this string or do the hashing so that they can compare the hash value. Fudger, another symbolic executor, is extremely bad at those. If you hit one hashing function, you’re screwed. There are so many constraints that there is no way we can revert back by definition.

There’s no way. But if you think about how to overcome these situations by using LLM, the LLM can recognize that this is a hashing function. We don’t actually have to create a constraint around, hey, what about we replace with identity functions? It’s something that we can easily divert by using symbolic execution. So then we start recognizing the possibility of LLM role in the symbolic execution. Now see that.

Smaller execution can scale to the large software right now. So I think this is a pretty amazing outcome of the competition.

CRob (10:53.11)
Awesome. So again, the competition completed in August. So what plans do you have? What plans does the team have for your CRS now that the competition’s over?

Taesoo Kim (10:58.446)
Thank

Taesoo Kim (11:02.318)
I think that’s a great question. Many of tech companies approach our team. Some of them recently joined, other big companies. And many of our students want to quit the PhD program and start a company. For good reasons, right?

CRob (11:14.848)
I bet.

Taesoo Kim (11:32.766)
One of the team, my four PhD students recently formed and looking for commercialization opportunity. Not in the traditional cyber infrastructure we are looking at through the DARPA, but they spotted the possibility in smart contracts. that smart contracts and modernized financial industries like stable coins and whatnot

where they can apply the AI XTC like techniques in finding vulnerability in those areas. So that instead of analyzing everything by human auditor, you can analyze everything by using LLM or agents and similar techniques that we developed for AI XTC so that you can reduce the auditing time significantly. In order to get some auditing in the smart contract, traditionally you have to wait for two weeks.

In the worst case, even months with a ridiculous amount of cost. Typically, in order to get one auditing for the smart contract, $20,000 or $50,000 per case. But in fact, you can reduce down the amount of auditing time by, I’ll say, a few hours by day. This speed, the potential benefit of achieving this speed is you really open up

CRob (12:40.454)
Mm-hmm.

CRob (12:47.836)
Wow.

Taesoo Kim (12:58.186)
amazing opportunity in this area. So you can automate the auditing, you can increase the frequency of auditing in the smart contract area. Not only that we thought there is a possibility for even more like compliance checkings of the smart contracts, there’s so many opportunities that we can play immediately by using ARCC systems. That’s the one area that we’re looking at. Another one is more traditional area.

CRob (13:00.347)
Mm-hmm.

Taesoo Kim (13:25.07)
what we call cyber infrastructure, like hospitals and some government sectors. They really want to analyze, but unfortunately, or fortunately though, there are other opportunities that in ARCC, we analyze everything by source code, but they don’t have access to them. So we are creating the pipeline that given a binary or execution only environment, how to convert them.

CRob (13:28.828)
Mm-hmm.

CRob (13:38.236)
Mm-hmm.

CRob (13:49.569)
Taesoo Kim (13:52.416)
in a way that we can still leverage the existing infrastructure that we have for AICC. More interestingly, they don’t have access to the internet when they’re doing pen testings or analyzing those, so that we start incorporating some of our open source model as part of our systems. These are two commercialization efforts that we’re thinking and many of my students are currently

CRob (13:57.67)
That’s very clever.

CRob (14:05.5)
Yeah.

CRob (14:13.564)
It’s awesome.

CRob (14:20.366)
And I imagine that this is probably amazing source material for dissertations and the PhD work, right?

Taesoo Kim (14:29.242)
Yes, yes. Last two years, we are purely focused on ARCC. Our motto is that we don’t have time for publication. It’s just win the competition. Everything is coming after. This is the moment that we actually, I think we’re going to release our Tech Report. It’s over 150 pages. Next week, around next week. So we have a draft right now, but we are still publishing.

CRob (14:39.256)
Yeah.

CRob (14:51.94)
Wow.

Taesoo Kim (14:58.51)
for publication so that other people not just like source code okay that’s great but you need some explanation why you did this many of the sources is for the competition right so that the core pieces might be a little bit different for like daily usage of normal developers and operator so we kind of create a condensed technical material for them to understand

Not only that, we have a plan to make it more accessible, meaning that currently our CRS implementation tightly bound to the competition environment. Meaning that we have a crazy amount of resources in Azure side, everything is deployed and better tested. But unfortunately, most of the people, including ourselves, we don’t have resources. Like the competition have about

80,000 cloud credit that we have to use. So no one has that kind of resource. It’s not like that, not if you’re not a company. But we want to apply this one for your project in the smaller scale. That’s what we are currently working on. So discarding all these competition dependent parameter from the source code, making more containable so that you can even launch our CRS in your local environment.

This is one of the big, big development effort that we are doing right now in our lab.

CRob (16:32.155)
That’s awesome. take me a second and thinking about this from the students perspective that participated. What kind of an experience was it getting to work with professors such as yourself and then actual professional researchers and hackers? What do you see the students are going to take away from this experience?

Taesoo Kim (16:53.846)
I think exposing to the latest model because we are tightly collaborating with this OpenAI and Gemini, we are really exposed to those latest model. If you’re just working on the security, not tightly working for LLM, you probably don’t appreciate that much. But through the competition, everyone’s mindset change. And then we spend time.

and deeply take a look in what’s possible, what’s not, we now have a great sense of what type of problem we have to solve, even in the research side. And now, suddenly, after this competition, every single security project, security research that we are doing at Georgia Tech is based on LLF. Even more surprising to hear that we have some decompilation project that we are doing, the traditional possible security research you can read.

CRob (17:42.448)
Ha ha.

Taesoo Kim (17:52.162)
binary analysis, malware analysis, decompilations, crash analysis, whatnot. Now everything is LLM. Now we realize LLM is much better at decompiling than traditional tools like IDEA and Jydra. So I think these are the type of research that we previously thought impossible. We’re probably not even thinking about applying LLM. Because we spend our lifetime working on decompiling.

CRob (17:53.68)
Mm.

CRob (17:59.068)
Yeah.

Taesoo Kim (18:22.318)
But at a certain point, we realized that LLM is just doing better than what we’ve been working on. Just one day. It’s a complete mind change. In traditional program analysis perspective, many things are empty completely. There’s no way you can solve it in an easier way. So they’re not spending time. That’s our typical mindset. But now, it works in practice, amazingly.

CRob (18:29.574)
Yeah.

Taesoo Kim (18:51.807)
how to improve what we thought previously impossible by using another one. It’s the key.

CRob (18:57.404)
That’s awesome. It’s interesting, especially since you stated initially when you went into the competition, you were very skeptical about the utility of LLMs. So that’s great that you had this complete reversal.

Taesoo Kim (19:04.238)
Thank

Yeah, but I think I like to emphasize one of the problems of LLM though, it’s expensive, it’s slow in traditional sense, you have to wait a few seconds or a few minutes in certain cases like reasoning model or whatnot. So tightly binding your performance with this performance lagging component in the entire systems is often challenging.

CRob (19:17.648)
Yes.

CRob (19:21.82)
Mm-hmm.

Taesoo Kim (19:39.598)
and then just talking. But another benefit of everything is text. There’s no proper API, just text. There’s no sophisticated way to leverage it, just text. I don’t know, you’re probably familiar with all these security issues, potentially with unstructured input. It’s similar to cross-site scripting in the web space. There’s so many problems you can imagine.

CRob (19:51.984)
Okay, yeah.

CRob (20:01.979)
Mm-hmm.

Taesoo Kim (20:08.11)
But as far as you can use in a well-contained manner in the right way, we believe there are so many opportunities we can get from it.

CRob (20:18.876)
Great. So now that your CRS has been released as open source, if someone from our community was interested in joining and maybe contributing to that, what’s the best way somebody could get started and get access?

Taesoo Kim (20:28.494)
Mm-hmm.

So we’re going to release non-competition version very soon, along with several documents, we call standardization effort that we and other teams are doing right now. So we define non-competition CRS interface so that you can tightly, as far as you implement those interface, our goal is to mainstream OSS browser together with Google team.

CRob (20:36.369)
Mm-hmm.

CRob (20:58.524)
Mm-hmm.

Taesoo Kim (20:59.086)
so that you can put your CRS as part of OSS Fuzz mainstream, so that we can make it much easier, so that everyone can evaluate one at a time in their local environment as part of OSS Fuzz project. So we’re gonna release the RFC document pretty soon through our website, so that everyone can participate and share their opinion, what are the features that they think we are missing, that we’d love to hear about.

CRob (21:03.74)
Thanks.

CRob (21:18.001)
Mm-hmm.

Taesoo Kim (21:26.502)
And then after that, a month period, we’re going to release our local version so that everyone can start using. And with a very permissive license, everyone can take advantage of the public research, including companies.

CRob (21:34.78)
Awesome.

CRob (21:42.692)
It’s, I’m just amazed. when I came into this, partnering with our friends at DARPA, I was initially skeptical as well. And as I was sitting there watching the finals announced, it was just amazing. Kind of this, the innovative innovation and creativity that all the different teams displayed. again, congratulations to your team, all the students and the researchers and everyone that participated.

Taesoo Kim (21:59.79)
Mm-hmm.

CRob (22:12.6)
Well done. Do you have any parting thoughts? know, as you’re think, as we move on, do you have any kind of words of wisdom you want to share with the community or any takeaways for people curious to get in this space?

Taesoo Kim (22:25.486)
Oh, regarding commercialization, one thing I also like to mention is that in Samsung, we already took the open source version of the CRS, start applying the internal project and open source Samsung project immediately after. So we started seeing the benefit of applying the CRS in the real world immediately after the competition. A lot of people think that competition is just for competition or show

CRob (22:38.108)
Mm-hmm.

Taesoo Kim (22:55.032)
But in fact, it’s not. Everyone in industry, including at Tropic Meta and OpenAI, they all want to adopt those technologies behind the scene. And Amazon, we also working together with Amazon AWS team so that they want to support the deployment of our systems in AWS environment as well. So everyone can just one click, they can launch the systems. And they mentioned there are several.

CRob (22:55.036)
Mm-hmm.

Taesoo Kim (23:24.023)
government-backed They explicitly request to launch our CRS in their environment.

CRob (23:31.1)
I imagine so. Well, again, kudos to the team. Congratulations. It’s amazing. I love to see when researchers have these amazing creative ideas and actually are able to add actual value. And it’s great to hear that Samsung was immediately able to start to get value out of this work. And I hopefully other folks will do the same.

Taesoo Kim (23:55.18)
Yeah, exactly. I think regarding one of wisdom or general advice in general is that this competition based innovation, particularly in academic or involvement like startups or not, because of this venue, so including ourselves and startup people and other team members put their life

on this competition. It’s an objective metric, head-to-head competitions. We don’t care about your background. Just win, right? There’s your objective score. Your job is fine and fix it, I think this competition really drives a lot of efforts behind the scene in our team. We are motivated because of this entire competition is represented in broader audience. I think this is really a way to drive the innovation.

CRob (24:26.46)
Mm-hmm.

CRob (24:32.57)
Yes.

CRob (24:36.709)
Mm-hmm.

Taesoo Kim (24:54.904)
to get some public attention beyond Alphi as well. So I think we really want to see other type of competition in this space. And in the longer future, you probably see based on the current trend, CTF competitions like that, maybe not just CTF, it’s Asian-based CTF, no human involved or the Asians are now attacking each other and solving CTF challenge.

CRob (24:58.524)
Excellent.

CRob (25:19.59)
Mm-hmm.

Taesoo Kim (25:24.846)
This is not a five-year no-vote. It’s going to happen in two years or shortly. Even in this year’s live CTF, one of the teams actually leveraged Asian systems and Asians actually solved the competition quicker than humans. So think we’re going to see those types of events and breakthroughs more often than

CRob (25:55.292)
I used to be a judge at the collegiate cyber competition for one of our local schools. And I think I see a lot of interesting applicability kind of using this as to help them to teach the students that you have an aggressive attacker is doing these different techniques and it’s able to kind of apply some of these learnings that you all have. It’s really exciting stuff.

Taesoo Kim (26:00.142)
Mm-hmm.

Taesoo Kim (26:15.47)
I think one of the interesting quote from, I don’t know who actually said, but in the AI space, someone mentioned that there will be one person, one billion market cap company appear because of LLN or because of AI in general. But if you see the CTF, currently most of the team has minimum 50 people or 100 people competing each other. We’re going to see very soon.

one person or maybe five people with the help of those AI tools and they’re going to compete. Or human are just assisting AI in a way that, hey, could you bring up the Raspberry Pi for me or set up so that human just helping LLN or helping AI in general so that AI can compete. So I think we’re going to see some interesting thing happening pretty soon in our company for sure.

CRob (26:59.088)
Mm-hmm. Yeah.

CRob (27:11.804)
I agree. Well, again, Taesoo, thank you for your time. Congratulations to the team. And that is a wrap. Thank you very much.

Taesoo Kim (27:22.147)
Thank you so much.

What’s in the SOSS? Podcast #51 – S3E3 AIxCC Part 1 – From Skepticism to Success: The AI Cyber Challenge (AIxCC) with Andrew Carney

By Podcast

Summary

This episode of What’s in the SOSS features Andrew Carney from DARPA and ARPA-H, discussing the groundbreaking AI Cyber Challenge (AIxCC). The competition was designed to create autonomous systems capable of finding and patching vulnerabilities in open source software, a crucial effort given the pervasive nature of open source in the tech ecosystem. Carney shares insights into the two-year journey, highlighting the initial skepticism from experts that ultimately turned into belief, and reveals the surprising efficiency of the competing teams, who collectively found over 80% of inserted vulnerabilities and patched nearly 70%, with remarkably low compute costs. The discussion concludes with a look at the next steps: integrating these cyber reasoning systems into the open source community to support maintainers and supercharge automated patching in development workflows.

This episode is part 1 of a four-part series on AIxCC:

Conversation Highlights

00:00 – Introduction and Guest Welcome
00:59 – Guest Background: Andrew Carney’s Role at DARPA/ARPA-H
02:20 – Overview of the AI Cyber Challenge (AIxCC)
03:48 – Competition History and Structure
04:44 – The Value of Skepticism and Surprising Learnings
07:11 – Surprising Efficiency and Low Compute Costs
08:15 – Major Competition Highlights and Results
13:09 – What’s Next: Integrating Cyber Reasoning Systems into Open Source
16:55 – A Favorite Tale of “Robots Gone Bad”
18:37 – Call to Action and Closing Thoughts

Transcript

Intro music & intro clip (00:00)

CRob (00:23)
Welcome, welcome, welcome to What’s in the SOSS, the OpenSSF podcast where I talk to people that are in and around the amazing world of open source software, open source software security and AI security. I have a really amazing guest today, Andrew.

He was one of the leaders that helped oversee this amazing AI competition we’re going to talk to. So let me start off, Andrew, welcome to the show. Thanks for being here.

Andrew Carney (00:57)
Thank you for having me so much, CRob. Really appreciate it.

CRob (00:59)
Yeah, so maybe for our audience that might not be as familiar with you as I am, could you maybe tell us a little bit about yourself, kind of where you work and what types of problems are you trying to solve?

Andrew Carney (01:12)
Yeah, I’m a vulnerability researcher. That’s been the core of my career for the last 20 years. And part of that has had me at DARPA. And now I’m at DARPA and ARPA-H, where I sort of work on cybersecurity research problems focused on national defense and/or health care. So it’s sort of the space that I’ve been living in for the past few years.

CRob (01:28)
That’s an interesting collaboration between those two worlds.

Andrew Carney (01:43)
Yeah, it’s, you know, it’s, I think the vulnerability research and reverse engineering community is, pretty tight, you know, pretty, pretty small. And, a lot of folks across lots of different industries and sectors have similar problems that, you know, we’re able to help with. So, yeah, it’s, it’s exciting to kind of see, see how, how, you know, folks in finance or automotive industry or the energy sector kind of all deal with similar-ish problems, but different scales with different kind of flavors of concerns.

CRob (02:20)
That’s awesome. And so as I mentioned, we were introduced through the AIxCC competition. Maybe for our audience that might not be as familiar, could you maybe give us an overview of AIxCC, the competition, and kind of why you felt this effort was so important and we’ve spent so much time working through this, years.

Andrew Carney (02:42)
Absolutely. I mean, AIxCC, uh, is a competition to create autonomous systems that can find and patch vulnerabilities in source code. Uh, a big part of this competition was focusing on open source software, um, because of how critical it is kind of across our tech ecosystem. It really is sort of like the font of all software.

And so DARPA and ARPA-H and other partners across the federal government, we saw this kind of need to support the open source community and also leverage kind of new technologies on the scene like LLMs. So how do we take these new technologies and apply them in a very principled way to help solve this massive problem? And working with the Linux Foundation and OpenSSF has been a huge piece of that as well. So I really appreciate everything you guys have done throughout the competition.

CRob (03:41)
Thank you.

CRob (03:48)
And maybe could you give us just a little history of when did the competition start and kind of how it was structured?

Andrew Carney (03:54)
Yeah. So the competition was announced at Black Hat in August of 2023. The competition was structured into two main sections. We had a qualifying event at DEF CON in 2024. And then we had our final event this past DEF CON, August 2025. And throughout that two-year period, we designed a competition that kept pushing the competitors sort of ahead of wherever the current models, the current kind of agentic technologies were, whatever that bar they were setting, we continued to push the competitors past that. So it’s been a really dynamic sort of competition because that technology has continued to evolve.

CRob (04:44)
I have to say when I initially heard about the competition, I’ve been doing cybersecurity a very long time. I was very skeptical about what the results will be, not to bury, to bury the lead, so to speak. But I was very surprised with the results that you all shared with the world this summer in Las Vegas. We’ll get to that in a minute. But again, this competition went over many years and as it progressed, could you maybe share what you learned that maybe surprised you, you didn’t expect from when this all kicked off.

Andrew Carney (05:21)
Yeah, think so. I think there have been a lot of surprises along the way. And I’ll also say that, you know, skepticism, especially from, you know, informed experts is a really good sign for a DARPA challenge. So for a lot of projects at DARPA generally, you know, if you’re kind of waffling between this is insanely hard and there’s no way we’ll be successful and this is kind of a much easy, like, you know, there’s an easy solution to this. If you’re constantly in that space of uncertainty, like, no, I really think this is really, really hard. And I’m getting skepticism from people that know a lot about this space. For us, that’s fuel. That’s okay. There is, you know, there’s a question to answer here. And so that really was part of driving us, even competitors, competitors that ended up making it to finals themselves were skeptical even as they were competing.

So I love that. I love that. Like, you know, we want to try to do really hard things and, you know, criticism helps us improve. Like that’s super beneficial.

CRob (06:33)
Yeah, it was, and I’ve had the opportunity to talk with many of the teams and now we’re in the phase post-competition where we’re actually starting to figure out how to share the results with the upstream projects and how to build communities around these tools. you assembled a really amazing group of folks in these competitive teams, some super top-notch minds. again,

You made me a believer now, where I really do believe that AI does have a place and can legitimately offer some real value to the world in this space.

Andrew Carney (07:11)
Yeah, think one of the biggest surprises for me was the efficiency. I think a lot of times, especially with DARPA programs, we expect that technical miracles will come with a pretty hefty price tag. And then you’ll have to find a way to scale down, to economize, to make that technology more useful, more more widely kind of distributable.

With AIxCC, we found the teams pushing so hard on the core kind of research questions, but at the same time, sort of woven into that was using their resources efficiently. And so even the competition results themselves were pleasantly surprising in terms of the compute costs for these systems to run. We’re talking tens to hundreds of dollars.

vulnerability discovered or patch emitted, which is really quite amazing.

CRob (08:15)
Yeah, so maybe could you just give me some highlights of kind of what the competition discovered, what the competitors achieved?

Andrew Carney (08:24)
Yeah. So I think when we’re trying to tackle these really challenging research questions and we’re examining it from all angles and being extremely critical of even our own approach, as well as the competitors’ approaches, that initially back in August of 2024, we had this amazing proof of life moment where the teams demonstrated with only a few hundred dollars in total compute budget.

that they were able to analyze large open source projects and find real issues. One of the teams found a real issue in SQLite that we had disclosed at the time to the maintainers. And they found that, once again, with this very limited compute budget across multiple millions of lines of code in these projects. So that was sort of the OK, there’s a there there, like there’s something here and we can keep pushing. So that was a really exciting moment for everyone. And then over the following year, up to August 2025, we had a series of these non-scoring events where the teams would be given challenges that looked very similar to what we’d give them for finals with an increasing level of scale and difficulty.

So you can think of these as like extreme integration events where we’re still giving the teams hundreds of thousands or millions of lines of code. We’re giving them, you know, eight to 12 hours per kind of task. And we’re seeing what they can do. This was important to ensure that the final competition went off without a hitch. And also because the models they were leveraging continue to evolve and change.

So it was really exciting. In that process, the teams found and disclosed hundreds of vulnerabilities and produced hundreds of potential patches that they would offer up to maintainers of the projects that they were doing their own internal kind of development on. So that was really exciting just to see that the SQLite bug wasn’t a fluke and that the teams could consistently kind of perform and keep pushing as we push them to move further and faster and deal with more complex code, they were able to adapt and find a way forward.

CRob (11:02)
That’s awesome. And I know you had, it was a long journey that you and the team and all the support folks went through, but is there any particular moment that kind of you smile on when you reflect on over the course of the competition?

Andrew Carney (11:20)
Oh, man, so many. I think there’s an equal number of like those smiling moments and also, you know, premature gray hairs that the team and myself have created. But I think one of the big moments, there were a number of just outstanding kind of experts in the field on social media.

in talks that would, the way that they talked about kind of AI powered program analysis was very skeptical. near the end, leading up to semi-finals, we had this lovely moment where the Google project zero team and the Google deep mind teams penned a blog post that said that they were inspired by one of the teams, by the SQL light bug, by one of the team’s discoveries. And that was huge, I think both for that team and just the competition as a whole. And then after that, seeing people’s opinions change and seeing people that had held, that were, like I said, top tier experts in the field, change their perspective pretty drastically, which that was, you know, that was helpful signal for us to demonstrate that we were being successful. Like converting a critic, I think, is one of the best kind of victories that you can have. Because now they can be a collaborator, right? Like now we can still kind of spar over different perspectives or ideas, but now we’re working together. That’s very exciting.

CRob (13:09)
That’s awesome. So what’s next? The hard work of the competition is over and now we’re in kind of the after action phase where we’re trying to integrate all this great work and kind of get these projects out to the world to use. So from your perspective or from DARPA or the competition, what’s next for you?

Andrew Carney (13:29)
Yeah, so one of the biggest challenges with DARPA programs is when you’re successful, sometimes you have that technological miracle, you have that accomplishment, and maybe the world’s not entirely ready for it yet. Or maybe there’s additional development that needs to happen to get it kind of into the real world. With AIxCC, we made the competition as realistic as possible. The automated systems, these cyber reasoning systems, were being given bug reports, they’re being given patch diffs, they’re being given artifacts that we would consume and review as human developers. So we modeled all the tasks very closely to the real things that we would want these systems to do. And they demonstrated incredible kind of performance. Collectively, the teams were able to find over 80 % of the vulnerabilities that we’d synthetically kind of inserted. And they patched nearly 70 % of those vulnerabilities. And that patching piece is so critical. What we didn’t want to do was create systems that made open source maintainers lives more problematic.

CRob (14:54)
Thank you.

Andrew Carney (14:56)
We wanted to demonstrate that this is a reachable bug and here’s a candidate patch. And in the months after the competition, we’ve incentivized the teams further than just the original prize money to go out into the open source community and support open source maintainers with their tools. And we’ve had folks come back and literally in their kind of reports, document that the patch they suggested to a maintainer was nearly identical to what the maintainer actually committed. Yeah. And those reports are coming in daily. So we’re getting, we have this constant feed of engagement and the tools are still obviously being improved and developed. But it’s really exciting to see it. So when I think about what’s next is like we’re already in the what’s next like getting the technology out there, using government funding to support open source maintainers wherever we can, especially if their code is part of widely used applications or code used in critical infrastructure. So that’s where we find ourselves now. And then we’re thinking a lot about how we supercharge that effort to the…

there have been, you the federal government supports a lot of actively used open source projects, right? And we’ve been working with all these partner agencies across the federal government and just making sure that we’re supporting the existing programs when we find them. And then where we see a gap, kind of figuring out what it would take to fill that gap that community that could use more support.

CRob (16:55)
So on a slightly different note, we’re both technologists and we love the field, but as I was going through this journey, kind of on the sidelines with you all, I was reflecting, do you have a a favorite tale of robots gone bad? Like Terminator’s Skynet or HAL 9000 or the Butlerian Jihad?

Andrew Carney (17:22)
That’s a, you know, I think I, I’ll, I don’t know that this is my favorite, but it is one of the most recent ones that I’ve read. There’s a series called Dungeon Crawler Carl. Yeah. And it’s been really like entertaining reading. And I just think the tension between the primal AIs and the corporations that rely on said independent entities, but also are constantly trying to rein them in is, I don’t know, it’s been really interesting to see that narrative evolve.

CRob (18:08)
I’ve always enjoyed science fiction and fantasy’s ability to kind of hold a mirror up to society and kind of put these questions in a safe space where you can kind of think about 1984 and Big Brother or these other things, but it’s just in paper or on your iPad or whatever. So it’s a nice experiment over there. And we don’t want that to be happening here.

Andrew Carney (18:29)
Yes, yes. Yeah, the fiction as thought experimentation, right?

CRob (18:37)
Right, exactly. So as we wind down, do you have a particular call to action or anything you want to highlight to the audience that they should maybe investigate a little further or participate in?

Andrew Carney (18:50)
Yeah, I think so a big one is, you know, we would love for open source maintainers to reach out to us directly. AIXCC at DARPA.mil. That’s the email address that our team uses. And we’ve been looking for more maintainers to connect with so that we can make sure that if we can provide resources to them, one, that they’re right sized for the challenges that those maintainers are having, or maintainer, right? Sometimes it’s just one person. And then two, that we’re engaging with them in the way that they would prefer to be engaged with. We want to be helpful help, not unhelpful help. So that’s a big one. And then I think in more generally, I would love to see more patching added into the kind of vulnerability research lifecycle. I think there’s so many opportunities for commercial and open source tools that have that discovery capability and that’s really their big selling point. And now with AIxCC and with the technology that the competitors open source themselves, since all of their systems were open sourced after the competition, there’s this real potential, I think that we haven’t seen it realized the way that it really could be. And so that’s, I would love to see more of that kind of automated patching added to tools and kind of development workflows.

CRob (20:29)
I’ll say my personal favorite experience out of all this is now that the competition, the minute the competition was over, then there was an ethical wall up between, you your administrators and us and the different competition teams. But now I’ve, we’ve observed the competitors, like looking at each other’s work and asking questions to each other and collaborating. that is, I’m so super excited to see what comes next. Now that all these smart people have proven themselves. and they found kind of connected spirits and they’re gonna start working together for even more amazing things.

Andrew Carney (21:07)
Absolutely. I think we’re expecting a state of knowledge paper with all the teams as authors. That’s something they’ve organized independently, to your point. And yeah, I cannot wait to see what they come out with collaboratively.

CRob (21:23)
Yeah. And anyone that’s interested to learn more or potentially directly interact with some of these competition experts, whether they’re in academia or industry, the OpenSSF is sponsoring as part of our AI ML working group. We’ve created a cyber reasoning special interest group specifically for the competition, all the competitors, and just to have public discussions and collaboration around these things. And we would invite everybody to show up and listen and participate as they feel comfortable and learn.

Well, Andrew and the whole DARPA and ARPA-H team, everyone that was involved in the competition, thank you. Thank you to our competitors. And we actually are going to have a series of podcasts talking to the individual competitors, kind of learning a little bit of the unique flavors and challenges these had. But thank you for sponsoring this and kind of really delivering something I think is going to have a ton of utility and value to the ecosystem.

Andrew Carney (21:47)
Thank you for working with us on this journey and we definitely look forward to more collaboration in the future.

CRob (21:54)
Well, and with that, we’ll wrap it up. I just want to tell everybody happy open sourcing. We’ll talk to you soon.

What’s in the SOSS? Podcast #50 – S3E2 Demystifying the CFP Process with KubeCon North America Keynote Speakers

By Podcast

Summary

Ever wondered what it takes to get your talk accepted at a major open source tech conference – or even land a keynote slot? Join What’s in the SOSS new co-host Sally Cooper, as she sits down with Stacey Potter and Adolfo “Puerco” GarcĂ­a Veytia, fresh off their viral KubeCon keynote “Supply Chain Reaction.” In this episode, they pull back the curtain on the CFP review process, share what makes a strong proposal stand out, and offer honest advice about overcoming imposter syndrome. Whether you’re a first-time speaker or a seasoned presenter, you’ll learn practical tips for crafting compelling abstracts, avoiding common pitfalls, and why your unique voice matters more than you think.

Conversation Highlights

00:00 – Introduction and Guest Welcome
01:40 – Meet the Keynote Speakers
05:27 – Why CFPs Matter for Open Source Communities
08:29 – Inside the Review Process: What Reviewers Look For
14:29 – Crafting a Strong Abstract: Dos and Don’ts
21:05 – From Regular Talk to Keynote: What Changed
25:24 – Conquering Imposter Syndrome
29:11 – Rapid Fire CFP Tips
30:45 – Upcoming Speaking Opportunities
33:08 – Closing Thoughts

Transcript

Music & Soundbyte 00:00
Puerco: Stop trying to blend or to mimic what you think the industry or your community wants from you. Represent – always show up who you are, where you came from – that is super valuable and that’s why people will always want to have you as part of their program.

Sally Cooper (00:20)
Hello, hello, and welcome back to What’s in the SOSS, an OpenSSF podcast. I’m Sally and I’ll be your host today. And we have a very, very special episode with two amazing guests and they are returning guests, which is my favorite, Stacey and Puerco. Welcome back by popular demand. Thank you for joining us for a second time on the podcast.

And since we last talked, you both delivered one of the most talked about keynote at KubeCon. Wow. So today’s episode, we’re going to talk to you about CFPs. And this is really an episode for anyone who has ever hesitated to submit a CFP, wondered how to get their talk reviewed through the CFP process. Asked themselves, am I ready to speak? Or dreamed about what it might take to keynote a major event.

We’re gonna focus on practical advice, what works, what doesn’t, and how to show up confidently. And I’m just so excited to talk to you both. So for anyone who’s listening for the first time, Stacey, Puerco, can you tell us a little bit about yourselves? and about the keynote. Stacey

Stacey (01:48)
Hey everyone, I’m Stacey Potter. I am the Community Manager here at OpenSSF. And my job, I mean, in a nutshell is basically to make security less scary and more accessible for everyone at open source, right? I’ve spent the last six or seven years in open source community building across mainly CNCF projects, Flux, Flagr, OpenFeature, Captain to name a few.

And now focusing on open source security here at OpenSSF. Basically helping people connect, learn, and just do cool things together. And yeah, and I delivered a keynote at KubeCon North America that was honestly, it’s still surreal to talk about. It was called Supply Chain Reaction, a cautionary tale in case security, and it was theatrical. It was…slightly ridiculous. And it was basically a story of a DevOps engineer who I played the DevOps engineer, even though I’m not a DevOps engineer, frantically troubleshooting a compromised deployment. And Puerto literally kaboomed onto the stage as a Luchador superhero to save the day. had him in costume and we had drama.

And then we taught people a little bit about supply chain security through like B-movie antics and theatrics. But it turns out people really responded to making security fun and approachable instead of terrifying.

Adolfo GarcĂ­a Veytia (@puerco) (03:23)
Yeah. Well, hi, and thanks everybody for listening. My name is Adolfo GarcĂ­a-Veytia. I am a software engineer working out of Mexico City. I’ve been working on open source security for, I don’t know, the past eight years or so, mainly on Kubernetes, and I maintain a couple of the technical initiatives here in the OpenSSF.

I am now part of the Governing Board as starting of this year, which is a great honor to have been voted into that position. But my real passion is really helping build tools that secure open source while being unobtrusive to developers and also raising awareness in the open source community about why security is important.

Because sometimes you will see that especially executives, CISOs, and they are compelled by legal frameworks or other requirements to make their products or projects secure. And in open source, we’re always so resource constrained that security tends to be not the first thing on people’s minds. But the good news is that here in the OpenSSF and other groups, we’re working to make that easy and transparent for the real person as much as possible.

Sally Cooper (04:57)
Wow, thank you both so much. Okay, so getting back to call for proposals, CFPs. From my perspective, they can seem really intimidating, but they’re also one of the most important ways for new voices to enter community. So I just have a couple questions. Basically, like, why are they important? So not just about like going to a conference, but why is it important to get

Why would a CFP be important to an open source community and not just a conference? Stacy, maybe you could kick that off.

Stacey (05:32)
Sure, I think this is a really important question. I think CFPs aren’t just about filling conference slots. They’re really about who gets to shape the narrative in our communities and within these conferences. So when we hear the same voices over and over and they show up repeatedly, right, you get the same perspectives, the same solutions, the same energy, which, you know, is also great. You know, we love our regular speakers, they’re brilliant, but

communities always need new and fresh perspectives, right? We need the people who just solved a weird edge case that nobody’s talking about. We need like a maintainer from a smaller project who has insights that maybe big projects haven’t considered, or, you know, we need people from different backgrounds, different use cases and different parts of the world as well. CFPs are honestly one of the most democratic ways we have to surface new leaders, right?

Sometimes someone doesn’t need to be well-connected or have a huge social media following. They just need a good idea and the courage to submit a talk about it, right? And that’s really powerful. And I think when someone gives their first talk and does well, they often become a mentor, a maintainer, a leader in that community, right? CFPs are literally how we build the next generation of contributors and speakers. So every talk is a potential origin story for someone’s open source journey.

Sally Cooper (07:08)
Puerco, what are your thoughts on that?

Sally Cooper (07:11)
And the question again is call for proposals can feel really intimidating, but they’re also one of the most important ways for new voices to enter a community.

Adolfo GarcĂ­a Veytia (@puerco) (07:20)
Yeah. So, I would say that intimidating is a very big word, especially for new people. maybe, Sometimes it’s difficult to ramp up the courage and I don’t want to mislead people into thinking it’s going to be easy. The first ones that you do, you will get up there, sweat, stutter, and basically your emotions will control your delivery and your body, so be prepared for that.

But it’s going to be fine. The next times you’ll do it, it will get better. And most importantly, people will not be judging you. In fact, it’s sometimes even more refreshing to see new voices getting up on stage.

Sally Cooper (08:13)
That’s really helpful. Thank you. I love it. The authenticity that you bring really helps and helps demystify the CFP process. But now let’s pull back the curtain on the review process. How does that work? And Stacey, have you been on a review panel before? Maybe you could talk about like, when you’re reviewing a CFP, what are you actually looking for?

Stacey (08:39)
Yeah, I’ve been on program committees. I’ve been on a program chair or co-chair on different programs and things like that. yeah, it’s a totally different experience, but I think it gives you lot of insight on how to prepare a talk once you’ve reviewed 75, 80 per session, right? It’s sometimes these calls are really big. I know KubeCon has really huge calls, right? But I would say, you know what we’re actually looking for:

So first, is this topic relevant and useful to our audience? Like, will people learn something they can actually apply? And second, like, can this person deliver on what they’re promising? And honestly, we’re looking we’re not looking for perfection, right? We’re looking for clarity and genuine expertise or experience like with that topic.

I would say be clear, be specific with your value proposition in the first two sentences of a CFP. When the program committee can read your abstract and immediately think, “oh that’s exactly what our attendees need,” right? That’s like gold, right? Also, when somebody shows that they understand the audience, that they’re they’re submitting to, right? Are you speaking to beginners or experienced practitioners and being explicit about that?

Adolfo GarcĂ­a Veytia (@puerco) (10:16)
Yeah, I think it’s important for applicants to understand who is going to be reviewing your papers. There are many kinds of conferences and I would… So ours, even though, of course, there’s commercial behind it because you have to sustain the event, like everybody involved in… Especially in the Linux Foundation conferences, I feel…

we put a lot of effort into making the conferences really community events. And I would like to distinguish the difference, like really make a clear cut between what is academic conferences, like purely trade show conferences and these community events. And especially in academia, there’s this hierarchical view of peers.

assessing what you’re doing. In pure trade show conferences, it’s mostly pay to play, I would say. And when you get down to community, especially if you ever applied to present or submit papers to the other kinds of conferences, you will be expecting completely different things. It’s easy to forget that people looking at your work, at your proposals, at your ideas is very, very close and very, very similar to you.

So don’t expect to be talking to some higher being that understands things much better than you. First of all, it’s not one person. It’s all of us reading your CFPs. keeping that in mind, what you need to keep like consider when submitting is what makes my proposal unique. I think that’s a key question. And we can talk more about that in the later topics, but I feel, to me, when I understood that it was sometimes even my friends reviewing my proposal made it so much easier.

Stacey (12:20)
Yeah, I think that’s a really, really good point Peurco makes is knowing that whatever conference you’re submitting for typically, and I say this like if it’s a Linux Foundation event, right? Because those are the ones that I’ve been most involved with. The program committee members are from within the community. They are, they submit an application to say, hey, yes, I would love to review talks. This is like me volunteering my time to help out this conference. Maybe they’re not able to make the conference.

Maybe they are, maybe they’re also submitting a talk. But usually the panel of reviewers is like five, six, up to 10 people, I would say, depending on the size of the conference. So you’re getting a wide range of perspectives reading through your submissions. And I think that’s really important. When I’m trying to select the program committee, I think it’s really important to diversify as well, right? So have voices from all over – different backgrounds, different expertise, different genders, just as much variance as you can have within the program committee panel, I think also makes a difference with the CFP reviews themselves, right?

But that’s kind of how it’s set up, is you pick these five to 10 people to review all of these CFPs, they have usually, it’s like a week or something like that to review everything, and then they rate it on a scale. And then that’s kind of how the program chairs then arrange the schedule is based off of all that feedback. You can make notes in each of the talks that you’re reviewing, you know, put those in there and then, and that’s basically how they’re all chosen. They’re ranked and they have notes, right, within that system.

Sally Cooper (14:08)
Wow, this is really educational. Thank you so much. For folks that are staring at a CFP right now, because there’s some coming up, and I think we’re going to get into that. Let’s get practical. What makes a strong abstract? How technical is too technical? How much storytelling belongs in a CFP? And what are some red flags that you might see in submissions?

Adolfo GarcĂ­a Veytia (@puerco) (14:34)
So, the first big no-no in community events is don’t pitch your product. Even if you trying to disguise it as a community event, the reviewers will … You have to keep in mind that reviewers have a lot of work in front of them. I am sure people, there are all sorts of reviewers, but usually as a reviewer, you see that folks put a lot of effort into crafting their proposals.

If you pitch your product, which is against the rules in most conferences, in the community conferences, the reviewer will instantly mark your proposal down. We can sniff it right away. You have to understand that for us, the more invalid proposals we can get out of the way as soon as possible, that will happen. If it is a product pitch, just don’t.

And then the next one is you have to be clear and concise in the first paragraph or sentence even. So when a reviewer reads your proposal, make sure that the first paragraph gives you an idea of, so this is going to be, I’ll talk about this and it’s gonna like…inspect the problem from this side or whatever, but give me that idea. And then you can develop the idea a little bit more on the next couple of paragraphs, but make sure that the idea of the talk is delivered right away. I have more, but I don’t know, Stacey, if you want to.

Stacey (16:20)
Yeah, no, I think that’s really good advice. would say whatever conference that you’re submitting, being on so many different program committees, I’ve seen the same talk submitted to every conference that has an Open CFP, regardless of the talk being specific to that conference or not. So think that’s key number one is make sure that what you’re submitting fits within the conference itself.

I think not doing a product pitch is key – especially within an open source community, open CFP, right? Those are only for open source, for non-product pitches. I think Puerco makes a really good point with that. But, you know, like, is this conference that I’m submitting this talk to higher level? Is it super technical and adjusting for those differences, right? A lot of times you’ll find in the CFPs that there is room to submit a beginner level, an intermediate level, an advanced level, but typically the conference description and the categories and things like this, you want to be very specific when you’re writing your CFP. You could sometimes you reuse the same CFP you’ve submitted to another conference, but you want to tailor it to each specific conference that you are submitting for.

Don’t just submit the same talk to five different conferences because they are unique, they are specific and you want to make sure that if you want your talk accepted, these are the little changes that make a big difference on really getting down to the brass tacks of what that conference is about and what they’re really looking for. So I always have to, when I’m writing something and when I’m looking at a conference to write it for, I have the CFP page up, I have the about page up for that conference and I’m making sure that it fits within what they’re asking me for, really.

Adolfo GarcĂ­a Veytia (@puerco) (18:20)
Yeah. And I just remember another one. And this is mostly, this happens most in the bigger ones, like the Cubicums and so on. Don’t try to slop your way into the conference. if you, I mean, it’s like, I’d rather see a proposal with bad English-ing or typos than something that was generated with AI. And I’ll tell you why.

It’s not because like, pure hates of AI or whatever. no. The problem with running your proposal into an LLM is that most of the time, so you have to keep in mind, especially in the big conferences, you will be submitting a proposal about the subject that probably then other people will be trying to talk about the same thing. And what will get you picked is your capability of expressing like…getting into the problem from a unique way, your personality, all of those things.

When you run the proposal through the LLM, it just erases them. All sorts of personal, like the uniqueness that you can give it will just be removed. And then it’ll be just like looking at the hollow doll of some of the person and you will not stand out.

Stacey (19:38)
Yeah, I agree completely – and…is it a terrible thing to have AI help you with some of the editing? No, not at all. But write your proposal first. Write it from your heart. Write it from your point of view. Write it from your angle. But do not create it in AI, in the chatbots. Create it from yourself first, and then ask for editing help. That’s fine.

I think a lot of us do that and a lot of people out there are using it for that extra pair of eyes. Do I sound crazy here? Does this make any sense? I don’t know how to word this one particular sentence. That’s fine. But yeah, don’t start that way.

Adolfo GarcĂ­a Veytia (@puerco) (20:19)
Exactly. mean, and just to make it super clear, it’s not that, especially people whose first language is not English like me. I of course use help of some of those things to like at least don’t like introduce many types or whatnot, but just as Stacey said, don’t create it there.

Sally Cooper (20:41)
This is great advice. Thank you both so much. Okay. How about getting accepted for a keynote? Like your KubeCon keynote really stood out. It was technical. It was really funny. memorable, engaging. How does someone prepare a keynote that differs from a regular talk?

Stacey (21:03)
Well, I want to start off by saying that we didn’t know, we weren’t submitting our talk for a keynote, right? We didn’t even know that that was like in the realm of possibility that could happen for KubeCon North America. We just submitted a talk that we thought would be fun, would be good, would give like, you know, some real world kind of vibes and that we wanted to have fun and we wanted to, you know, create a fun yet educational talk.

We had literally no idea that we could possibly have that talk accepted as a keynote. I didn’t know that. And this was my first real big talk. So it was a complete shock to me. I don’t know if you have other thoughts about that, but…

Adolfo GarcĂ­a Veytia (@puerco) (21:50)
Yeah, it sort of messes your plans because you had the talk planned for say 35 minutes and then you have 15 and you already had like 10 times more jokes that could fit into the 35 minutes. So, well…and then there’s also, course, like all of those things that we talked about, like getting nervous. Well, they not only come back, but they multiply in a huge way. I mean, you’ve been there. I don’t know. You get over it.

Stacey (22:28)
I would also say that once we found out that our talk was accepted first, were like, yay, our talk got accepted. And then I think it was like a few days later, they were like, no, no, your talk is now a keynote. So we freaked out, right? We had our little moment of panic. But then we just worked on it. And we worked on it, and we worked on it, and we worked on it, right? So not waiting till the last minute, I would say, to prep your talk.

But we…I think my main goal with this talk, and I have to give so much credit to Puerco because he’s such a good storyteller and he does it in such a humorous, but really technical and sound way. And we worked on this script. We wrote out an entire script because we only had 15 minutes. We went from a 25 minute talk to a 15 minute talk.

And so…pacing was really important, storytelling was really important, but also being funny was like something that I really wanted us to have, which Puerco was really good at too. And I think all of these things trying to squash it down into this 15 minutes was really tough, but I think that’s important to remember about keynotes versus talks is I think keynotes are more like, what is this experience of the talk about? Versus like, let’s get down to really technical details, right? You can do a technical talk that’s 25, 35, 45 minutes, but it’s a keynote. People aren’t going to remember anything from a keynote if you’re digging too, getting too deep in the weeds, right? So that was my focus. And I don’t know, Puerco, if you have anything else to add to that.

Adolfo GarcĂ­a Veytia (@puerco) (24:10)
Yeah, the other is that the audience is so much bigger that your responsibility just grows, especially to deliver, right? So as Stacey said, we actually wrote the script, rehearsed online, in person before the conference. And the experience also in the conference is very different because you have to show up early, you have to do a rehearsal in the prior days before your actual talk. And that’s said – nothing like it didn’t go perfect.
Like we still fumbled here and there and like messed up some of the details and the pacing and whatnot. it’s, I don’t know, at least in our case, it was about having fun and trying to get some of that fun into the attendees.

Sally Cooper (25:01)
Yeah, you really did. It was so fun. I think that’s what stood out.

Okay, one of the biggest barriers to submitting a CFP isn’t skill, it’s confidence. So what would you say to someone who feels like, I’m not expert enough. I don’t know if I have permission to do this. What you know, how do they deal? How do you personally deal with imposter syndrome? And why is it important to make sure that those new and diverse voices do submit at CFP?

Adolfo GarcĂ­a Veytia (@puerco) (25:27)
Oh, I’m an expert. So the first thing to remember, kids, is that Impostor Syndrome will never go away. In fact, you don’t want it to ever go away. Because Impostor Syndrome tells you something very, very important. And that is you are being critical of yourself, of your work, of your ideas. And if you ever stop doing that,

It means one, you don’t really understand the problem or the vastness of the problem that you’re trying to speak about and to talk about in your talk. And the other is you will stop looking for new and innovative ideas. So no matter where you get to, that imposter syndrome will ever be with you.

Stacey (26:20)
I agree. I don’t think it ever goes away. I feel like, you know, I was an imposter at the keynote. Absolutely was, right? Like, I didn’t know what the heck I was doing. I didn’t know what the heck I was saying half the time. I mean, I tried to memorize my lines and do the right thing and come off as this expert. I never, ever feel like an expert about anything, right? Unless I’m talking, I guess, about my cats or my kid or something.

Adolfo GarcĂ­a Veytia (@puerco) (26:47)
Yeah, exactly.

Stacey (26:49)
But yeah, think that’s, yeah, you’re pushing yourself to grow and that’s a good thing, right? So if you feel like an imposter, you know, that’s okay. And we all feel like that.

Adolfo GarcĂ­a Veytia (@puerco) (27:04)
Yeah. And the other, yeah, the other very important thing is think about what you are proposing to, to, to talk about in your talk. it’s supposed to be like new cutting edge stuff, like it’s something interesting, something unique. so it’s okay to feel about that because it’s, it’s a problem that you’re still researching that you’re trying to understand, that – especially think about – think about it this way.
If you propose any subject for your talk, anybody that goes there is more or less assuming that they want to know and learn more about it. if you feel confident enough to speak about it, like people will respond by willingness to attend your talk. That means you are already one little bit of a level above because you’ve done that research, you’ve done that in-depth dive into the subject. So it’s fine.

It’s fine to feel it. I realized that it’s a natural thing.

Stacey (28:05)
And most of the people in the audience are there to support you, to cheer you on, and are not gonna harp on you or say, oh gosh, you messed up this thing or that thing. They’re really there to give you kudos and really support you and be willing to hear and listen to what you have to say.

Sally Cooper (28:25)
Love that. Okay, let’s close the advice portion with a quick round of CFP tips rapid fire style. I’m going to go back and forth so each person can answer. Stacey will start with you. One thing every CFP should do.

Stacey (28:43)
I mean, get to the point as quickly as you possibly can. That would be my thing, right?

Sally Cooper (29:48)
Love it. Puerco, one thing people should stop doing in CFPs.

Adolfo GarcĂ­a Veytia (@puerco) (28:55)
Stop trying to blend or to mimic what you think the industry or your community wants from you. Represent. Always show off who you are, who you came from. That is super valuable and that’s why people will always want to have you as part of a program.

Sally Cooper (29:13)
Stacy, one piece of advice you wish you’d received earlier.

Stacey (29:18)
gosh, would say rejection is normal and not personal. I wish someone had told me that earlier, but that is one big, experience. Speakers get rejected all the time, right? It’s not about your worth. It’s about program balance, timing, and fit. So keep submitting.

Sally Cooper (29:39)
Okay, Puerco and Stacey, both got famous after this Puerco selfie or autograph?

Adolfo GarcĂ­a Veytia (@puerco) (29:44)
Selfie with a crazy face, at least get your tongue out or something.

Sally Cooper (29:50)
Stacey. KubeCon or KoobCon?

Stacey (29:54)
Oh gosh, I feel like this is like JIFF or GIF. And I’m in the GIF camp, by the way. I say KubeCon, even though I know it’s “Coo”-bernetes, I still say CubeCon, so.

Adolfo GarcĂ­a Veytia (@puerco) (30:07)
CubeCon, please.

Sally Cooper (30:09)
Okay, before we wrap up, Stacey, as the OpenSSF Community Manager, can you share some upcoming CFPs and speaking opportunities people should keep an eye on?

Stacey (30:19)
Yeah, so Open Source Summit North America is a pretty large event. I think it’s taking place in Minneapolis in May this year. There’s multiple tracks and there’s lots of opportunities for different types of talks. The CFP is currently open right now, but it does close February 9th. So go and check out the Linux Foundation Open Source Summit North America for that one.

We also have OpenSSF Community Days, which are co-located events at Open Source Summit North America, typically. And these are our events that we hold kind of around the world, but honestly, they’re perfect for first-time speakers as well. They’re smaller, they’re more intimate, and the community is super supportive. Our CFP for Community Day North America is February 15th. So go ahead and…search for that online. You can find them, and we’ll put the links in the description of this podcast so you can find that.

And then be on the lookout for key conferences later on in the year as well. KubeCon North America will be coming up later. Open Source Summit Europe is coming up later in the year. So be on the lookout for those. There’s also within the security space, I know there’s a lot of B-sides conferences and KCDs, which are Kubernetes community days and DevOps days.

If you’re in our OpenSSF Slack, we have a #cfp-nnounce channel that we try and promote and try and put out as many CFPs as we can to let people know that if you’re in our community and you want to submit talks regarding some of our projects or working groups or just OpenSSF in general, that CFP Announce channel is really a great place to keep checking.

Sally Cooper (32:13)
Amazing. Thank you both so much, not just for the insights, but for really making the CFP process feel more approachable and human. If you’re listening to this and you’ve been on the fence about submitting a CFP, let this be your sign. We really need your voice and thank you both so much.

Stacey (33:32)
Thank you.

Adolfo GarcĂ­a Veytia (@puerco) (33:33)
Thank you.

OpenSSF Newsletter – January 2026

By Newsletter

Welcome to the January 2026 edition of the OpenSSF Newsletter. This issue highlights new research, community priorities, and upcoming events across the open source security ecosystem.

TL;DR:

📊 2026 Cyber Resiliency Survey → Measure the awareness of CRA

🧭 OpenSSF 2026 Themes → What’s ahead and how to get involved

🔎 OSS Africa, VEX, AI & OSPS Baseline → Practical blogs and podcast highlights

🌍 Events & Community → GVIP Summit, EU Policy Summit, FOSDEM, Open Source SecurityCon Europe, CFPs, and project updates

OpenSSF and Linux Foundation Research: 2026 Cyber Resiliency Survey

As cybersecurity legislation such as the EU Cyber Resilience Act (CRA) takes effect, open source communities are beginning to feel its impact, from maintainers and contributors to organizations that rely on open source every day. Building on last year’s inaugural study, Linux Foundation Research and OpenSSF are again inviting the community to share perspectives through a new survey focused on awareness and readiness for cybersecurity regulation.

Your perspective matters. By participating, you help strengthen shared understanding, surface real community needs, and support the open source ecosystem as it navigates emerging regulatory challenges. Take the Survey.

OpenSSF at FOSDEM 2026: From Policy to Practical Security

OpenSSF is heading to Brussels for FOSDEM 2026 and Open Source Week, building on last year’s momentum around practical open source security, CRA readiness, and community-driven solutions. Expect strong presence across policy and technical devrooms, a joint booth with Linux Foundation Europe (K2-A-03), and active participation in key events like the GVIP Summit and EU Open Source Policy Summit. The focus this year: turning regulation and security best practices into real, usable tooling and guidance for maintainers and projects. Read the blog.

OpenSSF’s 2026 Themes: A Community Roadmap for Securing the Future of Open Source

Curious about what security topics will shape the open source world in 2026 and how you can be part of it? Read about OpenSSF’s quarterly themes from AI and ML security to vulnerability transparency, global policy alignment, and Baseline adoption. This blog also highlights key events, community activities, and how to get involved. Read more.

Signal in the Noise: An Industry-Wide Perspective on the State of VEX

Key stakeholders, Aubrey Olandt (Red Hat), Brandon Lum (Google), Charl de Nysschen (Google), Christoph Plutte (Ericsson), Georg Kunz (Ericsson), Jonathan Douglas (Microsoft), Jautau “Jay” White (Microsoft), Martin Prpič (Red Hat), and Rao Lakkakula (Microsoft) look at how VEX is developing across the software industry. VEX provides structured, machine-readable statements about whether a vulnerability affects a product. It can reduce false positives and cut down the workload for security teams, but adoption is still uneven. This report reviews the main VEX formats CSAF, OpenVEX, CycloneDX, and SPDX and highlights gaps in tooling, trust, and distribution. Read more.

Catching Malicious Package Releases Using a Transparency Log

In this guest blog from Trail of Bits, learn how transparency logs like Rekor, combined with tools such as rekor-monitor, help package maintainers spot tampering and unauthorized signatures in real time. With support from OpenSSF, new improvements make monitoring easier, more reliable, and ready for production, an important step toward securing the open source software supply chain.

Read the full blog to see how transparency logs work, why they matter, and what’s coming next.

AI, Software Development, Security, Tips, and the Future (Part 1 & 2)

How is AI really changing software development today? In “AI, Software Development, Security, Tips, and the Future (Part 1)”, David A. Wheeler notes that AI use during software development has become the norm because “productivity is king,” even though AI-generated results are frequently wrong, and discusses the security risks around development environments and insecure generated code. In Part 2, he continues by offering practical tips on how developers can better use AI, touches on licensing and “vibe coding,” and looks toward the future, explaining that AI won’t replace developers anytime soon, but will increase both attack and defense capabilities in software security. If you haven’t read both blogs yet, they provide a clear, realistic view of how AI is affecting software today and what developers should be thinking about next.

Your Guide to the OpenSSF OSPS Baseline for More Secure Open Source Projects

BaselineGuideWhat does good security actually look like for open source projects? This new blog walks through the community-developed OSPS Baseline, a catalog of practical security controls that helps projects understand expectations, improve over time, and meet users where they are. With FOSS in up to 96% of modern codebases and relied on across nearly every industry, the blog explains why shared security practices matter and how the Baseline connects to standards like NIST SSDF, the EU Cyber Resilience Act, and ISO 27001. It also links to keynotes, a tech talk, a podcast, a real project case study, and FAQs so you can see how the Baseline works in practice. Read the blog.

Collecting Badges, Building Bridges: Representing OpenSSF and Linux Foundation Across Europe

How does it feel to represent a global open source security community across Europe? In his blog, Madalin Neag reflects on attending key open source, cybersecurity, and standardization meetings on behalf of OpenSSF throughout 2025. He describes how each conference badge represents conversations, collaboration, and the growing understanding that open source security is becoming an essential part of Europe’s cybersecurity future. The blog highlights the connections formed between maintainers, policymakers, standards groups, and community leaders, and shows how work in open source security bridges policy and practice across many different environments. Read more.

Strengthening Open Source Security Through Community: Introducing OSSAfrica

OSSAfrica is a new community-led initiative working to strengthen open source security across Africa by connecting contributors, maintainers, developers, and security practitioners. Operating as a Special Interest Group under the OpenSSF BEAR Working Group, OSSAfrica focuses on community building, security awareness, locally relevant solutions, and creating clear pathways for African contributors to engage in global open source security efforts. Learn why this work matters, what’s being built, and how you can get involved. Read the blog.

Preserving Open Source Sustainability While Advancing CRA Compliance

This blog looks at how voluntary security attestation models under the EU Cyber Resilience Act could unintentionally shift risk and responsibility onto open source developers. It argues that CRA compliance should stay focused on downstream manufacturers and rely on automation and verifiable security metadata rather than upstream attestations that could undermine open source sustainability.

What’s in the SOSS? An OpenSSF Podcast:

#47 – S2E24 Teaching the Next Generation: Software Supply Chain Security in Academia with Justin Cappos

This episode goes inside academia with NYU’s Justin Cappos, who explains why universities struggle to teach software supply chain security and how his course is producing highly skilled professionals. He and Yesenia Yser talk about curriculum, real-world open source collaboration, and how the Linux Foundation’s Academic Computing Acceleration Program could reshape security education.

#48 – S2E25 2025 Year End Wrap Up: Celebrating 5 Years of Open Source Security Impact!

CRob and Yesenia close out the year with a special wrap-up celebrating OpenSSF’s fifth anniversary and a huge year in open source security. They look back at new free training courses, highlights from the DARPA AI Cyber Challenge, standout interviews, major projects such as, OSPS Baseline and AI model signing, and community conversations across SBOMs and supply chain security. With nearly 12,000 downloads and big plans for Season 3, this episode is a fun look at how far the community has come and what’s ahead in 2026.

#49 – S3E1 Why Marketing Matters in Open Source: Introducing Co-Host Sally Cooper

In this Season 3 premiere, What’s in the SOSS? welcomes Sally Cooper as an official co-host. Sally shares her path from technical training and documentation to marketing leadership at OpenSSF, and explains why marketing matters in open source communities. Joined by CRob and Yesenia Yser, the conversation explores personas, personal branding, trust, and how marketing helps great projects get discovered, supported, and sustained. The episode also offers a preview of OpenSSF’s 2026 marketing themes and practical ways for newcomers to get involved.

News from OpenSSF Community Meetings and Projects:

In the News:

Meet OpenSSF at These Upcoming Events!

Connect with the OpenSSF Community at these key events:

Ways to Participate:

There are a number of ways for individuals and organizations to participate in OpenSSF. Learn more here.

You’re invited to…

See You Next Month! 

We want to get you the information you most want to see in your inbox. Missed our previous newsletters? Read here!

Have ideas or suggestions for next month’s newsletter about the OpenSSF? Let us know at marketing@openssf.org, and see you next month! 

Regards,

The OpenSSF Team