What’s in the SOSS

Feb 09

What’s in the SOSS? Podcast #52 – S3E4 AIxCC Part 2 – From Skeptics to Believers: How Team Atlanta Won AIxCC by Combining Traditional Security with LLMs

By OpenSSF Podcast

Summary

In this 2nd episode in our series on DARPA’s AI Cyber Challenge (AIxCC), CRob sits down with Professor Taesoo Kim from Georgia Tech to discuss Team Atlanta’s journey to victory. Kim shares how his team – comprised of academics, world-class hackers, and Samsung engineers – initially skeptical of AI tools, underwent a complete mindset shift during the competition. He shares how they successfully augmented traditional security techniques like fuzzing and symbolic execution with LLM capabilities to find vulnerabilities in large-scale open source projects. Kim also reveals exciting post-competition developments, including commercialization efforts in smart contract auditing and plans to make their winning CRS accessible to the broader security community through integration with OSS-Fuzz.

This episode is part 2 of a four-part series on AIxCC:

Listen on Apple Podcasts Listen on Spotify Listen on Overcast Listen on Pocket Casts

Conversation Highlights

00:00 – Introduction
00:37 – Team Atlanta’s Background and Competition Strategy
03:43 – The Key to Victory: Combining Traditional and Modern Techniques
05:22 – Proof of Vulnerability vs. Finding Bugs
06:55 – The Mindset Shift: From AI Skeptics to Believers
09:46 – Overcoming Scalability Challenges with LLMs
10:53 – Post-Competition Plans and Commercialization
12:25 – Smart Contract Auditing Applications
14:20 – Making the CRS Accessible to the Community
16:32 – Student Experience and Research Impact
20:18 – Getting Started: Contributing to the Open Source CRS
22:25 – Real-World Adoption and Industry Impact
24:54 – The Future of AI-Powered Security Competitions

Episode Links

Transcript

Intro music & intro clip (00:00)

CRob (00:10.032)
All right, I’m very excited to talk to our next guest. I have Taesoo Kim, who is a professor down at Georgia Tech, also works with Samsung. And he got the great opportunity to help shepard Team Atlanta to victory in the AIxCC competition. Thank you for joining us. It’s a really pleasure to meet you.

Taesoo Kim (00:35.064)
Thank you for having me.

CRob (00:37.766)
So we were doing a bunch of conversations around the competition. I really want to showcase like the amazing early cutting edge work that you and the team have put together. So maybe, can you tell us what was your team’s approach? What was your strategy as you were kind of approaching the competition?

Taesoo Kim (00:59.858)
that’s a great question. Let me start with a little bit of a background.

CRob (00:)
Please.

Taesoo Kim (00:59)
Ourself, our team, Atlanta, we are multiple group of people in various backgrounds, including me as academics and researchers in security area. We also have world-class hackers in our team and some of the engineers from Samsung as well. So we have a little bit of background in various areas so that we bring our expertise.

Taesoo Kim (01:29.176)
to compete in this competition. It’s a two-year journey. We put a lot of effort, not just engineering side, we also tinkled with lot of research approach that we’ve been working on this area for a while. Said that, I think most important strategy that our team took is that, although it’s an AI competition…

CRob (01:51.59)
Mm-hmm.

Taesoo Kim (01:58.966)
…meaning that they promote the adoption of LLM-like techniques, we didn’t simply give up in traditional analysis technique that we are familiar with. It means we put a lot of effort to improve, like fuzzing is one of the great dynamic testing for finding vulnerability, and also traditional techniques like symbolic executions and concocted executions, even directed fuzzing. Although we suffer from lot of scalability issues in those tools, because one of themes of AIxCC is to find bugs in the real world.

And large-scale open source project. It means most of the traditional techniques do not scale in that level. We can analyze one function or a small number of code in the source code repository when it comes to, for example, Linux or Nginx. This is crazy amount of source code. Like building a whole graph in this gigantic repository itself is extremely hard. So that we start augmenting LLM in our pipeline.

One of the great examples of fuzzing is that when we are mutating input, although we leverage a lot of mutation techniques in the fuzzing side, we also leverage the understanding of LLM in a way that LLM also navigates the possibility of mutating places in the source code in a way that they can generate some of the dictionaries, providing vocabulary for fuzzer, and realize the input format that they have to mutate as well. So lot of augmentations of using LLM happen all over the places in traditional software analysis technique that we are doing.

CRob (03:43.332)
And do you feel that combination of using some of the newer techniques and fuzzing and some of the older, more traditional techniques, do you think that that was what was kind of unique and helped push you over the victory line and the cyber reasoning challenge?

Taesoo Kim (04:01.26)
It’s extremely hard to say which one contributed the most during the competition. But I want to emphasize that finding bugs in the location of the source code versus formulating input that trigger those vulnerability in our competition, what we call as proof of vulnerability. These two tasks are completely different. You can identify many bugs.

But unfortunately, in order to say this is truly the bug, you have to prove by yourself by showing or constructing the input that triggered the vulnerability. The difficulty of both tasks are, I would say people do not comprehend the challenges of formulating input versus finding a vulnerability in the source code. You can pinpoint without much difficulty the various places in the source code.

But in fact, that’s an easier job. In practice, more difficult challenge is identifying the input that actually reach the place that you like and trigger the vulnerability as a result. So we spend much more time how to construct the input correctly to show that we really prove the existence of vulnerability in the source.

CRob (05:09.692)
Mm-hmm.

CRob (05:22.94)
And I think that’s really a key to the competition as it happened versus just someone running LLM and scanners kind of randomly on the internet is the fact that you all were incented to and required to develop that fix and actually prove that these things are really vulnerable and accessible.

Taesoo Kim (05:33.718)
Exactly.

Taesoo Kim (05:42.356)
Exactly. That also highlights what practitioners care about. So you ended up having so many false positives in the security tools. No one cares. There are a of complaints about why we are not using security tools in the first place. So this is one of the important criteria of the competition. one of the strengths in traditional tools like buzzer and concord executor, everything centers around to reduce the false positives. The region people.

CRob (05:46.192)
Yes.

Taesoo Kim (06:12.258)
take Fuzzer in their workflow. So whenever Fuzzer says there is a vulnerability, indeed there is a vulnerability. There’s a huge difference. So that we start with those existing tool and recognize the places that we have to improve so that we can really scale up those traditional tool to find vulnerability in this large scale software.

CRob (06:36.568)
Awesome. As you know, the competition was a marathon, not a sprint. So you were doing this for quite some time. But as the competition was progressing, was there anything that surprised you in the team and kind of changed your thinking about the capabilities of these tools?

Taesoo Kim (06:51.502)
Ha

Taesoo Kim (06:55.704)
So as I mentioned before, we are hackers. We won Defqon CTF many times and we also won F1 competition in the past. So by nature, we are extremely skeptical about AI tool at the beginning of the competition. Two years ago, we evaluated every single existing LLM services with the benchmark that we designed. We realized they are all not usable at all.

CRob (07:09.85)
Mm-hmm.

Taesoo Kim (07:24.33)
not appropriate for the competition. Instead of spending time on improving those tools, which we felt like inferior at the beginning, so our motto at that time, our team, don’t touch those areas. We’re going to show you how powerful these traditional techniques are. So that’s why we progressed the semi-final. We did pretty well. We found many of the bugs by using all the traditional tools that we’ve been working on. But like…

Immediately after semifinal, everything changed. We reevaluated the possibility of adopting LLM. At that time, just removing or obfuscating some of the tokens in the repository, the LLM couldn’t even reason anything about. But suddenly, near or around semifinal, something happened. We realized that even after we inject or

If you think of it this way, there is a token, and you replace this token with meaningless words. LLM previously all confused about all these synthetic structures of the source code, but now, or on semifinal, they really understand. Although we tried to fool many times, you really catch up the idea, which is a source code that they never saw before, never used in the training, because we intentionally create this source code for the evaluation.

We start realizing that we actually understand. We shock everybody. So we start realizing that there are so many places, if that’s the case, there are so many places that we can improve. Right? So that’s the moment that we change our mindset. So now everything about LLM, everything about the new Asian architectures, so that we ended up putting humongous amount of efforts creating various architectures of Asian design that we have.

Also, we replaced some of software analysis techniques with LLM as well, surprisingly. For example, symbolic execution is a good example. It’s extremely hard to scale. Whenever you execute one instruction at a time, you have to create the constraint around them. But one of the big challenges in real-world software, there are so many, I would say, hard-to-analyze functions exist. Meaning that, for example, there is a

Taesoo Kim (09:46.026)
Even NGINX as an example, we thought that they probably compared the string to string at a time. But the way they perform string compare in NGINX, they map this string or do the hashing so that they can compare the hash value. Fudger, another symbolic executor, is extremely bad at those. If you hit one hashing function, you’re screwed. There are so many constraints that there is no way we can revert back by definition.

There’s no way. But if you think about how to overcome these situations by using LLM, the LLM can recognize that this is a hashing function. We don’t actually have to create a constraint around, hey, what about we replace with identity functions? It’s something that we can easily divert by using symbolic execution. So then we start recognizing the possibility of LLM role in the symbolic execution. Now see that.

Smaller execution can scale to the large software right now. So I think this is a pretty amazing outcome of the competition.

CRob (10:53.11)
Awesome. So again, the competition completed in August. So what plans do you have? What plans does the team have for your CRS now that the competition’s over?

Taesoo Kim (10:58.446)
Thank

Taesoo Kim (11:02.318)
I think that’s a great question. Many of tech companies approach our team. Some of them recently joined, other big companies. And many of our students want to quit the PhD program and start a company. For good reasons, right?

CRob (11:14.848)
I bet.

Taesoo Kim (11:32.766)
One of the team, my four PhD students recently formed and looking for commercialization opportunity. Not in the traditional cyber infrastructure we are looking at through the DARPA, but they spotted the possibility in smart contracts. that smart contracts and modernized financial industries like stable coins and whatnot

where they can apply the AI XTC like techniques in finding vulnerability in those areas. So that instead of analyzing everything by human auditor, you can analyze everything by using LLM or agents and similar techniques that we developed for AI XTC so that you can reduce the auditing time significantly. In order to get some auditing in the smart contract, traditionally you have to wait for two weeks.

In the worst case, even months with a ridiculous amount of cost. Typically, in order to get one auditing for the smart contract, $20,000 or $50,000 per case. But in fact, you can reduce down the amount of auditing time by, I’ll say, a few hours by day. This speed, the potential benefit of achieving this speed is you really open up

CRob (12:40.454)
Mm-hmm.

CRob (12:47.836)
Wow.

Taesoo Kim (12:58.186)
amazing opportunity in this area. So you can automate the auditing, you can increase the frequency of auditing in the smart contract area. Not only that we thought there is a possibility for even more like compliance checkings of the smart contracts, there’s so many opportunities that we can play immediately by using ARCC systems. That’s the one area that we’re looking at. Another one is more traditional area.

CRob (13:00.347)
Mm-hmm.

Taesoo Kim (13:25.07)
what we call cyber infrastructure, like hospitals and some government sectors. They really want to analyze, but unfortunately, or fortunately though, there are other opportunities that in ARCC, we analyze everything by source code, but they don’t have access to them. So we are creating the pipeline that given a binary or execution only environment, how to convert them.

CRob (13:28.828)
Mm-hmm.

CRob (13:38.236)
Mm-hmm.

CRob (13:49.569)
Taesoo Kim (13:52.416)
in a way that we can still leverage the existing infrastructure that we have for AICC. More interestingly, they don’t have access to the internet when they’re doing pen testings or analyzing those, so that we start incorporating some of our open source model as part of our systems. These are two commercialization efforts that we’re thinking and many of my students are currently

CRob (13:57.67)
That’s very clever.

CRob (14:05.5)
Yeah.

CRob (14:13.564)
It’s awesome.

CRob (14:20.366)
And I imagine that this is probably amazing source material for dissertations and the PhD work, right?

Taesoo Kim (14:29.242)
Yes, yes. Last two years, we are purely focused on ARCC. Our motto is that we don’t have time for publication. It’s just win the competition. Everything is coming after. This is the moment that we actually, I think we’re going to release our Tech Report. It’s over 150 pages. Next week, around next week. So we have a draft right now, but we are still publishing.

CRob (14:39.256)
Yeah.

CRob (14:51.94)
Wow.

Taesoo Kim (14:58.51)
for publication so that other people not just like source code okay that’s great but you need some explanation why you did this many of the sources is for the competition right so that the core pieces might be a little bit different for like daily usage of normal developers and operator so we kind of create a condensed technical material for them to understand

Not only that, we have a plan to make it more accessible, meaning that currently our CRS implementation tightly bound to the competition environment. Meaning that we have a crazy amount of resources in Azure side, everything is deployed and better tested. But unfortunately, most of the people, including ourselves, we don’t have resources. Like the competition have about

80,000 cloud credit that we have to use. So no one has that kind of resource. It’s not like that, not if you’re not a company. But we want to apply this one for your project in the smaller scale. That’s what we are currently working on. So discarding all these competition dependent parameter from the source code, making more containable so that you can even launch our CRS in your local environment.

This is one of the big, big development effort that we are doing right now in our lab.

CRob (16:32.155)
That’s awesome. take me a second and thinking about this from the students perspective that participated. What kind of an experience was it getting to work with professors such as yourself and then actual professional researchers and hackers? What do you see the students are going to take away from this experience?

Taesoo Kim (16:53.846)
I think exposing to the latest model because we are tightly collaborating with this OpenAI and Gemini, we are really exposed to those latest model. If you’re just working on the security, not tightly working for LLM, you probably don’t appreciate that much. But through the competition, everyone’s mindset change. And then we spend time.

and deeply take a look in what’s possible, what’s not, we now have a great sense of what type of problem we have to solve, even in the research side. And now, suddenly, after this competition, every single security project, security research that we are doing at Georgia Tech is based on LLF. Even more surprising to hear that we have some decompilation project that we are doing, the traditional possible security research you can read.

CRob (17:42.448)
Ha ha.

Taesoo Kim (17:52.162)
binary analysis, malware analysis, decompilations, crash analysis, whatnot. Now everything is LLM. Now we realize LLM is much better at decompiling than traditional tools like IDEA and Jydra. So I think these are the type of research that we previously thought impossible. We’re probably not even thinking about applying LLM. Because we spend our lifetime working on decompiling.

CRob (17:53.68)
Mm.

CRob (17:59.068)
Yeah.

Taesoo Kim (18:22.318)
But at a certain point, we realized that LLM is just doing better than what we’ve been working on. Just one day. It’s a complete mind change. In traditional program analysis perspective, many things are empty completely. There’s no way you can solve it in an easier way. So they’re not spending time. That’s our typical mindset. But now, it works in practice, amazingly.

CRob (18:29.574)
Yeah.

Taesoo Kim (18:51.807)
how to improve what we thought previously impossible by using another one. It’s the key.

CRob (18:57.404)
That’s awesome. It’s interesting, especially since you stated initially when you went into the competition, you were very skeptical about the utility of LLMs. So that’s great that you had this complete reversal.

Taesoo Kim (19:04.238)
Thank

Yeah, but I think I like to emphasize one of the problems of LLM though, it’s expensive, it’s slow in traditional sense, you have to wait a few seconds or a few minutes in certain cases like reasoning model or whatnot. So tightly binding your performance with this performance lagging component in the entire systems is often challenging.

CRob (19:17.648)
Yes.

CRob (19:21.82)
Mm-hmm.

Taesoo Kim (19:39.598)
and then just talking. But another benefit of everything is text. There’s no proper API, just text. There’s no sophisticated way to leverage it, just text. I don’t know, you’re probably familiar with all these security issues, potentially with unstructured input. It’s similar to cross-site scripting in the web space. There’s so many problems you can imagine.

CRob (19:51.984)
Okay, yeah.

CRob (20:01.979)
Mm-hmm.

Taesoo Kim (20:08.11)
But as far as you can use in a well-contained manner in the right way, we believe there are so many opportunities we can get from it.

CRob (20:18.876)
Great. So now that your CRS has been released as open source, if someone from our community was interested in joining and maybe contributing to that, what’s the best way somebody could get started and get access?

Taesoo Kim (20:28.494)
Mm-hmm.

So we’re going to release non-competition version very soon, along with several documents, we call standardization effort that we and other teams are doing right now. So we define non-competition CRS interface so that you can tightly, as far as you implement those interface, our goal is to mainstream OSS browser together with Google team.

CRob (20:36.369)
Mm-hmm.

CRob (20:58.524)
Mm-hmm.

Taesoo Kim (20:59.086)
so that you can put your CRS as part of OSS Fuzz mainstream, so that we can make it much easier, so that everyone can evaluate one at a time in their local environment as part of OSS Fuzz project. So we’re gonna release the RFC document pretty soon through our website, so that everyone can participate and share their opinion, what are the features that they think we are missing, that we’d love to hear about.

CRob (21:03.74)
Thanks.

CRob (21:18.001)
Mm-hmm.

Taesoo Kim (21:26.502)
And then after that, a month period, we’re going to release our local version so that everyone can start using. And with a very permissive license, everyone can take advantage of the public research, including companies.

CRob (21:34.78)
Awesome.

CRob (21:42.692)
It’s, I’m just amazed. when I came into this, partnering with our friends at DARPA, I was initially skeptical as well. And as I was sitting there watching the finals announced, it was just amazing. Kind of this, the innovative innovation and creativity that all the different teams displayed. again, congratulations to your team, all the students and the researchers and everyone that participated.

Taesoo Kim (21:59.79)
Mm-hmm.

CRob (22:12.6)
Well done. Do you have any parting thoughts? know, as you’re think, as we move on, do you have any kind of words of wisdom you want to share with the community or any takeaways for people curious to get in this space?

Taesoo Kim (22:25.486)
Oh, regarding commercialization, one thing I also like to mention is that in Samsung, we already took the open source version of the CRS, start applying the internal project and open source Samsung project immediately after. So we started seeing the benefit of applying the CRS in the real world immediately after the competition. A lot of people think that competition is just for competition or show

CRob (22:38.108)
Mm-hmm.

Taesoo Kim (22:55.032)
But in fact, it’s not. Everyone in industry, including at Tropic Meta and OpenAI, they all want to adopt those technologies behind the scene. And Amazon, we also working together with Amazon AWS team so that they want to support the deployment of our systems in AWS environment as well. So everyone can just one click, they can launch the systems. And they mentioned there are several.

CRob (22:55.036)
Mm-hmm.

Taesoo Kim (23:24.023)
government-backed They explicitly request to launch our CRS in their environment.

CRob (23:31.1)
I imagine so. Well, again, kudos to the team. Congratulations. It’s amazing. I love to see when researchers have these amazing creative ideas and actually are able to add actual value. And it’s great to hear that Samsung was immediately able to start to get value out of this work. And I hopefully other folks will do the same.

Taesoo Kim (23:55.18)
Yeah, exactly. I think regarding one of wisdom or general advice in general is that this competition based innovation, particularly in academic or involvement like startups or not, because of this venue, so including ourselves and startup people and other team members put their life

on this competition. It’s an objective metric, head-to-head competitions. We don’t care about your background. Just win, right? There’s your objective score. Your job is fine and fix it, I think this competition really drives a lot of efforts behind the scene in our team. We are motivated because of this entire competition is represented in broader audience. I think this is really a way to drive the innovation.

CRob (24:26.46)
Mm-hmm.

CRob (24:32.57)
Yes.

CRob (24:36.709)
Mm-hmm.

Taesoo Kim (24:54.904)
to get some public attention beyond Alphi as well. So I think we really want to see other type of competition in this space. And in the longer future, you probably see based on the current trend, CTF competitions like that, maybe not just CTF, it’s Asian-based CTF, no human involved or the Asians are now attacking each other and solving CTF challenge.

CRob (24:58.524)
Excellent.

CRob (25:19.59)
Mm-hmm.

Taesoo Kim (25:24.846)
This is not a five-year no-vote. It’s going to happen in two years or shortly. Even in this year’s live CTF, one of the teams actually leveraged Asian systems and Asians actually solved the competition quicker than humans. So think we’re going to see those types of events and breakthroughs more often than

CRob (25:55.292)
I used to be a judge at the collegiate cyber competition for one of our local schools. And I think I see a lot of interesting applicability kind of using this as to help them to teach the students that you have an aggressive attacker is doing these different techniques and it’s able to kind of apply some of these learnings that you all have. It’s really exciting stuff.

Taesoo Kim (26:00.142)
Mm-hmm.

Taesoo Kim (26:15.47)
I think one of the interesting quote from, I don’t know who actually said, but in the AI space, someone mentioned that there will be one person, one billion market cap company appear because of LLN or because of AI in general. But if you see the CTF, currently most of the team has minimum 50 people or 100 people competing each other. We’re going to see very soon.

one person or maybe five people with the help of those AI tools and they’re going to compete. Or human are just assisting AI in a way that, hey, could you bring up the Raspberry Pi for me or set up so that human just helping LLN or helping AI in general so that AI can compete. So I think we’re going to see some interesting thing happening pretty soon in our company for sure.

CRob (26:59.088)
Mm-hmm. Yeah.

CRob (27:11.804)
I agree. Well, again, Taesoo, thank you for your time. Congratulations to the team. And that is a wrap. Thank you very much.

Taesoo Kim (27:22.147)
Thank you so much.

Dec 16

Love0

What’s in the SOSS? Podcast #47 – S2E24 Teaching the Next Generation: Software Supply Chain Security in Academia with Justin Cappos

By OpenSSF Podcast

NYU professor Justin Cappos joins the OpenSSF podcast to discuss why software supply chain security is missing from most university curricula -- and how hands-on, open source-first education can change…

Oct 21

Love0

What’s in the SOSS? Podcast #43 – S2E20 Building Trust in Open Source: Seth Larson’s Journey from Maintainer to Security Leader

By OpenSSF Podcast

Seth Larson, Security Developer-in-Residence at the Python Software Foundation, joins What’s in the SOSS? to discuss trust, documentation, and the evolution of secure-by-default practices in open source.

Oct 07

Love0

What’s in the SOSS? Podcast #41 – S2E18 The Remediation Revolution: How AI Agents Are Transforming Open Source Security with John Amaral of Root.io

By Jeff Diecks Podcast

Summary

In this episode of What’s in the SOSS, CRob sits down with John Amaral from Root.io to explore the evolving landscape of open source security and vulnerability management. They discuss how AI and LLM technologies are revolutionizing the way we approach security challenges, from the shift away from traditional “scan and triage” methodologies to an emerging “fix first” approach powered by agentic systems. John shares insights on the democratization of coding through AI tools, the unique security challenges of containerized environments versus traditional VMs, and how modern developers can leverage AI as a “pair programmer” and security analyst. The conversation covers the transition from “shift left” to “shift out” security practices and offers practical advice for open source maintainers looking to enhance their security posture using AI tools.

Listen on Apple Podcasts Listen on Spotify Listen on Overcast Listen on Pocket Casts

Conversation Highlights

00:25 – Welcome and introductions
01:05 – John’s open source journey and Root.io’s SIM Toolkit project
02:24 – How application development has evolved over 20 years
05:44 – The shift from engineering rigor to accessible coding with AI
08:29 – Balancing AI acceleration with security responsibilities
10:08 – Traditional vs. containerized vulnerability management approaches
13:18 – Leveraging AI and ML for modern vulnerability management
16:58 – The coming “remediation revolution” and fix-first approach
18:24 – Why “shift left” security isn’t working for developers
19:35 – Using AI as a cybernetic programming and analysis partner
20:02 – Call to action: Start using AI tools for security today
22:00 – Closing thoughts and wrap-up

Episode Links

Transcript

Intro Music & Promotional clip (00:00)

CRob (00:25)
Welcome, welcome, welcome to What’s in the SOSS, the OpenSSF’s podcast where I talk to upstream maintainers, industry professionals, educators, academics, and researchers all about the amazing world of upstream open source security and software supply chain security.

Today, we have a real treat. We have John from Root.io with us here, and we’re going to be talking a little bit about some of the new air quotes, “cutting edge” things going on in the space of containers and AI security. But before we jump into it, John, could maybe you share a little bit with the audience, like how you got into open source and what you’re doing upstream?

John (01:05)
First of all, great to be here. Thank you so much for taking the time at Black Hat to have a conversation. I really appreciate it. Open source, really great topic. I love it. Been doing stuff with open source for quite some time. How do I get into it? I’m a builder. I make things. I make software been writing software. Folks can’t see me, but you know, I’m gray and have no hair and all that sort of We’ve been doing this a while. And I think that it’s been a great journey and a pleasure in my life to work with software in a way that democratizes it, gets it out there. I’ve taken a special interest in security for a long time, 20 years of working in cybersecurity. It’s a problem that’s been near and dear to me since the first day I ever had my like first floppy disk, corrupted. I’ve been on a mission to fix that. And my open source journey has been diverse. My company, Root.io, we are the maintainers of an open source project called Slim SIM (or SUM) Toolkit, which is a pretty popular open source project that is about security and containers. And it’s been our goal, myself personally, and as in my latest company to really try to help make open source secure for the masses.

CRob (02:24)
Excellent. That is an excellent kind of vision and direction to take things. So from your perspective, I feel we’re very similar age and kind of came up maybe in semi-related paths. But from your perspective, how have you seen application development kind of transmogrify over the last 20 or so years? What has gotten better? What might’ve gotten a little worse?

John (02:51)
20 years, big time frame talking about modern open source software. I remember when Linux first came out. And I was playing with it. I actually ported it to a single board computer as one of my jobs as an engineer back in the day, which was super fun. Of course, we’ve seen what happened by making software available to folks. It’s become the foundation of everything.

Andreessen said software will eat the world while the teeth were open source. They really made software available and now 95 or more percent of everything we touch and do is open source software. I’ll add that in the grand scheme of things, it’s been tremendously secure, especially projects like Linux. We’re really splitting hairs, but security problems are real. as we’ve seen, proliferation of open source and proliferation of repos with things like GitHub and all that. Then today, proliferation of tooling and the ability to build software and then to build software with AI is just simply exponentiating the rate at which we can do things. Good people who build software for the right reasons can do things. Bad people who do things for the bad reasons can do things. And it’s an arms race.

And I think it’s really both benefiting software development, society, software builders with these tremendously powerful tools to do things that they want. A person in my career arc, today I feel like I have the power to write code at a rate that’s probably better than I ever have. I’ve always been hands on the keyboard, but I feel rejuvenated. I’ve become a business person in my life and built companies.

And I didn’t always have the time or maybe even the moment to do coding at the level I’d like. And today I’m banging out projects like I was 25 or even better. But at the same time that we’re getting all this leverage universally, we also noticed that there’s an impending kind of security risk where, yeah, we can find vulnerabilities and generate them faster than ever. And LLMs aren’t quite good yet at secure coding. I think they will be. But also attackers are using it for exploits and really as soon as a disclosed vulnerability comes out or even minutes later, they’re writing exploits that can target those. I love the fact that the pace and the leverage is high and I think the world’s going to do great things with it, the world of open source folks like us. At the same time, we’ve got to be more diligent and even better at defending.

CRob (05:44)
Right. I heard an interesting statement yesterday where folks were talking about software engineering as a discipline that’s maybe 40 to 60 years old. And engineering was kind of the core noun there. Where these people, these engineers were trained, they had a certain rigor. They might not have always enjoyed security, but they were engineers and there was a certain kind of elegance to the code and that was people much like artists where they took a lot of pride in their work and how the code you could understand what the code is. Today and especially in the last several years with the influx of AI tools especially that it’s a blessing and a curse that anybody can be a developer. Not just people that don’t have time that used to do it and now they get to of scratch that itch. But now anyone can write code and they may not necessarily have that same rigor and discipline that comes from like most of them engineering trades.

John (06:42)
I’m going to guess. I think it’s not walking out too far on limb that you probably coded in systems at some point in your life where you had a very small amount of memory to work with. You knew every line of code in the system. Like literally it was written. There might have been a shim operating system or something small, but I wrote embedded systems early in my career and we knew everything. We knew every line of code and the elegance and the and the efficiency of it and the speed of it. And we were very close to the CPU, very close to the hardware. It was slow building things because you had to handcraft everything, but it was very curated and very beautiful, so to speak. I find beauty in those things. You’re exactly right. I think I started to see this happen around the time when JVM started happening, Java Virtual Machines, where you didn’t have to worry about Java garbage collection. You didn’t have to worry about memory management.

And then progressively, levels of abstraction have changed right to to make coding faster and easier and I give it more you know more power and that’s great and we’ve built a lot more systems bigger systems open source helps. But now literally anyone who can speak cogently and describe what they want and get a system and. And I look at the code my LLM’s produce. I know what good code looks like. Our team is really good at engineering right?

Hmm, how did it think to do it that way? Then go back and we tell it what we want and you can massage it with some words. It’s really dangerous and if you don’t know how to look for security problems, that’s even more dangerous. Exactly, the level of abstraction is so high that people aren’t really curating code the way they might need to to build secure production grade systems.

CRob (08:29)
Especially if you are creating software with the intention of somebody else using it, probably in a business, then you’re not really thinking about all the extra steps you need to take to help protect yourself in your downstream.

John (08:44)
Yeah, yeah. think it’s an evolution, right? And where I think of it like these AI systems we’re working with are maybe second graders. When it comes to professional code authoring, they can produce a lot of good stuff, right? It’s really up to the user to discern what’s usable.

And we can get to prototypes very quickly, which I think is greatly powerful, which lets us iterate and develop. In my company, we use AI coding techniques for everything, but nothing gets into production, into customer hands that isn’t highly vetted and highly reviewed. So, the creation part goes much faster. The review part is still a human.

CRob (09:33)
Well, that’s good. Human on the loop is important.

John (09:35)
It is.

CRob (09:36)
So let’s change the topic slightly. Let’s talk a little bit more about vulnerability management. From your perspective, thinking about traditional brick and mortar organizations, how have you seen, what key differences do you see from someone that is more data center, server, VM focused versus the new generation of cloud native where we have containers and cloud?

What are some of the differences you see in managing your security profile and your vulnerabilities there?

John (10:08)
Yeah, so I’ll start out by a general statement about vulnerability management. In general, the way I observe current methodologies today are pretty traditional.

It’s scan, it’s inventory – What do I have for software? Let’s just focus on software. What do I have? Do I know what it is or not? Do I have a full inventory of it? Then you scan it and you get a laundry list of vulnerabilities, some false positives, false negatives that you’re able to find. And then I’ve got this long list and the typical pattern there is now triage, which are more important than others and which can I explain away. And then there’s a cycle of remediation, hopefully, a lot of times not, that you’re cycling work back to the engineering organization or to whoever is in charge of doing the remediation. And this is a very big loop, mostly starting with and ending with still long lists of vulnerabilities that need to be addressed and risk managed, right? It doesn’t really matter if you’re doing VMs or traditional software or containerized software. That’s the status quo, I would say, for the average company doing vulnerability maintenance. And vulnerability management, the remediation part of that ends up being some fractional work, meaning you just don’t have time to get to it all mostly, and it becomes a big tax on the development team to fix it. Because in software, it’s very difficult for DevSec teams to fix it when it’s actually a coding problem in the end.

In traditional VM world, I’d say that the potential impact and the velocity at which those move compared to containerized environments, where you have

Kubernetes and other kinds of orchestration systems that can literally proliferate containers everywhere in a place where infrastructure as code is the norm. I just say that the risk surface in these containerized environments is much more vast and oftentimes less understood. Whereas traditional VMs still follow a pattern of pretty prescriptive way of deployment. So I think in the end, the more prolific you can be with deploying code, the more likely you’ll have this massive risk surface and containers are so portable and easy to produce that they’re everywhere. You can pull them down from Docker Hub and these things are full of vulnerabilities and they’re sitting on people’s desks.

They’re sitting in staging areas or sitting in production. So proliferation is vast. And I think that in conjunction with really high vulnerability reporting rates, really high code production rates, vast consumption of open source, and then exploits at AI speed, we’re seeing this kind of almost explosive moment in risk from vulnerability management.

CRob (13:18)
So there’s been, over the last several, like machine intelligence, which has now transformed into artificial intelligence. It’s been around for several decades, but it seems like most recently, the last four years, two years, it has been exponentially accelerating. We have this whole spectrum of things, AI, ML, LLM, GenAI, now we have Agentic and MCP servers.

So kind of looking at all these different technologies, what recommendations do you have for organizations that are looking to try to manage their vulnerabilities and potentially leveraging some of this new intelligence, these new capabilities?

John (13:58)
Yeah, it’s amazing at the rate of change of these kinds of things.

CRob (14:02)
It’s crazy.

John (14:03)
I think there’s a massively accelerating, kind of exponentially accelerating feedback loop because once you have LLMs that can do work, they can help you evolve the systems that they manifest faster and faster and faster. It’s a flywheel effect. And that is where we’re going to get all this leverage in LLMs. At Root, we build an agentic platform that does vulnerability patching at scale. We’re trying to achieve sort of an open source scale level of that.

And I only said that because I believe that rapidly, not just us, but from an industry perspective, we’re evolving to have the capabilities through agentic systems based on modern LLMs to be able to really understand and modify code at scale. There’s a lot of investment going in by all the major players, whether it’s Google or Anthropic or OpenAI to make these LLM systems really good at understanding and generating code. At the heart of most vulnerabilities today, it’s a coding problem. You have vulnerable code.

And so, we’ve been able to exploit the coding capabilities to turn it into an expert security engineer and maintainer of any software system. And so I think what we’re on the verge of is this, I’ll call it remediation revolution. I mentioned that the status quo is typically inventory, scan, list, triage, do your best. That’s a scan for us kind of, you know, I’ll call it, it’s a mode where mostly you’re just trying to get a comprehensive list of the vulnerabilities you have. It’s going to get flipped on its head with this kind of technique where it’s going to be just fix everything first. And there’ll be outliers. There’ll be things that are kind of technically impossible to fix for a while. For instance, it could be a disclosure, but you really don’t know how it works. You don’t have CWEs. You don’t have all the things yet. So you can’t really know yet.

That gap will close very quickly once you know what code base it’s in and you understand it maybe through a POC or something like that. But I think we’re gonna enter into the remediation revolution of vulnerability management where at least for third party open source code, most of it will be fixed – a priority.

Now, zero days will start to happen faster, there’ll be all the things and there’ll be a long tail on this and certainly probably things we can’t even imagine yet. But generally, I think vulnerability management as we know it will enter into this phase of fix first. And I think that’s really exciting because in the end it creates a lot of work for teams to manage those lists, to deal with the re-engineering cycle. It’s basically latent rework that you have to do. You don’t really know what’s coming. And I think that can go away, which is exciting because it frees up security practitioners and engineers to focus on, I’d say more meaningful problems, less toil problems. And that’s good for software.

CRob (17:08)
It’s good for the security engineers.

John (17:09)
Correct.

CRob (17:10)
It’s good for the developers.

John (17:11)
It’s really good for developers. I think generally the shift left revolution in software really didn’t work the way people thought. Shifting that work left, it has two major frictions. One is it’s shifting new work to the engineering teams who are already maximally busy.

CRob (17:29)
Correct.

John (17:29)
I didn’t have time to do a lot of other things when I was an engineer. And the second is software engineers aren’t security engineers. They really don’t like the work and maybe aren’t good at the work. And so what we really want is to not have that work land on their plate. I think we’re entering into an age where, and this is a general statement for software, where software as a service and the idea of shift left is really going to be replaced with I call shift out, which is if you can have an agentic system do the work for you, especially if it’s work that is toilsome and difficult, low value, or even just security maintenance, right? Like lot of this work is hard. It’s hard. That patching things is hard, especially for the engineer who doesn’t know the code. If you can make that work go away and make it secure and agents can do that for you, I think there’s higher value work for engineers to be doing.

CRob (18:24)
Well, and especially with the trend with open source, kind of where people are assembling composing apps instead of creating them whole cloth. It’s a very rare engineer indeed that’s going to understand every piece of code that’s in there.

John (18:37)
And they don’t. I don’t think it’s feasible. don’t know one except the folks who write node for instance, Node works internally. They don’t know. And if there’s a vulnerability down there, some of that stuff’s really esoteric. You have to know how that code works to fix it. As I said, luckily, agent existing LLM systems with agents kind of powering them or using or exploiting them are really good at understanding big code bases. They have like almost a perfect memory for how the code fits together. Humans don’t, and it takes a long time to learn this code.

CRob (19:11)
Yeah, absolutely. And I’ve been using leveraging AI in my practice is there are certain specific tasks AI does very well. It’s great at analyzing large pools of data and providing you lists and kind of pointers and hints. Not so great making it up by its own, but generally it’s the expert system. It’s nice to have a buddy there to assist you.

John (19:35)
It’s a pair programmer for me, and it’s a pair of data analysts for you, and that’s how you use it. I think that’s a perfect. We effectively have become cybernetic organisms. Our organic capabilities augmented with this really powerful tool. I think it’s going to keep getting more and more excellent at the tasks that we need offloaded.

CRob (19:54)
That’s great. As we’re wrapping up here, do you have any closing thoughts or a call to action for the audience?

John (20:02)
Call to action for the audience – I think it’s again, passion play for me, vulnerability management, security of open source. A couple of things. same. Again, same cloth – I think again, we’re entering an age where think security, vulnerability management can be disrupted. I think anyone who’s struggling with kind of high effort work and that never ending list helps on the way techniques you can do with open source projects and that can get you started. Just for instance, researching vulnerabilities. If you’re not using LLMs for that, you should start tomorrow. It is an amazing buddy for digging in and understanding how things work and what these exploits are and what these risks are. There are tooling like mine and others out there that you can use to really take a lot of effort away from vulnerability management. I’d say that for any open source maintainers out there, I think you can start using these programming tools as pair programmers and security analysts for you. And they’re pretty good. And if you just learn some prompting techniques, you can probably secure your code at a level that you hadn’t before. It’s pretty good at figuring out where your security weaknesses are and telling you what to do about them. I think just these things can probably enhance security open source tremendously.

CRob (24:40)
That would be amazing to help kind of offload some of that burden from our maintainers and let them work on that excellent…

John (21:46)
Threat modeling, for instance, they’re actually pretty good at it. Yeah. Which is amazing. So start using the tools and make them your friend. And even if you don’t want to use them as a pair of programmer, certainly use them as a adjunct SecOps engineer.

CRob (22:00)
Well, excellent. John from Root.io. I really appreciate you coming in here, sharing your vision and your wisdom with the audience. Thanks for showing up.

John (22:10)
Pleasure was mine. Thank you so much for having me.

CRob (22:12)
And thank you everybody. That is a wrap. Happy open sourcing everybody. We’ll talk to you soon.

What’s in the SOSS? Podcast #52 – S3E4 AIxCC Part 2 – From Skeptics to Believers: How Team Atlanta Won AIxCC by Combining Traditional Security with LLMs

Summary

Conversation Highlights

Episode Links

Transcript

What’s in the SOSS? Podcast #47 – S2E24 Teaching the Next Generation: Software Supply Chain Security in Academia with Justin Cappos

What’s in the SOSS? Podcast #43 – S2E20 Building Trust in Open Source: Seth Larson’s Journey from Maintainer to Security Leader

What’s in the SOSS? Podcast #41 – S2E18 The Remediation Revolution: How AI Agents Are Transforming Open Source Security with John Amaral of Root.io

Summary

Conversation Highlights

Episode Links

Transcript

We envision a future where OSS is universally trusted, secure, and reliable. Join us in making open source more secure.

Get the latest announcements, event info, and the community news in your inbox