Tag

DARPA

Hack to the Future: The Impact and Legacy of the DARPA AIxCC Challenge

By AI, Blog, Global Cyber Policy, Guest Blog

By Helen Woeste

AIxCC Competition Background & Results: 

In 2023, DARPA announced a two-year long competition called the Artificial Intelligence Cyber Challenge (AIxCC) with the goal to safeguard open source software used in critical infrastructure throughout America. The intent is to hasten the development of open source AI tooling that can assist developers with finding and fixing bugs in live software with minimal cost. Open source is a drastically underfunded and underresourced form of infrastructure. It therefore presents an exciting, practical target, and opportunity for the research and development of AI in cybersecurity. Additionally, open source’s publicly observable code is ideal for competition and collaboration. 

AIxCC was run in collaboration with ARPA-H and supported with contributions from Anthropic, Google, Microsoft, and OpenAI, with additional consulting around open source provided by the Linux Foundation and the Open Source Security Foundation (OpenSSF). This research was developed with funding from the Defense Advanced Research Projects Agency (DARPA). The competition consisted of two rounds, the Semifinal Competition (ASC) and the Final Competition (AFC), where cash prizes from a pot of $30,500,000 were distributed. For the ASC, 42 team submissions were accepted across two tracks; the Open Track and the Small Business Track, which required an additional technical paper submission. The top seven teams moved forward to the AFC which was set up to mimic a real world CI/CD pipeline. The scoring algorithm was also designed to highlight behaviors that would make the competing systems more useful to developers. At the conclusion of AFC, the top three teams were Team Atlanta, Trail of Bits, and Theori

For the AIxCC competition, real open source projects were selected, and their code was forked and then modified to insert artificial bugs for the Cyber Reasoning Systems (CRS) to discover and fix. However, during the execution of the competition, the CRSs discovered several real potential bugs alongside the artificial ones. This introduced the issue of how to triage and manage resolution of fixes in the projects. OpenSSF engaged third party open source security organization Open Source Technology Improvement Fund (OSTIF) to get involved with the closing out of the bugs identified as a result of the AIxCC competition. 

OSTIF selected the team at Ada Logics for their extensive experience working with open source fuzzing, bug verification, and disclosure. With a list of potential bugs identified through the course of the competition, Ada Logics was tasked with securely submitting verified issues, ensuring that anything reported to open source project maintainers was a proven bug. The Ada Logics team was able to reproduce and confirm twenty-seven issues after multiple rounds of testing and continued coordination between AIxCC competitors, collaborators, and contributors. CRS teams, including Team Atlanta, Team Buttercup, Team FuzzingBrain, Team Shellphish, Team Theori, Team 42-b3yond-6ug, and Team Lacrosse, working together with Kudu Dynamics and the OpenSSF, continued to collaborate and meet with OSTIF around the disclosures to ensure total accuracy of the reported issue’s testing and resulting decision around disclosure. 

It was of utmost importance that any and all real bugs detected during the competition were verified before alerting the project maintainer to the issue. This is to differentiate how the competition reports issues to projects from the low-quality reports plaguing open source maintainers today. In several cases, CRS-generated patches were submitted alongside bugs, an offering to project maintainers looking to quickly resolve the finding. Additionally, feedback was sourced from the projects around their experience as a target in the competition as well as the disclosure procedure following. 

The Findings:

Teams discovered twenty-seven candidate real-world issues during the competition and OSTIF engineers were ultimately able to replicate all of the draft bugs. The affected projects were cURL, shadowsocks-libev, healthcare-data-harmonization, hertzbeat, little-cms, and mongoose. Once identified, the hard work began of fixing those bugs, implementing CRS tooling to perform the second half of its double duty to find and fix security issues. 

However, some of the findings did not meet a level of security concern for various reasons. Some issues were fixed by code changes in the projects during the time-period in between the competition and when engineers reproduced them. Others were outside of the threat model of the project and did not meet the criteria needed to incorporate into the project (for example, the Apache Poi project threat model states “Expect any type of Exception when processing documents,” making any exception-based findings non-issues). One issue had actually already been found by OSS-Fuzz, but the project hadn’t fixed it yet.

Ultimately, interesting findings were discovered and fixed by the Cyber Reasoning Systems in this competition, and the systems found a lot of valid issues. Further, some projects had introduced fixes before the bugs were reported. This is likely because the AIxCC teams submitted the fuzzing harnesses to the projects before triage had taken place, which re-discovered the same bugs before triage had completed. One significant lesson learned from this is that cyber reasoning systems may benefit from doing self-triage when discovering potential issues by checking against the project’s documentation and understanding the types of issues that the project accepts as security bugs that need to be addressed.

Conclusion & Looking Forward:

The AIxCC program was a massive undertaking by dozens of organizations, all working to contribute back to open source security in a meaningful way using novel AI tooling. The competition was mindfully designed and carried out, with attention given towards the open source projects and maintainers, the wide variety of competitors and interests, and the impact of the competition itself on the industry all the way down to the maintainers. 

OpenSSF is the home for extended collaboration on these new open source tools through its newly formed Cyber Reasoning Systems Special Interest Group. OSS-CRS and FuzzingBrain, two open source projects that emerged from the competition, are now hosted at OpenSSF in the Linux Foundation. A third tool applied and was accepted to the OpenSSF, and has a few remaining steps before the official transition. The group aims to foster their development and adoption, and to establish best practices that help projects use CRSs effectively and responsibly.

This work is already producing real results. For example, FuzzingBrain has since turned its AI-assisted fuzzing system on the broader open source ecosystem, discovering sixty-two vulnerabilities across twenty-six projects, from CUPS and Apache Avro to Ghidra and OpenLDAP, with forty-three confirmed by maintainers and thirty-six already patched upstream. 42-b3yond-6ug has expanded its CRS to uncover twelve kernel-related vulnerabilities in the Linux kernel and related components, plus ten zero-day vulnerabilities in userspace projects including Eclipse Mosquitto and OpenLDAP. The team is also developing a platform to support more efficient model training and evaluation of models and agents, with a release expected soon. Using OSS-CRS, Team Atlanta discovered twenty-five vulnerabilities across sixteen projects spanning a broad range of software including PHP, U-Boot, memcached, and Apache Ignite 3. Of those, nine have been fixed and eight more have been confirmed with fixes in progress.

The future of AI assisting maintainers in finding and fixing security vulnerabilities is bright. The challenges raised by the AIxCC competition already have solutions being developed in open source, such as LLM-based tools that build threat models by looking at the data-flow of projects, and AI agents that triage findings against threat models and documentation before reporting issues. As these tools all continue to develop, they will harmonize into reliable solutions that maintainers can use to elevate their security with far less effort than today.

Our gratitude to the folks at Ada Logics for triaging the potential bugs and working hard to reproduce the issues so maintainers didn’t have to, OpenSSF for trusting us to bring together all of the stakeholders to work on the issues together, DARPA and ARPA-H for holding the AIxCC competition and sponsoring this work, the teams that built the Cyber Reasoning Systems for the competition, Kudu Dynamics for their support in confirming the findings, and all of the maintainers that worked with us to resolve the issues.

OpenSSF and OSTIF will continue to support this kind of work by serving as human connectors between CRS tools and open source communities. The goal is to help triage and validate vulnerability reports and proposed patches before they reach maintainers, ensuring findings are accurate, actionable, and respectful of maintainers’ time.

Organizing a competition of this scale on behalf of open source maintainers and its end users takes both enormous collaboration and individual effort. Understanding the communities involved, and building lightweight programs that shield maintainers from headaches while strengthening security is the best possible outcome for the ecosystem. It took everyone coming together to make this happen, and ongoing efforts will bring low-cost and low-maintenance tools to everyone that are valuable and make us all safer. 

As AI moves forward at breakneck speed, innovative work like this highlights how you can move fast and build things together for a better tomorrow. 

Author Bio

Helen Woeste joined OSTIF in 2023, coming from a decade of work experience in the restaurant and hospitality industries. With a passion (and degree) for writing and governance structures, Woeste quickly transitioned into an operations and communications role in technology. 

 

The views, opinions and/or findings expressed are those of the author and should not be interpreted as representing the official views or policies of the Department of Defense or the U.S. Government.

Distribution Statement “A” (Approved for Public Release, Distribution Unlimited)

From AIxCC to OpenSSF: Welcoming OSS-CRS to Advance AI Driven Open Source Security

By AI, Blog, Global Cyber Policy

By Jeff Diecks

Artificial intelligence is changing how we approach software security. Open source is at the center of that shift.

Over the past year, DARPA’s Artificial Intelligence Cyber Challenge (AIxCC) showed that cyber reasoning systems (CRS) can go beyond finding vulnerabilities. These systems can analyze code, confirm issues, and generate patches. This brings us closer to a future where security is more automated and scalable.

When the competition ended, one question remained. How do we take these breakthroughs and make them usable in the real world?

Today, we are taking an important step forward.

The Open Source Security Foundation (OpenSSF) is welcoming OSS-CRS as a new open source project under the AI / ML Security Working Group.

OSS-CRS emerged from AIxCC and is a standard orchestration framework for building and running LLM-based autonomous bug-finding and bug-fixing systems.

The open framework is designed to make CRS practical outside of the AIxCC environment. During the competition, teams built powerful systems that were released as open source. However, many of them depended on the competition infrastructure, which made them difficult to reuse or extend. OSS-CRS addresses that gap.

OSS-CRS Features include:

  • Standard CRS Interface: OSS-CRS defines a unified interface for CRS development. Build your CRS once following the development guide, and run it across different environments (local, Azure, …) without any modification.
  • Effortless Targeting: Run any CRS against projects in OSS-Fuzz format. If your project is compatible with OSS-Fuzz, OSS-CRS can orchestrate CRSs against it out of the box.
  • Ensemble Multiple CRSs: Compose and run multiple CRSs together in a single campaign to combine their strengths and maximize bug-finding and bug-fixing coverage.
  • Resource Control: Manage CPU limits and LLM budgets per CRS to keep costs and resources in check.

Read the OSS-CRS research paper: https://doi.org/10.48550/arXiv.2603.08566

From Competition to Community

The move of OSS-CRS into OpenSSF marks a clear transition from research and competition to open collaboration and long term development.

OpenSSF provides a neutral home where projects like OSS-CRS can grow. Contributors can work together to improve the tools, validate results, and support adoption across the ecosystem.

OSS-CRS is already producing real results. Using OSS-CRS, Team Atlanta discovered twenty-five vulnerabilities across sixteen projects spanning a broad range of software including PHP, U-Boot, memcached, and Apache Ignite 3.

OpenSSF will continue to support this important work by providing human connectors between CRS tools and open source communities. The goal is to help triage and validate vulnerability reports and proposed patches before they reach maintainers, ensuring findings are accurate, actionable, and respectful of maintainers’ time.

Recent research from the OSS-CRS team validates the necessity of having a human in the loop. The team manually reviewed a set of 630 AI-generated patches and found 20-40% of the patches to be semantically incorrect. The incorrect patches pass all automated validation but are actually wrong — a dangerous failure mode only catchable by manual review.

A key benefit of the OSS-CRS project is its Ensemble feature. The Ensemble feature enhances accuracy and reliability by combining patches from multiple CRS approaches and using a selection process to pick the one most likely to be correct. The research showed this approach consistently matches or outperforms the best single component in improving semantic correctness, which is hard to eliminate at the single-agent level. This collaboration of systems helps produce more robust results for open source defenders.

Get Involved

With projects like OSS-CRS, OpenSSF will continue to support AI-driven security work to help turn innovation into practical outcomes for open source.

We offer several options to get involved including:

Author Bio

Jeff Diecks is a Senior Technical Program Manager at The Linux Foundation. He has more than two decades of experience in technology and communications with a diverse background in operations, project management and executive leadership. A participant in open source since 1999, he’s delivered digital products and applications for universities, sports leagues, state governments, global media companies and non-profits.

What’s in the SOSS? Podcast #51 – S3E3 AIxCC Part 1 – From Skepticism to Success: The AI Cyber Challenge (AIxCC) with Andrew Carney

By Podcast

Summary

This episode of What’s in the SOSS features Andrew Carney from DARPA and ARPA-H, discussing the groundbreaking AI Cyber Challenge (AIxCC). The competition was designed to create autonomous systems capable of finding and patching vulnerabilities in open source software, a crucial effort given the pervasive nature of open source in the tech ecosystem. Carney shares insights into the two-year journey, highlighting the initial skepticism from experts that ultimately turned into belief, and reveals the surprising efficiency of the competing teams, who collectively found over 80% of inserted vulnerabilities and patched nearly 70%, with remarkably low compute costs. The discussion concludes with a look at the next steps: integrating these cyber reasoning systems into the open source community to support maintainers and supercharge automated patching in development workflows.

This episode is part 1 of a four-part series on AIxCC:

Conversation Highlights

00:00 – Introduction and Guest Welcome
00:59 – Guest Background: Andrew Carney’s Role at DARPA/ARPA-H
02:20 – Overview of the AI Cyber Challenge (AIxCC)
03:48 – Competition History and Structure
04:44 – The Value of Skepticism and Surprising Learnings
07:11 – Surprising Efficiency and Low Compute Costs
08:15 – Major Competition Highlights and Results
13:09 – What’s Next: Integrating Cyber Reasoning Systems into Open Source
16:55 – A Favorite Tale of “Robots Gone Bad”
18:37 – Call to Action and Closing Thoughts

Transcript

Intro music & intro clip (00:00)

CRob (00:23)
Welcome, welcome, welcome to What’s in the SOSS, the OpenSSF podcast where I talk to people that are in and around the amazing world of open source software, open source software security and AI security. I have a really amazing guest today, Andrew.

He was one of the leaders that helped oversee this amazing AI competition we’re going to talk to. So let me start off, Andrew, welcome to the show. Thanks for being here.

Andrew Carney (00:57)
Thank you for having me so much, CRob. Really appreciate it.

CRob (00:59)
Yeah, so maybe for our audience that might not be as familiar with you as I am, could you maybe tell us a little bit about yourself, kind of where you work and what types of problems are you trying to solve?

Andrew Carney (01:12)
Yeah, I’m a vulnerability researcher. That’s been the core of my career for the last 20 years. And part of that has had me at DARPA. And now I’m at DARPA and ARPA-H, where I sort of work on cybersecurity research problems focused on national defense and/or health care. So it’s sort of the space that I’ve been living in for the past few years.

CRob (01:28)
That’s an interesting collaboration between those two worlds.

Andrew Carney (01:43)
Yeah, it’s, you know, it’s, I think the vulnerability research and reverse engineering community is, pretty tight, you know, pretty, pretty small. And, a lot of folks across lots of different industries and sectors have similar problems that, you know, we’re able to help with. So, yeah, it’s, it’s exciting to kind of see, see how, how, you know, folks in finance or automotive industry or the energy sector kind of all deal with similar-ish problems, but different scales with different kind of flavors of concerns.

CRob (02:20)
That’s awesome. And so as I mentioned, we were introduced through the AIxCC competition. Maybe for our audience that might not be as familiar, could you maybe give us an overview of AIxCC, the competition, and kind of why you felt this effort was so important and we’ve spent so much time working through this, years.

Andrew Carney (02:42)
Absolutely. I mean, AIxCC, uh, is a competition to create autonomous systems that can find and patch vulnerabilities in source code. Uh, a big part of this competition was focusing on open source software, um, because of how critical it is kind of across our tech ecosystem. It really is sort of like the font of all software.

And so DARPA and ARPA-H and other partners across the federal government, we saw this kind of need to support the open source community and also leverage kind of new technologies on the scene like LLMs. So how do we take these new technologies and apply them in a very principled way to help solve this massive problem? And working with the Linux Foundation and OpenSSF has been a huge piece of that as well. So I really appreciate everything you guys have done throughout the competition.

CRob (03:41)
Thank you.

CRob (03:48)
And maybe could you give us just a little history of when did the competition start and kind of how it was structured?

Andrew Carney (03:54)
Yeah. So the competition was announced at Black Hat in August of 2023. The competition was structured into two main sections. We had a qualifying event at DEF CON in 2024. And then we had our final event this past DEF CON, August 2025. And throughout that two-year period, we designed a competition that kept pushing the competitors sort of ahead of wherever the current models, the current kind of agentic technologies were, whatever that bar they were setting, we continued to push the competitors past that. So it’s been a really dynamic sort of competition because that technology has continued to evolve.

CRob (04:44)
I have to say when I initially heard about the competition, I’ve been doing cybersecurity a very long time. I was very skeptical about what the results will be, not to bury, to bury the lead, so to speak. But I was very surprised with the results that you all shared with the world this summer in Las Vegas. We’ll get to that in a minute. But again, this competition went over many years and as it progressed, could you maybe share what you learned that maybe surprised you, you didn’t expect from when this all kicked off.

Andrew Carney (05:21)
Yeah, think so. I think there have been a lot of surprises along the way. And I’ll also say that, you know, skepticism, especially from, you know, informed experts is a really good sign for a DARPA challenge. So for a lot of projects at DARPA generally, you know, if you’re kind of waffling between this is insanely hard and there’s no way we’ll be successful and this is kind of a much easy, like, you know, there’s an easy solution to this. If you’re constantly in that space of uncertainty, like, no, I really think this is really, really hard. And I’m getting skepticism from people that know a lot about this space. For us, that’s fuel. That’s okay. There is, you know, there’s a question to answer here. And so that really was part of driving us, even competitors, competitors that ended up making it to finals themselves were skeptical even as they were competing.

So I love that. I love that. Like, you know, we want to try to do really hard things and, you know, criticism helps us improve. Like that’s super beneficial.

CRob (06:33)
Yeah, it was, and I’ve had the opportunity to talk with many of the teams and now we’re in the phase post-competition where we’re actually starting to figure out how to share the results with the upstream projects and how to build communities around these tools. you assembled a really amazing group of folks in these competitive teams, some super top-notch minds. again,

You made me a believer now, where I really do believe that AI does have a place and can legitimately offer some real value to the world in this space.

Andrew Carney (07:11)
Yeah, think one of the biggest surprises for me was the efficiency. I think a lot of times, especially with DARPA programs, we expect that technical miracles will come with a pretty hefty price tag. And then you’ll have to find a way to scale down, to economize, to make that technology more useful, more more widely kind of distributable.

With AIxCC, we found the teams pushing so hard on the core kind of research questions, but at the same time, sort of woven into that was using their resources efficiently. And so even the competition results themselves were pleasantly surprising in terms of the compute costs for these systems to run. We’re talking tens to hundreds of dollars.

vulnerability discovered or patch emitted, which is really quite amazing.

CRob (08:15)
Yeah, so maybe could you just give me some highlights of kind of what the competition discovered, what the competitors achieved?

Andrew Carney (08:24)
Yeah. So I think when we’re trying to tackle these really challenging research questions and we’re examining it from all angles and being extremely critical of even our own approach, as well as the competitors’ approaches, that initially back in August of 2024, we had this amazing proof of life moment where the teams demonstrated with only a few hundred dollars in total compute budget.

that they were able to analyze large open source projects and find real issues. One of the teams found a real issue in SQLite that we had disclosed at the time to the maintainers. And they found that, once again, with this very limited compute budget across multiple millions of lines of code in these projects. So that was sort of the OK, there’s a there there, like there’s something here and we can keep pushing. So that was a really exciting moment for everyone. And then over the following year, up to August 2025, we had a series of these non-scoring events where the teams would be given challenges that looked very similar to what we’d give them for finals with an increasing level of scale and difficulty.

So you can think of these as like extreme integration events where we’re still giving the teams hundreds of thousands or millions of lines of code. We’re giving them, you know, eight to 12 hours per kind of task. And we’re seeing what they can do. This was important to ensure that the final competition went off without a hitch. And also because the models they were leveraging continue to evolve and change.

So it was really exciting. In that process, the teams found and disclosed hundreds of vulnerabilities and produced hundreds of potential patches that they would offer up to maintainers of the projects that they were doing their own internal kind of development on. So that was really exciting just to see that the SQLite bug wasn’t a fluke and that the teams could consistently kind of perform and keep pushing as we push them to move further and faster and deal with more complex code, they were able to adapt and find a way forward.

CRob (11:02)
That’s awesome. And I know you had, it was a long journey that you and the team and all the support folks went through, but is there any particular moment that kind of you smile on when you reflect on over the course of the competition?

Andrew Carney (11:20)
Oh, man, so many. I think there’s an equal number of like those smiling moments and also, you know, premature gray hairs that the team and myself have created. But I think one of the big moments, there were a number of just outstanding kind of experts in the field on social media.

in talks that would, the way that they talked about kind of AI powered program analysis was very skeptical. near the end, leading up to semi-finals, we had this lovely moment where the Google project zero team and the Google deep mind teams penned a blog post that said that they were inspired by one of the teams, by the SQL light bug, by one of the team’s discoveries. And that was huge, I think both for that team and just the competition as a whole. And then after that, seeing people’s opinions change and seeing people that had held, that were, like I said, top tier experts in the field, change their perspective pretty drastically, which that was, you know, that was helpful signal for us to demonstrate that we were being successful. Like converting a critic, I think, is one of the best kind of victories that you can have. Because now they can be a collaborator, right? Like now we can still kind of spar over different perspectives or ideas, but now we’re working together. That’s very exciting.

CRob (13:09)
That’s awesome. So what’s next? The hard work of the competition is over and now we’re in kind of the after action phase where we’re trying to integrate all this great work and kind of get these projects out to the world to use. So from your perspective or from DARPA or the competition, what’s next for you?

Andrew Carney (13:29)
Yeah, so one of the biggest challenges with DARPA programs is when you’re successful, sometimes you have that technological miracle, you have that accomplishment, and maybe the world’s not entirely ready for it yet. Or maybe there’s additional development that needs to happen to get it kind of into the real world. With AIxCC, we made the competition as realistic as possible. The automated systems, these cyber reasoning systems, were being given bug reports, they’re being given patch diffs, they’re being given artifacts that we would consume and review as human developers. So we modeled all the tasks very closely to the real things that we would want these systems to do. And they demonstrated incredible kind of performance. Collectively, the teams were able to find over 80 % of the vulnerabilities that we’d synthetically kind of inserted. And they patched nearly 70 % of those vulnerabilities. And that patching piece is so critical. What we didn’t want to do was create systems that made open source maintainers lives more problematic.

CRob (14:54)
Thank you.

Andrew Carney (14:56)
We wanted to demonstrate that this is a reachable bug and here’s a candidate patch. And in the months after the competition, we’ve incentivized the teams further than just the original prize money to go out into the open source community and support open source maintainers with their tools. And we’ve had folks come back and literally in their kind of reports, document that the patch they suggested to a maintainer was nearly identical to what the maintainer actually committed. Yeah. And those reports are coming in daily. So we’re getting, we have this constant feed of engagement and the tools are still obviously being improved and developed. But it’s really exciting to see it. So when I think about what’s next is like we’re already in the what’s next like getting the technology out there, using government funding to support open source maintainers wherever we can, especially if their code is part of widely used applications or code used in critical infrastructure. So that’s where we find ourselves now. And then we’re thinking a lot about how we supercharge that effort to the…

there have been, you the federal government supports a lot of actively used open source projects, right? And we’ve been working with all these partner agencies across the federal government and just making sure that we’re supporting the existing programs when we find them. And then where we see a gap, kind of figuring out what it would take to fill that gap that community that could use more support.

CRob (16:55)
So on a slightly different note, we’re both technologists and we love the field, but as I was going through this journey, kind of on the sidelines with you all, I was reflecting, do you have a a favorite tale of robots gone bad? Like Terminator’s Skynet or HAL 9000 or the Butlerian Jihad?

Andrew Carney (17:22)
That’s a, you know, I think I, I’ll, I don’t know that this is my favorite, but it is one of the most recent ones that I’ve read. There’s a series called Dungeon Crawler Carl. Yeah. And it’s been really like entertaining reading. And I just think the tension between the primal AIs and the corporations that rely on said independent entities, but also are constantly trying to rein them in is, I don’t know, it’s been really interesting to see that narrative evolve.

CRob (18:08)
I’ve always enjoyed science fiction and fantasy’s ability to kind of hold a mirror up to society and kind of put these questions in a safe space where you can kind of think about 1984 and Big Brother or these other things, but it’s just in paper or on your iPad or whatever. So it’s a nice experiment over there. And we don’t want that to be happening here.

Andrew Carney (18:29)
Yes, yes. Yeah, the fiction as thought experimentation, right?

CRob (18:37)
Right, exactly. So as we wind down, do you have a particular call to action or anything you want to highlight to the audience that they should maybe investigate a little further or participate in?

Andrew Carney (18:50)
Yeah, I think so a big one is, you know, we would love for open source maintainers to reach out to us directly. AIXCC at DARPA.mil. That’s the email address that our team uses. And we’ve been looking for more maintainers to connect with so that we can make sure that if we can provide resources to them, one, that they’re right sized for the challenges that those maintainers are having, or maintainer, right? Sometimes it’s just one person. And then two, that we’re engaging with them in the way that they would prefer to be engaged with. We want to be helpful help, not unhelpful help. So that’s a big one. And then I think in more generally, I would love to see more patching added into the kind of vulnerability research lifecycle. I think there’s so many opportunities for commercial and open source tools that have that discovery capability and that’s really their big selling point. And now with AIxCC and with the technology that the competitors open source themselves, since all of their systems were open sourced after the competition, there’s this real potential, I think that we haven’t seen it realized the way that it really could be. And so that’s, I would love to see more of that kind of automated patching added to tools and kind of development workflows.

CRob (20:29)
I’ll say my personal favorite experience out of all this is now that the competition, the minute the competition was over, then there was an ethical wall up between, you your administrators and us and the different competition teams. But now I’ve, we’ve observed the competitors, like looking at each other’s work and asking questions to each other and collaborating. that is, I’m so super excited to see what comes next. Now that all these smart people have proven themselves. and they found kind of connected spirits and they’re gonna start working together for even more amazing things.

Andrew Carney (21:07)
Absolutely. I think we’re expecting a state of knowledge paper with all the teams as authors. That’s something they’ve organized independently, to your point. And yeah, I cannot wait to see what they come out with collaboratively.

CRob (21:23)
Yeah. And anyone that’s interested to learn more or potentially directly interact with some of these competition experts, whether they’re in academia or industry, the OpenSSF is sponsoring as part of our AI ML working group. We’ve created a cyber reasoning special interest group specifically for the competition, all the competitors, and just to have public discussions and collaboration around these things. And we would invite everybody to show up and listen and participate as they feel comfortable and learn.

Well, Andrew and the whole DARPA and ARPA-H team, everyone that was involved in the competition, thank you. Thank you to our competitors. And we actually are going to have a series of podcasts talking to the individual competitors, kind of learning a little bit of the unique flavors and challenges these had. But thank you for sponsoring this and kind of really delivering something I think is going to have a ton of utility and value to the ecosystem.

Andrew Carney (21:47)
Thank you for working with us on this journey and we definitely look forward to more collaboration in the future.

CRob (21:54)
Well, and with that, we’ll wrap it up. I just want to tell everybody happy open sourcing. We’ll talk to you soon.

OpenSSF at DEF CON 33: AI Cyber Challenge (AIxCC), MLSecOps, and Securing Critical Infrastructure

By Blog

By Jeff Diecks

The OpenSSF team will be attending DEF CON 33, where the winners of the AI Cyber Challenge (AIxCC) will be announced. We will also host a panel discussion at the AIxCC village to introduce the concept of MLSecOps.

AIxCC, led by DARPA and ARPA-H, is a two-year competition focused on developing AI-enabled software to automatically identify and patch vulnerabilities in source code, particularly in open source software underpinning critical infrastructure.

OpenSSF is supporting AIxCC as a challenge advisor, guiding the competition to ensure its solutions benefit the open source community. We are actively working with DARPA and ARPA-H to open source the winning systems, infrastructure, and data from the competition, and are designing a program to facilitate their successful adoption and use by open source projects. At least four of the competitors’ Cyber Resilience Systems will be open sourced on Friday, August 8 at DEF CON. The remaining CRSs will also be open sourced soon after the event.

Join Our Panel: Applying DevSecOps Lessons to MLSecOps

We will be hosting a panel talk at the AIxCC Village, “Applying DevSecOps Lessons to MLSecOps.” This presentation will delve into the evolving landscape of security with the advent of AI/ML applications.

The panelists for this discussion will be:

  • Christopher “CRob” Robinson – Chief Security Architect, OpenSSF
  • Sarah Evans – Security Applied Research Program Lead, Dell Technologies
  • Eoin Wickens – Director of Threat Intelligence, HiddenLayer

Just as DevSecOps integrated security practices into the Software Development Life Cycle (SDLC) to address critical software security gaps, Machine Learning Operations (MLOps) now needs to transition into MLSecOps. MLSecOps emphasizes integrating security practices throughout the ML development lifecycle, establishing security as a shared responsibility among ML developers, security practitioners, and operations teams. When thinking about securing MLOps using lessons learned from DevSecOps, the conversation includes open source tools from OpenSSF and other initiatives, such as Supply-Chain Levels for Software Artifacts (SLSA) and Sigstore, that can be extended to MLSecOps. This talk will explore some of those tools, as well as talk about potential tooling gaps the community can partner to close. Embracing this methodology enables early identification and mitigation of security risks, facilitating the development of secure and trustworthy ML models.  Embracing MLSecOps methodology enables early identification and mitigation of security risks, facilitating the development of secure and trustworthy ML models.

We invite you to join us on Saturday, August 9, from 10:30-11:15 a.m. at the AIxCC Village Stage to learn more about how the lessons from DevSecOps can be applied to the unique challenges of securing AI/ML systems and to understand the importance of adopting an MLSecOps approach for a more secure future in open source software.

About the Author

JeffJeff Diecks is the Technical Program Manager for the AI Cyber Challenge (AIxCC) at the Open Source Security Foundation (OpenSSF). A participant in open source since 1999, he’s delivered digital products and applications for dozens of universities, six professional sports leagues, state governments, global media companies, non-profits, and corporate clients.