Skip to main content

đŸ“© Stay Updated! Follow us on LinkedIn and join our mailing list for the latest news!

Deconstructing the AI Cyber Challenge (AIxCC)

By December 19, 2023Blog

The AI Cyber Challenge (AIxCC) is structured around two tracks and multiple competitions and events. For a brief overview of AIxCC, watch the video: AI Cyber Challenge Streaming Event. Check out the announcements and challenge information here:

About AIxCC Competition Tracks

AIxCC will ask competitors to design Cyber Reasoning Systems (CRSs) that integrate novel AI capabilities to secure critical code. The Challenge has two tracks, the Open Track and the Small Business Track. These two tracks help facilitate the maturity of CRS capabilities as we enter the competitions. The Small Business Track will consist of seven (7) participants, who are each awarded $1 million as an initial prize, just for being selected to enter the Small Business Track. This initial prize helps ensure the teams have equitable access to the resources required to build the best version of their solution. All seven participants in the Small Business Track are automatically entered into the Open Track and any Small Business Track participant who does not make it into the group of seven will also be automatically entered into the Open Track. Participating in the Small Business Track has specific requirements, which can be found here.

Participants in either track must first register. AIxCC is open to team members and individuals of all nationalities, ages, academic institutions, and business entities, with some caveats as described by the AIxCC Rules. In particular, the participant’s “Entrant Official” “shall be incorporated in and maintain a primary place of business in the United States, and in the case of an individual, whether participating singly or in a group, shall be a citizen or permanent resident of the United States,” per 15 U.S. Code § 3719. This means that at least one member of the participating team, must be designated the ‘Entrant Official’ per the definition above. Registration is open from December 13, 2023 – April 30, 2024. There is no fee for entry. For the Small Business Track only:

  • Small Business Track Registration Closes: January 15, 2024
  • Small Business Track Prize Winners Announced: February 2024

For both tracks, the participants must submit a Registration Paper that consists of the following:

  • A justification of feasibility of their CRS concept. White papers are encouraged to include supporting evidence such as:
    • Publicly available evidence
    • Past work
    • Research results
    • Other relevant supporting evidence
  • The technical approach for constructing their CRS for competing in AIxCC, including:
    • Specific objectives and metrics
    • Risks and mitigations
    • A technical plan for accomplishment of objectives

If submitting to the Small Business Track, this Registration Paper has additional requirements and may be referred to as a “Concept White Paper”. The additional requirements revolve around the Small Business participants detailing their Open Source Strategy. The Linux Foundation provides a great guide for organizations looking to set up an open source strategy.

About AIxCC Competition Schedule

AIxCC was announced at Black Hat 2023 and DEFCON 31. The Challenge was announced on the mainstage at Black Hat right after the opening keynote and there was an in-depth panel at DEFCON, which you should be sure to check out.

AIxCC will kick off the competition portion in March 2024. Preliminary events will be held through March and into July right before Blackhat 2024 and DEFCON 32. These preliminary events will help competitors hone their capabilities for a more successful outcome at the semifinal competition, which occurs in August 2024 in Las Vegas. Participants may compete virtually or in-person.

At the AIxCC Semifinal Competition (ASC) in August, 2024 at DEFCON 32 in Las Vegas, participants will compete to create a fully autonomous CRS to find and fix vulnerabilities within the Challenge Projects, without human assistance.

Participants will receive an identical corpus of challenge projects modeled on real-world, open-source projects. The CRS should create a patch to fix the vulnerabilities that correctly removes the vulnerable code and replaces it with safe code, while not removing any intended functionality of each underlying project. The programming languages in which challenge projects will be written may include: C, C++, Java, Rust, Go, JavaScript, TypeScript, Python, Ruby, and PHP. At least 50% of the challenge projects will be written primarily in C or C++, and will focus on memory corruption-related vulnerabilities. The seven top scoring teams in the semifinal competition will receive $2 million each and move on to the AIxCC Final Competition (AFC). The AFC will be held in August 2025. This gives the Semifinal winners about a year to enhance their CRS’s capabilities. The top winners of the final competition will receive additional prize money: $4 million (first place), $3 million (second place), and $1.5 million (third place). 

For more details on the ASC format and other details, be sure to check out the AIxCC Rules.

About the Scoring Algorithm & Exemplar Challenges

THe AIxCC Semifinals Competition (ASC) round will leverage a defined Scoring Algorithm and will release Exemplar challenge projects ahead of the Semifinal competition. Aspects of the Scoring Algorithm and Exemplars have been released on the Semifinals site (see the bottom of the webpage), primarily as a request for comment (RFC). AIxCC is asking for community comments, limited to 400 words or less, to the draft Scoring Algorithm and Exemplar approach. These comments are due January 15, 2024.

The Scoring Algorithm focuses on four key objectives, though as comments to the draft approach are assessed after January 15th, these objectives and other aspects of the approach detailed below may change:

  1. Automated Evaluation: AIxCC is an automated competition. Thus, scoring must also be automated.
  2. Metagaming: Teams should focus on improvements in automated vulnerability discovery and program repair–not the analysis or defeat of the scoring algorithm.
  3. Neutrality: The AIxCC competition will foster innovative research via a gamified environment that challenges participants to outperform the state of the art in automated program repair and vulnerability discovery. The scoring algorithm will not incentivize any particular approach or methodology.
  4. Real-World Relevance: AIxCC aims to secure critical infrastructure and open source software. To that end, the AIxCC scoring algorithm and challenges will be crafted such that the resulting CRSs can assess software on a scale that mirrors real-world software applications.

In order to assess these objectives, participants will need to achieve scores in four metrics, and  total score will be the aggregate of the scores within each metric:

  1. Diversity Multiplier (DM): Measures a CRS’s ability to complete a diverse set of cyber reasoning tasks across multiple languages and vulnerability types. Specifically, it rewards systems that can find and patch a broad range of Common Weakness Enumerations (CWEs) across multiple programming languages.
  2. Accuracy Multiplier (AM): Measures a CRS’s vulnerability discovery accuracy. It tracks the number of invalid or rejected submissions from a team, with more inaccuracies leading to a lower AM. This aims to prevent strategies which simply report a large number of vulnerabilities without proper validation.
  3. Vulnerability Discovery Score (VDS): Measures a CRS’s performance in vulnerability discovery accuracy. It scores based on whether a submitted proof of vulnerability actually triggered the expected sanitizers. In essence, this validates that the CRS correctly understands the target vulnerability.
  4. Program Repair Score (PRS): Measures a CRS’s performance in generating patches that effectively remediate vulnerabilities without breaking functionality, as well as patches that would likely be accepted by human developers. It scores based on functionality testing, security testing, and adherence to best coding practices/style.

These 4 core metrics aim to provide coverage across the key aspects of vulnerability discovery and repair. The DM rewards breadth, the AM penalizes inaccuracy, the VDS measures discovery precision, and the RPS evaluates real-world patch quality. Together they incentivize well-rounded systems that can be deployed at scale in our open source ecosystem.

Before the semifinals, AIxCC will release challenge project exemplars. Participants are encouraged to use the exemplars to test their CRSs. “The format of the [challenge project] can help prepare CRSs on what to expect at the time of the competition.” The challenge project exemplars are open source software (OSS) injected with known vulnerabilities, and the exemplars will be built from the same set of OSS projects that will be used in the AIxCC semifinal competition. The competition challenge projects will contain many more vulnerabilities, and those vulnerabilities won’t be in the exemplars or publicized before the competition. The exemplars are intended to help competing teams create the best CRSs and may include artifacts such as “build environment and instructions for building, running, and testing the projects; example discovery and patch submissions that score; and further information.” AIxCC will also provide a live submission system so that competitors can see if their CRSs can correctly submit their results to the scoring system.

The first challenge project exemplar uses the Linux kernel. The Linux kernel is widely used, including by over 81% of websites and 90% of public cloud workloads. All of the top 500 world’s supercomputers run Linux, and the Linux kernel is in a variety of critical infrastructure services, including medical equipment, autonomous vehicles, and spacecraft. This exemplar re-introduces published vulnerability into the Linux kernel, specifically vulnerability CVE-2021-43267 in the Linux kernel’s Transparent Inter Process Communication (TIPC) subsystem. This subsystem supports communication across clusters on a network. The re-introduced vulnerability is a heap-based buffer overflow, a distressingly common kind of vulnerability today.

The second exemplar will use Jenkins (the OSS automation server and CI/CD solution) with an exemplar vulnerability of its own. The AIxCC team expects to publicly release more challenge project exemplars. These exemplars will allow competing teams to practice before taking on the real ASC.

We encourage people and organizations to consider participating in this challenge. This is a bold attempt to encourage the search for ways to use AI to improve software security more generally.