Mitigating Attack Vectors in GitHub Workflows

GitHub Actions are commonly used to automate processes in repositories, by running CI (continuous integration) tests on pull requests for example. It can also be used to make a package release process more secure just by making it automated. But, it is important to be careful to ensure that they are safe and do not expose the project to attacks. Understanding how workflows can be part of the attack surface for a GitHub project helps us understand how to prevent it.

Note that we’ll focus on threats to workflows running on GitHub-hosted runners, so threats targeting self-hosted runners are out of scope. For real life examples on threats regarding self-hosted runners see the Pytorch vulnerability report case.

TL;DR

This document provides an overview of the most common attack vectors on GitHub workflows and recommendations on how to secure them. In particular, it covers:

Running untrusted code in privileged workflows,
Code injections,
Vulnerable Actions,
Malicious releases,
Tag-Renaming attacks,
Imposter commits,
Usafe use of caches.

Relevant Concepts

To start, it is essential that we define some concepts regarding GitHub Workflows.

Privileged workflow: a workflow with any of the following:

Write permissions to any GitHub resource, like a repository or branch. A contents: write block grants a workflow write permission over the repository content. For projects created before February 2023, a workflow without explicit permissions has write-all permissions (unless configured otherwise).
Access to secrets, such as ${{ secrets.ACTIONS_TOKEN }}
The ability to produce any sensitive artifact, such as workflows used for build and/or release.

Workflow triggers

In this blogpost we’ll discuss the following triggers:

pull_request: runs on every pull request change. It is not allowed to access secrets or to have write permission when running on external pull requests–that can be submitted by anyone through forks.
pull_request_target: similar to pull_request, but runs as a privileged workflow for external pull requests.
workflow_run: privileged workflow that is triggered after another workflow runs.

All set up, we can dig into the possible ways an attacker could try using our workflows against us.

Untrusted Inputs

When code, variables or any other information used can be manipulated by external sources, we consider them untrusted. This is because an attacker might exploit these elements to jeopardize the repository’s integrity.

In terms of open source, this exploitation is usually done through external pull requests, by either submitting malicious code, or by using information that can be modified by external contributors (such as pull request’s title or commit messages) to perform a code injection.

Privileged triggers

A privileged trigger allows a privileged workflow to run untrusted code, without revoking its privileges.

One example of a non-privileged trigger is the pull_request because, although it can run with privilege on internal pull requests (trusted code) its privileges are revoked when running on external pull requests (untrusted code). The pull_request_target and the workflow_run (such as many other triggers) are privileged.

Attack Vectors

By succeeding to compromise a GitHub Workflow, an attacker can poison the repository through many different ways, such as:

Stealing secrets: secrets leaked can be exploited by the attacker to compromise different parts of the development process.
Committing code to the repository: if the workflow running has contents: write permission, the attacker could exploit this permission and force code into the repository’s main branch—unless branch protection is enabled.
Changing pull request checks: this can be used to pretend that a malicious contribution is actually a health contribution by overwriting the result of the checks to success instead of failure.
Compromising release artifacts: if the workflow compromised is a release workflow, the attacker can compromise the artifact by making it build from a modified code source, changing dependencies to malicious versions, etc.
Compromising default branch cache: if a workflow uses cache, it may be exposed to cache poisoning if there is any other workflow in the repository (privileged or not) exposed to code injection that runs in the context of the default branch.

To prevent these poisonings to happen, we will be looking into preventing attackers from compromising our GitHub Workflows.

Running untrusted code

Running privileged triggers while checking out to a pull request head can prompt the workflow to run untrusted code on a privileged workflow. This could be problematic, as privileged triggers may have access to secrets or have write permissions.

That said, there are some relevant use cases that need accessing and running pull request content while also requiring write permissions. A common example is labeling or commenting on workflows whose execution hinges on test results.

Let’s see one example of how this attack vector can be used by an attacker. Consider the following workflow:

name: CI
on:
    pull_request_target:
        branches: [ "main" ]
jobs:
    my-tests:
        steps:
        # checkout to pull request code
        - uses: actions/checkout
        with:
            ref: ${{ github.event.pull_request.head.sha }}
        - run: ./bash.sh
           env:
                TOKEN: ${{secrets.MY_SECRET}}

Considering that no permission is specified to the workflow above and that the repository is not configured to limit workflow permissions by default, it will run with write-all permissions.

Thus, this workflow has write permissions and access to secrets, so it is privileged. But it is checking the pull request branch so all checked-out code is untrusted. For example, an attacker can submit a PR printing the secret variable handed to bash.sh, gaining access to your secret once the workflow runs:

You can notice that GitHub redacts secrets from the logs, but it can’t avoid reversible transformations such as base32 encoding. With a simple decode anyone can transform the “JFKF6SKTL5AV6U2FINJEKVAK” to “IT_IS_A_SECRET”.

This can be used to expose any secrets on the workflow, such as GitHub PAT tokens or AWS tokens. But, even if the workflow does not use any secrets, if it has write permissions, the attacker might abuse those permissions through the implicit secret GITHUB_TOKEN that is always available on any workflow.

Remediation

When running workflows on untrusted code (e.g. external pull requests), be sure to restrict the permissions as read only or follow the divide it in two workflows, which will share information through artifacts.

For example, this unsafe workflow:

name: Unsafe Workflow
on:
    pull_request_target: 

permissions: {}

jobs:
    permissions:
        contents: read
        pull-requests: write
    my-job:
        - name: Checks out untrusted code
          uses: actions/checkout@v4
          with:
              ref: ${{ github.event.pull_request.head.sha }}
        # checks anything that needs to add a comment on the PR
        
        - name: Comment on PR
        # add a comment to the PR where it is running

Can be divided into the following workflows:

A non-privileged (read only with no secrets) workflow that has access to the untrusted code

name: Safe Read Workflow
on:
    pull_request:

permissions: {}

jobs:
    permissions:
        contents: read
    my-job:
        - name: Checks out untrusted code
          uses: actions/checkout@v4
        ... # checks anything
       - name: Save the PR id for the next workflow
          run: echo ${{ github.event.number }} > ./pr_number
        - name: Send the PR id for the next workflow
          uses: actions/upload-artifact@v2
          with:
            name: pr_number
            path: pr_number

A workflow with write permission without access to the untrusted code and that runs in the context of the pull request branch.

name: Safe Write Workflow
on:
    workflow_run:
      workflows: ["Safe Read Workflow"]
      types:
        - completed

permissions: {}

jobs:
  permissions:
    contents: read
    actions: read
    pull-requests: write
  my-job:
    - name: Download the artifact
    # downloads the artifact published by Safe Read Workflow
    - name: Comment on PR
    run: |
      var fs = require('fs');
      const pr_number = Number(fs.readFileSync('pr_number'));
      # add a comment to the PR where it is running

See Preventing pwn requests for another example of the steps above. For further information about running untrusted code risks, see “Running Attacker Code” section of Privilege Escalation Inside Your CI/CD Pipeline.

Code injection

Code injection on GitHub Actions is a cyberattack where malicious code is snuck into the workflow execution through a weak spot. This code tricks the workflow into running it, potentially allowing the attacker to steal secrets, compromise the code base, or abuse the permissions granted to the workflow.

There are three main ways a code can be injected into a workflow: by an untrusted input, an untrusted file or through environment files.

Unsafe Interpolation

Untrusted inputs

There is a long list of event context data that might be attacker controlled and should be treated as potentially untrusted input. Some common examples are an issue or pull-request’s title or body.

Let’s see in practice how a code injection in GitHub workflows would look like. Consider the following step of a workflow:

run: |
    echo "Comment created by ${{ event.comment.user.login }}"
    echo "${{ github.event.comment.body }}"

If the body of the pull request where it is is running is equal to {{ 1 + 1 }}, the output would be:

Comment created by SomeUser
2

Instead of an innocent sum operation, the code injected could be used to expose secrets, exploit the actions permissions(by committing code to the repository’s main branch for example), compromise a release artifact to affect the users, and so on.

Untrusted files

Another way of getting code injected on a workflow is when two workflows share information through artifacts. The information shared can be, for example, any pull request data (ID, title, author), the result of tests or linters, etc.

The following workflows exemplify this scenario:

# First Workflow
# It produces an artifact (with pull request ID) that the Second Workflow will consume
on: pull_request

jobs:
  my-first-job:
    steps:
      - run: echo ${{ github.event.number }} > artifact.txt
      - uses: actions/upload-artifact@v2
        with:
          name: artifact
          path: artifact.txt

# Second Workflow
# It consumes an artifact produced by the First Workflow
on: workflow_run
jobs:
  my-second-job:
    steps:
      - name: download pr artifact
      uses: dawidd6/action-download-artifact@v2
      with:
          workflow: ${{github.event.workflow_run.workflow_id}}
          run_id: ${{github.event.workflow_run.id}}
          name: artifact
    # Save PR id to output
    - name: Save artifact data
      id: artifact
      run: echo "::set-output name=id::$(cat artifact.txt)"
    - name: Use artifact
      run: echo ${{ steps.artifact.outputs.id }}

The risk related to this practice is if the first workflow gets compromised and its artifact is not reliable anymore.

This scenario is common to share information between an unprivileged pull_request workflow and a privileged workflow_run. This technique avoids running untrusted code on privileged workflows.

An attacker can compromise the file shared between both workflows. For example, they could submit a PR changing the pull_request workflow artifact content:

# First Workflow
# It produces an artifact (with pull request ID) that the Second Workflow will consume
on: pull_requestjobs:
  my-first-job:
    steps:
      - run: echo "; echo 'attacker controlled code'; #" > artifact.txt
      - uses: actions/upload-artifact@v2
        with:
          name: artifact
          path: ./artifact.txt

And once the worflow_run executes, it will run the attacker’s code.

Environment file

It is also dangerous to define an environment variable using environment files when it is assigned untrusted content to it.

Both variables declared below can be exposed to code injection through the file.txt if it is, by any way, considered untrusted.

An untrusted file is one that can be potentially compromised by any of the attack vectors described in this blog post. For instance, if the file originates from another workflow that lacks adequate security measures or resides in the repository while the workflow executes on a pull request, the file is considered untrusted.

run: echo "VAR=$(cat file.txt)" >> $GITHUB_ENV

# Or

run: echo "::set-env name=VAR::$(cat file.txt)"

The same can happen even if it is an environment variable with untrusted content:

- env:
      BODY: ${{ github.event.issue.body }}

run: echo "VAR=${BODY}" >> $GITHUB_ENV

# Or

run: echo "::set-env name=VAR::${BODY}"

In this case, an issue body with a content like the one below would trick the command to define as many environment variables as the attacker wants, allowing them to change any code execution behavior.

FOO
OTHERVAR="Any Value"

Remediation

Prefer action over inline script. Action’s parameters are stringified, which prevents code injection by treating the input as text data, making it harmless even if it contains malicious code.
Restrict GitHub token permissions to the minimal required permissions. This way we would limit the damage the attacker could cause even if they were successful in compromising the workflow.

This can be done by either configuring the permissions needed for each workflow:

# Configuring with **no** permission by default for the jobs
# and each job should configure **any** permissions they need.
name: My workflow
permissions: {}
jobs:
  first-job:
    permissions:
      contents: read
  second-job:
    permissions:
      contents: read
      issues: write
  ...

# Configuring with **read** permissions by default for the jobs
# and each job that needs different permissions should configure it themselves.
name: My workflow
permissions:
  contents: read
jobs:
  first-job:
    # no permissions block
  second-job:
    permissions:
      contents: read
      issues: write
  ...

Or by changing the default GITHUB_TOKEN permissions on settings (any additional read or write permission should be granted as explained above):

Use environment variables when parsing inputs and files. Environment variables are also stringified before running.

# Instead of
run: echo "${{ github.event.issue.title }}"# Do
env:
    TITLE: ${{ github.event.issue.title }}
run: echo "$TITLE"

Use the Action’s output instead of environment files to share information between steps or jobs. The Action’s outputs are also stringified.
Validate the file or input content if you need to parse it to environment files. This way you can be more sure that the file content is exactly what is expected from it. For example, if the file should have one single number, you could reject any file that contains characters different from 0-9 digits.

For further information on Code injection, see Untrusted input, Google & Apache Found Vulnerable to GitHub Environment Injection and “Insecure Usage of Artifacts” section on Privilege Escalation Inside Your CI/CD Pipeline.

Vulnerable Actions

Another risk, that is harder to mitigate, is that the Actions used may actually be malicious or buggy.

A bug or malicious code on the used Action can lead it to silently change unwanted files, compromise release artifacts, abuse of the permissions granted to the workflow (changing files in the codebase, change the result of the Action—purposely hiding bugs and vulnerabilities) or even any other code injection risk.

Remediation

Check for known open vulnerabilities in the Action to be included.
Prioritize GitHub-owned Actions whenever possible.
Restrict GitHub token permissions to the minimal required permissions.

For further information about vulnerable Actions, see Actions That Open the Door to CI/CD Pipeline Attacks.

Vulnerable Releases

A project release process is also exposed to attacks and it might affect its dependents. Since the dependencies of a GitHub Workflow are composed by, basically, GitHub Actions and installable tools used on it, if their release process is compromised, the workflow would be also compromised.

An Action, for example, can have its version defined in basically four ways:

Branch pin: actions/checkout@main
Major version pin: actions/checkout@v4
Minor version pin: actions/checkout@v4.1.1
Hash pin: actions/checkout@b4ffde65f46336ab88eb53be808477a3936bae11 # v4.1.1

Considering that the attacker gains full access to the Action’s repository, they can change the source code and publish a new release under the same major (v4.1.2 for example) to affect all users that are referencing the Action using both the major version and branch pin.

This risk can be avoided by opting for a minor version pin, but it is not correct to think that this would make you safe against this attacker.

Instead of publishing a new release, they can change the commit hash related to an existing version:

git push origin :refs/tags/v4.1.1
git tag -fa v4.1.1
git push origin main --tags

This way they are able to affect both major version and minor version pin users. Users pinning to a specific hash wouldn’t be affected.

Remediation

Restrict GitHub token permissions to the minimal required permissions.
Pin the Action to a specific commit.

This prevents new vulnerabilities to be blindly included in the workflow and avoids tag renaming attacks. Although, if not proactively updated, the project wouldn’t benefit from security patches, that’s why this remediation should be done together with the following one.

Enable “Dependabot security updates” (e.g. Dependabot, Renovatebot) to receive security patches as soon as they are released
Configure dependabot to periodically look for new versions to be up to date to new fixes and features. Configuring it to run monthly allows a good window for vulnerabilities to be found and fixed before affecting you.

For further information about vulnerable releases, see Why you should pin your GitHub Actions by commit-hash

Imposter Commits

Imposter commits are commits that pretend to be in the original repository but actually belong to a fork. This can happen due to how GitHub handles forks: it shares commits between the fork and its parent. This provides many features that are very convenient to developers, but it is also convenient for attackers too.

That’s because, if they fork a repository and publish a malicious commit on it, they are able to reference their commit pretending that it is from the parent repository.

For example, consider that an attacker forked the repository someone/my-project, creating the fork attacker/my-project. The commit attacker/my-project@ea14e30 can be successfully referenced as someone/my-project@ea14e30, pretending to be from the parent repository.

This attack vector was disclosed by Chainguard and, although GitHub has released a feature to show a warning whenever you open a link to a commit that is not originally from the repository you are looking for, there is still no way to easily identify an impostor commit when evaluating pull requests that upgrade these hashes.

Remediation

Restrict GitHub token permissions to the minimal required permissions
When upgrading the hash of an Action version on your workflow, ensure that the new hashes belong to the original repository. This can be done by accessing the link to the commit web page (https://github.com/<owner>/<repo>/commit/<hash>) and verifying that there is no warning.

For further information about imposter commits, see What the fork? Imposter commits in GitHub Actions and CI/CD

Caching

Caching can be a very effective way to speed up workflows, but it is important to use it with care. One of the biggest risks of caching is that it can break the temporality and isolation of jobs. Losing these characteristics also implies that, if one job is compromised we cannot consider the other jobs safe anymore.

It is also important to have in mind that anyone can open a pull request in the repository and see the cache content, so any sensitive information should not be included there.

Besides these two factors, there is also another risk to be aware of when using GitHub cache: if an attacker succeeds to compromise a workflow that runs in the context of the default branch (through code injection for example), even if it is not a privileged workflow, it can use this to steal the cache tokens which can be used to move laterally to other workflows that use caching.

This would allow compromising privileged workflows, even if they are not running or dealing with any type of untrusted content.

Remediation

Do not cache sensitive information.
When caching, if one job runs untrusted code, consider all the other jobs also untrusted. It means following the remediation for Running untrusted code.
Never run untrusted code within the context of the default branch. Even if you are not using cache on any of your workflows, the third-party actions you use can be caching under the hood.

Thus, be sure that all of your workflows are safe from Code injection and that you are not running unsafe code in the context of the default branch. Use pull-request triggers instead, since they are considered safe because, even though they run untrusted code, they run it in the context of the pull request branch with no privilege.

Since it is difficult to be 100% sure our workflows cannot be compromised, an additional security practice is to avoid caching privileged or sensitive workflows at all (e.g. release workflows).

For further reading on cache poisoning and its risks, see The monsters in your build cache – GitHub actions cache poisoning.

Main Takeaways

After going through all the attack vectors that can target GitHub Workflows, it is clear that there are many things to consider and be cautious about. Instead of having to remember them all, you can try the OpenSSF Scorecard CLI Tool (or its Action) to help you proactively identify many of these risks.

Here are some key takeaways:

Restrict GitHub token permissions to the minimum required ones. This was mentioned as a remediation for almost all the attack vectors and the reason is that this reduces the damage any of these attack vectors could cause by basically limiting the permissions the workflow has on the repository.

But it is also important to notice that it doesn’t make a compromised workflow completely harmless, so it is important to also follow the other remediation whenever possible.

Never run untrusted code on privileged workflows or in the default branch, instead follow the recommendation of dividing the workflow in two: a privileged and a non-privileged workflow.

Be careful when handling untrusted data. Always parse potentially attacker-controlled data to an environment variable.

Hash pin and adopt a dependency update tool. Hash pin your Actions to ensure they are immutable and adopt a dependency update tool to get them updated regularly (once a week or month for example). Also, be cautious when receiving external contributions that update these hashes to avoid imposter commits.

About the Author

JoyceBrum
Joyce Brum is a Software Engineer at Google, full time member of the Google Open Source Security Team that works in improving the Open Source ecosystem security by making changes with a security-minded focus and implementing best practices.

TL;DR

Relevant Concepts

Attack Vectors

Running untrusted code

Remediation

Code injection

Unsafe Interpolation

Untrusted inputs

Untrusted files

Environment file

Remediation

Vulnerable Actions

Remediation

Vulnerable Releases

Remediation

Imposter Commits

Remediation

Caching

Remediation

Main Takeaways

About the Author

We envision a future where OSS is universally trusted, secure, and reliable. Join us in making open source more secure.

Get the latest announcements, event info, and the community news in your inbox