Monitoring Airflow Metrics
By Hiren Rupchandani & Mukesh Kumar
Table of Contents
2. StatsD
3. Prometheus
4. Grafana
- Monitoring metrics of data pipelines can be tricky. In Airflow, we need to check between web server UI, the python code, the DAG logs, and some other monitoring tools.
- In this article, we will explore a monitoring system consisting of StatsD, Prometheus, and Grafana.
- Airflow exposes metrics such as DAG bag size, number of currently running tasks, and task duration time, every moment the cluster is running.
- While we set them up, we will see what each of them specializes in.
- You can find a list of all the different metrics exposed, along with descriptions, in the official Airflow documentation.
Prerequisites
- You need to have Docker installed on your system before proceeding with the following steps.
- We have Docker Desktop installed with a WSL2 backend, so you can proceed with the same.
Let’s get started…
1. StatsD
- StatsD is a widely used service for collecting and aggregating metrics from various sources.
- Airflow has built-in support for sending metrics into the StatsD server.
- Once configured, Airflow will then push metrics to the StatsD server and we will be able to visualize them.

- You need to open the airflow.cfg file and search for statsd.
- You will see the following variables:
[metrics]
statsd_on = False
statsd_host = localhost
statsd_port = 8125
statsd_prefix = airflow
- Set statsd_on =True. If you wish to change the port where we can listen to the metrics, you can do so.
- Now run a DAG in airflow and in a different terminal, type the following command:
nc -l -u localhost 8125
- This command will basically return the airflow metrics that are exposed and caught by statsd.
- The output will be continuous since metrics are collected constantly after specific intervals.

- We will now use a statsd exporter to send these metrics to Prometheus.
2. Prometheus
- Prometheus is a popular solution for storing metrics and alerting.
- Because it is typically used to collect metrics from other sources, like RDBMSes and webservers, we will use Prometheus as the main storage for our metrics.
- Because Airflow doesn’t have an integration with Prometheus, we’ll use Prometheus StatsD Exporter to collect metrics and transform them into a Prometheus-readable format.
- It bridges the gap between StatsD and Prometheus by translating StatsD metrics into Prometheus metrics via configured mapping rules.

- To set up statsd-exporter, you need to write two files: “prometheus.yml” and “mapping.yml”, you can find those files on our GitHub repository linked below.
- You need to store these files inside a new folder named as “.prometheus” inside your airflow directory.
- After you have copied these two files inside the folder, type the following command on a Ubuntu terminal:
docker run --name=prom-statsd-exporter \
-p 9123:9102 \
-p 8125:8125/udp \
-v $PWD/mapping.yml:/tmp/mapping.yml \
prom/statsd-exporter \
--statsd.mapping-config=/tmp/mapping.yml \
--statsd.listen-udp=:8125 \
--web.listen-address=:9102
- If you see the following line in the output, you are good to go:
level=info ts=2021-10-04T04:38:30.408Z caller=main.go:358 msg="Accepting Prometheus Requests" addr=:9102
- Keep this terminal open and type a new command in a new Ubuntu terminal:
docker run --name=prometheus \
-p 9090:9090 \
-v $PWD/prometheus.yml:prometheus.yml \
prom/prometheus \
--config.file=prometheus.yml \
--log.level=debug \
--web.listen-address=:9090 \
- Your prometheus has been set up successfully if you see the following line in your output:
level=info ts=2021-10-02T21:09:59.717Z caller=main.go:794 msg="Server is ready to receive web requests."
3. Grafana

- Grafana is our preferred metrics visualization tool.
- It has native Prometheus support and we will use it to set up our Airflow Cluster Monitoring Dashboard.
- You can see here all vital metrics: like scheduler heartbeat, dagbag size, queued/running tasks count, currently running DAGs aggregated by tasks, etc.
- You can see DAG-related metrics: success DAG run duration, failed DAG run duration, DAG run dependency check time, and DAG run schedule delay.
- To set up Grafana on your system, type the following command in your Ubuntu terminal:
docker run -d --name=grafana -p 3000:3000 grafana/grafana
- After successful setup, you can go to https://localhost:3000/ to access the grafana dashboard.
- On the dashboard, click on icon below the + icon to create a new data-source, and select Prometheus data source and in the URL section, type:
host.docket.internal:9090

- Now, create a new dashboard and assign it any name. You will be directed to a Edit Panel.


- In the Metrics Browser, you can type which metrics you want to monitor. We will check the scheduler heartbeat, so we will type the following command:
rate(airflow{host="airflowStatsD", instance="host.docker.internal:9123", job="airflow", metric="scheduler_heartbeat"}[1m])
- We will see that a plot (time-series in our case)is being generated for the above command, and the plot will be updated after every minute (indicated by 1m).

- In the panel options, you can also format your plot by assigning X-axis name and title for the plot.

- Congratulations! You have successfully set up a monitoring system for Airflow using StatsD, Prometheus, and Grafana.