Airflow tutorial 7: Airflow variables

1 minute read

In this tutorial, we explore how to use Airflow variables

The GitHub links for this tutorial

What are Airflow variables?

  • Variables are key-value stores in Airflow’s metadata database.
  • It is used to store and retrieve arbitrary content or settings from the metadata database.

When to use Variables

  • Variables are mostly used to store static values like:
    • config variables
    • a configuration file
    • list of tables
    • list of IDs to dynamically generate tasks from
  • Separate the constants and variables from pipeline code:
    • It is useful to have some variables or configuration items accessible and modifiable through the UI.

Working with Variables

  • Variables can be listed, created, updated and deleted from the UI (Admin -> Variables).
  • In addition, json settings files can be bulk uploaded through the UI. Please look at an example here for a variable json setting file

Restrict the number of Airflow variables in your DAG

  • Since Airflow Variables are stored in Metadata Database, so any call to variables would mean a connection to Metadata DB.
    • Instead of storing a large number of variable in your DAG, which may end up saturating the number of allowed connections to your database.
    • It is recommended you store all your DAG configuration inside a single Airflow variable with JSON value.

  • You can then access the variables as follow:
# Config variables
## Common
var1 = "value1"
var2 = [1, 2, 3]
var3 = {'k': 'value3'}

## 3 DB connections called
var1 = Variable.get("var1")
var2 = Variable.get("var2")
var3 = Variable.get("var3")

## Recommended way
dag_config = Variable.get("example_variables_config", deserialize_json=True)
var1 = dag_config["var1"]
var2 = dag_config["var2"]
var3 = dag_config["var3"]

# You can directly use a variable from a jinja template
## {{ var.value.<variable_name> }}

t2 = BashOperator(
    task_id="get_variable_value",
    bash_command='echo {{ var.value.var3 }} ',
    dag=dag,
)

## {{ var.json.<variable_name> }}
t3 = BashOperator(
    task_id="get_variable_json",
    bash_command='echo {{ var.json.example_variables_config.var3 }} ',
    dag=dag,
)

Access variables through Airflow command line

  • You can run some CRUD operations on variables through the Airflow CLI
  • Some command you can run in this example:
# get value of var1
docker-compose run --rm webserver airflow variables --get var1

# set value of var4
docker-compose run --rm webserver airflow variables --set var4 value4]

# import variable json file
docker-compose run --rm webserver airflow variables --import /usr/local/airflow/dags/config/example_variables.json

Leave a comment