Airflow tutorial 2: Set up airflow environment with docker
We will learn how to set up airflow environment using Docker
The GitHub links for this tutorial
Airflow problem
- open source software grows at an overwhelming pace
- Airflow is built to integrate with all databases, system, cloud environments, …
- Managing and maintaining all of the dependencies changes will be really difficult.
- Takes lots of time to set up, and config Airflow env.
- How to share development and production environments for all developers.
Docker overview
- Docker is an open platform for developing, shipping, and running applications.
-
Docker provides the ability to package and run an application in a loosely isolated environment called a container. The isolation and security allow you to run many containers simultaneously on a given host, regardless of its operating system: Mac, Windows, PC, cloud, data center, …
- Containers are lightweight because they don’t need the extra load of a hypervisor, but run directly within the host machine’s kernel. This means you can run more containers on a given hardware combination than if you were using virtual machines.
Benefits of using Docker
- Docker is freeing us from the task of managing, maintaining all of the Airflow dependencies, and deployment.
- Easy to share and deploy different versions and environments.
- Keep track through Github tags and releases.
- Ease of deployment from testing to production environment.
Airflow docker image
Fork and clone the below Github repo and follow the instruction to set up
- airflow-tutorial: https://github.com/tuanavu/airflow-tutorial
Prerequisites
- Install Docker
- Install Docker Compose
- Following the Airflow release from Python Package Index
Getting Started
- Clone the repo
- Install the prerequisites
- Run the service
- Check http://localhost:8080
- Done!
Leave a comment