Apache Airflow providers are plugins that allow Airflow to interface with external systems.The capabilities of Apache Airflow can be extended by installing additional packages, called providers.
Full list of the providers can be found here
https://airflow.apache.org/docs/apache-airflow-providers/packages-ref.html
First we need to install this provider
pip install ‘apache-airflow-providers-docker’
Now let’s try to understand task with a simple example
from airflow import DAG from datetime import datetime, timedelta from airflow.providers.docker.operators.docker import DockerOperator default_args = { "owner":"airflow", "email_on_failure":False, "email_on_retry":False, "email":"airflowadmin@airflow.com", "retries":1, "retry_delay":timedelta(minutes=5) } with DAG("forex_data_pipeline",start_date=datetime(2023,11,14),schedule_interval="@daily", default_args=default_args,catchup=False) as dag: docker_task = DockerOperator( task_id='docker_task', image='python:3.7', api_version='auto', auto_remove=True, command='/bin/sleep 30' )
- First we are creating the instance of the DockerOperator and assign this to docker_task variable
- We always have to specify task_id the task ID must be unique across all of the operators you have in the same Dag.
- image=’python:3.7′ specifies the Docker image that the operator will use to create a container.
- After that we api_version this is used specify the api_version to use auto means Docker API version based on the installed Docker SDK.
- Then we have auto_removal=true this means container will be removed once it’s done executing
- After this we have command which as the name suggest a docker command which means it will sleep for 30 seconds
Now we test this task if this running successfully or not for that we have to run the below command
airflow tasks test forex_data_pipelinedocker_task 2023-11-01
Here forex_data_pipeline is dag id and docker_task is task id and 2023-11-01 is our execution date in past
We at Helical have more than 10 years of experience in providing solutions and services in the domain of data and have served more than 85+ clients. Please reach out to us for assistance, consulting, services, maintenance as well as POC and to hear about our past experience on Airflow. Please do reach out on nikhilesh@Helicaltech.com
Airflow 2.0 Providers airflow provider packages apache-airflow-providers list How do I upgrade my Airflow provider? How to create your own provider in Apache Airflow Provider packages Providers packages reference in Apache Airflow What is Providers in Apache Airflow What is the difference between Airflow extras and providers?