Apache Airflow 3.x

Haiyue
6min

Chapter 1: Introduction to Apache Airflow 3.x

Learning Objectives:

  1. Understand what Apache Airflow is and its use cases
  2. Learn the key differences between Airflow 3.x and Airflow 2.x
  3. Install and configure Airflow 3.x development environment
  4. Navigate the Airflow Web UI and CLI

Brief Description: Get started with Apache Airflow 3.x from scratch, understand workflow orchestration concepts, and set up your first development environment.

Chapter 2: Core Concepts and Architecture

Learning Objectives:

  1. Understand DAGs, Tasks, and Operators
  2. Learn the Airflow 3.x architecture (Scheduler, Executor, Webserver, DAG Processor)
  3. Understand the new Task Execution Interface in Airflow 3.x
  4. Learn how metadata database and message broker work together

Brief Description: Deep dive into Airflow’s core architecture and fundamental concepts that form the foundation for building workflows.

Chapter 3: Writing Your First DAG

Learning Objectives:

  1. Create a basic DAG with the TaskFlow API
  2. Understand DAG parameters and configuration
  3. Use the @dag and @task decorators
  4. Run and monitor your DAG in the Web UI

Brief Description: Write your first complete DAG using the modern TaskFlow API, learning the preferred Airflow 3.x programming style.

Chapter 4: Built-in Operators and Sensors

Learning Objectives:

  1. Master common operators (BashOperator, PythonOperator, EmailOperator)
  2. Understand Sensors and their poke/reschedule modes
  3. Use FileSensor, HttpSensor, and ExternalTaskSensor
  4. Choose between operators and TaskFlow-decorated functions

Brief Description: Explore the rich library of built-in operators and sensors, understanding when and how to use each type.

Chapter 5: Task Dependencies and Control Flow

Learning Objectives:

  1. Define task dependencies using >>, <<, and chain()
  2. Use BranchPythonOperator for conditional workflows
  3. Implement trigger rules (all_success, one_failed, none_skipped, etc.)
  4. Use task groups to organize complex DAGs

Brief Description: Master task dependency patterns and control flow mechanisms to build sophisticated workflow logic.

Chapter 6: XCom and Data Passing

Learning Objectives:

  1. Understand XCom for inter-task communication
  2. Push and pull XCom values manually and automatically
  3. Use TaskFlow API for implicit XCom passing
  4. Handle large data with custom XCom backends

Brief Description: Learn how tasks communicate and share data using XCom, including the streamlined approach in Airflow 3.x.

Chapter 7: Scheduling and Timetables

Learning Objectives:

  1. Master cron expressions and preset schedules
  2. Use Timetable API for custom scheduling logic
  3. Understand data intervals and logical dates
  4. Configure catchup, backfill, and max_active_runs

Brief Description: Control when and how your DAGs run with flexible scheduling options and the powerful Timetable API.

Chapter 8: Connections, Hooks, and Providers

Learning Objectives:

  1. Configure connections via UI, CLI, and environment variables
  2. Use hooks to interact with external systems (databases, cloud services)
  3. Install and use provider packages (AWS, GCP, Azure, etc.)
  4. Build a data pipeline with real external integrations

Brief Description: Connect Airflow to the outside world using connections, hooks, and the extensive provider ecosystem.

Chapter 9: Dynamic DAGs and Advanced Patterns

Learning Objectives:

  1. Generate DAGs dynamically from configuration files
  2. Use dynamic task mapping (expand/map) for parallel processing
  3. Implement SubDAGs vs TaskGroups comparison
  4. Apply the dataset-driven scheduling in Airflow 3.x

Brief Description: Learn advanced DAG patterns including dynamic generation, mapped tasks, and event-driven scheduling with datasets.

Chapter 10: Testing and Debugging DAGs

Learning Objectives:

  1. Write unit tests for DAGs, tasks, and custom operators
  2. Use airflow dags test and airflow tasks test commands
  3. Debug DAGs locally with IDE integration
  4. Validate DAG integrity and detect import errors

Brief Description: Ensure workflow reliability through comprehensive testing strategies and effective debugging techniques.

Chapter 11: Security, RBAC, and Multi-Tenancy

Learning Objectives:

  1. Configure Role-Based Access Control (RBAC) in Airflow 3.x
  2. Manage users, roles, and permissions
  3. Secure connections and variables with secrets backends
  4. Implement multi-tenancy patterns for team isolation

Brief Description: Secure your Airflow deployment with proper access control, secrets management, and multi-tenancy strategies.

Chapter 12: Production Deployment and Operations

Learning Objectives:

  1. Deploy Airflow on Kubernetes with the official Helm chart
  2. Configure CeleryExecutor and KubernetesExecutor for scalability
  3. Set up monitoring with Prometheus, Grafana, and StatsD
  4. Implement CI/CD pipelines for DAG deployment

Brief Description: Take Airflow to production with scalable deployment architectures, monitoring, and operational best practices.