Apache Airflow 3.x
Chapter 1: Introduction to Apache Airflow 3.x
Learning Objectives:
- Understand what Apache Airflow is and its use cases
- Learn the key differences between Airflow 3.x and Airflow 2.x
- Install and configure Airflow 3.x development environment
- Navigate the Airflow Web UI and CLI
Brief Description: Get started with Apache Airflow 3.x from scratch, understand workflow orchestration concepts, and set up your first development environment.
Chapter 2: Core Concepts and Architecture
Learning Objectives:
- Understand DAGs, Tasks, and Operators
- Learn the Airflow 3.x architecture (Scheduler, Executor, Webserver, DAG Processor)
- Understand the new Task Execution Interface in Airflow 3.x
- Learn how metadata database and message broker work together
Brief Description: Deep dive into Airflow’s core architecture and fundamental concepts that form the foundation for building workflows.
Chapter 3: Writing Your First DAG
Learning Objectives:
- Create a basic DAG with the TaskFlow API
- Understand DAG parameters and configuration
- Use the
@dagand@taskdecorators - Run and monitor your DAG in the Web UI
Brief Description: Write your first complete DAG using the modern TaskFlow API, learning the preferred Airflow 3.x programming style.
Chapter 4: Built-in Operators and Sensors
Learning Objectives:
- Master common operators (BashOperator, PythonOperator, EmailOperator)
- Understand Sensors and their poke/reschedule modes
- Use FileSensor, HttpSensor, and ExternalTaskSensor
- Choose between operators and TaskFlow-decorated functions
Brief Description: Explore the rich library of built-in operators and sensors, understanding when and how to use each type.
Chapter 5: Task Dependencies and Control Flow
Learning Objectives:
- Define task dependencies using
>>,<<, and chain() - Use BranchPythonOperator for conditional workflows
- Implement trigger rules (all_success, one_failed, none_skipped, etc.)
- Use task groups to organize complex DAGs
Brief Description: Master task dependency patterns and control flow mechanisms to build sophisticated workflow logic.
Chapter 6: XCom and Data Passing
Learning Objectives:
- Understand XCom for inter-task communication
- Push and pull XCom values manually and automatically
- Use TaskFlow API for implicit XCom passing
- Handle large data with custom XCom backends
Brief Description: Learn how tasks communicate and share data using XCom, including the streamlined approach in Airflow 3.x.
Chapter 7: Scheduling and Timetables
Learning Objectives:
- Master cron expressions and preset schedules
- Use Timetable API for custom scheduling logic
- Understand data intervals and logical dates
- Configure catchup, backfill, and max_active_runs
Brief Description: Control when and how your DAGs run with flexible scheduling options and the powerful Timetable API.
Chapter 8: Connections, Hooks, and Providers
Learning Objectives:
- Configure connections via UI, CLI, and environment variables
- Use hooks to interact with external systems (databases, cloud services)
- Install and use provider packages (AWS, GCP, Azure, etc.)
- Build a data pipeline with real external integrations
Brief Description: Connect Airflow to the outside world using connections, hooks, and the extensive provider ecosystem.
Chapter 9: Dynamic DAGs and Advanced Patterns
Learning Objectives:
- Generate DAGs dynamically from configuration files
- Use dynamic task mapping (expand/map) for parallel processing
- Implement SubDAGs vs TaskGroups comparison
- Apply the dataset-driven scheduling in Airflow 3.x
Brief Description: Learn advanced DAG patterns including dynamic generation, mapped tasks, and event-driven scheduling with datasets.
Chapter 10: Testing and Debugging DAGs
Learning Objectives:
- Write unit tests for DAGs, tasks, and custom operators
- Use
airflow dags testandairflow tasks testcommands - Debug DAGs locally with IDE integration
- Validate DAG integrity and detect import errors
Brief Description: Ensure workflow reliability through comprehensive testing strategies and effective debugging techniques.
Chapter 11: Security, RBAC, and Multi-Tenancy
Learning Objectives:
- Configure Role-Based Access Control (RBAC) in Airflow 3.x
- Manage users, roles, and permissions
- Secure connections and variables with secrets backends
- Implement multi-tenancy patterns for team isolation
Brief Description: Secure your Airflow deployment with proper access control, secrets management, and multi-tenancy strategies.
Chapter 12: Production Deployment and Operations
Learning Objectives:
- Deploy Airflow on Kubernetes with the official Helm chart
- Configure CeleryExecutor and KubernetesExecutor for scalability
- Set up monitoring with Prometheus, Grafana, and StatsD
- Implement CI/CD pipelines for DAG deployment
Brief Description: Take Airflow to production with scalable deployment architectures, monitoring, and operational best practices.