Xcom In Airflow [extra Quality] (2024)

Here’s a structured, useful blog post about — written for data engineers who want to move beyond basic tasks and build real DAGs. Mastering XComs in Apache Airflow: Cross‑Task Communication Without the Pain One of the first surprises when learning Airflow is that tasks run isolated from each other. You can’t just set task_2.data = task_1.data . So how do you pass a value from one task to another? XComs .

✅ or ensure upstream dependencies with >> . ❌ Using XComs for many small values across many tasks Each XCom is a DB row. 10 000 tasks × 5 XComs = 50 000 rows – fine. But 100 000 tasks × 10 XComs = 1 million rows – slow. Advanced: XCom Backends Airflow 2.0+ lets you store XComs outside the metadata DB. Useful if you need slightly larger values or lower DB load. xcom in airflow

@task def extract() -> dict: return "user_id": 123, "name": "Alice" # pushed automatically Here’s a structured, useful blog post about —

process(extract()) # XCom passed implicitly So how do you pass a value from one task to another

process_record(get_latest_record_id()) @task def produce_data(): return "ids": [1,2,3], "source": "api" @task def consume_one(data): return f"Got data['ids'][0]"

aggregate(download.expand(url=fetch_urls()))

@task def consume_two(data): return f"Got data['source']" @task def fetch_urls() -> list[str]: return ["http://a.com", "http://b.com"] @task def download(url: str) -> str: # download content return f"content_of_url"