Skip to content

fix(scheduler): set end_date on tasks skipped by dagrun timeout#63250

Open
YoannAbriel wants to merge 1 commit intoapache:mainfrom
YoannAbriel:fix/issue-58536
Open

fix(scheduler): set end_date on tasks skipped by dagrun timeout#63250
YoannAbriel wants to merge 1 commit intoapache:mainfrom
YoannAbriel:fix/issue-58536

Conversation

@YoannAbriel
Copy link
Copy Markdown
Contributor

Problem

When a DAG run times out via dagrun_timeout, unfinished task instances are marked SKIPPED but their end_date is never set. The UI computes duration as now - start_date, so skipped tasks show a continuously increasing duration — confusing and incorrect.

Root Cause

In SchedulerJobRunner._schedule_dag_run, the timeout handler sets task_instance.state = TaskInstanceState.SKIPPED but doesn't touch end_date. Without an end_date, the UI falls back to computing live duration.

Fix

Set end_date = timezone.utcnow() alongside the state change for all tasks skipped by dagrun timeout. Added a test that creates a timed-out DAG run with running tasks and verifies end_date is populated after the scheduler processes it.

Closes: #58536


Was generative AI tooling used to co-author this PR?
  • Yes — Claude Code

Generated-by: Claude Code following the guidelines


  • Read the Pull Request Guidelines for more information. Note: commit author/co-author name and email in commits become permanently public when merged.
  • For fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.

@boring-cyborg boring-cyborg bot added the area:Scheduler including HA (high availability) scheduler label Mar 10, 2026
@potiuk
Copy link
Copy Markdown
Member

potiuk commented Mar 10, 2026

Did you actually check if it works? Or is that change just PR-ed without actually running some real tests? Can you show screenshots and logs from the execution?

@potiuk
Copy link
Copy Markdown
Member

potiuk commented Mar 10, 2026

explanation: This is a vital part of airflow in order to modify scheduler you need to know exactly what you are doing and actually test it.

@YoannAbriel
Copy link
Copy Markdown
Contributor Author

Fair — unit tests only so far. Will set up a local environment with a DAG that hits the timeout path and share logs/screenshots.

@YoannAbriel
Copy link
Copy Markdown
Contributor Author

Reproduced and verified on Airflow 3.1.8 (standalone, SequentialExecutor). Test DAG with dagrun_timeout=timedelta(seconds=30) and 3 tasks chained: quick_task (5s) -> slow_task (120s) -> final_task (5s).

Before fix (unpatched 3.1.8):

Task State end_date
quick_task success 2026-03-11 21:05:05
slow_task skipped None
final_task None None

After fix:

Task State end_date
quick_task success 2026-03-11 21:09:21
slow_task skipped 2026-03-11 21:09:45
final_task None None (never started, expected)

Two lines in _schedule_dag_runnow = timezone.utcnow() and task_instance.end_date = now before merging skipped tasks.

@YoannAbriel YoannAbriel force-pushed the fix/issue-58536 branch 4 times, most recently from 2b79864 to 64899cc Compare March 16, 2026 16:08
@YoannAbriel YoannAbriel force-pushed the fix/issue-58536 branch 2 times, most recently from 06edc79 to 6ebc21a Compare March 23, 2026 19:07
@YoannAbriel YoannAbriel force-pushed the fix/issue-58536 branch 6 times, most recently from 24902f4 to c497cda Compare April 8, 2026 18:05
@potiuk potiuk added the ready for maintainer review Set after triaging when all criteria pass. label Apr 8, 2026
@YoannAbriel YoannAbriel force-pushed the fix/issue-58536 branch 4 times, most recently from 9e738f5 to d817d2a Compare April 10, 2026 09:04
@kaxil kaxil requested a review from Copilot April 10, 2026 19:55
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

@YoannAbriel YoannAbriel force-pushed the fix/issue-58536 branch 3 times, most recently from 74c3ea4 to df2246a Compare April 13, 2026 06:06
When a DAG run times out via dagrun_timeout, unfinished tasks are
marked as SKIPPED but end_date was not set. This caused task duration
to keep increasing in the UI even though the task was already skipped.

Set end_date to the current time when marking tasks as SKIPPED during
DAG run timeout handling.

Closes: apache#58536
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:Scheduler including HA (high availability) scheduler ready for maintainer review Set after triaging when all criteria pass.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Task duration keeps increasing for skipped tasks due to DAG timeout

3 participants