Skip to content

Dump stacktraces from goroutines on timeouts for inspection#442

Open
driv3r wants to merge 4 commits into
mainfrom
add-instrumentation-to-debug-stuck-tests
Open

Dump stacktraces from goroutines on timeouts for inspection#442
driv3r wants to merge 4 commits into
mainfrom
add-instrumentation-to-debug-stuck-tests

Conversation

@driv3r

@driv3r driv3r commented Jun 16, 2026

Copy link
Copy Markdown
Contributor

This should help debug timeouts that we are having in tests

@driv3r driv3r requested a review from a team June 16, 2026 14:56
driv3r added 3 commits June 16, 2026 19:58
Add bounded Ruby test synchronization and watchdog diagnostics so CI hangs fail
with useful output instead of reaching the job timeout.

- Replace unbounded status-handler sleeps with a BlockingGate helper
- Track active Ghostferry status handlers for timeout diagnostics
- Add bounded subprocess joins with Ruby thread and Go goroutine dumps
- Add per-test Ruby wall-clock watchdog
- Keep expected interrupt/failure panic stacks buffered instead of live-printing
- Configure Go test timeouts and GOTRACEBACK for fuller goroutine dumps
Replace the ConditionVariable-based per-test watchdog with a Queue-backed
interruptible timer so normal teardown cancels the watchdog reliably.

The previous implementation could wake during teardown and incorrectly fire
while the main test thread was joining the watchdog, causing a false timeout
diagnostic and aborting otherwise healthy tests.

- Use Thread::Queue#pop(timeout:) to distinguish release vs timeout
- Capture watchdog state in locals to avoid cross-test ivar aliasing
- Keep timeout diagnostics unchanged for genuine stuck tests
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants