Skip to content

Fix auth-aware request deduplication#314

Closed
3em0 wants to merge 1 commit into
D4Vinci:mainfrom
3em0:fix/scheduler-auth-context
Closed

Fix auth-aware request deduplication#314
3em0 wants to merge 1 commit into
D4Vinci:mainfrom
3em0:fix/scheduler-auth-context

Conversation

@3em0

@3em0 3em0 commented May 31, 2026

Copy link
Copy Markdown

Summary

  • include authentication-related request context in the default scheduler fingerprint
  • keep non-auth headers behind fp_include_headers to preserve normal deduplication behavior
  • add regression tests for Authorization, Cookie, extra_headers, and explicit cookies request context
  • update scheduler deduplication docs

Closes #313

Tests

  • pytest tests/spiders/test_request.py tests/spiders/test_scheduler.py
  • ruff check scrapling/spiders/request.py tests/spiders/test_request.py tests/spiders/test_scheduler.py

@D4Vinci D4Vinci added the invalid This doesn't seem right label Jun 7, 2026
@D4Vinci

D4Vinci commented Jun 7, 2026

Copy link
Copy Markdown
Owner

Closing as per the issue.

@D4Vinci D4Vinci closed this Jun 7, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

invalid This doesn't seem right

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Authenticated requests can be incorrectly deduplicated when headers/cookies are omitted from scheduler fingerprint

2 participants