Skip to content

br: Cherry pick from feature restore from replication storage#68601

Open
Leavrth wants to merge 10 commits into
pingcap:release-8.5from
Leavrth:cherry-pick-from-feature-restore-from-replication-storage
Open

br: Cherry pick from feature restore from replication storage#68601
Leavrth wants to merge 10 commits into
pingcap:release-8.5from
Leavrth:cherry-pick-from-feature-restore-from-replication-storage

Conversation

@Leavrth

@Leavrth Leavrth commented May 22, 2026

Copy link
Copy Markdown
Contributor

What problem does this PR solve?

Issue Number: close #xxx

Problem Summary:

What changed and how does it work?

Check List

Tests

  • Unit test
  • Integration test
  • Manual test (add detailed scripts or steps below)
  • No need to test
    • I checked and no code files have been changed.

Side effects

  • Performance regression: Consumes more CPU
  • Performance regression: Consumes more Memory
  • Breaking backward compatibility

Documentation

  • Affects user behaviors
  • Contains syntax changes
  • Contains variable changes
  • Contains experimental features
  • Changes MySQL compatibility

Release note

Please refer to Release Notes Language Style Guide to write a quality release note.

None

Summary by CodeRabbit

  • New Features

    • CLI flags to retain newest MVCC per key during compacted-SST restore, control restore phase, and set replication-status sub-prefix.
    • Runtime probe to detect store support for batch-download of latest-MVCC SSTs; graceful guidance if unsupported.
    • Tagged backup-meta filenames and resume-state helpers to detect DDL metas and persist/restore resume state.
  • Refactor

    • Split compacted-SST restore into collect + restore phases with checkpoint-aware progress; improved metadata caching and concurrent meta-KV reads; importer parallelizes batch downloads per peer.
  • Tests

    • Added SST collection, concurrency-focused meta/KV read tests, batch-download parallelism, and compaction-coverage validation tests.

Signed-off-by: Jianjun Liao <jianjun.liao@outlook.com>
@ti-chi-bot ti-chi-bot Bot added do-not-merge/cherry-pick-not-approved release-note-none Denotes a PR that doesn't merit a release note. labels May 22, 2026
@ti-chi-bot

ti-chi-bot Bot commented May 22, 2026

Copy link
Copy Markdown

This cherry pick PR is for a release branch and has not yet been approved by triage owners.
Adding the do-not-merge/cherry-pick-not-approved label.

To merge this cherry pick:

  1. It must be LGTMed and approved by the reviewers firstly.
  2. For pull requests to TiDB-x branches, it must have no failed tests.
  3. AFTER it has lgtm and approved labels, please wait for the cherry-pick merging approval from triage owners.
Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@ti-chi-bot ti-chi-bot Bot added the size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. label May 22, 2026
@tiprow

tiprow Bot commented May 22, 2026

Copy link
Copy Markdown

Hi @Leavrth. Thanks for your PR.

PRs from untrusted users cannot be marked as trusted with /ok-to-test in this repo meaning untrusted PR authors can never trigger tests themselves. Collaborators can still trigger tests on the PR using /test all.

I understand the commands that are listed here.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@coderabbitai

coderabbitai Bot commented May 22, 2026

Copy link
Copy Markdown

Review Change Stack

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds BatchDownloadLatestMVCC and a support probe; implements retain-latest-MVCC SST download and routing in SnapFileImporter; refactors SST restore into collect+restore phases; adds two‑phase stream restore and replication resume-state; introduces tagged backupmeta filename flags and makes metadata reads concurrency-safe; updates tests and build pins.

Changes

Latest MVCC SST Download and Restoration

Layer / File(s) Summary
Import client latest-MVCC RPC methods
br/pkg/restore/internal/import_client/import_client.go
ImporterClient adds BatchDownloadLatestMVCC and CheckBatchDownloadLatestMVCCSupport; importClient implements the RPC delegation and store-level probe.
SnapFileImporter latest-MVCC infrastructure
br/pkg/restore/snap_client/import.go, br/pkg/restore/snap_client/client.go, br/pkg/restore/snap_client/export_test.go, br/pkg/restore/snap_client/import_test.go
Adds retainLatestMVCCVersion option, capability probe wiring, batchDownloadNewestVersionSST implementation, response handling/truncation, per-store throttling, and tests asserting per-peer parallelization for both LatestMVCC and SST paths.
SST download routing for latest-MVCC mode
br/pkg/restore/snap_client/import.go
Routes TiDBCompacted downloads through newest-version batch download when flag enabled; adjusts workerpool sizing and ApiVersion handling.
LogClient SST collection refactoring
br/pkg/restore/log_client/client.go, br/pkg/restore/log_client/client_test.go
Adds CollectSSTFileSets to pre-collect compacted SST groups and KV counts; RestoreSSTFileSets now restores from pre-collected sets; InitClients accepts retainLatestMVCCVersion and wires importer options.
Meta-KV filtering and concurrent reads
br/pkg/restore/log_client/*
Introduces shouldReadMetaKVFile/countReadableMetaKVFiles, CF-aware splitting, and concurrent meta-KV file reads with a bounded worker pool to aggregate and sort KV entries.

Two-Phase Stream Restore with Checkpoint Preservation

Layer / File(s) Summary
RestoreConfig flags and fields for two-phase restore
br/pkg/task/restore.go
Adds CLI flags and config fields: RetainLatestMVCCVersion, ReplicationStatusSubPrefix, RestorePhase, FromReplicationStorage, RestoreInPhase.
Flag parsing and validation
br/pkg/task/restore.go, br/pkg/task/config_test.go
Parses new flags, validates allowed phase values, and enforces checkpoint mode when phase-based restore is active; adds unit test for phase/checkpoint validation.
RunRestore phase 1 early exit & abort wiring
br/pkg/task/restore.go
Pauses the registry task and returns early for phase 1 to preserve checkpoints; RunRestoreAbort resolves log info using replication-storage parameters.
Stream restore flow changes
br/pkg/task/stream.go
Switches SST restore to CollectSSTFileSets + RestoreSSTFileSets, adds checkpoint-aware progress accounting, validates compaction coverage when retain-latest-MVCC is enabled, probes for remaining WriteCF logs and skips user DML restore when appropriate, and wires RetainLatestMVCCVersion into client init.
Replication status persistence
br/pkg/task/stream.go
Adds PersistentState, safe subdir normalization, status filename helpers, and reading LastCheckpoint from replication storage.
Helpers: startTS and log-file probe
br/pkg/task/stream.go, br/pkg/task/stream_test.go
Tightens restore start-TS selection and adds hasAnyWriteCFLogFile plus tests for detecting WriteCF log files.

Metadata filename flags and cache concurrency

Layer / File(s) Summary
Tagged backupmeta parser
br/pkg/stream/backupmetas/parser.go
Adds NameFlagsTag, ParsedName.Flags/HasFlags, CalculateShiftTS, HasDDLFiles, and TryParseTaggedBackupMetaFileName.
Shift-TS filename-based update
br/pkg/stream/stream_metas.go
Switches UpdateShiftTS to try filename parsing first and adds wrapper/fallback to metadata-based UpdateShiftTSFromMetadata; metadata keyed by filename.
Metadata cache & ReadFile API
br/pkg/stream/stream_mgr.go
Make ContentRef/MetadataHelper concurrency-safe (per-entry mutex + cache RW lock), add rawLength parameter to ReadFile/decode, and add skipCondition to FastUnmarshalMetaData.
Metadata read tests
br/pkg/stream/*_test.go, br/pkg/restore/log_client/*_test.go
Add gated read wrappers, concurrency assertions, DDL-only metadata filtering, and readable-file counting tests.

Migration validation

Layer / File(s) Summary
Retain-latest-MVCC compaction coverage check
br/pkg/restore/log_client/migration.go, br/pkg/restore/log_client/migration_test.go
Parse compaction comments into intervals/shard metadata and verify full shard coverage across the restore range; tests exercising multiple success/failure scenarios.

Utility refactors and build updates

Layer / File(s) Summary
iter chunkMapping refactor
br/pkg/utils/iter/combinator_types.go, br/pkg/utils/iter/combinator_test.go, br/pkg/utils/iter/BUILD.bazel
Replace errgroup-based buffer fill with producer/consumer mapping, update tests, and remove errgroup from build deps.
Dependency and Bazel patches
go.mod, DEPS.bzl, br/pkg/task/BUILD.bazel, br/pkg/restore/log_client/BUILD.bazel
Update kvproto pseudo-version and DEPS pin, increment test shard_count(s), add //br/pkg/stream/backupmetas to log_client deps, and adjust NewSnapFileImporterOptions call-sites.
SnapClient option call-sites
br/pkg/restore/snap_client/client.go, br/pkg/restore/snap_client/export_test.go
Update NewSnapFileImporterOptions invocations to pass new boolean argument for the retain flag.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

  • pingcap/tidb#67142: modifies SnapFileImporter options/argument wiring around NewSnapFileImporterOptions.
  • pingcap/tidb#67268: touches ReadFilteredEntriesFromFiles / meta-KV reading semantics.

Suggested labels

ok-to-test, type/cherry-pick-for-release-8.5

Suggested reviewers

  • YuJuncen
  • RidRisR

"I tunneled through branches to fetch the newest leaf—
The SSTs I carried, trimmed to just the briefest grief.
Phase one I curtsied, left checkpoints snug and warm,
Metadata flags I nibbled through every file and form.
Hop, hop, restore, the rabbit hums a tiny charm."

🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (2 warnings)

Check name Status Explanation Resolution
Description check ⚠️ Warning The PR description follows the required template structure with checklist items marked, but lacks critical content: no Issue Number/link, no Problem Summary, and no explanation of what changed or how it works. Complete the description by adding Issue Number (close #xxx), Problem Summary explaining the feature/fix, and 'What changed and how does it work' section detailing the implementation.
Docstring Coverage ⚠️ Warning Docstring coverage is 23.08% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title 'br: Cherry pick from feature restore from replication storage' is partially related to the changeset, referring to a real aspect of the changes (restore features and replication storage handling), but lacks sufficient specificity about the main implementations.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🧹 Nitpick comments (1)
br/pkg/restore/log_client/client.go (1)

337-338: ⚡ Quick win

Add contextual wrapping when SST collection fails.

Returning raw iterator/rewrite errors here makes restore failures harder to triage (which table/subcompaction failed). Please annotate with table context before returning.

Suggested diff
@@
-       if r.Err != nil {
-           return nil, 0, r.Err
+       if r.Err != nil {
+           return nil, 0, errors.Annotate(r.Err, "failed while iterating compacted SSTs")
        }
@@
-       newRules, err := rc.rewriteRulesFor(i, rewriteRules)
+       newRules, err := rc.rewriteRulesFor(i, rewriteRules)
        if err != nil {
-           return nil, 0, err
+           return nil, 0, errors.Annotatef(err, "failed to prepare rewrite rules for table %d", i.TableID())
        }
As per coding guidelines: Go code: Keep error handling actionable and contextual; avoid silently swallowing errors.

Also applies to: 351-353

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@br/pkg/restore/log_client/client.go` around lines 337 - 338, Wrap the raw
iterator/rewrite error (r.Err) at the return sites so the error includes table
and subcompaction context before returning; replace returns like "return nil, 0,
r.Err" with a wrapped error using fmt.Errorf (or errors.Wrapf) that adds
identifiers present in scope (e.g. tableID/tableName, cfName, subCompactionID or
range/start/end) such as fmt.Errorf("collect SSTs failed for table %s cf %s
subcompaction %d: %w", tableName, cfName, subCompactionID, r.Err); do the same
for the other return site(s) that currently return r.Err (lines around the
second occurrence).
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@br/pkg/task/restore.go`:
- Around line 535-553: The validation and normalization of RestorePhase,
ReplicationStatusSubPrefix and their derived booleans (RestoreInPhase,
FromReplicationStorage) are currently only performed in ParseFromFlags; move
that logic into the config's Adjust() method so non-CLI code paths also enforce
the invariant and normalize values. Specifically, perform: (1) compute
RestoreInPhase = flags-equivalent check (i.e. whether RestorePhase != 0 or
ReplicationStatusSubPrefix non-empty) inside Adjust(), (2) validate RestorePhase
is 1 or 2 when RestoreInPhase is true, (3) enforce the checkpoint requirement
(UseCheckpoint) when RestoreInPhase is true, and (4) set FromReplicationStorage
based on ReplicationStatusSubPrefix presence inside Adjust(); remove or keep
only minimal flag parsing in ParseFromFlags so it only reads raw values and
defers normalization/validation to Adjust().
- Around line 1084-1093: The code currently treats a failed
restoreRegistry.PauseTask as non-fatal: it logs a warning then unconditionally
logs that phase 1 finished and returns nil; change this so that inside the
IsStreamRestore && cfg.RestorePhase == 1 branch, if cfg.RestoreID != 0 and
restoreRegistry.PauseTask(c, cfg.RestoreID) returns an error you return that
error (or wrap it) instead of continuing; only emit the success
log.Info("restore phase 1 finished...") and return nil when PauseTask succeeded
(or when cfg.RestoreID == 0), and remove the misleading success log when
PauseTask failed. Ensure the change touches the PauseTask call, the surrounding
if block (cfg.RestoreID check), and the final return path.

In `@br/pkg/task/stream.go`:
- Around line 1738-1745: The async DML-check using
hasDMLFilesResultCh/hasAnyLogFiles must be made synchronous and performed before
calling RestoreSSTFileSets when cfg.RetainLatestMVCCVersion is true so we
fail-fast on DML-bearing inputs; replace the goroutine+buffered channel pattern
that defers waiting with a direct call to hasAnyLogFiles(ctx, logFilesIter),
handle its error immediately, and if hasFiles is true return the appropriate
error (or abort) before invoking RestoreSSTFileSets (references:
hasDMLFilesResultCh, hasAnyLogFiles, RestoreSSTFileSets,
cfg.RetainLatestMVCCVersion, logFilesIter).

In `@go.mod`:
- Around line 349-350: The go.mod currently replaces github.com/pingcap/kvproto
with a fork github.com/leavrth/kvproto at v0.0.0-20260505121738-8ce467678bf2
which is a fork-only commit; for a release either remove that replace and
restore usage of github.com/pingcap/kvproto (undo the replace line) or, if the
forked changes are absolutely required, keep the replace but add a clear
justification and remediation steps: add a short comment above the replace in
go.mod referencing a tracking issue/PR ID, document the reason and
compatibility/interop implications in RELEASE_NOTES (or
SECURITY/THIRD-PARTY.md), open an upstreaming PR to pingcap/kvproto and include
a TODO in go.mod to remove the replace once upstreamed (and state the target
release version). Ensure the unique symbol to change is the replace directive
for github.com/pingcap/kvproto => github.com/leavrth/kvproto
v0.0.0-20260505121738-8ce467678bf2.

---

Nitpick comments:
In `@br/pkg/restore/log_client/client.go`:
- Around line 337-338: Wrap the raw iterator/rewrite error (r.Err) at the return
sites so the error includes table and subcompaction context before returning;
replace returns like "return nil, 0, r.Err" with a wrapped error using
fmt.Errorf (or errors.Wrapf) that adds identifiers present in scope (e.g.
tableID/tableName, cfName, subCompactionID or range/start/end) such as
fmt.Errorf("collect SSTs failed for table %s cf %s subcompaction %d: %w",
tableName, cfName, subCompactionID, r.Err); do the same for the other return
site(s) that currently return r.Err (lines around the second occurrence).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: ccfab626-c6f1-47fc-b0bd-f7640f6b0a59

📥 Commits

Reviewing files that changed from the base of the PR and between 15cb1cf and e525781.

⛔ Files ignored due to path filters (1)
  • go.sum is excluded by !**/*.sum
📒 Files selected for processing (11)
  • br/pkg/restore/internal/import_client/import_client.go
  • br/pkg/restore/log_client/client.go
  • br/pkg/restore/log_client/client_test.go
  • br/pkg/restore/snap_client/client.go
  • br/pkg/restore/snap_client/export_test.go
  • br/pkg/restore/snap_client/import.go
  • br/pkg/task/BUILD.bazel
  • br/pkg/task/config_test.go
  • br/pkg/task/restore.go
  • br/pkg/task/stream.go
  • go.mod

Comment thread br/pkg/task/restore.go Outdated
Comment thread br/pkg/task/restore.go
Comment thread br/pkg/task/stream.go Outdated
Comment thread go.mod Outdated
@codecov

codecov Bot commented May 22, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 48.36066% with 378 lines in your changes missing coverage. Please review.
⚠️ Please upload report for BASE (release-8.5@bd0431a). Learn more about missing BASE report.

⚠️ Current head e21bff8 differs from pull request most recent head 4b181bb

Please upload reports for the commit 4b181bb to get more accurate results.

Additional details and impacted files
@@               Coverage Diff                @@
##             release-8.5     #68601   +/-   ##
================================================
  Coverage               ?   55.8649%           
================================================
  Files                  ?       1829           
  Lines                  ?     669099           
  Branches               ?          0           
================================================
  Hits                   ?     373792           
  Misses                 ?     267898           
  Partials               ?      27409           
Flag Coverage Δ
integration 39.2862% <33.7893%> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

Components Coverage Δ
dumpling 55.3026% <0.0000%> (?)
parser ∅ <0.0000%> (?)
br 62.3769% <0.0000%> (?)
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@Leavrth Leavrth mentioned this pull request May 22, 2026
13 tasks
Leavrth added 2 commits May 22, 2026 22:27
Signed-off-by: Jianjun Liao <jianjun.liao@outlook.com>
Signed-off-by: Jianjun Liao <jianjun.liao@outlook.com>
@Leavrth Leavrth force-pushed the cherry-pick-from-feature-restore-from-replication-storage branch from e525781 to d455458 Compare May 22, 2026 14:29

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
br/pkg/task/stream.go (1)

2083-2089: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Add a doc comment for GetStatusFileName.

This new exported helper is missing a Go doc comment.

As per coding guidelines, "Go code: Keep exported-symbol doc comments, and prefer semantic constraints over name restatement".

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@br/pkg/task/stream.go` around lines 2083 - 2089, Add a Go doc comment for the
exported function GetStatusFileName that explains what the function returns and
under what conditions it can error: mention that it normalizes the provided
storage sub-directory using normalizeStorageSubDir and returns the path to the
status file "resume-state.json" within that normalized subdirectory, and that an
error is returned if normalization fails; prefer a semantic description rather
than repeating the function name.
♻️ Duplicate comments (3)
br/pkg/task/stream.go (1)

1738-1745: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Block on the DML probe before restoring SSTs.

This still applies SST data before waiting on hasDMLFilesResultCh, so --retain-latest-mvcc-version can fail only after a partial restore has already happened. The DML-file check needs to complete before RestoreSSTFileSets.

Also applies to: 1764-1785

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@br/pkg/task/stream.go` around lines 1738 - 1745, The DML probe launched into
hasDMLFilesResultCh must be awaited before applying SSTs: when
cfg.RetainLatestMVCCVersion is true, block on reading from hasDMLFilesResultCh
(the value produced by hasAnyLogFiles via hasLogFilesResult) and handle its
error/result before calling RestoreSSTFileSets; do the same for the other
occurrence that currently applies SST data (the second block using
hasDMLFilesResultCh around RestoreSSTFileSets) so SST restore is skipped or
aborted if the probe reports an error or indicates DML files exist.
br/pkg/task/restore.go (2)

1084-1093: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Make phase-1 registry pause failure fatal.

If PauseTask fails here, the function still logs phase-1 success and returns nil, which leaves checkpoint data behind without a paused restore registration for the follow-up phase.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@br/pkg/task/restore.go` around lines 1084 - 1093, The current block treats
PauseTask failure as non-fatal: in IsStreamRestore + cfg.RestorePhase == 1 path,
if restoreRegistry.PauseTask(c, cfg.RestoreID) returns an error the code only
logs a warning and then logs phase-1 success and returns nil; change this so
that when restoreRegistry.PauseTask(...) fails you log the error (use log.Error
or similar) and return that error (or a wrapped error) immediately instead of
proceeding to log success and return nil; ensure the success log ("restore phase
1 finished...") only runs when PauseTask succeeds and reference the same
cfg.RestoreID and restoreRegistry.PauseTask symbols when modifying control flow.

535-552: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Normalize phase/replication restore state in Adjust(), not only during flag parsing.

These derived fields are still set only in ParseFromFlags, so non-CLI paths that populate RestoreConfig and then call Adjust() can miss replication-storage mode or skip phase validation entirely.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@br/pkg/task/restore.go` around lines 535 - 552, Move the derived/validation
logic for phase and replication-storage out of ParseFromFlags into
RestoreConfig.Adjust(): in Adjust(), compute RestoreInPhase = (RestorePhase >
0), validate RestorePhase is either 1 or 2 when RestoreInPhase, enforce that
RestoreInPhase requires UseCheckpoint, and set FromReplicationStorage =
(len(ReplicationStatusSubPrefix) > 0); keep ParseFromFlags responsible only for
reading flags into fields (it can still call cfg.Adjust() at the end) so non-CLI
code paths that populate RestoreConfig and call Adjust() get consistent
normalization and validation.
🧹 Nitpick comments (1)
br/pkg/stream/stream_metas.go (1)

295-315: ⚡ Quick win

Add doc comments for the exported helpers in this hunk.

TryParseTaggedBackupMetaFileNameWrapper, UpdateShiftTS, and UpdateShiftTSFromMetadata are exported but undocumented, which makes the package surface harder to understand and breaks the Go guideline we follow.

As per coding guidelines, "Go code: Keep exported-symbol doc comments, and prefer semantic constraints over name restatement".

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@br/pkg/stream/stream_metas.go` around lines 295 - 315, Add concise Go doc
comments for the three exported helpers:
TryParseTaggedBackupMetaFileNameWrapper, UpdateShiftTS, and
UpdateShiftTSFromMetadata. For each function, add a one- or two-sentence comment
immediately above the function declaration that explains what the function does,
its key inputs/outputs (e.g., filename and metadata for parsing; startTS and
restoreTS and the returned shifted timestamp and found flag), and any important
semantic behavior (e.g., that TryParse... strips metaSuffix and delegates to
backupmetas.TryParseTaggedBackupMetaFileName, that UpdateShiftTS prefers
filename parsing and falls back to metadata, and that UpdateShiftTSFromMetadata
computes shift from Metadata). Keep comments idiomatic (no restating the name)
and follow Go doc style (start with the function name).
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@br/pkg/restore/log_client/log_file_manager.go`:
- Around line 179-199: The anonymous callback that calls
stream.TryParseTaggedBackupMetaFileNameWrapper currently returns false on parse
error, which prevents the raw-metadata fallback from running; instead, change
the error branch so that parse failures do not abort the callback (i.e., do not
return false on err) and allow the function to return true so the raw-metadata
handler runs (preserving shiftStartTS behavior); update the block around
stream.TryParseTaggedBackupMetaFileNameWrapper, the err branch, and the
surrounding logic that sets shiftTS (the closure referencing lm.startTS,
lm.restoreTS and shiftTS) to ensure untagged/older .meta files trigger the
fallback path.

In `@br/pkg/stream/stream_mgr.go`:
- Around line 438-440: The call to the callback skipCondition in stream_mgr.go
is not nil-checked and will panic if callers pass nil; update the code that
currently does if skipCondition(path) { return nil } to first guard the callback
(e.g., check skipCondition != nil) before invoking it, or initialize
skipCondition to a no-op default earlier where it's accepted, so that calls to
skipCondition(path) are safe; ensure this change references the skipCondition
variable in the function or method that contains the current check so callers
who pass nil no longer cause a panic.
- Around line 208-214: In decodeCompressedData, avoid passing rawLength directly
into make (which can panic if it's too large); validate rawLength before using
it to set slice capacity in the call to m.decoder.DecodeAll: ensure rawLength is
within a safe bound and convertible to int (e.g., check rawLength <= math.MaxInt
and <= a reasonable maximum like 1<<30), fallback to 0 or a safe default when
invalid, then convert to int and call m.decoder.DecodeAll(data, make([]byte, 0,
int(validatedRawLength))). This prevents a "cap out of range" panic while
keeping the use of m.decoder.DecodeAll consistent.

In `@br/pkg/stream/stream_misc_test.go`:
- Around line 26-39: The ReadFile implementation on gatedReadStorage currently
recurses into itself; replace the final recursive call with a call to the
underlying storage by delegating to the embedded ExternalStorage (e.g., call
s.ExternalStorage.ReadFile(ctx, name)), preserving the active counter defer and
the readGate wait behavior; ensure you reference gatedReadStorage.ReadFile, the
embedded ExternalStorage, and fields active, maxActive and readGate when making
the change.

In `@br/pkg/utils/iter/combinator_types.go`:
- Around line 48-52: The code is double-draining the m.outstanding channel
causing possible deadlock: remove or guard the second blocking receive so
outstanding is only drained once when an upstream error occurs. In TryNext (the
method calling m.inner.TryNext(ctx)), after receiving r := m.inner.TryNext(ctx)
and before sending the error result, drain outstanding exactly once; then when
you receive back from the mapper result avoid doing a second <-m.outstanding if
r.Err != nil (either skip the receive when r.Err != nil or replace it with a
non-blocking select to drain only if available). Update the logic around
m.inner.TryNext, m.cancel, m.finished and uses of <-m.outstanding so the channel
is drained in one place and the second drain is conditional/non-blocking to
prevent blocking forever.

---

Outside diff comments:
In `@br/pkg/task/stream.go`:
- Around line 2083-2089: Add a Go doc comment for the exported function
GetStatusFileName that explains what the function returns and under what
conditions it can error: mention that it normalizes the provided storage
sub-directory using normalizeStorageSubDir and returns the path to the status
file "resume-state.json" within that normalized subdirectory, and that an error
is returned if normalization fails; prefer a semantic description rather than
repeating the function name.

---

Duplicate comments:
In `@br/pkg/task/restore.go`:
- Around line 1084-1093: The current block treats PauseTask failure as
non-fatal: in IsStreamRestore + cfg.RestorePhase == 1 path, if
restoreRegistry.PauseTask(c, cfg.RestoreID) returns an error the code only logs
a warning and then logs phase-1 success and returns nil; change this so that
when restoreRegistry.PauseTask(...) fails you log the error (use log.Error or
similar) and return that error (or a wrapped error) immediately instead of
proceeding to log success and return nil; ensure the success log ("restore phase
1 finished...") only runs when PauseTask succeeds and reference the same
cfg.RestoreID and restoreRegistry.PauseTask symbols when modifying control flow.
- Around line 535-552: Move the derived/validation logic for phase and
replication-storage out of ParseFromFlags into RestoreConfig.Adjust(): in
Adjust(), compute RestoreInPhase = (RestorePhase > 0), validate RestorePhase is
either 1 or 2 when RestoreInPhase, enforce that RestoreInPhase requires
UseCheckpoint, and set FromReplicationStorage = (len(ReplicationStatusSubPrefix)
> 0); keep ParseFromFlags responsible only for reading flags into fields (it can
still call cfg.Adjust() at the end) so non-CLI code paths that populate
RestoreConfig and call Adjust() get consistent normalization and validation.

In `@br/pkg/task/stream.go`:
- Around line 1738-1745: The DML probe launched into hasDMLFilesResultCh must be
awaited before applying SSTs: when cfg.RetainLatestMVCCVersion is true, block on
reading from hasDMLFilesResultCh (the value produced by hasAnyLogFiles via
hasLogFilesResult) and handle its error/result before calling
RestoreSSTFileSets; do the same for the other occurrence that currently applies
SST data (the second block using hasDMLFilesResultCh around RestoreSSTFileSets)
so SST restore is skipped or aborted if the probe reports an error or indicates
DML files exist.

---

Nitpick comments:
In `@br/pkg/stream/stream_metas.go`:
- Around line 295-315: Add concise Go doc comments for the three exported
helpers: TryParseTaggedBackupMetaFileNameWrapper, UpdateShiftTS, and
UpdateShiftTSFromMetadata. For each function, add a one- or two-sentence comment
immediately above the function declaration that explains what the function does,
its key inputs/outputs (e.g., filename and metadata for parsing; startTS and
restoreTS and the returned shifted timestamp and found flag), and any important
semantic behavior (e.g., that TryParse... strips metaSuffix and delegates to
backupmetas.TryParseTaggedBackupMetaFileName, that UpdateShiftTS prefers
filename parsing and falls back to metadata, and that UpdateShiftTSFromMetadata
computes shift from Metadata). Keep comments idiomatic (no restating the name)
and follow Go doc style (start with the function name).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 13a391dd-c793-4c23-a55b-fb6b3f9ac059

📥 Commits

Reviewing files that changed from the base of the PR and between e525781 and d455458.

⛔ Files ignored due to path filters (1)
  • go.sum is excluded by !**/*.sum
📒 Files selected for processing (23)
  • DEPS.bzl
  • br/pkg/restore/internal/import_client/import_client.go
  • br/pkg/restore/log_client/BUILD.bazel
  • br/pkg/restore/log_client/client.go
  • br/pkg/restore/log_client/client_test.go
  • br/pkg/restore/log_client/export_test.go
  • br/pkg/restore/log_client/log_file_manager.go
  • br/pkg/restore/log_client/log_file_manager_test.go
  • br/pkg/restore/snap_client/client.go
  • br/pkg/restore/snap_client/export_test.go
  • br/pkg/restore/snap_client/import.go
  • br/pkg/stream/backupmetas/parser.go
  • br/pkg/stream/stream_metas.go
  • br/pkg/stream/stream_metas_test.go
  • br/pkg/stream/stream_mgr.go
  • br/pkg/stream/stream_mgr_fuzz_test.go
  • br/pkg/stream/stream_misc_test.go
  • br/pkg/task/restore.go
  • br/pkg/task/stream.go
  • br/pkg/utils/iter/BUILD.bazel
  • br/pkg/utils/iter/combinator_test.go
  • br/pkg/utils/iter/combinator_types.go
  • go.mod
💤 Files with no reviewable changes (1)
  • br/pkg/utils/iter/BUILD.bazel
✅ Files skipped from review due to trivial changes (1)
  • br/pkg/restore/log_client/BUILD.bazel

Comment thread br/pkg/restore/log_client/log_file_manager.go
Comment thread br/pkg/restore/snap_client/import.go
Comment thread br/pkg/stream/stream_mgr.go Outdated
Comment thread br/pkg/stream/stream_mgr.go
Comment thread br/pkg/stream/stream_misc_test.go
Comment thread br/pkg/utils/iter/combinator_types.go
@Leavrth

Leavrth commented May 23, 2026

Copy link
Copy Markdown
Contributor Author

/retest

@tiprow

tiprow Bot commented May 23, 2026

Copy link
Copy Markdown

@Leavrth: PRs from untrusted users cannot be marked as trusted with /ok-to-test in this repo meaning untrusted PR authors can never trigger tests themselves. Collaborators can still trigger tests on the PR using /test.

Details

In response to this:

/retest

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Leavrth added 2 commits May 23, 2026 14:02
Signed-off-by: Jianjun Liao <jianjun.liao@outlook.com>
Signed-off-by: Jianjun Liao <jianjun.liao@outlook.com>

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@br/pkg/restore/snap_client/import_test.go`:
- Line 320: The test closes importClient.unblock directly which can double-close
on failure; add a sync.Once field to blockingBatchDownloadImporterClient (e.g.,
unblockOnce) and expose a method (e.g., Unblock()) that calls
unblockOnce.Do(func(){ close(i.unblock) }), then replace all direct
close(importClient.unblock) calls (including in waitForConcurrentDownloads and
deferred cleanups) with importClient.Unblock(), ensuring the unblock path is
idempotent across tests like those calling waitForConcurrentDownloads and the
deferred cleanup.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 97760101-b430-4805-9bce-ae8b4be5f0db

📥 Commits

Reviewing files that changed from the base of the PR and between 2f6522f and 8e50ced.

📒 Files selected for processing (3)
  • br/pkg/restore/snap_client/BUILD.bazel
  • br/pkg/restore/snap_client/import.go
  • br/pkg/restore/snap_client/import_test.go
🚧 Files skipped from review as they are similar to previous changes (1)
  • br/pkg/restore/snap_client/import.go

Comment thread br/pkg/restore/snap_client/import_test.go
Leavrth added 2 commits June 9, 2026 11:17
Signed-off-by: Jianjun Liao <jianjun.liao@outlook.com>

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
br/pkg/task/stream.go (1)

1742-1793: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Keep the retain-latest check as a preflight, not a post-SST warning.

FlagRetainLatestMVCCVersion is still documented in br/pkg/task/restore.go:404-463 as requiring "no user DML log files", but this branch only waits for the write-CF probe after RestoreSSTFileSets has already applied SST data. If finding a write-CF file is still meant to invalidate the request, this remains a partial-restore path; if it is now allowed, the flag contract and tests need to be updated in the same change instead of silently changing behavior here.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@br/pkg/task/stream.go` around lines 1742 - 1793, The retain-latest MVCC probe
must be performed as a preflight and block SST restore rather than being checked
after RestoreSSTFileSets; move the hasAnyWriteCFLogFile check (the
goroutine/hasWriteCFLogFileResultCh logic around hasAnyWriteCFLogFile) to run
and be awaited before calling client.RestoreSSTFileSets, and if it returns a
non-nil file or error, return/propagate an error (or short-circuit) instead of
proceeding; update handling of logFilesIter and the FlagRetainLatestMVCCVersion
logging to reflect the preflight failure path so the flag contract/tests remain
consistent (references: hasAnyWriteCFLogFile, hasWriteCFLogFileResultCh,
RestoreSSTFileSets, FlagRetainLatestMVCCVersion, logFilesIter).
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@br/pkg/restore/log_client/client.go`:
- Around line 1717-1721: The partial-index predicate in
info.IndexInfo.ConditionExprString must have literal '%' characters escaped
before concatenating into sql.AddSQL because ExecuteInternal will later call
sqlescape.EscapeSQL (when addArgs is non-empty) which treats %n/%? as
placeholders; update the code that builds addSQL (the block that checks
len(info.IndexInfo.ConditionExprString) and calls addSQL.WriteString) to
pre-process ConditionExprString and escape percent signs (e.g. replace "%" with
"%%") so any occurrences like "%n%" or "%?%" become safe literals before
appending to sql.AddSQL used by ExecuteInternal.

---

Duplicate comments:
In `@br/pkg/task/stream.go`:
- Around line 1742-1793: The retain-latest MVCC probe must be performed as a
preflight and block SST restore rather than being checked after
RestoreSSTFileSets; move the hasAnyWriteCFLogFile check (the
goroutine/hasWriteCFLogFileResultCh logic around hasAnyWriteCFLogFile) to run
and be awaited before calling client.RestoreSSTFileSets, and if it returns a
non-nil file or error, return/propagate an error (or short-circuit) instead of
proceeding; update handling of logFilesIter and the FlagRetainLatestMVCCVersion
logging to reflect the preflight failure path so the flag contract/tests remain
consistent (references: hasAnyWriteCFLogFile, hasWriteCFLogFileResultCh,
RestoreSSTFileSets, FlagRetainLatestMVCCVersion, logFilesIter).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: b5485598-72eb-416e-a2d1-fbf615bcd39d

📥 Commits

Reviewing files that changed from the base of the PR and between 8e50ced and 54723a4.

⛔ Files ignored due to path filters (1)
  • go.sum is excluded by !**/*.sum
📒 Files selected for processing (16)
  • DEPS.bzl
  • br/pkg/restore/log_client/client.go
  • br/pkg/restore/log_client/migration.go
  • br/pkg/restore/log_client/migration_test.go
  • br/pkg/restore/snap_client/BUILD.bazel
  • br/pkg/restore/snap_client/client.go
  • br/pkg/restore/snap_client/export_test.go
  • br/pkg/restore/snap_client/import.go
  • br/pkg/restore/snap_client/import_test.go
  • br/pkg/task/BUILD.bazel
  • br/pkg/task/config_test.go
  • br/pkg/task/restore.go
  • br/pkg/task/stream.go
  • br/pkg/task/stream_test.go
  • br/pkg/utils/iter/BUILD.bazel
  • go.mod
✅ Files skipped from review due to trivial changes (2)
  • br/pkg/restore/snap_client/BUILD.bazel
  • br/pkg/task/BUILD.bazel
🚧 Files skipped from review as they are similar to previous changes (8)
  • br/pkg/restore/snap_client/export_test.go
  • br/pkg/task/config_test.go
  • go.mod
  • br/pkg/restore/snap_client/client.go
  • DEPS.bzl
  • br/pkg/restore/snap_client/import_test.go
  • br/pkg/task/restore.go
  • br/pkg/restore/snap_client/import.go

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Inline review comments failed to post. This is likely due to GitHub's internal server error or limits when posting large numbers of comments. If you are seeing this consistently it is likely a permissions issue. Please check "Moderation" -> "Code review limits" under your organization settings.

Actionable comments posted: 1

♻️ Duplicate comments (1)
br/pkg/task/stream.go (1)

1742-1793: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Keep the retain-latest check as a preflight, not a post-SST warning.

FlagRetainLatestMVCCVersion is still documented in br/pkg/task/restore.go:404-463 as requiring "no user DML log files", but this branch only waits for the write-CF probe after RestoreSSTFileSets has already applied SST data. If finding a write-CF file is still meant to invalidate the request, this remains a partial-restore path; if it is now allowed, the flag contract and tests need to be updated in the same change instead of silently changing behavior here.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@br/pkg/task/stream.go` around lines 1742 - 1793, The retain-latest MVCC probe
must be performed as a preflight and block SST restore rather than being checked
after RestoreSSTFileSets; move the hasAnyWriteCFLogFile check (the
goroutine/hasWriteCFLogFileResultCh logic around hasAnyWriteCFLogFile) to run
and be awaited before calling client.RestoreSSTFileSets, and if it returns a
non-nil file or error, return/propagate an error (or short-circuit) instead of
proceeding; update handling of logFilesIter and the FlagRetainLatestMVCCVersion
logging to reflect the preflight failure path so the flag contract/tests remain
consistent (references: hasAnyWriteCFLogFile, hasWriteCFLogFileResultCh,
RestoreSSTFileSets, FlagRetainLatestMVCCVersion, logFilesIter).
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@br/pkg/restore/log_client/client.go`:
- Around line 1717-1721: The partial-index predicate in
info.IndexInfo.ConditionExprString must have literal '%' characters escaped
before concatenating into sql.AddSQL because ExecuteInternal will later call
sqlescape.EscapeSQL (when addArgs is non-empty) which treats %n/%? as
placeholders; update the code that builds addSQL (the block that checks
len(info.IndexInfo.ConditionExprString) and calls addSQL.WriteString) to
pre-process ConditionExprString and escape percent signs (e.g. replace "%" with
"%%") so any occurrences like "%n%" or "%?%" become safe literals before
appending to sql.AddSQL used by ExecuteInternal.

---

Duplicate comments:
In `@br/pkg/task/stream.go`:
- Around line 1742-1793: The retain-latest MVCC probe must be performed as a
preflight and block SST restore rather than being checked after
RestoreSSTFileSets; move the hasAnyWriteCFLogFile check (the
goroutine/hasWriteCFLogFileResultCh logic around hasAnyWriteCFLogFile) to run
and be awaited before calling client.RestoreSSTFileSets, and if it returns a
non-nil file or error, return/propagate an error (or short-circuit) instead of
proceeding; update handling of logFilesIter and the FlagRetainLatestMVCCVersion
logging to reflect the preflight failure path so the flag contract/tests remain
consistent (references: hasAnyWriteCFLogFile, hasWriteCFLogFileResultCh,
RestoreSSTFileSets, FlagRetainLatestMVCCVersion, logFilesIter).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: b5485598-72eb-416e-a2d1-fbf615bcd39d

📥 Commits

Reviewing files that changed from the base of the PR and between 8e50ced and 54723a4.

⛔ Files ignored due to path filters (1)
  • go.sum is excluded by !**/*.sum
📒 Files selected for processing (16)
  • DEPS.bzl
  • br/pkg/restore/log_client/client.go
  • br/pkg/restore/log_client/migration.go
  • br/pkg/restore/log_client/migration_test.go
  • br/pkg/restore/snap_client/BUILD.bazel
  • br/pkg/restore/snap_client/client.go
  • br/pkg/restore/snap_client/export_test.go
  • br/pkg/restore/snap_client/import.go
  • br/pkg/restore/snap_client/import_test.go
  • br/pkg/task/BUILD.bazel
  • br/pkg/task/config_test.go
  • br/pkg/task/restore.go
  • br/pkg/task/stream.go
  • br/pkg/task/stream_test.go
  • br/pkg/utils/iter/BUILD.bazel
  • go.mod
✅ Files skipped from review due to trivial changes (2)
  • br/pkg/restore/snap_client/BUILD.bazel
  • br/pkg/task/BUILD.bazel
🚧 Files skipped from review as they are similar to previous changes (8)
  • br/pkg/restore/snap_client/export_test.go
  • br/pkg/task/config_test.go
  • go.mod
  • br/pkg/restore/snap_client/client.go
  • DEPS.bzl
  • br/pkg/restore/snap_client/import_test.go
  • br/pkg/task/restore.go
  • br/pkg/restore/snap_client/import.go
🛑 Comments failed to post (1)
br/pkg/restore/log_client/client.go (1)

1717-1721: ⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

# show surrounding code around requested lines
sed -n '1680,1780p' br/pkg/restore/log_client/client.go | nl -ba | sed -n '1,120p'

Repository: pingcap/tidb

Length of output: 99


🏁 Script executed:

#!/bin/bash
set -euo pipefail
rg -n "ExecuteInternal" br/pkg/restore/log_client/client.go br/pkg/restore/log_client -S

Repository: pingcap/tidb

Length of output: 2740


🏁 Script executed:

#!/bin/bash
set -euo pipefail
# find where addSQL is consumed
rg -n "addSQL" br/pkg/restore/log_client/client.go -S

Repository: pingcap/tidb

Length of output: 1073


🏁 Script executed:

#!/bin/bash
set -euo pipefail
# locate fmt.Sprintf or Sprintf usage around the relevant region
rg -n "Sprintf\\(|fmt\\.Sprintf|\\bfmt\\.\\w*Sprintf" br/pkg/restore/log_client/client.go -S

Repository: pingcap/tidb

Length of output: 809


🏁 Script executed:

#!/bin/bash
set -euo pipefail
# check imports for strings/fmt usage
sed -n '1,80p' br/pkg/restore/log_client/client.go | nl -ba

Repository: pingcap/tidb

Length of output: 99


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# 1) Find ExecuteInternal method definitions (not call sites)
rg -n "func \\(.*\\) ExecuteInternal\\b" -S --glob='*.go' | head -n 50

Repository: pingcap/tidb

Length of output: 1018


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# 2) Find formatting helpers likely used with %? placeholders
rg -n "FormatSQL|formatSQL|%\\?|" -S --glob='*.go' | head -n 100

Repository: pingcap/tidb

Length of output: 5449


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# 3) Inspect br/pkg/restore/log_client/client.go around the relevant builder + call
sed -n '1635,1760p' br/pkg/restore/log_client/client.go

Repository: pingcap/tidb

Length of output: 4866


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# 4) Inspect where sql.AddSQL/args are passed into ExecuteInternal (index repair path)
sed -n '1860,1950p' br/pkg/restore/log_client/client.go

Repository: pingcap/tidb

Length of output: 3087


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Inspect TiDB session ExecuteInternal implementation
sed -n '1480,1625p' pkg/session/session.go

# Inspect br-side ExecuteInternal (glue session)
sed -n '160,270p' br/pkg/gluetidb/glue.go

Repository: pingcap/tidb

Length of output: 7984


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Search placeholder formatting logic around ExecuteInternal / formatSQL usage
rg -n "%\\?|%n|%%" pkg/session/session.go br/pkg/gluetidb/glue.go pkg/executor -S --glob='*.go'

Repository: pingcap/tidb

Length of output: 27088


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Find the helper that expands %?/ %n tokens (names vary)
rg -n "format.*sql|Format.*SQL|replace.*%\\?|%\\?\\s*;" pkg/session br/pkg/gluetidb -S --glob='*.go' | head -n 100

Repository: pingcap/tidb

Length of output: 1519


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Locate any function that formats placeholders for ExecuteInternal args
rg -n "ExecuteInternal\\(" pkg/session/session.go br/pkg/gluetidb/glue.go -S --glob='*.go'

Repository: pingcap/tidb

Length of output: 1013


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Locate sqlescape.EscapeSQL implementation
rg -n "func \\(.*\\)?EscapeSQL\\b|EscapeSQL\\(" pkg -S --glob='*.go' | head -n 50

Repository: pingcap/tidb

Length of output: 1929


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Narrow to the sqlescape package and show the EscapeSQL function body
rg -n "package sqlescape" pkg -S --glob='*.go' | head -n 50

Repository: pingcap/tidb

Length of output: 159


🏁 Script executed:

#!/bin/bash
set -euo pipefail
# Try common locations first
ls pkg/util/sqlescape 2>/dev/null || true
ls pkg/util/sqlescape* 2>/dev/null || true
ls pkg/sqlescape 2>/dev/null || true

rg -n "sqlescape\\.EscapeSQL" -S --glob='*.go' br pkg | head -n 20

Repository: pingcap/tidb

Length of output: 800


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# After we find the file, print surrounding lines
# (fallback: dump the first match we can find by file path)
MATCH_FILE="$(rg -n "func \\w*EscapeSQL\\b" pkg -S --glob='*.go' | head -n 1 | cut -d: -f1)"
echo "MATCH_FILE=$MATCH_FILE"
sed -n '1,220p' "$MATCH_FILE"

Repository: pingcap/tidb

Length of output: 5924


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# show rest of escapeSQL implementation after the shown portion, around where switch finishes
sed -n '220,420p' pkg/util/sqlescape/utils.go

Repository: pingcap/tidb

Length of output: 2451


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# look for tests around literal percent handling
rg -n "EscapeSQL\\(|MustEscapeSQL\\(|%\\?|'%|%%|like" pkg/util/sqlescape/utils_test.go | head -n 120

Repository: pingcap/tidb

Length of output: 1454


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# show the specific TestEscapeSQL table entries around where it mentions %?
sed -n '340,480p' pkg/util/sqlescape/utils_test.go | nl -ba | head -n 200

Repository: pingcap/tidb

Length of output: 99


🏁 Script executed:

#!/bin/bash
set -euo pipefail
# show remaining part of EscapeSQL implementation after the %n/%? switch
sed -n '420,620p' pkg/util/sqlescape/utils.go

Repository: pingcap/tidb

Length of output: 38


🏁 Script executed:

#!/bin/bash
set -euo pipefail
# scan tests for any explicit handling of literal % / %% / LIKE patterns
rg -n "%%|%'|%\\?\\b|LIKE|percent" pkg/util/sqlescape/utils_test.go | head -n 200

Repository: pingcap/tidb

Length of output: 108


🏁 Script executed:

#!/bin/bash
set -euo pipefail
# show the latter part of TestEscapeSQL to see expected behavior for percent sequences
sed -n '90,220p' pkg/util/sqlescape/utils_test.go | nl -ba | head -n 160

Repository: pingcap/tidb

Length of output: 99


🏁 Script executed:

#!/bin/bash
set -euo pipefail
rg -n "ConditionExprString" -S br/pkg pkg | head -n 200

Repository: pingcap/tidb

Length of output: 2123


🏁 Script executed:

#!/bin/bash
set -euo pipefail
# find struct definition for IndexInfo/ConditionExprString
rg -n "type .*IndexInfo|ConditionExprString" -S br/pkg | head -n 80

Repository: pingcap/tidb

Length of output: 461


🏁 Script executed:

#!/bin/bash
set -euo pipefail
# inspect ingestrec package
find br -type f -path "*ingestrec*" -maxdepth 6 2>/dev/null | head -n 50

Repository: pingcap/tidb

Length of output: 311


🏁 Script executed:

#!/bin/bash
set -euo pipefail
# locate ingestrec sources
fd ingestrec br pkg -t f 2>/dev/null || true

Repository: pingcap/tidb

Length of output: 38


🏁 Script executed:

#!/bin/bash
set -euo pipefail
# list files with ingest recorder
rg -n "package .*ingestrec" -S br pkg | head -n 50

Repository: pingcap/tidb

Length of output: 389


🏁 Script executed:

#!/bin/bash
set -euo pipefail
rg -n "ConditionExprString" -S br/pkg pkg | head -n 200

Repository: pingcap/tidb

Length of output: 2123


🏁 Script executed:

#!/bin/bash
set -euo pipefail
rg -n "ConditionExprString" -S br/pkg/restore pkg | head -n 200

Repository: pingcap/tidb

Length of output: 2123


🏁 Script executed:

#!/bin/bash
set -euo pipefail
rg -n "package .*ingestrec" -S br pkg | head -n 100

Repository: pingcap/tidb

Length of output: 389


🏁 Script executed:

#!/bin/bash
set -euo pipefail
rg -n "ConditionExprString" -S br/pkg/ingest -S || true

Repository: pingcap/tidb

Length of output: 110


🏁 Script executed:

#!/bin/bash
set -euo pipefail
rg -n "ConditionExprString" -S pkg | head -n 200

Repository: pingcap/tidb

Length of output: 1932


🏁 Script executed:

#!/bin/bash
set -euo pipefail
# Inspect how idxInfo.ConditionExprString is assigned/constructed
sed -n '380,470p' pkg/ddl/index.go
sed -n '1000,1065p' pkg/ddl/index.go
sed -n '3920,4025p' pkg/ddl/index.go
sed -n '560,650p' pkg/table/tables/index_test.go
sed -n '900,970p' pkg/table/tables/index.go

Repository: pingcap/tidb

Length of output: 14277


Escape % in ConditionExprString before building sql.AddSQL for ExecuteInternal.

ExecuteInternal runs sqlescape.EscapeSQL over the full sql.AddSQL (since addArgs is non-empty). EscapeSQL treats %n and %? as placeholders anywhere in the SQL string, so a partial-index predicate containing literal %n/%? (e.g. LIKE '%n%' or LIKE '%?%') would be mis-parsed and can break ingest-index repair.

Suggested fix
 		// WHERE CONDITION
 		if len(info.IndexInfo.ConditionExprString) > 0 {
 			addSQL.WriteString(" WHERE ")
-			addSQL.WriteString(info.IndexInfo.ConditionExprString)
+			addSQL.WriteString(strings.ReplaceAll(info.IndexInfo.ConditionExprString, "%", "%%"))
 		}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

		// WHERE CONDITION
		if len(info.IndexInfo.ConditionExprString) > 0 {
			addSQL.WriteString(" WHERE ")
			addSQL.WriteString(strings.ReplaceAll(info.IndexInfo.ConditionExprString, "%", "%%"))
		}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@br/pkg/restore/log_client/client.go` around lines 1717 - 1721, The
partial-index predicate in info.IndexInfo.ConditionExprString must have literal
'%' characters escaped before concatenating into sql.AddSQL because
ExecuteInternal will later call sqlescape.EscapeSQL (when addArgs is non-empty)
which treats %n/%? as placeholders; update the code that builds addSQL (the
block that checks len(info.IndexInfo.ConditionExprString) and calls
addSQL.WriteString) to pre-process ConditionExprString and escape percent signs
(e.g. replace "%" with "%%") so any occurrences like "%n%" or "%?%" become safe
literals before appending to sql.AddSQL used by ExecuteInternal.

Leavrth added 3 commits June 9, 2026 13:33
Signed-off-by: Jianjun Liao <jianjun.liao@outlook.com>
Signed-off-by: Jianjun Liao <jianjun.liao@outlook.com>
Signed-off-by: Jianjun Liao <jianjun.liao@outlook.com>
@ti-chi-bot

ti-chi-bot Bot commented Jun 15, 2026

Copy link
Copy Markdown

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign bornchanger for approval. For more information see the Code Review Process.
Please ensure that each of them provides their approval before proceeding.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@ti-chi-bot

ti-chi-bot Bot commented Jun 15, 2026

Copy link
Copy Markdown

@Leavrth: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
idc-jenkins-ci-tidb/check_dev 4b181bb link true /test check-dev
idc-jenkins-ci-tidb/mysql-test 4b181bb link true /test mysql-test
idc-jenkins-ci-tidb/unit-test 4b181bb link true /test unit-test
pull-br-integration-test 4b181bb link true /test pull-br-integration-test
idc-jenkins-ci-tidb/check_dev_2 4b181bb link true /test check-dev2
idc-jenkins-ci-tidb/build 4b181bb link true /test build

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

do-not-merge/cherry-pick-not-approved release-note-none Denotes a PR that doesn't merit a release note. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant