columnar: add scan details stats#10905
Conversation
Signed-off-by: yongman <yming0221@gmail.com>
Signed-off-by: yongman <yming0221@gmail.com>
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Repository UI Review profile: CHILL Plan: Pro Run ID: ⛔ Files ignored due to path filters (1)
📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
📝 WalkthroughWalkthroughAdds end-to-end columnar scan statistics collection. A new ChangesColumnar Scan Statistics End-to-End
Sequence Diagram(s)sequenceDiagram
rect rgba(173, 216, 230, 0.5)
note over StorageDisaggColumnar,ExecutionSummaryHelper: Columnar scan stats collection and reporting pipeline
end
participant StorageDisaggColumnar as StorageDisaggregatedColumnar
participant RNColumnarInputStream
participant CloudStorageProxy as Cloud Storage Proxy
participant ColumnarScanContext
participant ExecutorStatisticsCollector
participant ExecutionSummaryHelper
StorageDisaggColumnar->>ColumnarScanContext: create and register in columnar_scan_context_map
RNColumnarInputStream->>RNColumnarInputStream: releaseReader()
RNColumnarInputStream->>RNColumnarInputStream: mergeReaderStats()
RNColumnarInputStream->>CloudStorageProxy: fn_columnar_scan_stats(reader)
CloudStorageProxy-->>RNColumnarInputStream: ColumnarScanStats
RNColumnarInputStream->>ColumnarScanContext: merge(ColumnarScanStats)
RNColumnarInputStream->>ColumnarScanContext: addUserReadBytes / addDeserializeBlockNs
ExecutorStatisticsCollector->>ColumnarScanContext: merge from columnar_scan_context_map
ExecutionSummaryHelper->>ColumnarScanContext: serialize() → protobuf
ExecutionSummaryHelper->>ExecutionSummaryHelper: emit columnar_scan_context to protobuf
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Suggested reviewers
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (2)
dbms/src/Storages/StorageDisaggregatedColumnar.cpp (1)
1385-1391: ⚡ Quick winLog exceptions in the new broad catch path
Line 1385-1391 swallows all exceptions from
releaseReader()with an emptycatch (...). Please keep it non-throwing but log viatryLogCurrentException(log, "...")so cleanup/FFI failures remain diagnosable.Suggested patch
SCOPE_EXIT({ try { releaseReader(); } catch (...) - {} + { + tryLogCurrentException(log, "failed to release RNColumnarInputStream reader"); + } });As per coding guidelines, use
tryLogCurrentException(log, "context")in broadcatch (...)paths to avoid duplicated exception-formatting code.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@dbms/src/Storages/StorageDisaggregatedColumnar.cpp` around lines 1385 - 1391, In the SCOPE_EXIT block containing the try-catch around the releaseReader() call, replace the empty catch (...) {} block with a logging statement. Instead of silently swallowing exceptions, call tryLogCurrentException(log, "context message") to log any exceptions thrown during cleanup, ensuring that cleanup and FFI failures remain diagnosable without duplicating exception-formatting code.Source: Coding guidelines
dbms/src/Flash/Coprocessor/ColumnarScanContext.h (1)
30-52: ⚡ Quick winUse project width aliases (
UInt64) for counters in dbms headers.Please align counter and API types with project conventions (
UInt64) instead of rawuint64_tin this header for consistency across DB code.As per coding guidelines:
**/*.{cpp,h,hpp}should use explicit width types fromdbms/src/Core/Types.hsuch asUInt8,UInt32,Int64,Float64,String.Also applies to: 196-197
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@dbms/src/Flash/Coprocessor/ColumnarScanContext.h` around lines 30 - 52, Replace all instances of `uint64_t` with the project's standard width alias `UInt64` in the ColumnarScanContext class header file. This applies to all the atomic counter member variables (regions, read_tasks, physical_tables, columns, user_read_bytes, mvcc_input_rows, mvcc_input_bytes, mvcc_output_rows, total_read_block_ns, total_serialize_block_ns, total_init_reader_ns, total_prefetch_ns, total_deserialize_block_ns, rough_check_total_packs, rough_check_selected_packs, rough_check_skipped_packs, rough_check_unknown_packs, remote_segments, total_segments) and any other counter declarations at lines 196-197. Change `std::atomic<uint64_t>` to `std::atomic<UInt64>` to align with the project's coding conventions defined in dbms/src/Core/Types.h.Source: Coding guidelines
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@dbms/src/Flash/Coprocessor/ColumnarScanContext.h`:
- Around line 106-107: The merge() method performs non-atomic load-max-store
operations on physical_tables and columns atomic variables (at lines 106-107 and
129-130), creating a race condition where concurrent calls can overwrite larger
maximum values with smaller ones. Replace the direct assignment pattern with
atomic compare-and-swap loops to ensure the maximum value is always preserved
atomically. Additionally, replace all uint64_t type declarations with UInt64
from dbms/src/Core/Types.h to align with coding guidelines. Apply these changes
to both the physical_tables and columns member variables in both locations.
---
Nitpick comments:
In `@dbms/src/Flash/Coprocessor/ColumnarScanContext.h`:
- Around line 30-52: Replace all instances of `uint64_t` with the project's
standard width alias `UInt64` in the ColumnarScanContext class header file. This
applies to all the atomic counter member variables (regions, read_tasks,
physical_tables, columns, user_read_bytes, mvcc_input_rows, mvcc_input_bytes,
mvcc_output_rows, total_read_block_ns, total_serialize_block_ns,
total_init_reader_ns, total_prefetch_ns, total_deserialize_block_ns,
rough_check_total_packs, rough_check_selected_packs, rough_check_skipped_packs,
rough_check_unknown_packs, remote_segments, total_segments) and any other
counter declarations at lines 196-197. Change `std::atomic<uint64_t>` to
`std::atomic<UInt64>` to align with the project's coding conventions defined in
dbms/src/Core/Types.h.
In `@dbms/src/Storages/StorageDisaggregatedColumnar.cpp`:
- Around line 1385-1391: In the SCOPE_EXIT block containing the try-catch around
the releaseReader() call, replace the empty catch (...) {} block with a logging
statement. Instead of silently swallowing exceptions, call
tryLogCurrentException(log, "context message") to log any exceptions thrown
during cleanup, ensuring that cleanup and FFI failures remain diagnosable
without duplicating exception-formatting code.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: e3055fd7-3f3f-421a-9e4b-ad3524a4dd58
⛔ Files ignored due to path filters (1)
contrib/tiflash-columnar-hub/Cargo.lockis excluded by!**/*.lock
📒 Files selected for processing (19)
contrib/cloud-storage-enginecontrib/tiflash-columnar-hub/Cargo.tomlcontrib/tiflash-columnar-hub/hub-runtime/Cargo.tomlcontrib/tiflash-columnar-hub/hub-runtime/ffi/src/RaftStoreProxyFFI/ProxyFFI.hcontrib/tiflash-columnar-hub/hub-runtime/src/cloud_helper.rscontrib/tiflash-columnar-hub/hub-runtime/src/columnar_impls.rscontrib/tiflash-columnar-hub/hub-runtime/src/interfaces.rscontrib/tiflash-columnar-hub/hub-runtime/src/run.rscontrib/tipbdbms/src/Flash/Coprocessor/ColumnarScanContext.hdbms/src/Flash/Coprocessor/ColumnarScanContext_fwd.hdbms/src/Flash/Coprocessor/DAGContext.hdbms/src/Flash/Coprocessor/ExecutionSummary.cppdbms/src/Flash/Coprocessor/ExecutionSummary.hdbms/src/Flash/Statistics/ExecutionSummaryHelper.cppdbms/src/Flash/Statistics/ExecutorStatisticsCollector.cppdbms/src/Flash/Statistics/TableScanImpl.cppdbms/src/Storages/StorageDisaggregatedColumnar.cppdbms/src/Storages/StorageDisaggregatedColumnar.h
Signed-off-by: yongman <yming0221@gmail.com>
Signed-off-by: yongman <yming0221@gmail.com>
|
@yongman: The following tests failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
What problem does this PR solve?
Issue Number: close #10906
Problem Summary:
There is no execute summary info in explain when query columnar.
What is changed and how it works?
Add columnar stats in explain and logs.
Check List
Tests
Side effects
Documentation
Release note
Summary by CodeRabbit
Summary by CodeRabbit
New Features
Bug Fixes
Chores