Parent issue: #2058
Summary
Implement full Spark Connect metadata extraction and reporting on top of the parser awareness added in earlier phases.
This is the Phase 3 work item from #2058.
Problem
Once Connect events are accepted, the tools still need to interpret them to provide user/session attribution, operation lifecycle timing, and operation-level failure metadata. This is the feature-complete Spark Connect support layer.
Scope
- Parse Connect session lifecycle events for
sessionId and userId
- Parse Connect operation lifecycle events for
operationId, jobTag, timing markers, row counts, and failure metadata
- Correlate operation events with SQL executions/jobs via
jobTag
- Support per-session config tracking by combining session correlation with
modifiedConfigs
- Surface relevant metadata in qualification/profiling outputs where appropriate
- Consider multi-user reporting or attribution breakdowns where the data model supports it
Acceptance Criteria
- Session lifecycle metadata is captured and queryable
- Operation lifecycle timing can be reconstructed from Connect events
- Operation failures/cancellations are attributable in tool output or internal models
- Connect operations can be correlated to SQL executions/jobs through
jobTag
- Tests cover session attribution, timing reconstruction, and failure handling
Notes
Relevant analysis in repo:
core/docs/spark-connect-events-analysis.md
core/docs/spark-connect-operation-started-examples.md
Parent issue: #2058
Summary
Implement full Spark Connect metadata extraction and reporting on top of the parser awareness added in earlier phases.
This is the Phase 3 work item from #2058.
Problem
Once Connect events are accepted, the tools still need to interpret them to provide user/session attribution, operation lifecycle timing, and operation-level failure metadata. This is the feature-complete Spark Connect support layer.
Scope
sessionIdanduserIdoperationId,jobTag, timing markers, row counts, and failure metadatajobTagmodifiedConfigsAcceptance Criteria
jobTagNotes
Relevant analysis in repo:
core/docs/spark-connect-events-analysis.mdcore/docs/spark-connect-operation-started-examples.md