Skip to content

Remove CoalesceBatchesExec from plans #386

@gabotechs

Description

@gabotechs

In DataFusion 53, the CoalesceBatchesExec node was deprecated, the alternative is to just let each node batch things at will.

We still rely on this node for

pub(crate) fn batch_coalescing_below_network_boundaries(
, which adds some coalescing right below network boundaries so that we send bigger batches over the wire.

Removing this should show some performance improvements, as batch coalescing relies on data copies for creating the concatenated batches.

I'm not sure what should be the alternative to our batch_coalescing_below_network_boundaries though, I'm not sure if we can just rely on people to increase the vanilla datafusion.execution.batch_size setting, as queries that don't end up getting distributed might suffer some degradation if that's widely set.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions