Introduce batched query execution and data-node side reduce #121885

original-brownbear · 2025-02-06T09:04:55Z

Shortest version I could think of for this from where we are now.

This change moves the query phase a single roundtrip per node just like can_match or field_caps work already.
A a result of executing multiple shard queries from a single request we can also partially reduce each node's query results on the data node side before responding to the coordinating node.

As a result this change significantly reduces the impact of network latencies on the end-to-end query performance, reduces the amount of work done (memory and cpu) on the coordinating node and the network traffic by factors of up to the number of shards per data node!

Benchmarking shows up to orders of magnitude improvements in heap and network traffic dimensions in querying across a larger number of shards.

Shortest version I could think of. Still WIP, have to make some test adjustments and polish rough edges, but it shouldn't get longer than this.

original-brownbear · 2025-02-06T09:20:19Z

trivial dependency: #121887

original-brownbear · 2025-02-06T16:23:26Z

Another trivial dependency #121922 to avoid some duplication here and remove existing dead code.

…-exec-short

An easy change we can split out of elastic#121885 to make that shorter.

original-brownbear · 2025-03-29T15:52:53Z

Alrighty, flag is in place, serverless setting reference is removed :)

Thanks Luca + Jim! I'll open a back port PR to 8.19 after some benchmarking! :)

…astic#122188) An easy change we can split out of elastic#121885 to make that shorter.

…22188) (#126293) An easy change we can split out of #121885 to make that shorter.

…121885) This change moves the query phase a single roundtrip per node just like can_match or field_caps work already. A a result of executing multiple shard queries from a single request we can also partially reduce each node's query results on the data node side before responding to the coordinating node. As a result this change significantly reduces the impact of network latencies on the end-to-end query performance, reduces the amount of work done (memory and cpu) on the coordinating node and the network traffic by factors of up to the number of shards per data node! Benchmarking shows up to orders of magnitude improvements in heap and network traffic dimensions in querying across a larger number of shards.

…#126563) * Introduce batched query execution and data-node side reduce (#121885) This change moves the query phase a single roundtrip per node just like can_match or field_caps work already. A a result of executing multiple shard queries from a single request we can also partially reduce each node's query results on the data node side before responding to the coordinating node. As a result this change significantly reduces the impact of network latencies on the end-to-end query performance, reduces the amount of work done (memory and cpu) on the coordinating node and the network traffic by factors of up to the number of shards per data node! Benchmarking shows up to orders of magnitude improvements in heap and network traffic dimensions in querying across a larger number of shards. * Filter out empty top docs results before merging (#126385) `Lucene.EMPTY_TOP_DOCS` to identify empty to docs results. These were previously null results, but did not need to be send over transport as incremental reduction was performed only on the data node. Now it can happen that the coord node received a merge result with empty top docs, which has nothing interesting for merging, but that can lead to an exception because the type of the empty array does not match the type of other shards results, for instance if the query was sorted by field. To resolve this, we filter out empty top docs results before merging. Closes #126118 --------- Co-authored-by: Luca Cavanna <javanna@apache.org>

#121885 attempted to shortcut a phase failure caused by a reduction failure on the data node by failing the query phase in the batched query action response listener. Before batching the query phase, we did not fail the phase immediately upon a reduction failure. We held on to the failure and continued querying all shards, only failing during final reduction at the beginning of the fetch phase. I can't think of anything inherently wrong with this approach, besides the fact that the phase cannot be failed multiple times (#134151). However certain cleanup aspects of the code (specifically releasing reader contexts and query search results, see: #130821, #122707) rely on the assumption that all shards are queried before failing the phase. This commit reworks batched requests to fail in the same way: only after all shards are queried. To do this, we must include results in transport response even when a reduction failure occurred.

elastic#121885 attempted to shortcut a phase failure caused by a reduction failure on the data node by failing the query phase in the batched query action response listener. Before batching the query phase, we did not fail the phase immediately upon a reduction failure. We held on to the failure and continued querying all shards, only failing during final reduction at the beginning of the fetch phase. I can't think of anything inherently wrong with this approach, besides the fact that the phase cannot be failed multiple times (elastic#134151). However certain cleanup aspects of the code (specifically releasing reader contexts and query search results, see: elastic#130821, elastic#122707) rely on the assumption that all shards are queried before failing the phase. This commit reworks batched requests to fail in the same way: only after all shards are queried. To do this, we must include results in transport response even when a reduction failure occurred.

Introduce batched query execution and data-node side reduce

8f4a650

Shortest version I could think of. Still WIP, have to make some test adjustments and polish rough edges, but it shouldn't get longer than this.

original-brownbear added WIP :Search Foundations/Search Catch all for Search Foundations labels Feb 6, 2025

elasticsearchmachine added the v9.1.0 label Feb 6, 2025

original-brownbear mentioned this pull request Feb 6, 2025

WIP: Batch + move parts of reduce during query phase execution to data nodes #118490

Closed

[CI] Auto commit changes from spotless

adcad03

original-brownbear added 2 commits February 6, 2025 12:50

Merge remote-tracking branch 'elastic/main' into batched-exec-short

1082dd8

fix test

62f7c4f

original-brownbear and others added 19 commits February 6, 2025 21:38

Merge remote-tracking branch 'elastic/main' into batched-exec-short

643c4a2

Merge remote-tracking branch 'elastic/main' into batched-exec-short

a79c3b3

Merge remote-tracking branch 'elastic/main' into batched-exec-short

ce1d136

shorter

df69ec3

[CI] Auto commit changes from spotless

65f9b36

make ccs work, bring back skip

db88f8a

Merge remote-tracking branch 'origin/batched-exec-short' into batched…

c659106

…-exec-short

fixes

72f8845

Merge remote-tracking branch 'elastic/main' into batched-exec-short

249c078

fixes

b3a85b1

fixes

d9bea59

fixes

07bc5e8

bwc correctly

b942e4e

Merge remote-tracking branch 'elastic/main' into batched-exec-short

7064ade

mute

8a7edfa

drier

d1b5249

Merge remote-tracking branch 'elastic/main' into batched-exec-short

8a1b272

dry

61bc61c

revert noise

616ef31

original-brownbear added a commit to original-brownbear/elasticsearch that referenced this pull request Feb 10, 2025

Relax some search interfaces to allow arbitrary cancellable tasks

e42cab4

An easy change we can split out of elastic#121885 to make that shorter.

original-brownbear added 4 commits March 28, 2025 20:48

Merge branch 'main' into batched-exec-short

ffb8d26

Merge branch 'main' into batched-exec-short

e11828c

Merge branch 'main' into batched-exec-short

771ca68

Merge branch 'main' into batched-exec-short

fe8aaa8

original-brownbear merged commit fd2cc97 into elastic:main Mar 29, 2025
16 of 17 checks passed

original-brownbear deleted the batched-exec-short branch March 29, 2025 15:53

This was referenced Mar 31, 2025

action.search.shard_count.limit should apply to the shard count post can_match phase #52524

Closed

Add partial reduce nodes for reducing intermediate aggregation results #56748

Closed

Change the default batched_reduce_size of search requests #51857

Open

original-brownbear mentioned this pull request Mar 31, 2025

Remove redundant consumeAll call on in QueryPhaseResultConsumer #125299

Closed

javanna added the release highlight label Apr 2, 2025

javanna mentioned this pull request Apr 2, 2025

Batch query phase shard level requests per data node #112306

Closed

original-brownbear added a commit to original-brownbear/elasticsearch that referenced this pull request Apr 4, 2025

Relax some search interfaces to allow arbitrary cancellable tasks (el…

e6d74a0

…astic#122188) An easy change we can split out of elastic#121885 to make that shorter.

original-brownbear mentioned this pull request Apr 4, 2025

Relax some search interfaces to allow arbitrary cancellable tasks (#122188) #126293

Merged

elasticsearchmachine pushed a commit that referenced this pull request Apr 4, 2025

Relax some search interfaces to allow arbitrary cancellable tasks (#1…

7d9a420

…22188) (#126293) An easy change we can split out of #121885 to make that shorter.

original-brownbear mentioned this pull request Apr 9, 2025

Introduce batched query execution and data-node side reduce (#121885) #126563

Merged

manseaume mentioned this pull request May 21, 2025

NPE on batched query execution when the request is part of PIT with alias filters #128270

Closed

drempapis mentioned this pull request May 28, 2025

NPE on batched query execution when the request is part of PIT with alias filters #128552

Merged

cbuescher mentioned this pull request Jul 7, 2025

[CI] FieldSortIT testSortMixedFieldTypes failing #129445

Open

drempapis mentioned this pull request Jul 8, 2025

[CI] SearchWithRandomDisconnectsIT testSearchWithRandomDisconnects failing #122707

Open

benchaplin mentioned this pull request Jul 11, 2025

[CI] SearchWithRejectionsIT testOpenContextsAfterRejections failing #130821

Closed

This was referenced Sep 8, 2025

Add IT for num_reduced_phases with batched query execution #134312

Merged

Fix terms aggregation doc_count_error_upper_bound for already reduced results (batched query phase) #134645

Merged

ivancea mentioned this pull request Sep 15, 2025

Aggs: Fix CB on reduction phase #133398

Merged

benchaplin mentioned this pull request Oct 21, 2025

Remove early phase failure in batched #136889

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Introduce batched query execution and data-node side reduce #121885

Introduce batched query execution and data-node side reduce #121885

Uh oh!

original-brownbear commented Feb 6, 2025 •

edited

Loading

Uh oh!

original-brownbear commented Feb 6, 2025

Uh oh!

original-brownbear commented Feb 6, 2025

Uh oh!

original-brownbear commented Mar 29, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Introduce batched query execution and data-node side reduce #121885

Introduce batched query execution and data-node side reduce #121885

Uh oh!

Conversation

original-brownbear commented Feb 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

original-brownbear commented Feb 6, 2025

Uh oh!

original-brownbear commented Feb 6, 2025

Uh oh!

original-brownbear commented Mar 29, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

original-brownbear commented Feb 6, 2025 •

edited

Loading