Skip to content

Commit

Permalink
[introspection] Set the minimal query parallelism to 2 (#1586)
Browse files Browse the repository at this point in the history
This commit makes sure that the default parallelism for the query engine
(datafusion) is at least 2. This value translates to 'target partitions'
which is used by datafusion during query execution to control execution
concurrency. Setting this value to 1 however causes #1585, and setting
this to high is unadvised in our case, as datafusion is used for
low-prio introspection tasks.
However once issue #1585 is resolved, we will return to properly sizing
datafusion.
  • Loading branch information
igalshilman authored Jun 3, 2024
1 parent ce1edbd commit e64ebc7
Showing 1 changed file with 9 additions and 3 deletions.
12 changes: 9 additions & 3 deletions crates/storage-query-datafusion/src/context.rs
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@
// the Business Source License, use of this software will be governed
// by the Apache License, Version 2.0.

use std::cmp::max;
use std::fmt::Debug;
use std::sync::Arc;

Expand Down Expand Up @@ -168,9 +169,14 @@ impl QueryContext {
// build the session
//
let mut session_config = SessionConfig::new();
if let Some(p) = default_parallelism {
session_config = session_config.with_target_partitions(p)
}

// TODO: this value affects the number of partitions created by datafusion while
// executing a query. Setting this to 1 causes the SymmetricHashJoin to fail
// which we use for some of the queries.
// Resolving this issue is tracked at https:/restatedev/restate/issues/1585
let parallelism = default_parallelism.unwrap_or(2);
session_config = session_config.with_target_partitions(max(2, parallelism));

session_config = session_config
.with_allow_symmetric_joins_without_pruning(true)
.with_information_schema(true)
Expand Down

0 comments on commit e64ebc7

Please sign in to comment.