Skip to content

Commit

Permalink
SpannerScanner: add option to disableDataboost
Browse files Browse the repository at this point in the history
Allows Databoost to be disabled; it is on by default
given the point of this connector. However, there is
something to be said about compatibility so that by
default most users who haven't enabled Databoost can
still use it, but that's to be discussed for later.

Fixes #68
  • Loading branch information
odeke-em committed Sep 19, 2023
1 parent 983ea73 commit b9473c4
Showing 1 changed file with 8 additions and 1 deletion.
Original file line number Diff line number Diff line change
Expand Up @@ -89,14 +89,21 @@ public InputPartition[] planInputPartitions() {
if (filters.length > 0) {
sqlStmt += " WHERE " + SparkFilterUtils.getCompiledFilter(true, filters);
}

// By default, dataBoost is enabled, given the point of this
// integration was to take advantage of dataBoost firstly.
// Please see https://github.com/GoogleCloudDataproc/spark-spanner-connector/issues/68
boolean disableDataboost = this.opts.get("disableDataboost");
boolean enableDataboost = disableDataboost == null || !disableDataboost;

try (BatchReadOnlyTransaction txn =
batchClient.batchClient.batchReadOnlyTransaction(TimestampBound.strong())) {
String mapAsJSON = SpannerUtils.serializeMap(this.opts);
List<com.google.cloud.spanner.Partition> rawPartitions =
txn.partitionQuery(
PartitionOptions.getDefaultInstance(),
Statement.of(sqlStmt),
Options.dataBoostEnabled(true));
Options.dataBoostEnabled(enableDataboost);

List<Partition> parts =
Streams.mapWithIndex(
Expand Down

0 comments on commit b9473c4

Please sign in to comment.