-
Notifications
You must be signed in to change notification settings - Fork 58
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PHOENIX-5238 Provide an option to pass hints with PhoenixRDD and Data… #4
base: master
Are you sure you want to change the base?
Conversation
used |
@@ -87,6 +88,7 @@ public PhoenixDataSourceReader(DataSourceOptions options) { | |||
this.tableName = options.tableName().get(); | |||
this.zkUrl = options.get(PhoenixDataSource.ZOOKEEPER_URL).get(); | |||
this.dateAsTimestamp = options.getBoolean("dateAsTimestamp", false); | |||
this.disableBlockCache = options.getBoolean("NO_CACHE", false); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you use Hint.NO_CACHE
instead of the String for consistency?
@@ -148,6 +150,9 @@ public StructType readSchema() { | |||
// Optimize the query plan so that we potentially use secondary indexes | |||
final QueryPlan queryPlan = pstmt.optimizeQuery(selectStatement); | |||
final Scan scan = queryPlan.getContext().getScan(); | |||
if (this.disableBlockCache) { | |||
scan.setCacheBlocks(false); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The scan
variable is unused. You can actually remove it. You should be setting this on each scan in the queryPlan, otherwise the Spark executor scans will not have this hint set. Instead of iterating over each scan here, it may be easier to set this in PhoenixDataSourceReadOptions
. We create an instance of this when we call PhoenixDataSourceReader#planInputPartitions()
from the driver. Also, these are embedded in each of our InputPartitions, so the read options are available to us on the Spark executors (see PhoenixInputPartitionReader#initialize()
). Here we are iterating over the scans and you can use the set value in the read options to setCacheBlocks
to false.
Also, in case this hint is provided, you should make sure any other scan objects used on the driver also has this property set for example, the scan that we use on the driver-side to get the region locations.
@@ -38,11 +38,14 @@ | |||
private static final String V1 = "v1"; | |||
private static final String V2 = "v2"; | |||
private static final String V3 = "v3"; | |||
private static final String NO_CACHE = "NO_CACHE"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment as above about using already defined enum value Hint.NO_CACHE
@@ -64,6 +67,8 @@ public void testPhoenixConfigsExtractedProperly() { | |||
assertEquals(V1, p.getProperty(P1)); | |||
assertEquals(V2, p.getProperty(P2)); | |||
assertEquals(V3, p.getProperty(P3)); | |||
assertEquals(V3, p.getProperty(P3)); | |||
assertEquals(true, Boolean.valueOf(p.getProperty(NO_CACHE))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test isn't testing your change at all..Here you are using the extraOptions to set the property and just checking that the property is set. Ideally, we want to use extraOptions to set HBase/Phoenix properties if they are valid configs we set, in say hbase-site.xml. In this case, NO_CACHE
is not such a config so we are using a different way to set this.
…source v2