Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable querying cumulative metrics with agg_time_dimension #999

Merged
merged 13 commits into from
Jan 31, 2024

Conversation

courtneyholcomb
Copy link
Contributor

@courtneyholcomb courtneyholcomb commented Jan 26, 2024

Resolves #1000

Description

Enable querying cumulative metrics with their agg_time_dimension. Should have the same behavior as when queried with metric_time.

@cla-bot cla-bot bot added the cla:yes label Jan 26, 2024
Copy link

Thank you for your pull request! We could not find a changelog entry for this change. For details on how to document a change, see the contributing guide.

@courtneyholcomb courtneyholcomb changed the title Add time_dimension_spec_for_join to JoinOverTimeRangeNode Enable querying cumulative metrics with agg_time_dimension Jan 26, 2024
@courtneyholcomb courtneyholcomb marked this pull request as ready for review January 26, 2024 06:03
@courtneyholcomb courtneyholcomb added the Run Tests With Other SQL Engines Runs the test suite against the SQL engines in our target environment label Jan 26, 2024
@github-actions github-actions bot removed the Run Tests With Other SQL Engines Runs the test suite against the SQL engines in our target environment label Jan 26, 2024
Copy link
Contributor

@tlento tlento left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, thanks for splitting this up into such bite-sized chunks for review!

There are a few boundary cases I'd like to test here and read through before approving.

  1. The time range constraint SQL rendering - the query runs, but I worry this is because the optimizer collapses the metric_time references into direct column name references, and it's not clear to me what the raw rendering will look like. I see this is tested for result correctness in the integration test (which is great) but there is no SQL to look at right now.
  2. Do we render correctly for metrics that take multiple measures with the same agg_time_dimension?
  3. Do we throw the right error for metrics that take multiple measures with different agg_time_dimensions?

cumulative_metric_constrained_node: Optional[ConstrainTimeRangeNode] = None
if (
cumulative_metric_adjusted_time_constraint is not None
and time_range_constraint is not None
and queried_linkable_specs.contains_metric_time
and queried_agg_time_dimension_specs
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Careful.... the ConstrainTimeRangeNode is hard-coded against metric time here:

https://github.com/dbt-labs/metricflow/blob/main/metricflow/plan_conversion/dataflow_to_sql.py#L991

If you follow that through the underlying property accessor it's just a set of TimeDimensionInstances manufactured out of thin air with the value of the METRIC_TIME_NAME enum entry.

It's possible this will still work due to magic, but we may need to change things up a bit to allow an override, otherwise time range constraints might not work properly for cases where people use the underlying agg_time_dimension name.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep I'm aware! But good point that I actually I should revert the change on this line so it only applies the time constraint with metric_time - forgot to do that.
I had actually put up a separate issue for this since it's not directly related to the feature: #1008
Since the time constraint param isn't exposed in any of our APIs, only in the MF CLI, it seems less important to fix that issue before releasing this feature.
Note: in the integration test you noted, I used a where filter instead of a time constraint.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just put up a commit to raise a more appropriate error if metric_time is not used here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the time constraint param isn't exposed in any of our APIs, only in the MF CLI, it seems less important to fix that issue before releasing this feature.

Right, makes sense, although we still don't want result divergence for people who do local dev with the CLI and then issue where filters in prod cloud accounts.

If I'm honest, though, I'm mostly looking ahead to predicate pushdown implementation concerns. :)

I'll take a look at the update!

date_part=None,
)
self._metric_time_specs = tuple(
TimeDimensionSpec.generate_possible_specs_for_time_dimension(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice.

@courtneyholcomb
Copy link
Contributor Author

Do we render correctly for metrics that take multiple measures with the same agg_time_dimension?
Do we throw the right error for metrics that take multiple measures with different agg_time_dimensions?

@tlento I realized this isn't relevant to this PR since cumulative metrics can't have multiple input measures. So I'll move dealing with this feedback to the time offset PR!

Copy link
Contributor

@tlento tlento left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

@courtneyholcomb courtneyholcomb enabled auto-merge (squash) January 30, 2024 23:36
@courtneyholcomb courtneyholcomb merged commit e869fb3 into main Jan 31, 2024
9 checks passed
@courtneyholcomb courtneyholcomb deleted the court/cumulative-agg-time-dim branch January 31, 2024 00:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[SL-1604] [Feature] Enable querying cumulative metrics with their agg_time_dimension
2 participants