-
Notifications
You must be signed in to change notification settings - Fork 192
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add datasetter for date and fourier #405
Conversation
WIP. Will add some tests for the different scenarios that were described |
Codecov Report
@@ Coverage Diff @@
## main #405 +/- ##
==========================================
- Coverage 88.67% 88.59% -0.08%
==========================================
Files 21 21
Lines 1880 1885 +5
==========================================
+ Hits 1667 1670 +3
- Misses 213 215 +2
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just some questions about the intended behavior
Also, there is some heavy overlap between _generate_and_preprocess_model_data
and this method now and I think much could be combined
if isinstance(X, np.ndarray): | ||
return X |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we should support ndarray when we want to select different sets of columns. Unless we want to support the case where only costs are added. i.e
isinstance(X, np.ndarray) and self.control_columns is None and self.yearly_seasonality is None
I think that this can simplify the code heavily
try: | ||
new_channel_data = X[self.channel_columns].to_numpy() | ||
except KeyError as e: | ||
raise RuntimeError("New data must contain channel_data!", e) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DataFrames already raise KeyError
s can that be used instead? What's more important: switching to RuntimeError or having an informative error message
@@ -67,7 +67,7 @@ def batched_convolution(x, w, axis: int = 0): | |||
|
|||
|
|||
def geometric_adstock( | |||
x, alpha: float = 0.0, l_max: int = 12, normalize: bool = False, axis: int = 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
getting a mypy issue
Also think that this is still missing the |
Still working through the tests as removing the mixins requires a lot of changes 😢 However, the new interface should be somewhat stable. I can keep the mixins and just use the transformers that are within to support new data if that is more desirable. I'd appreciate any feedback and requests @ricardoV94 @juanitorduz @cluhmann |
if self.validate_data: | ||
self.validate("X", X) | ||
self.validate("y", y) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
New order because the x_transformer.transform(X)
will not have date_col
Will continue this off of the branch that @cetagostini is working on |
Addresses #400
📚 Documentation preview 📚: https://pymc-marketing--405.org.readthedocs.build/en/405/