Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CLV Quickstart example fails #435

Closed
animenon opened this issue Nov 14, 2023 · 6 comments
Closed

CLV Quickstart example fails #435

animenon opened this issue Nov 14, 2023 · 6 comments
Labels
CLV docs Improvements or additions to documentation

Comments

@animenon
Copy link

CLV Quickstart example fails at the function call:
beta_geo_model = clv.BetaGeoModel(data = data)

Not sure what I am missing here, I am on a Mac M1 and using conda to run the code from ipython.

On a side note, why doesn't the package just have a pip installable version? I am not a conda user so to just checkout the package I had to use conda.

@animenon
Copy link
Author

Error I see:

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
File ~/.pyenv/versions/anaconda3-2023.09-0/lib/python3.11/site-packages/pandas/core/indexes/base.py:3653, in Index.get_loc(self, key)
   3652 try:
-> 3653     return self._engine.get_loc(casted_key)
   3654 except KeyError as err:

File ~/.pyenv/versions/anaconda3-2023.09-0/lib/python3.11/site-packages/pandas/_libs/index.pyx:147, in pandas._libs.index.IndexEngine.get_loc()

File ~/.pyenv/versions/anaconda3-2023.09-0/lib/python3.11/site-packages/pandas/_libs/index.pyx:176, in pandas._libs.index.IndexEngine.get_loc()

File pandas/_libs/hashtable_class_helper.pxi:7080, in pandas._libs.hashtable.PyObjectHashTable.get_item()

File pandas/_libs/hashtable_class_helper.pxi:7088, in pandas._libs.hashtable.PyObjectHashTable.get_item()

KeyError: 'customer_id'

The above exception was the direct cause of the following exception:

KeyError                                  Traceback (most recent call last)
File ~/.pyenv/versions/anaconda3-2023.09-0/envs/marketing_env/lib/python3.11/site-packages/pymc_marketing/clv/models/beta_geo.py:116, in BetaGeoModel.__init__(self, data, model_config, sampler_config)
    115 try:
--> 116     self.customer_id = data["customer_id"]
    117 except KeyError:

File ~/.pyenv/versions/anaconda3-2023.09-0/lib/python3.11/site-packages/pandas/core/frame.py:3761, in DataFrame.__getitem__(self, key)
   3760     return self._getitem_multilevel(key)
-> 3761 indexer = self.columns.get_loc(key)
   3762 if is_integer(indexer):

File ~/.pyenv/versions/anaconda3-2023.09-0/lib/python3.11/site-packages/pandas/core/indexes/base.py:3655, in Index.get_loc(self, key)
   3654 except KeyError as err:
-> 3655     raise KeyError(key) from err
   3656 except TypeError:
   3657     # If we have a listlike key, _check_indexing_error will raise
   3658     #  InvalidIndexError. Otherwise we fall through and re-raise
   3659     #  the TypeError.

KeyError: 'customer_id'

During handling of the above exception, another exception occurred:

KeyError                                  Traceback (most recent call last)
Cell In[6], line 1
----> 1 beta_geo_model = clv.BetaGeoModel(data = data)

File ~/.pyenv/versions/anaconda3-2023.09-0/envs/marketing_env/lib/python3.11/site-packages/pymc_marketing/clv/models/beta_geo.py:118, in BetaGeoModel.__init__(self, data, model_config, sampler_config)
    116     self.customer_id = data["customer_id"]
    117 except KeyError:
--> 118     raise KeyError("customer_id column is missing from data")
    119 try:
    120     self.frequency = data["frequency"]

KeyError: 'customer_id column is missing from data'

@animenon
Copy link
Author

Error in short: KeyError: 'customer_id column is missing from data'

animenon added a commit to animenon/pymc-marketing that referenced this issue Nov 14, 2023
Adding  `customer_id` column to the sample dataset.

`customer_id` is a required column otherwise the example fails (as described in pymc-labs#435)
@xhulianoThe1
Copy link
Contributor

Seems this dataset doesn't have the "customer_id" column which is required for the Beta Geo Model.

Setting the index as the customer_id should fix the issue given it just needs a unique identifier...

data['customer_id'] = data.index

@juanitorduz
Copy link
Collaborator

juanitorduz commented Nov 16, 2023

Do you want to do a pull request :) ?

@xhulianoThe1
Copy link
Contributor

Will submit a pr.

This was referenced Nov 16, 2023
@ricardoV94
Copy link
Contributor

Closed via #440

@ricardoV94 ricardoV94 added docs Improvements or additions to documentation CLV labels Dec 7, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLV docs Improvements or additions to documentation
Projects
None yet
Development

No branches or pull requests

4 participants