-
Notifications
You must be signed in to change notification settings - Fork 192
Commit
* init rfm_segments func * TODOs * docstrings and for loop * docstrings and for loop * WIP dev notebook debugging * checkpoint commit for remote pull * code testing in dev notebook * unit tests added * dev notebook cleanup * clean up type hints * comments and code cleanup * docstrings * move formatting to rfm_summary and quickstart edits * fix rfm_train_test_split bug * added test for rfm_quartile_labels * added rfm score warning
- Loading branch information
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -67,10 +67,10 @@ | |
"* `customer_id` represents a unique identifier for each customer.\n", | ||
"* `frequency` represents the number of _repeat_ purchases that a customer has made, i.e. one less than the total number of purchases.\n", | ||
"* `T` represents a customer's \"age\", i.e. the duration between a customer's first purchase and the end of the period of study. In this example notebook, the units of time are in weeks.\n", | ||
"* `recency` represents the timepoint when a customer made their most recent purchase. This is also equal to the duration between a customer’s first non-repeat purchase (usually time 0) and last purchase. If a customer has made only 1 purchase, their recency is 0;\n", | ||
"* `recency` represents the time period when a customer made their most recent purchase. This is equal to the duration between a customer’s first and last purchase. If a customer has made only 1 purchase, their recency is 0.\n", | ||
This comment has been minimized.
Sorry, something went wrong.
This comment has been minimized.
Sorry, something went wrong.
coweal
|
||
"* `monetary_value` represents the average value of a given customer’s repeat purchases. Customers who have only made a single purchase have monetary values of zero.\n", | ||
"\n", | ||
"If working with raw transaction data, the `rfm_summary` function can be used to preprocess data for modeling:" | ||
"The `rfm_summary` function can be used to preprocess raw transaction data for modeling:" | ||
] | ||
}, | ||
{ | ||
|
@@ -339,6 +339,8 @@ | |
"id": "514ee548", | ||
"metadata": {}, | ||
"source": [ | ||
"It is important to note these definitions differ from that used in RFM segmentation, where the first purchase is included, `T` is not used, and `recency` is the number of time periods since a customer's most recent purchase.\n", | ||
"\n", | ||
"To visualize data in RFM format, we can plot the recency and T of the customers with the `plot_customer_exposure` function. We see a large chunk (>60%) of customers haven't made another purchase in a while." | ||
] | ||
}, | ||
|
@@ -2579,7 +2581,7 @@ | |
"name": "python", | ||
"nbconvert_exporter": "python", | ||
"pygments_lexer": "ipython3", | ||
"version": "3.9.18" | ||
"version": "3.10.14" | ||
}, | ||
"toc": { | ||
"base_numbering": 1, | ||
|
Hey! I am not sure if this is a correct definition. For example, if a customer has made one transaction one year ago and the period is in days, as I understand it, recency for that customer should be one year, not zero as it is in the current implementation.
Another problem is with customers who make a lot of purchases for a long time. For example, if a customer has been with the company almost from the beginning and has made many purchases over that time, with the last purchase, let's say, two days ago, their recency will be huge and equal their whole lifespan (not two days, as I would expect)! When one runs
expected_probability_alive
, the probability of being alive for this customer will be equal to zero, which definitely is not correct.It seems that recency should be equal to the difference between
observation_period_end
and the time of the latest purchase made by a customer.