Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Curious why some featured products are not present in all products. #639

Open
MustaphaU opened this issue Sep 22, 2024 · 2 comments
Open
Assignees

Comments

@MustaphaU
Copy link

MustaphaU commented Sep 22, 2024

Hi,

I am curious why some of the featured products are not available in all products.

Specifically, 5 featured items appear to be missing from all products.

When I run this in Lab 4 of the Personalization workshop:

all_products_resp = requests.get('http://{}/products/all'.format(products_service_instance))
featured_products_resp = requests.get('http://{}/products/featured'.format(products_service_instance))

all_products = all_products_resp.json()
featured_products = featured_products_resp.json()

print(set(pd.DataFrame(featured_products).id) - set(pd.DataFrame(all_products).id))

It outputs the following IDs, implying these items are featured but not in all products:

{'2ad09e8e-fd41-4d29-953e-546b924d7cb8',
 '4bb66b8a-cf13-4959-87ce-ca506fa568a2',
 '6bd74f2d-90c0-4ca6-9663-f3bbe9bf405b',
 '6f04daee-7387-442f-bc99-a9b0072b29ce',
 'b87da3f8-9a3e-417d-abd7-16329c5be1ba'}
@BastLeblanc
Copy link
Contributor

Hi,

You are right, the "all products" api actually doesn't return all the products because of a limit in the dynamodb scan operation.

all_products_resp = requests.get('http://{}/products/all'.format(products_service_instance))
featured_products_resp = requests.get('http://{}/products/featured'.format(products_service_instance))

all_products = all_products_resp.json()
featured_products = featured_products_resp.json()
print(len(all_products))

prints : 2028, but in the ddb table there is (currently, it might evolve) 2,466 items.

The fix would require to retrieve all data when doing the scan operation (as per paginating results doc https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/Scan.html#Scan.Pagination )

Can you let us know the impact on the work you are doing?

You are welcome to contribute with a PR for this.

@MustaphaU
Copy link
Author

MustaphaU commented Sep 27, 2024

Thank you. The issue/ impact was side-by-side comparisons of the reranked and the unranked lists could not be done effectively since the length of 'reranked list' < 'unranked list'
PR #642

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Working on it
Development

No branches or pull requests

3 participants