Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ecommerce retirement fails #36

Closed
MoisesGSalas opened this issue Jun 26, 2024 · 12 comments · Fixed by #37
Closed

Ecommerce retirement fails #36

MoisesGSalas opened this issue Jun 26, 2024 · 12 comments · Fixed by #37

Comments

@MoisesGSalas
Copy link

Summary

While running the retirement pipeline in an nutmeg installation that runs ecommerce and discovery I got the following error in the user retirement statuses:

Moved from RETIRING_ECOMMERCE (step 91) to ERRORED (step 191):
Client Error 404: https://{{ ECOMMERCE_HOST }}/api/v2/user/retire/

My environment

  • Tutor version: 14.2.3
  • tutor-contrib-retirement version: 1.0.0
  • Output of the retirement command/CronJob: No users to retire

Additional context

A similar (pretty much the same) problem was reported back in 2020 in this forum post: https://discuss.openedx.org/t/retiring-a-learner-from-ecommerce/2951/2

I tried to investigate a little bit about that /api/v2/user/retire/ endpoint but I wasn't able to find if it was ever implemented in ecommerce.

This endpoint is called from the EcommerceApi client in the tubular repository.

I tried to check our old retirement pipelines but we didn't have the ecommerce steps.

I would like to confirm if this is a misconfiguration on my side or this have never really worked and it should be removed from the pipeline.

@fghaas
Copy link
Contributor

fghaas commented Jun 26, 2024

Considering the Nutmeg release has been EOL since the release of Olive (that is, since 2022-12-12), I'd say all bets are off at this point. If you can reproduce this in a Quince or Redwood environment, please do let us know.

@MoisesGSalas
Copy link
Author

MoisesGSalas commented Jun 26, 2024

I spinned up a tutor local in quince and faced the same issue when running tutor local retire-users:

b'Learner Retirement: Starting learner retirement for john using config file ./pipeline_config/config.yml'
b'Learner Retirement: Starting state RETIRING_ECOMMERCE'
ERROR:tubular.edx_api:API Error: 404 Client Error: Not Found for url: http://ecommerce.local.edly.io/api/v2/user/retire/ with status code: 404
b'Learner Retirement: Error in retirement state RETIRING_ECOMMERCE: 404 Client Error: Not Found for url: http://ecommerce.local.edly.io/api/v2/user/retire/'
b'Learner Retirement: Error encountered in state "RETIRING_ECOMMERCE"\n404 Client Error: Not Found for url: http://ecommerce.local.edly.io/api/v2/user/retire/'
b'Learner Retirement: Traceback (most recent call last):\n  File "/tubular/tubular/edx_api.py", line 63, in _request\n    response.raise_for_status()\n  File "/retirement/venv/lib/python3.8/site-packages/requests/models.py", line 1021, in raise_for_status\n    raise HTTPError(http_error_msg, response=self)\nrequests.exceptions.HTTPError: 404 Client Error: Not Found for url: http://ecommerce.local.edly.io/api/v2/user/retire/\n\nDuring handling of the above exception, another exception occurred:\n\nTraceback (most recent call last):\n  File "scripts/retire_one_learner.py", line 193, in retire_learner\n    response = getattr(config[service], method)(learner)\n  File "/retirement/venv/lib/python3.8/site-packages/backoff/_sync.py", line 105, in retry\n    ret = target(*args, **kwargs)\n  File "/retirement/venv/lib/python3.8/site-packages/backoff/_sync.py", line 105, in retry\n    ret = target(*args, **kwargs)\n  File "/tubular/tubular/edx_api.py", line 408, in retire_learner\n    return self._request(\'POST\', api_url, json=data)\n  File "/tubular/tubular/edx_api.py", line 81, in _request\n    raise HttpDoesNotExistException(str(exc))\ntubular.exception.HttpDoesNotExistException: 404 Client Error: Not Found for url: http://ecommerce.local.edly.io/api/v2/user/retire/\n'

I'm pretty sure this code has not changed in a long time. The most recent change is the move of the scripts from tubular to edx-platform in redwood, but it still keeps the same API call to the same endpoint: https://github.com/openedx/edx-platform/blob/8c40057c1078806e43c8278750ede664309c958c/scripts/user_retirement/utils/edx_api.py#L407

Even if nutmeg is EOL what I really wanted to know is if the ecommerce step worked back then when developing the plugin, mostly to discard my suspicion that this was never correctly implemented when the scripts were added to the tubular repository.

@fghaas
Copy link
Contributor

fghaas commented Jun 26, 2024

Honestly, we wouldn't know, because we don't use ecommerce. In other words, we never had ECOMMERCE_HOST defined, thus our generated config.yml does not configure the retirement pipeline to even touch ecommerce.

But if the pipeline is broken because it calls an endpoint that's really a 404, then that would look very much like nothing that the plugin can do anything about.

@bmtcril
Copy link

bmtcril commented Jun 26, 2024

Hi @MoisesGSalas , this definitely worked when I wrote it 6 yrs ago until I left edX. The URL should work and point here: https://github.com/openedx/ecommerce/blob/714cdfb4b8b818ddc86f710b5f91ffd6efad6eba/ecommerce/extensions/api/v2/views/retirement.py#L16

It's worth noting that it throws 404s for users not existing in ecommerce in a few places, so you may want to see what the specific error response string is from the 404 and check if that's your problem.

@MoisesGSalas
Copy link
Author

Thanks a lot for the answer @bmtcril. I found that view, but the weird thing is that the endpoint for that view is /api/v2/retirement/tracking_id/(?P<username>[\w .@_+-]+)/. That's how it appears in the api-docs swagger page too. And if I do a request to this endpoint I receive a succesfull response:

$ curl -X GET -H "Authorization: JWT $JWT" "https://$ECOMMERCE_HOST/api/v2/retirement/tracking_id/my_username/"
{"id":2,"ecommerce_tracking_id":"ecommerce-2","lms_user_id":10}%

So I'm really curious why the retirement script uses that random /api/v2/user/retire/ endpoint.

@fghaas, if it's really not working I think maybe removing those steps from the pipeline configuration would be a good idea.

@bmtcril
Copy link

bmtcril commented Jun 26, 2024

Oh! I see. Yes, those steps won't work. It's likely that things were configured for ecommerce to get the Segment IDs, but I have some vague recollection that user data could not be deleted from ecommerce for financial compliance reasons. I think it makes sense to remove those retirement steps from here since it will never / has never worked, but in the unlikely case that someone is using both Ecommerce and Segment it may make sense to keep the rest of the ecomm configuration.

@fghaas
Copy link
Contributor

fghaas commented Jun 27, 2024

@bmtcril I'm afraid I don't quite follow. If some data can't be deleted from ecommerce for compliance reasons then that makes sense, but the retirement pipeline may still want to do other things such as obscure data that is not needed for those reasons (cf. GDPR principles of data minimisation and storage limitation). So, rather than this plugin just throwing up its hands saying "I don't care about ecommerce", shouldn't the retire_one_learner.py script be fixed instead?

https://github.com/openedx/edx-platform/blob/8c40057c1078806e43c8278750ede664309c958c/scripts/user_retirement/retire_one_learner.py#L157

@bmtcril
Copy link

bmtcril commented Jun 27, 2024

The problem isn't that retire_one_learner is running the wrong API, it's using a different API for a different purpose (getting the Segment ID for the Segment retirement step). There isn't an existing API for retiring learner data in Ecommerce (that I know of, it's been a few years but I didn't see anything in searching).

So someone would have to figure out a general purpose set of retirement steps for Ecommerce to take that wouldn't run afoul of laws, then implement it (given the deprecated, unmaintained state that may be a challenge), and possibly backport it if people want it old named releases.

Even then I think you would still want the step to be opt-in rather than based on the existence of Ecommerce in the environment just in case organizations have different needs.

@fghaas
Copy link
Contributor

fghaas commented Jun 27, 2024

There isn't an existing API for retiring learner data in Ecommerce (that I know of, it's been a few years but I didn't see anything in searching).

Okay, but then that this means ecommerce is generally a non-starter whenever and wherever GDPR is involved; is that what you're saying?

fghaas added a commit to fghaas/tutor-contrib-retirement that referenced this issue Jun 27, 2024
This never worked, and can't ever work, due to limitations in the
E-Commerce service itself.

Thus, rather than rely on whether ECOMMERCE_HOST is defined to decide
whether the retirement pipeline for that service should be enabled,
always disable it.

Also, update the README explaining that this plugin can't be used to
retire E-Commerce accounts.

Fixes hastexo#36.
@bmtcril
Copy link

bmtcril commented Jun 27, 2024

I'm not a lawyer and this is not legal advice. 🙂 Anyone looking to set up an e-commerce system should consult with a lawyer to make sure the system and configuration is in compliance with all laws apply to them. That said, my personal recollection is that there were exceptions carved out of things like GDPR and COPPA for financial transaction data. The retirement pipeline is designed to be extended if an organization has needs that the default configuration doesn't address, and is not the only way to delete data from a system.

IMO (as someone who maintained it) there are plenty of reasons why Ecommerce should be a non-starter for everyone. Most pressingly that it is in a weird state of deprecation and has an uncertain future around things like security patches and long-term maintenance. It's also extremely complicated to configure, expensive to run, and the upstream package it relies on has dropped MySQL support (which is how we expect it to be run), so that's a time bomb for any time we upgrade django-oscar.

@fghaas
Copy link
Contributor

fghaas commented Jun 27, 2024

@bmtcril Fair! If you have thoughts on #37, please share them.

@fghaas
Copy link
Contributor

fghaas commented Jul 3, 2024

@MoisesGSalas Thanks for reporting this!

MoisesGSalas pushed a commit to fccn/tutor-contrib-retirement that referenced this issue Jul 3, 2024
This never worked, and can't ever work, due to limitations in the
E-Commerce service itself.

Thus, rather than rely on whether ECOMMERCE_HOST is defined to decide
whether the retirement pipeline for that service should be enabled,
always disable it.

Also, update the README explaining that this plugin can't be used to
retire E-Commerce accounts.

Fixes hastexo#36.

(cherry picked from commit 64dcfa3)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants