Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve GeoLite2 file downloads when using RoadRunner #2124

Open
acelaya opened this issue May 10, 2024 · 1 comment
Open

Improve GeoLite2 file downloads when using RoadRunner #2124

acelaya opened this issue May 10, 2024 · 1 comment
Milestone

Comments

@acelaya
Copy link
Member

acelaya commented May 10, 2024

An issue was recently reported, which was causing a GeoLite2 db file download attempt for every visit to Shlink #2114. The root cause is not determined, and it eventually went away.

A similar issue was fixed some time ago, which caused the same result, but because of a bug in Shlink #2021

These issues are highlighting the fact that current approach to automatically download/update the GeoLite2 db file is a bit brittle, and would be good to revisit it.

Current approach and context

When Shlink started to use GeoLite2, it initially provided a command line tool that checks if the database is up to date, and tries to download it otherwise. It was up to users to schedule the execution of this command as they see fit.

This is still the recommended approach for those serving Shlink with a classic web server (nginx/apache + php-fpm, or similar).

For convenience, and due to the existence of background jobs when Shlink started to support swoole/openswoole, and later RoadRunner, Shlink tried to provide a mechanism to automatically check if the GeoLite2 db needs to be updated, every time a visit happens, and do it if the file's build date metadata tells it's old enough.

This presents some problems though. If the download fails for whatever reason (a bug in Shlink, incorrect write permissions, download timeout, error while extracting the file, etc.), when existing db is too old, Shlink will try to download a new file for every visit, which can lead to a lot of download attempts.

This is even worst with recent MaxMind API limit changes, which only allow 30 daily downloads for one API key, leading to email notifications and loggs getting fludded with errors, when Shlink has reached that limit.

Ideal scenario

In an ideal world, Shlink would try to update the GeoLite2 db only every N days, but not based on the file metadata, but on a fixed time schedule relative to when was the last attempt. If an error occurs, Shlink should re-schedule another attempt a bit later, with a maximum amount of attempts per day to try to avoid API limits.

This is tricky though, as RoadRunner's jobs system doesn't immediately provide this capability, so it would require some custom implementation.

RoadRunner's job queues docs https://docs.roadrunner.dev/queues-and-jobs/overview-queues

@fmunim

This comment has been minimized.

@acelaya acelaya moved this to Todo in Shlink Jul 3, 2024
@acelaya acelaya moved this from Todo to In Progress in Shlink Jul 21, 2024
@acelaya acelaya removed the status in Shlink Jul 23, 2024
@acelaya acelaya removed this from the 4.2.0 milestone Jul 23, 2024
@acelaya acelaya added this to the 4.3.0 milestone Aug 11, 2024
@acelaya acelaya moved this to Todo in Shlink Oct 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Todo
Development

No branches or pull requests

2 participants