-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Geo forwarder for tfgrid validators #56
Comments
Reading this issue, I'm a bit confused as to what the goal is here, or what problem is being solved. If I understand correctly, the idea is to have multiple independent stacks, which are served both under a unique domain each, and a globally shared domain. This shared domain would host this load balancer/proxy, and proxy to individual stacks, potentially based on geolocation. There are some problems with the concept as laid out above. First, this setup is considered decentralized because each stack is hosted individually, and by someone else. However if we put a centralized proxy in front of it, this removes the achieved decentralization (making the proxy the single point of failure in the process). Secondly, it makes no sense to proxy based on geolocation. At this level, the user already connected to the proxy, and from here on out al that matters is the latency between proxy and actual backend. Proxying to a physically close backend stack from the users perspective would give adverse effects if the user is far away from the proxy. Imo a better solution to this would be to solve this on the DNS level, by running geo aware authoritative DNS servers, where the DNS reply returns the IP(s) of the nearest backend stack if this is what we want. That way there is no additional (potentially high) latency cost to a centralized proxy first, and the SPOF of the proxy is removed. For this reason it can also be considered more decentralized. Note that the TLS certificate will need to be present on every backend stack (for the shared domain), which will require some mechanism to either distribute a certificate from a central location, or some orchestration so these backends can all acquire and renew the certificate on their own. |
Thanks for this great feedback. Always highly appreciated. Indeed I realized that the proxy was a SPOF. I thought worst case is that if it goes down, users will need to manually go to other URLs. It's not ideal for sure. The idea you bring here looks way better and you seem to say it's feasible. I think we could maybe close this issue and create a new one with what you suggest here. |
So, after trying again we have been instructed no DNS infra of any kind for now and also no POC setups. Thanks @LeeSmet for helping in trying to convince them. If we still have to continue with this geo load balancer, the next "best" approach (not) could be a 3 node geo forwarder. If:
In that case a user visiting dashboard.geo.grid.tf will get 3 ip's, round-robin will be used to choose one. Once the request is send, any of the 3 geo forwarders forward you to the closest grid stack. From that point on, you loose connection to dashboard.geo.grid.tf and continue on the forwarded grid backend stack. Having a hard time just to get a single instance working, both with Nginx and Caddy. Have some poc running here: http://geo.ninja.tf/ |
Looks good as a starter with round-robin. At least we do get the load balancing functionality! Thanks for the update. |
@coesensbert can you give a status of this? Thanks! |
Got further on this but stuck at nginx not mapping to a defined upstream. The following works, but doing this based on country name will make for a long list of pre-defined mappings. Which would have to be manually maintained based on where you want regions to go. Doable but not a practical approach. Mapping based on geo seem to work from basic vpn tests. nginx.conf
default website
Instead working with Continent names is much more in line with hour our current grid backend stacks are distributed (and will be in the future). Here we encounter the issue that for europe we have 3 stacks, while for the US one and for Asia also one. As this will grow organically, we need multiple options per continent thus we can introduce some basic load balancing if a given continent (or upstream) is chosen. The following config is currently exposed at http://geo.ninja.tf ,can be tested but is broken as explained below. nginx.conf
default website
The problem here is that the mapping happens, but is not interpreted as an upstream config. If you visit http://geo.ninja.tf you will be redirected to the name of the correct upstream, but not to what is configured inside the upstream config. |
Hi @coesensbert can you try replacing that redirect with a proxy_pass instead ? As you were saying it seems the you can't resolve that in a redirect, but may work in the proxy_pass
|
|
@xmonader @PeterNashaat thanks for the suggestions. We can't use proxy since that defeats the purpose of doing this. It would mean every request is first taken by the geo forwarder (well proxy in that case) and proxied to the closest stack. So if your in the us, every request would go to the eu, defeating what it's for and making the situation even worse then before. Using rewrite in the default website produces the same issue as with return 301
Even tried it like this, but this logic is not picked up at all while nginx approves the config
can show you about 4 different configs I also tried, same issue always comes up. So I'm missing something somewhere. |
|
your right, i solved that during testing, pasted the wrong config (have so many by now). Quickly tested it again and the logic is accepted by nginx but not used when a request comes in. I'm always just forwarded tot http://dashboard.grid.tf while I also tested the geoip2 db works.
|
got a cool AI suggestion from @Mik-TF to use this javascript thing function balanceServers(r) {
var upstream = r.variables.preferred_upstream;
var servers = {
'eu_upstream': ['dashboard.grid.tf', 'dashboard.be.grid.tf', 'dashboard.fin.grid.tf'],
'us_upstream': ['dashboard.us.grid.tf', 'dashboard.grid.tf'],
'as_upstream': ['dashboard.sg.grid.tf', 'dashboard.grid.tf'],
'default_upstream': ['dashboard.grid.tf']
};
var selectedServers = servers[upstream] || servers['default_upstream'];
var index = Math.floor(Math.random() * selectedServers.length);
return selectedServers[index];
}
export default {balanceServers}; here I run into the issue that we need a nginx js module, which only comes from the original nginx repo's. Installed those, then a conflict occurs with the geoip2 module. Seems the nginx repo's only support geoip, not geoip2
at this point, I have no clue nor any ideas anymore. Every path we make up, ends up dead. |
Here's a solution based on a bit different approach. The idea is to serve the user some Javascript that will take care of the redirect step inside the browser: <!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="" content="width=device-width, initial-scale=1.0">
<title>ThreeFold Dashboard</title>
</head>
<body>
<p>Redirecting...</p>
<script>
// Thanks to https://www.movable-type.co.uk/scripts/latlong.html (MIT license)
function getDistanceFromLatLonInKm(lat1,lon1,lat2,lon2) {
var R = 6371; // Radius of the earth in km
var dLat = deg2rad(lat2-lat1); // deg2rad below
var dLon = deg2rad(lon2-lon1);
var a =
Math.sin(dLat/2) * Math.sin(dLat/2) +
Math.cos(deg2rad(lat1)) * Math.cos(deg2rad(lat2)) *
Math.sin(dLon/2) * Math.sin(dLon/2)
;
var c = 2 * Math.atan2(Math.sqrt(a), Math.sqrt(1-a));
var d = R * c; // Distance in km
return d;
}
function deg2rad(deg) {
return deg * (Math.PI/180)
}
function redirectClosest(geoip) {
var closest = null;
for (const backend of backends) {
if (!closest) {
closest = {url: backend.url, distance: getDistanceFromLatLonInKm(geoip.lat, geoip.lon, backend.lat, backend.lon)};
} else {
distance = getDistanceFromLatLonInKm(geoip.lat, geoip.lon, backend.lat, backend.lon);
if (distance < closest.distance) {
closest = {url: backend.url, distance: distance};
}
}
}
window.location.replace(closest.url);
}
backends = [
{url: 'https://dashboard.grid.tf', lat: 50.4777, lon: 12.3649},
{url: 'https://dashboard.us.grid.tf', lat: 40.7862, lon: -74.0743},
{url: 'https://dashboard.be.grid.tf', lat: 51.0923, lon: 3.82025},
{url: 'https://dashboard.sg.grid.tf', lat: 1.35208, lon: 103.82},
{url: 'https://dashboard.fin.grid.tf', lat: 60.1797, lon: 24.9344}
];
window.onload = function() {
fetch('http://ip-api.com/json')
.then(response => response.json())
.then(data => redirectClosest(data));
};
</script>
</body>
</html> The backends are hardcoded with the lat/lon values to save some API requests. Distance is calculated using Haversine formula, based on the lat/lon values for the user pulled from ip-api service. We could also use I did a couple quick tests with VPN and it seems to work fine. If we want to use this, it should at least be improved to fall back to a default if the geoip API can't be reached. Probably best to have an experienced JS dev check it over too 🙂 This keeps thing operationally very simple, as it's just a static file to serve and the browser takes care of the rest. Another approach would be to do something similar server side and serve up an actual redirect, via a simple custom application. |
@scottyeager it works perfectly on my end, testing with different locations via VPN. Clever suggestion! Here's a rewrite with some picocss to enhance the UI: EDIT: As suggested by Scott, I set the same colors as the dashboard loading page colors.
|
Here's an alternative version with a couple of changes. First, it just uses a plain background that's the same color as the Dashboard loading page. This is pretty seamless so it's a nice experience without needing extra stuff. The other change is that this one makes a request against each backend from the list and redirects to the one that responds first. This is simpler and more robust without any real downside. <!DOCTYPE html>
<html lang="en">
<body style="background-color:#212020;">
<head>
<meta charset="UTF-8">
<meta name="" content="width=device-width, initial-scale=1.0">
<title>ThreeFold Dashboard</title>
</head>
<body>
<script>
backends = [
'https://dashboard.grid.tf',
'https://dashboard.us.grid.tf',
'https://dashboard.be.grid.tf',
'https://dashboard.sg.grid.tf',
'https://dashboard.fin.grid.tf'
];
async function wasFetched(url) {
await fetch(url, {mode: 'no-cors'});
return url;
}
window.onload = function() {
Promise.race(backends.map(url => wasFetched(url)))
.then(url => window.location.replace(url))
.catch(error => console.error('Error:', error));
};
</script>
</body>
</html> The only thing to add here would be some handling for the case where no backend responds before some timeout. Most likely it would be the user's connection to blame in that case, since the likelihood of simultaneous failure of all these sites should be next to zero. |
I wanted to add a quick additional note on the second approach. The use of However, it might be appropriate to add health check path, such as described here but with a |
Thanks for the suggestions guys, everything is setup below. If it does not seem to work, make sure to clear cache or use a new incognito tab. http://dashboard.geo.ninja.tf/ -->
http://scott.geo.ninja.tf/ -->
|
Tested the 3 methods on vpn (toronto, france and india). All worked well!!
|
the plan for now is to setup a multi node caddy poc that serves Scott's latency based forwarder. If one wants to improve this, send in some suggestions. We ditch nginx at this point, we only use caddy everywhere anyway. see: https://github.com/threefoldtech/tf_operations/issues/2803
|
Situation
We want to develop a load balancer that will redirect users from dashboard.grid.tf to the closest validator stack. Below is a recap the project to make sure the load balancer part is clear.
For example, we will have 16 validators running their full grid stack at dashboard.01.grid.tf, dashboard.02.grid.tf, ... dashboard.16.grid.tf
Then a user can go to dashboard.grid.tf and the load balancer points to any of the working validator stack URL (e.g. dashboard.04.grid.tf)
Users can also decide to simply go directly to a given validator URL (e.g. dashboard.04.grid.tf)
Full TFGrid Validator Stack Deployment Phase 1:
Here is a recap of the first phase of the project:
The first phase of the project Full TFGrid Validator Stack Deployment is to make it possible for anyone to run the grid independently, this means the full grid stack with tfhub and tfbootstrap.
Phase 2:
Once we have phase 1 ready, validators will be able to deploy the full grid stack and this grid stack will be available at some given URLs, e.g. dashboard.03.grid.tf, dashboard.04.grid.tf, etc.
We will be able to share this list of URLs on our websites (e.g. github, threefold.io, etc.).
Users will be able to join these URLs to connect to specific grid instances, that are all independent from one another. This will make sure the grid is decentralized.
Once we have this, we will need to set a load balancer with all those URLs, so when a user goes to dashboard.grid.tf, the load balancer points the user to the closest grid instance (e.g. dashboard.grid.tf points to dashboard.03.grid.tf, if 03 goes down, it will point to 04, etc.)
Todo
For this issue, we want to develop a load balancer that will redirect from dashboard.grid.tf to the other validator grid stack (e.g. dashboard.05.grid.tf)
References and Suggestions
@coesensbert let me know if this is clear! I know you have some suggestions for this load balancer, please write your ideas on this issue.
The text was updated successfully, but these errors were encountered: