Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Load balance our geoserver #7

Open
goldpbear opened this issue Nov 28, 2017 · 3 comments
Open

Load balance our geoserver #7

goldpbear opened this issue Nov 28, 2017 · 3 comments

Comments

@goldpbear
Copy link
Contributor

Hopefully #6 is helping with our geoserver memory issues (there have been no new AWS alarms, so that seems good at least).

I think an even more robust setup would have at least two production geoservers working together behind a load balancer. I'm imagining both geoservers would completely mirror each other, and would sync from the S3 geodata bucket.

I'm not sure how high on the priority list this should be. Maybe the memory upgrade via #6 is enough for the time being. On the other hand, we are hosting a lot of complex WMS data at this point.

@modulitos
Copy link
Member

A load balancer could work, but I think we should also consider other solutions. Is there a particular reason why you think a load balancer would help? Other options would be to upgrade our server instance (right now we are on a t2 medium, I believe). We can also increase the size of the geoserver cache, as described here: http://docs.geoserver.org/stable/en/user/production/config.html#cache-your-data

I wonder if increasing the geoserver cache and upgrading the instance with enough memory to support a bigger cache would be more beneficial than running a cluster behind a load balancer.

On a side note, I'm skeptical that CloudFront has been of much help cause we are sending the bounding box of the views in the query params, which results in a lot of cache misses because the lat/lon of the bounding box has a lot of decimal places so it is constantly changing. And omitting the bounding box query params from CloudFront caching isn't possible, because geoserver needs the bounding box to determine which tiles to serve, which was the cause of the "single repeated tile" bug that you resolved on the last day in Madrid.

@goldpbear
Copy link
Contributor Author

I think upgrading the instance size could be just as effective. I honestly don't know how to determine which approach is better without just testing both and comparing performance. But maybe a reasonable policy would be to continue upgrading our single instance until it can't keep up with demand, then consider load balancing options.

Mostly I've just been surprised by how quickly we maxed out geoserver's processing abilities with our new Madrid and NYC data. I wanted to think ahead a bit in case we get an influx of users and geoserver struggles.

re: the bounding boxes-- maybe we can round off decimal values in Leaflet somehow to increase cache hits? I'll look into that a bit...

@modulitos
Copy link
Member

I don't think that we have outgrown the limits of a single geoserver instance - but I still haven't research it enough to know for sure.

If we need to improve our Geoserver performance, I would look into increasing Geoserver's server side cache, which would ease the burden of having to calculate and serve the tile data. Then also looking into increasing the EC2 instance for it, and allocating more resources to Geoserver. I believe we are still at the low end of the hardware requirements for Geoserver.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants