Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Split output file in chunks #71

Open
tombreit opened this issue Jul 22, 2018 · 1 comment
Open

Split output file in chunks #71

tombreit opened this issue Jul 22, 2018 · 1 comment
Milestone

Comments

@tombreit
Copy link

In order to generate a valid google sitemap file, I need to split the file in chunks of a certain size (containing not more than 50.000 items):

Break up large sitemaps into a smaller sitemaps to prevent your server from being overloaded if Google requests your sitemap frequently. A sitemap file can't contain more than 50,000 URLs and must be no larger than 50 MB uncompressed.
(Source: https://support.google.com/webmasters/answer/183668?hl=en#general-guidelines)

Currently my simple view, rendering an "all-in" sitemaps file, which holds more than 50.000 items:

path: /google_image_sitemap.xml
template: google_image_sitemap.xml.jinja2
context:
  dynamic:
    photos: session.query(Photo).all()

I'm not aware of an elegant, "statik-esque" way of generating these chunks and link them in a (small) sitemaps-index-file. Any idea?

@thanethomson
Copy link
Owner

How often do those items change? If, once created, they're permanent, you could simply create sitemap files that group your items according to date (i.e. sort all 50,000 by date created), but do so in chunks (like chunks of 1,000). Then generate your sitemap index file to point to all of these files.

You could perhaps do this by creating a simple view for your sitemap index file, and then a complex view for your "chunk" index files. The key is making sure you sort in such a way that, when you create new items, they only get appended to the sorted list in your sitemap - not inserted somewhere in between.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants