Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(Discussion) Static Site Generation and DSpace #3183

Open
kshepherd opened this issue Jul 11, 2024 · 5 comments
Open

(Discussion) Static Site Generation and DSpace #3183

kshepherd opened this issue Jul 11, 2024 · 5 comments

Comments

@kshepherd
Copy link
Member

kshepherd commented Jul 11, 2024

Description

Generating and serving static HTML and javascript ('Static Site Generation') can help make repository content easier to run in low resource environments, more portable (pages are on disk), and checks off many other best practices.

DSpace Angular can't do everything as static content, of course, but there are many pages which don't change very much (e.g. unauthenticated context + item page) and are the most commonly visited content, by both human and robot users.

This ticket is for discussion about how and where we can use SSG (static site generation) with DSpace repositories, various techniques and how they work, and any experience or progress experimenting with this topic.

Approaches

There seem to be a few approaches with SSG in Angular, so far we have identified:

  1. Build-time generation of configured or discovered routes (user-specified) - this could hammer the REST API quite hard at startup. It could be more useful for repositories who have large numbers of item pages that do not change for years.
    • As noted in July 18 meeting, it can also severely increase build time, if one must prerender with every build this becomes a blocker to fixing bugs, upgrading, etc.
  2. On-demand / 'regeneration' as requests come in?
    • As noted in July 18 meeting, dynamic SSG comes with all the performance problems (and maybe more) as SSR, so could end up negatively impacting on DSpace performance
  3. External tools like scully (see Refs) that can crawl the Angular site and create static content, could be configured and used to generate things incrementally or as admins see fit, on TTLs, etc.

Note re: performance / SSR

This discussion is not intended to be a band-aid "fix" or competing proposal for solving high-resource usage in SSR (server-side rendering). It is a discussion for those curious about supporting static sites in DSpace repositories, as a goal in and of itself.

References

@kshepherd kshepherd added new feature needs triage New issue needs triage and/or scheduling and removed new feature needs triage New issue needs triage and/or scheduling labels Jul 11, 2024
@kshepherd kshepherd self-assigned this Jul 11, 2024
@tdonohue tdonohue added new feature performance / caching Related to performance, caching or embedded objects needs discussion labels Jul 11, 2024
@kshepherd kshepherd removed the performance / caching Related to performance, caching or embedded objects label Jul 18, 2024
@abollini
Copy link
Member

please please please :) only look for option 3.
IMHO we need to simplify as much as possible the angular code to reach our primary goal that should be have a fast application in a basic scenario without the need of all these extra layers (cache, pre-site generation) that are of course needed for High Performance, large and heavy accessed site.
The other options could be attractive in the short term but would make the performance worst, the installation process more complex and/or slow

@kshepherd
Copy link
Member Author

please please please :) only look for option 3

Good points made today, I've added notes about the downsides to options 1 and 2, and tried to make it clearer that this issue was not supposed to be related to SSR (or even Angular, necessarily) or compete for attention with SSR performance issues.

@kshepherd kshepherd changed the title (Discussion) Static Site Generation in DSpace (Discussion) Static Site Generation and DSpace Jul 24, 2024
@jameswsullivan
Copy link

I don't know much about the inner workings of DSpace but I really wish that there could be something like the "Simply Static" WordPress plugin for it.

@kshepherd
Copy link
Member Author

@jameswsullivan would a function like "generate website from publicly-accessible resources" fit your use case? (i.e. any items, bitstreams, collections that are not readable by unauthenticated users would be excluded)

@jameswsullivan
Copy link

@kshepherd I think so, yes.
One of the performance issues we have is that the DSpace site is getting hit by large amounts of either crawling or content scraping bots that enumerate the links/resources, causing heavy load to the hosting. And I think under DSpace's current structure, each of these sessions would trigger calls/responses between the DSpace angular UI and the DSpace API, and the SSR stuff? (I have limited knowledge about how exactly DSpace works but these are the topics I've come across during hosting and orchestration.)

So I'd think if publicly-accessible resources could be generated and then hosted/served in a static way (especially for the bots and anonymous hits), that would alleviate a lot of this problem? But I'm not sure how it would work though, will the publicly-accessible content be served as static pages first, until a user logs in? My analogy to WordPress' Simply Static is probably not a good fit in this scenario because I simply spin up a local WordPress instance and make edits, and then generate the static HTMLs and upload them to the hosting server.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: 📋 To Do
Development

No branches or pull requests

4 participants