Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix reconnecting websocket polyfill and error propagation during push #6200

Merged
merged 4 commits into from
Aug 13, 2024

Conversation

juliushaertl
Copy link
Member

@juliushaertl juliushaertl commented Aug 12, 2024

From reading console logs of recent reports and checking back the access logs it seemed we had couple of cases where the editor was left in a state where it was editable but changes were no longer pushed to the server.

I could reproduce this for multiple scenarios:

  • Set the browser network tab to offline and wait until the y-websocket connection got closed after not receiving further messages

  • Manually return a failure status code like 401, 503, 403 on the push endpoint

  • fix: Track push errors and do not reset error state if push fails but sync passes

    • This fixes flickering error messages as the push failed but the next sync request reset the failure state again (even though push was still failing)
  • fix: Do not reset read only state if sync errors occurred

    • Ensure that if we are in an error state, no further edits can happen in the frontend
  • fix: emit onerror for websocket and reopen connection on reconnect attempts of y-websocket

    • Make sure that a failure on the push is properly announced to y-websocket
  • fix: Emit error if push fails in other unhandled cases

    • In cases we didn't handle there no error status was propagated to the Editor vue component

@juliushaertl juliushaertl added bug Something isn't working 3. to review high labels Aug 12, 2024
@juliushaertl juliushaertl changed the title fix/sync bugs Fix reconnecting websocket polyfill and error propagation during push Aug 12, 2024
Copy link
Member

@mejo- mejo- left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to follow the logic by stepping through with a debugger and review, but have more questions than answers 😬

Maybe we can schedule a call later today or tomorrow to go through the changes together?

src/services/SyncService.js Show resolved Hide resolved
src/services/WebSocketPolyfill.js Outdated Show resolved Hide resolved
@@ -175,6 +182,7 @@ class SyncService {
}).catch(err => {
const { response, code } = err
this.sending = false
this.pushError++
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Treating push errors in hasConnectionIssues means the editor turns readonly as soon as there a connection issue, right? I can imagine that usability will suffer in flaky networks (e.g. in trains) with this change. We could think about displaying a document status that warns about unpushed changes instead. But I'm also fine with merging this as is and seeing how it will perform in reality.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After going through the code again I still feel this is more sane as a quick fix then skipping the commit. Without it we currently have even worse behaviour on flaky networks where the editor randomly switches between editable and non-editable depending if a sync request passes or not.

My proposal would be to get this in and follow up with a PR to improve the overall behaviour with error handling.

if (this.$editor.isEditable !== editable) {
this.$editor.setEditable(editable)
}
},

onSync({ steps, document }) {
this.hasConnectionIssue = false
this.hasConnectionIssue = !this.$providers[0].wsconnected || this.$syncService.pushError > 0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'm missing something obvious, but I don't understand why we should evaluate this.$syncService.pushError > 0 here. onSync() is called when SyncService emits 'sync', which in my understanding happen either if we pushed steps successfully (which means pushError = 0), or if we received steps from the server (which is not related to pushError).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The issue is that push errors were always overwritten by any following sync request that passed. This change will make sure that a missing push is always still taken into account, so we display an error as long as no successful push happened.

@juliushaertl
Copy link
Member Author

juliushaertl commented Aug 13, 2024

@mejo- After testing the behaviour again with just the two commits as discussed yesterday i noticed that this is quite flaky that way:

  • There is no indication that the push failed
  • The editor "randomly" changes between editable and not editable depending if a sync succeeded or a push failed.

I would feel more confident in merging this as it is and follow up to improve:

  • Refactor the general sync state into a defined data strcuture by splitting push/sync/save state and also remembering last success/failure time, number of failures. That way we have more control to show errors only if the issue persists and not immediately on flaky connections
  • Indicate that local changes were not pushed to the server with a proper error message on the save indicator
    • Figure out if we can auto retry the push or if we want to let the user manually trigger it
  • Avoid immediately setting the editor to read only
  • Avoid saving the file as the user if there are unpushed steps
  • Warn users when navigating away with unpushed steps
  • File follow up issue for proper offline support, keeping changes in the browser storage and pushing later

@mejo-
Copy link
Member

mejo- commented Aug 13, 2024

Thanks for going through this again @juliushaertl. Your assessment sounds reasonable. And the list of follow-up todos is also a nice starting point! So let's get this merged as is. And probably we should also backport until stable28 at least, no?

@juliushaertl
Copy link
Member Author

/backport to stable29

@juliushaertl
Copy link
Member Author

/backport to stable28

@juliushaertl
Copy link
Member Author

Follow up started in #6202

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3. to review bug Something isn't working high
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants