Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[question] concurrent write to websocket connection with mutex.Lock() #828

Open
Jrenk opened this issue Nov 18, 2022 · 19 comments
Open

[question] concurrent write to websocket connection with mutex.Lock() #828

Jrenk opened this issue Nov 18, 2022 · 19 comments
Labels

Comments

@Jrenk
Copy link

Jrenk commented Nov 18, 2022

Describe the problem you're having

Using the gorilla/websocket package I sometimes get the following error, despite using mutex.Lock:
concurrent write to websocket connection

Versions

  • Go version: 1.18
  • package version: v1.5.0

"Show me the code!"

type WebsocketClient struct {
	websocket   *websocket.Conn
	mutex            sync.Mutex
}

func (websocketClient *WebsocketClient) sendMessage(msg interface{}) error {
	websocketClient.mutex.Lock()
	defer websocketClient.mutex.Unlock()

	return websocketClient.websocket.WriteJSON(msg)
}

In an http handler the websocket clients get created:

ws, err := upgrader.Upgrade(w, r, nil)
...
clients[id] = &WebsocketClient{
    websocket: ws,
}

The sendMessage function gets called from multiple go-routines and the WriteJSON never gets called directly anywhere else. From my understanding the mutex.Lock() should prevent concurrent websocket writes.

@Jrenk
Copy link
Author

Jrenk commented Dec 6, 2022

The stack trace points to the sendMessage function. As I already mentioned this is the only place where the WriteJSON or any other Write of the gorilla package gets called. So there is no possibility of the application calling WriteJSON without calling mutex.Lock() beforehand. I don't think this issue should be closed.

@garyburd garyburd reopened this Dec 6, 2022
@FZambia
Copy link
Contributor

FZambia commented Dec 8, 2022

@Jrenk hello, could you provide a reproducing example/test/benchmark then to make it clear that it's an issue of this package?

@NomNomCameron
Copy link

@Jrenk I'm running into a similar situation. I have a loop in a goroutine that writes to an incoming socket connection for the heartbeat response. Before each write, I lock a common mutex and unlock after the write. The issue seems like it may be tied to a race condition as the issue doesn't happen consistently. Did you figure out what was causing your issue?

@Jrenk
Copy link
Author

Jrenk commented Jul 18, 2023

@NomNomCameron No I didn't figure it out. The problem happened very rarely and it didn't occur for a very long time. I'm still not sure what causes this sadly.

@ayjayt
Copy link

ayjayt commented Jul 20, 2023

@Jrenk

Could it be someone pinged your websocket server and the default ping handler tried to write a pong while it was locked?

edit:I might test this later today, we'll see.

@ayjayt
Copy link

ayjayt commented Jul 20, 2023

I'm imagining a pull request where configuring the websockets server would ask if you want to stick a lock or mutex or something a couple layers deep. Any thoughts on that?

@Jrenk
Copy link
Author

Jrenk commented Jul 20, 2023

@ayjayt that could actually be the problem. Let me know if you tested this. A lock in the package itself could resolve this as you said.

@ayjayt
Copy link

ayjayt commented Jul 20, 2023

Yeah, I mean it has to be it. I might test it, but you should use

Conn.SetPingHandler(func(string) error) anyway to protect against race conditions.

Here is what the default handler looks like:

h = func(message string) error {
	err := c.WriteControl(PongMessage, []byte(message), time.Now().Add(writeWait))
	if err == ErrCloseSent {
		return nil
	} else if e, ok := err.(net.Error); ok && e.Temporary() {
		return nil
	}
	return err
}

from:

websocket/conn.go

Line 1158 in 9111bb8

func (c *Conn) SetPingHandler(h func(appData string) error) {

@ayjayt
Copy link

ayjayt commented Jul 20, 2023

@coreydaley is this not a bug? If I'm correct, anyone who uses any *Write*() function at anytime might collide with the WriteControl() called by the default handler and produce a concurrent write error. I am not confident in this!

edit: especially since most people are going to run their read loops and write loops in separate go routines, and the default handler is actually called automatically by NextReaders()'s advanceFrame() call

@FZambia
Copy link
Contributor

FZambia commented Jul 20, 2023

From Gorilla WebSocket docs:

Connections support one concurrent reader and one concurrent writer.

Applications are responsible for ensuring that no more than one goroutine calls the write methods (NextWriter, SetWriteDeadline, WriteMessage, WriteJSON, EnableWriteCompression, SetCompressionLevel) concurrently and that no more than one goroutine calls the read methods (NextReader, SetReadDeadline, ReadMessage, ReadJSON, SetPongHandler, SetPingHandler) concurrently.

The Close and WriteControl methods can be called concurrently with all other methods.

As I mentioned above - if you think it's a bug in a package you better provide a reproducer instead of guessing. You can use https://pkg.go.dev/golang.org/x/tools/cmd/stress tool to increase the chance of finding the reliable reproducer.

@ayjayt
Copy link

ayjayt commented Jul 20, 2023

if you think it's a bug in a package you better provide a reproducer instead of guessing.

If you want me to spend my time working on this project, and building something to reproduce this bug, you better be nice to me!

We're having a discussion before spending time writing code.

[contributors note: redaction was not needed]

@ghost
Copy link

ghost commented Jul 21, 2023

anyone who uses any Write() function at anytime might collide with the WriteControl() called by the default handler and produce a concurrent write error.

The websocket connection uses a mutex (implemented using a channel) to prevent WriteControl from colliding with the *Write* methods. The websocket connection write methods acquire the mutex before executing any write related methods on the network connection (SetWriteDeadline and Write).

(Why the websocket connection uses a channel instead of a sync.Mutex is a puzzle for another day.)

@coreydaley
Copy link
Contributor

Please keep all comments and conversations civil, please keep in mind that Github users are from all over the world and speak hundred of different languages, and their native languages are not english, but they are being gracious enough to use english here. So do not assume that anything is being said with a specific tone or meaning. Disrespectful language will not be tolerated in this community.

We are almost done with the project transition and then we will begin triaging and working on issues, but we are not quite to that point yet. If you think that you have found a bug and would like to submit a pull request, please do so and we will review it as part of triaging/addressing this issue.

@ayjayt
Copy link

ayjayt commented Jul 21, 2023

If I'm going to be chastised for asking someone to speak respectfully to me, I will simply not participate here further.

@coreydaley
Copy link
Contributor

If I'm going to be chastised for asking someone to speak respectfully to me, I will simply not participate here further.

That is your prerogative. This is not the first comment within your short time of contributing to this project that could be deemed disrespectful to someone who has disagreed with you or questioned you: #841 (comment)

As far as I am concerned that is two strikes. One more and we will have to discuss blocking you from interacting with the organziation.

@jaitaiwan
Copy link
Member

@e9x That sort of comment will not be tolerated. I sincerely hope that you are able to find help that allows you to be a more civil member of society.

@coreydaley
Copy link
Contributor

Not that it will do any good but I reported the comment from @e9x, not sure what I did to elicit such a comment.

@gorilla gorilla deleted a comment from e9x May 5, 2024
@gorilla gorilla deleted a comment from e9x May 13, 2024
@jaitaiwan
Copy link
Member

Alright, I'm very tired, but I think I've reviewed this ticket enough to come to the conclusion that this should technically be considered a bug. While the documentation does specify that it supports only one reader/writer doing so concurrently at a time, it doesn't make it clear that if you're doing your own concurrent writes you'd also have to overwrite the ping handler.

In this case the appropriate solution is likely to document the need for the ping handler being specified if performing your own concurrent writing and to have this resolved in what ever future version of this library.

@hulkingshtick
Copy link

hulkingshtick commented Sep 1, 2024

if you're doing your own concurrent writes you'd also have to overwrite the ping handler.

That's not true. WriteControl, the method used by the ping handler, can be called concurrently with the other write methods. The connection is implemented to support the concurrent call. The documentation states that the concurrent call is allowed.

The concurrent write to websocket connection panic cannot be triggered by the WriteControl method because the WriteControl method does not directly or indirectly call the flushFrame method where the concurrency detector is implemented. The concurrent write to websocket connection panic is caused by concurrent application calls to WriteMessage, WriteJSON and NextWriter, not the ping handler.

Edit: Previous comments (1, 2) also explain why the ping handler is not a problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: No status
Development

No branches or pull requests

8 participants