Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: handle incomplete UTF-8 sequences and add test for reproduction #1166

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

monochromegane
Copy link

I'd like to start by thanking you for releasing such an amazing TUI framework 🧋 .
This pull request introduces improvements to the way we handle input data by detecting incomplete UTF-8 sequences and addressing them appropriately.

Background

Currently, tea.KeyMsg detects an unknownInputByteMsg when a byte array is interrupted in the middle of reading multibyte UTF-8 characters. As a result, the character is corrupted and cannot be correctly input. Fortunately, UTF-8 encoding allows us to determine whether more bytes are needed based on the first byte. We believe this can be resolved by invoking an additional read to complete the sequence.

Reproduction

This issue can occasionally be reproduced by repeatedly inputting multiple multibyte characters using the code below. My environment is macOS 14.6.1, go version go1.23.1 darwin/arm64, tmux 3.4.

package main

import (
	"fmt"
	"log"
	"strings"

	tea "github.com/charmbracelet/bubbletea"
)

type model struct {
	msgs []string
}

func (m model) Init() tea.Cmd { return nil }

func (m model) Update(msg tea.Msg) (tea.Model, tea.Cmd) {
	m.msgs = append(m.msgs, fmt.Sprintf("%T %#v", msg, msg))
	switch msg := msg.(type) {
	case tea.KeyMsg:
		switch msg.Type {
		case tea.KeyCtrlC:
			return m, tea.Quit
		}
	}
	return m, nil
}

func (m model) View() string {
	return strings.Join(m.msgs, "\n")
}

func main() {
	prog := tea.NewProgram(model{msgs: []string{}})
	if err := prog.Start(); err != nil {
		log.Fatal(err)
	}
}

The log during reproduction is as follows.
I am repeatedly inputting 一二三 (representing one, two, three in Japanese). After several inputs, it is detected as unknownInputByteMsg.

$ go run .
tea.WindowSizeMsg tea.WindowSizeMsg{Width:117, Height:25}
tea.KeyMsg tea.KeyMsg{Type:-1, Runes:[]int32{19968, 20108, 19977}, Alt:false, Paste:false}
tea.KeyMsg tea.KeyMsg{Type:-1, Runes:[]int32{19968, 20108, 19977}, Alt:false, Paste:false}
tea.KeyMsg tea.KeyMsg{Type:-1, Runes:[]int32{19968, 20108, 19977}, Alt:false, Paste:false}
tea.KeyMsg tea.KeyMsg{Type:-1, Runes:[]int32{19968, 20108, 19977}, Alt:false, Paste:false}
tea.KeyMsg tea.KeyMsg{Type:-1, Runes:[]int32{19968, 20108, 19977}, Alt:false, Paste:false}
tea.KeyMsg tea.KeyMsg{Type:-1, Runes:[]int32{19968}, Alt:false, Paste:false}
tea.unknownInputByteMsg 0xe4
tea.unknownInputByteMsg 0xba
tea.unknownInputByteMsg 0x8c
tea.KeyMsg tea.KeyMsg{Type:-1, Runes:[]int32{19977}, Alt:false, Paste:false}

This is because the expected 9 bytes (3 chars x 3 bytes) are read in two parts, like 0xe4, 0xb8, 0x80, 0xe4 and 0xba, 0x8c, 0xe4, 0xb8, 0x89.

Fix

This fix will resolve the character missing issue.

$ go run .
tea.WindowSizeMsg tea.WindowSizeMsg{Width:117, Height:25}
tea.KeyMsg tea.KeyMsg{Type:-1, Runes:[]int32{19968, 20108, 19977}, Alt:false, Paste:false}
tea.KeyMsg tea.KeyMsg{Type:-1, Runes:[]int32{19968}, Alt:false, Paste:false}
tea.KeyMsg tea.KeyMsg{Type:-1, Runes:[]int32{20108, 19977}, Alt:false, Paste:false}
tea.KeyMsg tea.KeyMsg{Type:-1, Runes:[]int32{19968, 20108}, Alt:false, Paste:false}
tea.KeyMsg tea.KeyMsg{Type:-1, Runes:[]int32{19977}, Alt:false, Paste:false}
tea.KeyMsg tea.KeyMsg{Type:-1, Runes:[]int32{19968, 20108}, Alt:false, Paste:false}
tea.KeyMsg tea.KeyMsg{Type:-1, Runes:[]int32{19977}, Alt:false, Paste:false}
tea.KeyMsg tea.KeyMsg{Type:-1, Runes:[]int32{19968, 20108}, Alt:false, Paste:false}
tea.KeyMsg tea.KeyMsg{Type:-1, Runes:[]int32{19977}, Alt:false, Paste:false}
tea.KeyMsg tea.KeyMsg{Type:-1, Runes:[]int32{19968, 20108}, Alt:false, Paste:false}
tea.KeyMsg tea.KeyMsg{Type:-1, Runes:[]int32{19977}, Alt:false, Paste:false}
tea.KeyMsg tea.KeyMsg{Type:-1, Runes:[]int32{19968, 20108, 19977}, Alt:false, Paste:false}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant