Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sway/Wayland/wlroots instant crash #896

Open
djvs opened this issue Jan 8, 2024 · 15 comments
Open

Sway/Wayland/wlroots instant crash #896

djvs opened this issue Jan 8, 2024 · 15 comments

Comments

@djvs
Copy link

djvs commented Jan 8, 2024

Been observing crashes in sway for a while (wlroots-based Wayland compositor). This appears to be due to a rendering bug in snes9x's handling of Wayland surfaces. Attached output of WAYLAND_DEBUG=1 ./snes9x-gtk 2> output.txt from last night's nightly release of snes9x-gtk for Linux.

Prior to it totally crashing, I was observing a bug where resizing the snes9x-gtk window would manage to crash the entire "sway" compositor.

output.txt

@djvs
Copy link
Author

djvs commented Jan 8, 2024

I don't seem to have the same crashes with snes9x-x11. Although it doesn't let me resize the window.

@bearoso
Copy link
Collaborator

bearoso commented Jan 8, 2024

I've tried with various wayland compositors, and I'm not getting any crashes. Gnome, KDE, Wayfire, Sway, all work fine when resizing.

I've tried with various fractional scales as well, since your output indicates that the coordinates for the source buffer in the viewport extension are the cause of the crash. Still nothing.

Could you give more information about your system configuration? You say you're using a nightly release of Snes9x, but there aren't any official nightly releases. Did you compile it yourself or download an existing build?

@djvs
Copy link
Author

djvs commented Jan 8, 2024

I install it from the links on this repo:

https://github.com/snes9xgit/snes9x -> http://cirrus-ci.com/github/snes9xgit/snes9x -> http://cirrus-ci.com/build/6647216182394880 -> http://cirrus-ci.com/task/6158874105872384 -> (then the txz file under "Artifacts" on that page)

Sway is at bf2b79b2845a070d84aafaf95e6cfcf9af7eeb9b. Video card is AMD Radeon RX 6700 XT. Linux kernel 6.6.9 (artix sans systemd).

I went on my laptop just now, opened up Hyprland instead of Sway, same issue (source rectangle out of buffer bounds). That one has AMD Radeon 680M.

@djvs
Copy link
Author

djvs commented Jan 8, 2024

And it's the same issue with a stable release anyway. 1.62.3-1 in the Arch repos.

@bearoso
Copy link
Collaborator

bearoso commented Jan 9, 2024

I just tried with Sway and wlroots git and I got your crash. The stable releases are fine. It must be a regression they introduced.

@djvs
Copy link
Author

djvs commented Jan 9, 2024

Oh I forgot the last link:

hyprwm/Hyprland#4158

According to them, wlroots implemented stricter checks, but this is erroneous behavior from the app ("osu" in this case).

@djvs
Copy link
Author

djvs commented Jan 9, 2024

I am able to do WAYLAND_DISPLAY=0 snes9x-gtk, which probably forces it through x11-wayland or something. Just leaving a note for myself or others, as a potential workaround.

@bearoso
Copy link
Collaborator

bearoso commented Jan 9, 2024

I see what they're doing wrong. They're bounds-checking the subsurface's buffer with the size of the buffer attached to the parent surface. In this case, the subsurface is larger than the parent surface, but that's within Wayland spec.

@djvs
Copy link
Author

djvs commented Jan 9, 2024

Is there something I should do to follow up?

@vyivel
Copy link

vyivel commented Jan 9, 2024

[3073398.789]  -> [email protected]_viewport(new id wp_viewport@48, wl_surface@36)
...
[3073467.211]  -> [email protected]_source(0.00000000, 0.00000000, 1805.00000000, 993.00000000)
[3073467.214]  -> [email protected]_destination(1805, 993)
[3073467.217]  -> [email protected]()
...
[3073476.297]  -> [email protected]_immed(new id wl_buffer@56, 256, 224, 808669784, 0)
[3073476.300]  -> [email protected]()
[3073476.303]  -> [email protected](wl_buffer@56, 0, 0)
[3073476.306]  -> [email protected](0, 0, 2147483647, 2147483647)
[3073476.308]  -> [email protected]()

1805x993 viewport src is committed, then a 256x224 buffer is attached and committed.

@vyivel
Copy link

vyivel commented Jan 9, 2024

I can also reproduce the crash on Weston and Smithay's Anvil (though with the latter it's inconsistent for some reason).

@bearoso
Copy link
Collaborator

bearoso commented Jan 9, 2024

[3073398.789]  -> [email protected]_viewport(new id wp_viewport@48, wl_surface@36)
...
[3073467.211]  -> [email protected]_source(0.00000000, 0.00000000, 1805.00000000, 993.00000000)
[3073467.214]  -> [email protected]_destination(1805, 993)
[3073467.217]  -> [email protected]()
...
[3073476.297]  -> [email protected]_immed(new id wl_buffer@56, 256, 224, 808669784, 0)
[3073476.300]  -> [email protected]()
[3073476.303]  -> [email protected](wl_buffer@56, 0, 0)
[3073476.306]  -> [email protected](0, 0, 2147483647, 2147483647)
[3073476.308]  -> [email protected]()

1805x993 viewport src is committed, then a 256x224 buffer is attached and committed.

But the log is wrong. It indicates a viewport is requested for the parent surface (36), when in actuality the request was for the child, subsurface (31)

viewport = wp_viewporter_get_viewport(viewporter, child);
So something is changing the request for the child into a request for the parent.

@vyivel
Copy link

vyivel commented Jan 9, 2024

[3073398.726] -> [email protected]_subsurface(new id wl_subsurface@45, wl_surface@36, wl_surface@31)

wl_surface@31 is the parent. The signature of wl_subcompositor.get_subsurface is

<arg name="id" type="new_id" interface="wl_subsurface" summary="the new sub-surface object ID"/>
<arg name="surface" type="object" interface="wl_surface" summary="the surface to be turned into a sub-surface"/>
<arg name="parent" type="object" interface="wl_surface" summary="the parent surface"/>

@bearoso
Copy link
Collaborator

bearoso commented Jan 9, 2024

Whoops. You're correct.

Regardless, I've just committed a change to always use the whole buffer as a source, which should have been done in the first place. There's still some funkiness going on behind the API with Vulkan surfaces, but I'm going to try to reorder some of the resize stuff and see if I can make it cleaner.

@djvs
Copy link
Author

djvs commented Jan 9, 2024

Reinstalled from git, that appears to have fixed it. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants