Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[NVIDIA - STABLE] Getting kicket back to SDDM Greeter after login #1704

Open
NekroSomnia opened this issue Sep 26, 2024 · 20 comments
Open

[NVIDIA - STABLE] Getting kicket back to SDDM Greeter after login #1704

NekroSomnia opened this issue Sep 26, 2024 · 20 comments
Labels
bug Something isn't working

Comments

@NekroSomnia
Copy link

Describe the bug

After login, i get kicked back to the Greeter/Lockscreen located on TTY1.
If i try to login again, the Desktop won't load / the screen stays black.

What did you expect to happen?

I expected to have a desktop, that doesn't kick me out after a Minute

Output of rpm-ostree status

State: idle
Deployments:
● ostree-unverified-registry:ghcr.io/ublue-os/bazzite-nvidia:stable
                   Digest: sha256:158bbced9de484d9e6a3acca9534be77d0becab8e7d4d828a75880024ef53340
                  Version: 40.20240922.0 (2024-09-23T05:05:17Z)
                Initramfs: regenerate

  ostree-unverified-registry:ghcr.io/ublue-os/bazzite-nvidia:stable
                   Digest: sha256:e447992949d4508d573ddce67fd2669aef87cc98efe2fe44312db54d052b5aeb
                  Version: 40.20240914.0 (2024-09-15T21:06:47Z)
                Initramfs: regenerate

Hardware

No response

Extra information or context

After getting kicked back to the Greeter, i either have to reboot using a shell on TTY3 or higher, or kill the Desktop Session on TTY2 manually using loginctl kill-session {ID}

I've noticed, that the SDDM Session usually gets destroyed, after a sucessfull login (if this issue doesn't happen).
This is not the case in the instances, I get bootet out of my Session.
In this case, the greeter Session persists and has the Status "online", while the Desktop Session on TTY2 has the State "closing".

After killing the "defective" session, another login attempt will result in a working Desktop without any interuptions/unexpected lockscreens.

Attached are 2 Photos.
One (quite blurry butr readable) image of the output from loginctl list-sessions after a failed login and one with the output after a working login.
BotchedLogin_Censored
WorkingLogin_Censored

@dosubot dosubot bot added the bug Something isn't working label Sep 26, 2024
@Moshugan
Copy link

I have the exact same issue. Same Bazzite version too (same rpm-ostree status output). The strange thing about it is that the desktop might not crash if Steam is running! If Steam is not running then it will certainly crash sooner than later. Discovery app might also contribute to the crashing.

Other weird behaviors include the inability to update certain "freedesktop platform" parts. Discovery didn't want to update certain parts, but when running system update it did something to them, but still outputs weird warnings that I don't know what to do about.

error_01
error_02
error_03
error_04

@Moshugan
Copy link

Okay, the Steam thing might be a coincidence. I did not do any if those things that you did, but I am succesfully running Bazzite right now and my loginctl output is the same:

error_05

I'm not knowledgeable enough to understand this. I have no idea why it's working now but not at other times.

@mrdev023
Copy link

Same problem

State: idle
Deployments:
● ostree-image-signed:docker://ghcr.io/ublue-os/bazzite-nvidia-open:stable
                   Digest: sha256:7714620ce66e84806949720204c07da491b6b31bc6304a27ca620893bf1508b9
                  Version: 40.20240922.0 (2024-09-23T05:09:18Z)

  ostree-image-signed:docker://ghcr.io/ublue-os/bazzite-nvidia-open:stable
                   Digest: sha256:0038d9bf78189ecf1505eeeaa22b7bf47fc39a8bdd2527c6b9a043ba4f99e14c
                  Version: 40.20240921.1 (2024-09-22T12:39:50Z)

@Moshugan
Copy link

For some inexplicable reason my recent logins have been without issues.

@alec-petros
Copy link

I am experiencing the same issue, though I've had mixed success in killing the tty2 session, returning to the sddm greeter and logging in again. Not sure if this is relevant, as this appears to happen even on a successful load, but I see this in plasmashell logs immediately following boot:

Sep 30 23:02:35 bazzite plasmashell[6040]: KPackageStructure of KPluginMetaData(pluginId:"dev.jhyub.supergfxctl", fileName: "/usr/share/plasma/plasmoids/dev.jhyub.supergfxctl/metadata.json") does not match requested format "Plasma/Applet"
Sep 30 23:02:35 bazzite plasmashell[6040]: kde.plasmashell: Aborting shell load: The activity manager daemon (kactivitymanagerd) is not running.
Sep 30 23:02:35 bazzite plasmashell[6040]: kde.plasmashell: If this Plasma has been installed into a custom prefix, verify that its D-Bus services dir is known to the system for the daemon to be activatable.
Sep 30 23:02:36 bazzite plasmashell[6040]: kde.plasmashell: Aborting shell load: The activity manager daemon (kactivitymanagerd) is not running.
Sep 30 23:02:36 bazzite plasmashell[6040]: kde.plasmashell: If this Plasma has been installed into a custom prefix, verify that its D-Bus services dir is known to the system for the daemon to be activatable.

Also possibly relevant, it seems that so far for me, a boot immediately following an update / new rpm-ostree deploy will tend to work correctly, and this bug occurs on following boots on the same deploy. I've noticed this offhand over the past couple weeks of this issue popping up. Haven't tested this theory extensively yet, but last night I started rebooting from tty3 as mentioned in the initial report to see if it it would boot correctly. I got this crash-to-sddm bug about six or seven times in a row, then out of curiosity I did an rpm-ostree update of an irrelevant package (the spotify client) to trigger a new deploy, and the next reboot was successful.

@NekroSomnia
Copy link
Author

Also possibly relevant, it seems that so far for me, a boot immediately following an update / new rpm-ostree deploy will tend to work correctly

I've notuiced something simmilar, although it seems like a regular reboot will do the trick too.

I also noticed - from this and this reddit posts, that the issue might be related to the combination of Ryzen and Nvidia.

If all that are affecvted use and AMD CPU wirth an nVidia GPU, we might be on to something - maybe an incompatibility, maybe a red herring, but certainly something

@Moshugan
Copy link

Moshugan commented Oct 1, 2024

It happened to me again, totally randomly. Immediately after boot I checked loginctl list-sessions and at first tty2 looked normal. Then I updated some flatpaks on Discovery and watched some show on Netflix with Chrome for a little while. Then it just suddenly threw me to the greeter. I immediately opened tty4 and checked loginctl list-sessions which gave me this:

0015 yoshi 1

I was able to do a new login that worked following NekroSomnias directions.

Also possibly relevant, it seems that so far for me, a boot immediately following an update / new rpm-ostree deploy will tend to work correctly

I've notuiced something simmilar, although it seems like a regular reboot will do the trick too.

I also noticed - from this and this reddit posts, that the issue might be related to the combination of Ryzen and Nvidia.

If all that are affecvted use and AMD CPU wirth an nVidia GPU, we might be on to something - maybe an incompatibility, maybe a red herring, but certainly something

I also do have a Ryzen 5 3600 CPU and a GeForce RTX 3070 GPU. Thanks for those posts! I hope this issue gets noticed by the devs.

@NekroSomnia
Copy link
Author

I was able to do a new login that worked following NekroSomnias directions.

Glad that helped :D

I've exported my journalctl via journalctl --since today > ~/Desktop/journalctl-export.log and will disassemble that one once i got the time for it.
Its a long log file, so that is gonna take some time but might shine some light on the issue

I also do have a Ryzen 5 3600 CPU and a GeForce RTX 3070 GPU.

That's good to know, i hope we are onto something here.

@NekroSomnia
Copy link
Author

Little Update :
I've had a quick look at the log this morning and found the following lines, right before i get disconnected from the active Session :

Oct 02 10:35:04 COMPUTER.DOMAIN.NAME setroubleshoot[8761]: SELinux is preventing kwin_wayland from 'read, write' accesses on the chr_file nvidia-modeset.

and

Oct 02 10:35:05 COMPUTER.DOMAIN.NAME sddm-helper-start-wayland[8342]: "kwin_wayland_drm: Presentation failed! Invalid argument\n"
Oct 02 10:35:05 COMPUTER.DOMAIN.NAME sddm-helper-start-wayland[8342]: "kwin_core: Applying output config failed!\n"
Oct 02 10:35:05 COMPUTER.DOMAIN.NAME sddm-helper-start-wayland[8342]: "kwin_wayland_drm: Presentation failed! Permission denied\n"

Note that there is teh hint to run sealert -l fced9120-1a43-4615-b5c3-66eae81adbc2 for more information.
So i did.
This is the output :

SELinux is preventing kwin_wayland from 'read, write' accesses on the chr_file nvidia-modeset.

*****  Plugin device (91.4 confidence) suggests   ****************************

If you want to allow kwin_wayland to have read write access on the nvidia-modeset chr_file
Then you need to change the label on nvidia-modeset to a type of a similar device.
Do
# semanage fcontext -a -t SIMILAR_TYPE 'nvidia-modeset'
# restorecon -v 'nvidia-modeset'

*****  Plugin catchall (9.59 confidence) suggests   **************************

If you believe that kwin_wayland should be allowed read write access on the nvidia-modeset chr_file by default.
Then you should report this as a bug.
You can generate a local policy module to allow this access.
Do
allow this access for now by executing:
# ausearch -c 'kwin_wayland' --raw | audit2allow -M my-kwinwayland
# semodule -X 300 -i my-kwinwayland.pp


Additional Information:
Source Context                system_u:system_r:xdm_t:s0-s0:c0.c1023
Target Context                system_u:object_r:device_t:s0
Target Objects                nvidia-modeset [ chr_file ]
Source                        kwin_wayland
Source Path                   kwin_wayland
Port                          <Unknown>
Host                          COMPUTER.DOMAIN.NAME
Source RPM Packages           
Target RPM Packages           
SELinux Policy RPM            <Unknown>
Local Policy RPM              <Unknown>
Selinux Enabled               True
Policy Type                   targeted
Enforcing Mode                Enforcing
Host Name                     COMPUTER.DOMAIN.NAME
Platform                      Linux COMPUTER.DOMAIN.NAME
                              6.9.12-205.fsync.fc40.x86_64 #1 SMP
                              PREEMPT_DYNAMIC Thu Aug 22 20:33:26 UTC 2024
                              x86_64
Alert Count                   326
First Seen                    2024-08-05 23:46:34 CEST
Last Seen                     2024-10-02 10:43:14 CEST
Local ID                      fced9120-1a43-4615-b5c3-66eae81adbc2

Raw Audit Messages
type=AVC msg=audit(1727858594.192:9961): avc:  denied  { read write } for  pid=6984 comm="maliit-keyboard" name="nvidia-modeset" dev="devtmpfs" ino=1458 scontext=system_u:system_r:xdm_t:s0-s0:c0.c1023 tcontext=system_u:object_r:device_t:s0 tclass=chr_file permissive=0


Hash: kwin_wayland,xdm_t,device_t,chr_file,read,write

I have never had to troubleshoot anything to do with SELinux, but (as far as i understand this), it seems like SELinux is blocking Wayland to read the DRM Cache (DRM = Direct Render Manager, not Digital Rights Management).

I'll try to figure out, how to allow that, without setting SELinux to permissive mode after i've pulled a Backup of my drive.

@NekroSomnia
Copy link
Author

It seems like i accidentally fixed my issue.

Had to reset my CMOS yesterday after installing more RAM, since i got some weird post issues (too many Sticks, too high of a frequency, Memory controller wasn't having it). Now the issue seems to be gone.
I replicated the BIOS Settings i had before, just to see, if the issue would pop up again. But no, it seems like my problems just vanished.

I should be happy about that, but the fact, that I don't know what caused the problems just annoys me to no end.

@Moshugan
Copy link

Moshugan commented Oct 3, 2024

It seems like i accidentally fixed my issue.

Had to reset my CMOS yesterday after installing more RAM, since i got some weird post issues (too many Sticks, too high of a frequency, Memory controller wasn't having it). Now the issue seems to be gone. I replicated the BIOS Settings i had before, just to see, if the issue would pop up again. But no, it seems like my problems just vanished.

I should be happy about that, but the fact, that I don't know what caused the problems just annoys me to no end.

Aha!! So it might be some kind of an issue related to CMOS? You know what, I've recently had this problem that the clock time has been wrong every time I've booted up Windows 10! The clock on Linux has been right from bootup, but I guess it just updates it immediately via online unlike Windows where I had to manually choose to sync the clock . I've suspected that there's something wrong with the CMOS battery but haven't got around to doing anything about it yet. So if the CMOS battery dying is causing the wrong time on Windows, then maybe it's causing this issue on Bazzite?

BTW, thank you very much for doing all this work on this issue!

@tarus13
Copy link

tarus13 commented Oct 13, 2024

I’d like to add to this conversation. I have NVIDIA 4060 and i5 13400 and experiencing this same issue. It’s completely random and only shutting down and restarting the machine resolves the issue.

@mrdev023
Copy link

I get this issue a lot actually 🥲 Any news about this ?

@NekroSomnia
Copy link
Author

I get this issue a lot actually 🥲 Any news about this ?

Yes and no.
The Discussion on Discord died down.
Hikari thinks it might be an Issue on nVidias End and they don't see a solution for it.

The Issue was gone for almost 2 weeks on my end, but came back a few days ago. Even worse than before.
My original workaround (killing the dying session from a TTY) does not work for me anymore.
I have to log out before the session self terminates. After that, i can log in again.
I'm seriously contemplating to rip the m.2 with bazzite out of my system and reinstalling a fresh copy on a spare drive i've got collecting dust.
Just to see, if that's something that happens on the most recent iso.
Was kinda hoping to avoid that, since it's a hassle to get to that specific Drive.

@Mattheish
Copy link

I've been having this issue for a quite some weeks not sure when it started precisely. When you get kicked back to SDDM the issues start when logging in again.

Current solution for me is when login screen shows to just switch sessions to another tty not login and then switch back immediately with ctrl+alt+f2. Then it just drops me back to the desktop and no further issues occur for me until next reboot.

For me this works of course no guarantee for anyone else but maybe worth a try.

@mrdev023
Copy link

I've been having this issue for a quite some weeks not sure when it started precisely. When you get kicked back to SDDM the issues start when logging in again.

Current solution for me is when login screen shows to just switch sessions to another tty not login and then switch back immediately with ctrl+alt+f2. Then it just drops me back to the desktop and no further issues occur for me until next reboot.

For me this works of course no guarantee for anyone else but maybe worth a try.

Yes I made that but after switch, I can't connect Bluetooth device or listen music on Bluetooth device

@NekroSomnia
Copy link
Author

Current solution for me is when login screen shows to just switch sessions to another tty not login and then switch back immediately with ctrl+alt+f2. Then it just drops me back to the desktop and no further issues occur for me until next reboot.

For me this works of course no guarantee for anyone else but maybe worth a try.

I'd advice against doing that. Your Session will be in a "closing" state, in which many essential Services are either shut down or in the process of being shut down.
That includes - as far as i know - some PAM Services. Those are requiered to do stuff like log on to wifi, connect to network shares and many more.
I don't know, if security might be compromised in that state, but i'd not risk it tbh.
You can verify your session state by running loginctl list-sessions from a terminal.
If your session has the state "closing", just logout and log back in. This currently solves the issue for me.

if it doesn't solve the issue, you can always switch to tty3, run loginctl list-sessions, note the Session that has the "closing" state (usually Session 2) and kill it using loginctl kill-session $number. After that switch to TTY1 (this is where the Greeter/Login Window is spawned) and login normally again.
This was the first workaround i've came up with, which worked for me (and some others) for a while. For some reason, it refuses to work for me at this point.
If i try this, i'll get no display out at all and the system seems to completely freeze. CTRL+ALT+DEL to enforce a reboot works. Just keep that in mind, if you try this yourself.

The workaround from the Bazzite Discord is entierly different. They are somewhat certain, that the issue relates to Nvidia GPUs and Freesync Displays and they recommend unplugging your freesync Display before boot to circumvent this issue entirely.
I haven't tested that one myself, so I can't judge how well that works.

Yes I made that but after switch, I can't connect Bluetooth device or listen music on Bluetooth device

This is expected, because the pam service which handles the Authentification is usually shut down at that point. If you want to connect to a wifi network, you should get an error stating that.
Found that out the hard way too.

@Trezamere
Copy link

Trezamere commented Oct 31, 2024

I have this too and disabling SELinux is the only reliable (albeit undesirable) solution. Will try and post more detailed information later.

I haven't had a chance to dig into the labeling yet but hopefully someone on the team is actually looking at this rather than saying "nvidia broke it", not being able to login or use the system is a pretty big issue and most users probably wont realize or know hpw to disable selinix...

@Pixelguin
Copy link

Pixelguin commented Oct 31, 2024

I had this issue with any build past bazzite-nvidia:stable-40.20240901.0, but rebasing to bazzite-nvidia-open:stable fixed it.

It might be worth trying the new branch if you're using a newer (GTX 16 / RTX) card.

@Moshugan
Copy link

Moshugan commented Nov 7, 2024

Hello again! I thought I should report that I haven't encountered this issue at all anymore as of the last couple of system updates. I don't have anything else useful to say about it, but I would like to know if anyone else has had the issue stop after updating to version 41.20241030 or version 41.20241104 ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

8 participants