View Issue Details

IDProjectCategoryView StatusLast Update
0001022channel: elrepo/el8kmod-nvidiapublic2020-09-09 20:56
Reportermroche Assigned Topperry  
PrioritynormalSeveritymajorReproducibilityalways
Status resolvedResolutionfixed 
Summary0001022: Upgrade to 450.57 from 440.100 breaks gnome-session
DescriptionIn upgrading from 440.100 on RHEL 8.2 (latest), gdm no longer starts appropriately and will completely fail. In one set of boots I was able to capture some selinux catches, but a second attempt at upgrading after a rollback doesn't output anything. I've provided below the audit.log contents specific to the issue as well as the output from journal on the first attempt.

The actual visual problem is GDM tries to load and then provides an all white screen saying something has gone wrong with just a button to log out.

Rolling back to 440.100 as it's still stable on my system.

audit.log
type=AVC msg=audit(1594585929.767:76): avc: denied { map } for pid=2248 comm="gnome-session-c" path=2F6D656D66643A2F2E6E76696469615F6472762E585858585858202864656C6574656429 dev="tmpfs" ino=37368 scontext=system_u:system_r:xdm_t:s0-s0:c0.c1023 tcontext=system_u:object_r:xserver_tmpfs_t:s0 tclass=file permissive=0
type=AVC msg=audit(1594585929.855:77): avc: denied { map } for pid=2251 comm="gnome-session-c" path=2F6D656D66643A2F2E6E76696469615F6472762E585858585858202864656C6574656429 dev="tmpfs" ino=37368 scontext=system_u:system_r:xdm_t:s0-s0:c0.c1023 tcontext=system_u:object_r:xserver_tmpfs_t:s0 tclass=file permissive=0
type=AVC msg=audit(1594585934.899:78): avc: denied { map } for pid=2263 comm="gnome-session-c" path=2F6D656D66643A2F2E6E76696469615F6472762E585858585858202864656C6574656429 dev="tmpfs" ino=37368 scontext=system_u:system_r:xdm_t:s0-s0:c0.c1023 tcontext=system_u:object_r:xserver_tmpfs_t:s0 tclass=file permissive=0
type=AVC msg=audit(1594585934.923:79): avc: denied { map } for pid=2264 comm="gnome-session-c" path=2F6D656D66643A2F2E6E76696469615F6472762E585858585858202864656C6574656429 dev="tmpfs" ino=37368 scontext=system_u:system_r:xdm_t:s0-s0:c0.c1023 tcontext=system_u:object_r:xserver_tmpfs_t:s0 tclass=file permissive=0
TagsNo tags attached.
Attached Files
gnome-session-failure.txt (14,082 bytes)   
Jul 12 16:32:09 workstation gnome-session[2234]: X Error of failed request:  BadValue (integer parameter out of range for operation)
Jul 12 16:32:09 workstation gnome-session[2234]:   Major opcode of failed request:  150 (GLX)
Jul 12 16:32:09 workstation gnome-session[2234]:   Minor opcode of failed request:  3 (X_GLXCreateContext)
Jul 12 16:32:09 workstation gnome-session[2234]:   Value in failed request:  0x0
Jul 12 16:32:09 workstation gnome-session[2234]:   Serial number of failed request:  34
Jul 12 16:32:09 workstation gnome-session[2234]:   Current serial number in output stream:  35
Jul 12 16:32:09 workstation gnome-session[2234]: gnome-session-check-accelerated: GL Helper exited with code 256
Jul 12 16:32:09 workstation gnome-session-c[2251]: Failed to create EGL surface
Jul 12 16:32:09 workstation gnome-session[2234]: gnome-session-check-accelerated: GLES Helper exited with code 256
Jul 12 16:32:12 workstation dbus-daemon[1170]: [system] Activating service name='org.fedoraproject.Setroubleshootd' requested by ':1.204' (uid=0 pid=1113 comm="/usr/sbin/sedispatch " label="system_u:system_r:auditd_t:s0") (using servicehelper)
Jul 12 16:32:12 workstation dbus-daemon[2254]: [system] Failed to reset fd limit before activating service: org.freedesktop.DBus.Error.AccessDenied: Failed to restore old fd limit: Operation not permitted
Jul 12 16:32:13 workstation dbus-daemon[1170]: [system] Successfully activated service 'org.fedoraproject.Setroubleshootd'
Jul 12 16:32:14 workstation setroubleshoot[2254]: SELinux is preventing gnome-session-c from map access on the file /memfd:/.nvidia_drv.XXXXXX (deleted). For complete SELinux messages run: sealert -l 91d94d03-1d8e-4460-a031-b6aed025614b
Jul 12 16:32:14 workstation platform-python[2254]: SELinux is preventing gnome-session-c from map access on the file /memfd:/.nvidia_drv.XXXXXX (deleted).
                                                 
                                                 *****  Plugin restorecon (92.2 confidence) suggests   ************************
                                                 
                                                 If you want to fix the label. 
                                                 /memfd:/.nvidia_drv.XXXXXX (deleted) default label should be default_t.
                                                 Then you can run restorecon. The access attempt may have been stopped due to insufficient permissions to access a parent directory in which case try to change the following command accordingly.
                                                 Do
                                                 # /sbin/restorecon -v /memfd:/.nvidia_drv.XXXXXX (deleted)
                                                 
                                                 *****  Plugin catchall_boolean (7.83 confidence) suggests   ******************
                                                 
                                                 If you want to allow domain to can mmap files
                                                 Then you must tell SELinux about this by enabling the 'domain_can_mmap_files' boolean.
                                                 
                                                 Do
                                                 setsebool -P domain_can_mmap_files 1
                                                 
                                                 *****  Plugin catchall (1.41 confidence) suggests   **************************
                                                 
                                                 If you believe that gnome-session-c should be allowed map access on the .nvidia_drv.XXXXXX (deleted) file by default.
                                                 Then you should report this as a bug.
                                                 You can generate a local policy module to allow this access.
                                                 Do
                                                 allow this access for now by executing:
                                                 # ausearch -c 'gnome-session-c' --raw | audit2allow -M my-gnomesessionc
                                                 # semodule -X 300 -i my-gnomesessionc.pp
                                                 
Jul 12 16:32:14 workstation setroubleshoot[2254]: SELinux is preventing gnome-session-c from map access on the file /memfd:/.nvidia_drv.XXXXXX (deleted). For complete SELinux messages run: sealert -l 91d94d03-1d8e-4460-a031-b6aed025614b
Jul 12 16:32:14 workstation platform-python[2254]: SELinux is preventing gnome-session-c from map access on the file /memfd:/.nvidia_drv.XXXXXX (deleted).
                                                 
                                                 *****  Plugin restorecon (92.2 confidence) suggests   ************************
                                                 
                                                 If you want to fix the label. 
                                                 /memfd:/.nvidia_drv.XXXXXX (deleted) default label should be default_t.
                                                 Then you can run restorecon. The access attempt may have been stopped due to insufficient permissions to access a parent directory in which case try to change the following command accordingly.
                                                 Do
                                                 # /sbin/restorecon -v /memfd:/.nvidia_drv.XXXXXX (deleted)
                                                 
                                                 *****  Plugin catchall_boolean (7.83 confidence) suggests   ******************
                                                 
                                                 If you want to allow domain to can mmap files
                                                 Then you must tell SELinux about this by enabling the 'domain_can_mmap_files' boolean.
                                                 
                                                 Do
                                                 setsebool -P domain_can_mmap_files 1
                                                 
                                                 *****  Plugin catchall (1.41 confidence) suggests   **************************
                                                 
                                                 If you believe that gnome-session-c should be allowed map access on the .nvidia_drv.XXXXXX (deleted) file by default.
                                                 Then you should report this as a bug.
                                                 You can generate a local policy module to allow this access.
                                                 Do
                                                 allow this access for now by executing:
                                                 # ausearch -c 'gnome-session-c' --raw | audit2allow -M my-gnomesessionc
                                                 # semodule -X 300 -i my-gnomesessionc.pp
                                                 
Jul 12 16:32:14 workstation gnome-session[2234]: X Error of failed request:  BadValue (integer parameter out of range for operation)
Jul 12 16:32:14 workstation gnome-session[2234]:   Major opcode of failed request:  150 (GLX)
Jul 12 16:32:14 workstation gnome-session[2234]:   Minor opcode of failed request:  3 (X_GLXCreateContext)
Jul 12 16:32:14 workstation gnome-session[2234]:   Value in failed request:  0x0
Jul 12 16:32:14 workstation gnome-session[2234]:   Serial number of failed request:  34
Jul 12 16:32:14 workstation gnome-session[2234]:   Current serial number in output stream:  35
Jul 12 16:32:14 workstation gnome-session[2234]: gnome-session-check-accelerated: GL Helper exited with code 256
Jul 12 16:32:14 workstation gnome-session-c[2264]: Failed to create EGL surface
Jul 12 16:32:14 workstation gnome-session[2234]: gnome-session-check-accelerated: GLES Helper exited with code 256
Jul 12 16:32:14 workstation gnome-session[2234]: gnome-session-binary[2234]: WARNING: software acceleration check failed: Child process exited with code 1
Jul 12 16:32:14 workstation gnome-session-binary[2234]: WARNING: software acceleration check failed: Child process exited with code 1
Jul 12 16:32:18 workstation setroubleshoot[2254]: SELinux is preventing gnome-session-c from map access on the file /memfd:/.nvidia_drv.XXXXXX (deleted). For complete SELinux messages run: sealert -l 91d94d03-1d8e-4460-a031-b6aed025614b
Jul 12 16:32:18 workstation platform-python[2254]: SELinux is preventing gnome-session-c from map access on the file /memfd:/.nvidia_drv.XXXXXX (deleted).
                                                 
                                                 *****  Plugin restorecon (92.2 confidence) suggests   ************************
                                                 
                                                 If you want to fix the label. 
                                                 /memfd:/.nvidia_drv.XXXXXX (deleted) default label should be default_t.
                                                 Then you can run restorecon. The access attempt may have been stopped due to insufficient permissions to access a parent directory in which case try to change the following command accordingly.
                                                 Do
                                                 # /sbin/restorecon -v /memfd:/.nvidia_drv.XXXXXX (deleted)
                                                 
                                                 *****  Plugin catchall_boolean (7.83 confidence) suggests   ******************
                                                 
                                                 If you want to allow domain to can mmap files
                                                 Then you must tell SELinux about this by enabling the 'domain_can_mmap_files' boolean.
                                                 
                                                 Do
                                                 setsebool -P domain_can_mmap_files 1
                                                 
                                                 *****  Plugin catchall (1.41 confidence) suggests   **************************
                                                 
                                                 If you believe that gnome-session-c should be allowed map access on the .nvidia_drv.XXXXXX (deleted) file by default.
                                                 Then you should report this as a bug.
                                                 You can generate a local policy module to allow this access.
                                                 Do
                                                 allow this access for now by executing:
                                                 # ausearch -c 'gnome-session-c' --raw | audit2allow -M my-gnomesessionc
                                                 # semodule -X 300 -i my-gnomesessionc.pp
                                                 
Jul 12 16:32:18 workstation setroubleshoot[2254]: SELinux is preventing gnome-session-c from map access on the file /memfd:/.nvidia_drv.XXXXXX (deleted). For complete SELinux messages run: sealert -l 91d94d03-1d8e-4460-a031-b6aed025614b
Jul 12 16:32:18 workstation platform-python[2254]: SELinux is preventing gnome-session-c from map access on the file /memfd:/.nvidia_drv.XXXXXX (deleted).
                                                 
                                                 *****  Plugin restorecon (92.2 confidence) suggests   ************************
                                                 
                                                 If you want to fix the label. 
                                                 /memfd:/.nvidia_drv.XXXXXX (deleted) default label should be default_t.
                                                 Then you can run restorecon. The access attempt may have been stopped due to insufficient permissions to access a parent directory in which case try to change the following command accordingly.
                                                 Do
                                                 # /sbin/restorecon -v /memfd:/.nvidia_drv.XXXXXX (deleted)
                                                 
                                                 *****  Plugin catchall_boolean (7.83 confidence) suggests   ******************
                                                 
                                                 If you want to allow domain to can mmap files
                                                 Then you must tell SELinux about this by enabling the 'domain_can_mmap_files' boolean.
                                                 
                                                 Do
                                                 setsebool -P domain_can_mmap_files 1
                                                 
                                                 *****  Plugin catchall (1.41 confidence) suggests   **************************
                                                 
                                                 If you believe that gnome-session-c should be allowed map access on the .nvidia_drv.XXXXXX (deleted) file by default.
                                                 Then you should report this as a bug.
                                                 You can generate a local policy module to allow this access.
                                                 Do
                                                 allow this access for now by executing:
                                                 # ausearch -c 'gnome-session-c' --raw | audit2allow -M my-gnomesessionc
                                                 # semodule -X 300 -i my-gnomesessionc.pp
gnome-session-failure.txt (14,082 bytes)   
nvupload.tar.xz (13,624 bytes)

Activities

pperry

2020-07-12 16:34

administrator   ~0007037

Are you able to test in permissive mode or with SELinux temporarily disabled to establish if it's purely an SELinux issue or if there's other issues at play?

I'm not sure where to start troubleshooting this.

mroche

2020-07-13 20:26

reporter   ~0007038

Didn't realize uploading a file would wipe the text area... whoops. Essentially boiled down to this:

- Rebooted into multi-user and updated driver.
- Set default to graphical and set selinux to permissive. Rebooted
- Everything, GDM, etc loads up fine and I can log in and things seem to work.
- Set selinux to enforcing and reboot
- Error as previously described
- Reboot into permissive and things work

nvupload.tar.xz contains the audit.log snippets, journal snippets, and Xorg logs if that helps.

pperry

2020-07-14 03:30

administrator   ~0007039

Thanks for that. So it looks like it's an SELinux issue, with gnome-session-c trying to perform a memory map on a temporary file on tmpfs.

Have you tried the suggested SELinux mitigations? I doubt it's a relabeling issue, but worth trying setting the following boolean:

setsebool -P domain_can_mmap_files 1

to see if this fixes the issue.

I guess we need to establish if (a) this is a universal issue or something unique to your system, and (b) if it's a universal issue, identify the correct course of action to fix it.

pperry

2020-07-14 03:43

administrator   ~0007040

Last edited: 2020-07-14 03:44

A little bit more background info on this.

On RHEL7, domain_can_mmap_files is enabled by default meaning that memory map permissions are not checked. However, this option is not enabled by default on RHEL8 meaning memory map permissions are now enforced by SELinux:

RHEL7:
$ getsebool domain_can_mmap_files
domain_can_mmap_files --> on

RHEL8:
$ getsebool domain_can_mmap_files
domain_can_mmap_files --> off

Enabling domain_can_mmap_files should fix the issue (I hope). Creating a custom policy as suggested in your attached journal.txt may provide a more specific/targeted solution if you are concerned about enabling globally.

lowen

2020-07-14 11:08

reporter   ~0007041

I can duplicate the 'Something went wrong' white screen issue with 450.57 Used rescue mode to downgrade to 440.100. I will try the change in the boolean listed above and report back.

lowen

2020-07-14 11:17

reporter   ~0007042

UPDATE: Setting the domain_can_mmap_files resolved the issue for me; updated to 450.57 . Here's some data:
[lowen@localhost ~]$ nvidia-detect -v
Probing for supported NVIDIA devices...
[10de:11be] NVIDIA Corporation GK104GLM [Quadro K3000M]
This device requires the current 440.64 NVIDIA driver kmod-nvidia
[lowen@localhost ~]$ lspci -v -d 10de:11be
01:00.0 VGA compatible controller: NVIDIA Corporation GK104GLM [Quadro K3000M] (rev a1) (prog-if 00 [VGA controller])
    Subsystem: Dell Device 053f
    Flags: bus master, fast devsel, latency 0, IRQ 36
    Memory at f5000000 (32-bit, non-prefetchable) [size=16M]
    Memory at e0000000 (64-bit, prefetchable) [size=256M]
    Memory at f0000000 (64-bit, prefetchable) [size=32M]
    I/O ports at e000 [size=128]
    [virtual] Expansion ROM at 000c0000 [disabled] [size=128K]
    Capabilities: <access denied>
    Kernel driver in use: nvidia
    Kernel modules: nouveau, nvidia_drm, nvidia

[lowen@localhost ~]$

pperry

2020-07-14 12:56

administrator   ~0007043

Brilliant, thank you. That would also explain why I see no issues on RHEL7.

So next question - is setting the domain_can_mmap_files boolean the best solution to this issue or can we craft a more targeted response. I would be somewhat reluctant to have our nvidia package make global changes to the default setting of domain_can_mmap_files boolean but equally don't want to ship broken packages.

I will see if I can compile a custom SELinux policy module to allow mmap access that we could ship with our nvidia package.

pperry

2020-07-14 13:19

administrator   ~0007044

Generating a local SELinux policy from the originally attached audit entries:

# cat audit.txt | audit2allow -M nvidialocal

gives the following type enforcement policy file:

# cat nvidialocal.te

module nvidialocal 1.0;

require {
    type xserver_tmpfs_t;
    type xdm_t;
    class file map;
}

#============= xdm_t ==============

#!!!! This avc can be allowed using the boolean 'domain_can_mmap_files'
allow xdm_t xserver_tmpfs_t:file map;

Would anyone be able to test the above policy file fixes the issue once domain_can_mmap_files boolean as been reset to it's original state?

If so, I would propose we install this custom SELinux policy to allow mmap access from within the nvidia package.

Any thoughts?

mroche

2020-07-14 14:20

reporter   ~0007045

I can take a stab at that tonight after work. For the uninitiated and lazy, where do policies go? Or do they need to be loaded by command which will then save the rule?

Thanks for all the work, Perry!

This is also on the RH Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1851448

pperry

2020-07-14 15:26

administrator   ~0007046

From your audit .log, do:

# grep gnome-session-c /path/to/audit.log | audit2allow -M nvidialocal

should give you nvidialocal.te and nvidialocal.pp in your current directory. You can check nvidialocal.te, should look like:

# cat nvidialocal.te

module nvidialocal 1.0;

require {
    type xserver_tmpfs_t;
    type xdm_t;
    class file map;
}

#============= xdm_t ==============
allow xdm_t xserver_tmpfs_t:file map;


Now load the .pp file using:

semodule -i nvidialocal.pp

and reset the domain_can_mmap_files boolean fix we previously did:

setsebool -P domain_can_mmap_files 0

and reboot to test.

To uninstall the SELinux module, use:

semodule -r nvidialocal

Assuming that works, I will implement the above SELinux policy as part of the package.

mroche

2020-07-14 16:54

reporter   ~0007047

Booted into multi-user, upgraded to 450.57, applied selinux module, set-default to graphical, and rebooted. Everything works.

My module had the same output as you:
+++++++++++++++
module nvidialocal 1.0;

require {
    type xserver_tmpfs_t;
    type xdm_t;
    class file map;
}

#============= xdm_t ==============

#!!!! This avc can be allowed using the boolean 'domain_can_mmap_files'
allow xdm_t xserver_tmpfs_t:file map;
+++++++++++++

Once into the system, I verified that enforcing was still active and the module loaded successfully.

# nvidia-settings --version
nvidia-settings: version 450.57
# getenforce
Enforcing
# semodule -l | grep nvidia
nvidialocal

pperry

2020-07-15 01:56

administrator   ~0007048

Brilliant, thanks Michael.

Can you just confirm you reset the domain_can_mmap_files boolean first:

setsebool -P domain_can_mmap_files 0

$ getsebool domain_can_mmap_files
domain_can_mmap_files --> off

I will take that .te file and add it to the nvidia package source, and allow the package to compile and install the module on el8. I don't see any point separating it out into a separate selinux sub-package.

mroche

2020-07-15 07:10

reporter   ~0007049

Sorry, thought I had included that in the above report! I never turned in on, but I made sure that I had it off when doing the prior test.

# getsebool domain_can_mmap_files
domain_can_mmap_files --> off

pperry

2020-07-16 06:17

administrator   ~0007052

Thanks for the confirmation Michael.

From Aaron's (nvidia) response to the Red Hat bug report, it seems this is something that should be in selinux-policy an maybe not something we should be looking to fix with an ad-hoc policy.

Therefore, I'm tempted to take no action given that affected users can switch on domain_can_mmap_files as a workaround for now. Hopefully RH will fix this sooner rather than later as it's a patch that is already in fedora.

pperry

2020-09-08 10:25

administrator   ~0007194

Upstream have released an updated selinux-policy package today which hopefully fixes this bug:

https://access.redhat.com/errata/RHBA-2020:3655

selinux-policy-3.14.3-41.el8_2.6.noarch.rpm

If anyone can confirm this fixes the issue, I'll close the bug here.

mroche

2020-09-09 20:21

reporter   ~0007195

Hey Perry, about to test this, I'll let you know how it goes!

Mike

mroche

2020-09-09 20:36

reporter   ~0007196

Alrighty, everything seems to check out! I'm using 450.66:

* Rebooted into multi-user target
* Unloaded nvidialocal module :: semodule -r nvidialocal
* Rebooted system
* Double checked boolean was disabled :: getsebool domain_can_mmap_files -> off
* Ran upgrade
* Rebooted into graphical target

Everything seems to work with selinux-policy-3.14.3-41.el8_2.6

If there's anything else you need from me, just ask!

Cheers,
Mike

mroche

2020-09-09 20:37

reporter   ~0007197

Small addition:

* Double checked boolean was disabled :: getsebool domain_can_mmap_files -> off

AND

* Double checked selinux module didn't exist :: semodule --list | grep nvidia

Cheers,
Mike

pperry

2020-09-09 20:56

administrator   ~0007198

Fantastic, thanks for the feedback.

Closing as fixed.

Issue History

Date Modified Username Field Change
2020-07-12 15:36 mroche New Issue
2020-07-12 15:36 mroche Status new => assigned
2020-07-12 15:36 mroche Assigned To => pperry
2020-07-12 15:36 mroche File Added: gnome-session-failure.txt
2020-07-12 16:34 pperry Note Added: 0007037
2020-07-13 20:23 mroche File Added: nvupload.tar.xz
2020-07-13 20:26 mroche Note Added: 0007038
2020-07-14 03:30 pperry Note Added: 0007039
2020-07-14 03:43 pperry Note Added: 0007040
2020-07-14 03:44 pperry Note Edited: 0007040
2020-07-14 11:08 lowen Note Added: 0007041
2020-07-14 11:17 lowen Note Added: 0007042
2020-07-14 12:56 pperry Note Added: 0007043
2020-07-14 13:19 pperry Note Added: 0007044
2020-07-14 14:20 mroche Note Added: 0007045
2020-07-14 15:26 pperry Note Added: 0007046
2020-07-14 16:54 mroche Note Added: 0007047
2020-07-15 01:56 pperry Note Added: 0007048
2020-07-15 07:10 mroche Note Added: 0007049
2020-07-16 06:17 pperry Note Added: 0007052
2020-09-08 10:25 pperry Note Added: 0007194
2020-09-08 10:25 pperry Status assigned => feedback
2020-09-09 20:21 mroche Note Added: 0007195
2020-09-09 20:36 mroche Note Added: 0007196
2020-09-09 20:37 mroche Note Added: 0007197
2020-09-09 20:56 pperry Note Added: 0007198
2020-09-09 20:56 pperry Status feedback => resolved
2020-09-09 20:56 pperry Resolution open => fixed