View Issue Details

IDProjectCategoryView StatusLast Update
0000496channel: elrepo/el7kmod-nvidiapublic2014-08-01 05:57
Reporterzmyrgel Assigned Topperry  
PrioritynormalSeverityminorReproducibilityalways
Status assignedResolutionopen 
Summary0000496: Can't boot encrypted LUKS partitions with kmod-nvidia installed
DescriptionI'm using encrypted lvm partition besides normal /boot partition to protect my files. Using the nouveau driver I get prompted for encryption passphrase during boot to unlock my partitions and boot works just fine.

After installing kmod-nvidia I only get few lines of text during boot:
hid-generic: usage index exceeded
hid-generic: item 0 2 2 2 parsing failed

The cursor stays still after this and boot won't continue.
Additional InformationI've tried to enter my passphrase when the boot is stuck and pressed enter but that didn't help.

I also got hint from IRC to add:
omit_drivers+=" nouveau " to /etc/dracut.conf.d/nouveau.conf and run "dracut -fv"

That didn't help either.

Curiosly I can boot my laptop to single-user mode. (just add 'single' to kernel line in Grub prompt). Then I get asked for my passphrase and boot completes. Once I exit single user mode I get to Gnome desktop.
TagsNo tags attached.
Reported upstream

Activities

pperry

2014-07-29 13:02

administrator   ~0003819

Last edited: 2014-07-29 13:12

Hi,

Thanks for the report.

As I've never used encrypted partitions / LUKS, I'm afraid I'm a little lost to know where to start troubleshooting this.

Can we perhaps start at the beginning. Could you try uninstalling kmod-nvidia and try installing the nvidia drivers via the NVIDIA installer so we can determine if the problem is with the nvidia drivers or a problem elrepo has introduced in our packaging of those drivers.

Any more clues would be gratefully received. Once we discover what the problem is, we can hopefully come up with a solution.

zmyrgel

2014-07-29 23:24

reporter   ~0003822

As the driver itself is working if booting first to single user mode and then switching to multi-user mode I'd assume it something with the boot process.

I'd assume that the boot process for normal multi-user mode is still expecting some nouveau dependency, as it can't load nouveau module, the boot will stuck.
And I assume the single-user mode doesn't depend on the nouveau parts so it will ask the passphrase to unlock encrypted partitions and can boot normally.

I have zero experience on systemd still so I'm bit clueless on how to debug what are the normal multi-user mode startup actions and why it seem to get stuck.

zmyrgel

2014-07-30 01:59

reporter   ~0003823

Found the cause, after removing the "rhgb" flag from Grub kernel line the boot up used the text version and continued normally.

The 'rhgb' graphical boot probably requires KMS enabled driver and as nVidia driver doesn't provide that, it can't proceed.

Should the kmod-nvidia package check the default kernel flags and remove them and generate new Grub config automatically or just warn user to do it themselves?

pperry

2014-07-30 07:48

administrator   ~0003825

Great.

To explain a little about what we do and why we do it...

In order for the nvidia kernel driver to be able to load, it must be able to bind the device and it can't do that if another driver has already done that - in our case that would be the nouveau driver.

So we must prevent the nouveau driver from loading.

There are two ways we can do that, and for redundency we actually use both although the first method is probably the preferred method.

1. We can blacklist the nouveau driver in the initramfs image:

# lsinitrd -k 3.10.0-123.4.4.el7.x86_64 | grep nouveau
-rw-r--r-- 1 root root 208 Jul 9 14:35 usr/lib/modprobe.d/blacklist-nouveau.conf
drwxr-xr-x 2 root root 0 Jul 30 14:16 usr/lib/modules/3.10.0-123.4.4.el7.x86_64/kernel/drivers/gpu/drm/nouveau
-rw-r--r-- 1 root root 1526009 Jul 16 19:56 usr/lib/modules/3.10.0-123.4.4.el7.x86_64/kernel/drivers/gpu/drm/nouveau/nouveau.ko

note the nouveau driver is still in the initramfs but is blacklisted as per the /usr/lib/modprobe.d/blacklist-nouveau.conf configuration file. This prevents the nouveau driver from loading and allows the nvidia driver to bind the device.

Note we also add the nvidia driver to the initramfs image:

# lsinitrd -k 3.10.0-123.4.4.el7.x86_64 | grep nvidia
-rw-r--r-- 1 root root 128 May 23 10:19 etc/ld.so.conf.d/nvidia.conf
drwxr-xr-x 2 root root 0 Jul 30 14:16 usr/lib64/nvidia
drwxr-xr-x 2 root root 0 Jul 30 14:16 usr/lib/modules/3.10.0-123.4.4.el7.x86_64/weak-updates/nvidia
lrwxrwxrwx 1 root root 53 Jul 30 14:16 usr/lib/modules/3.10.0-123.4.4.el7.x86_64/weak-updates/nvidia/nvidia.ko -> ../../../3.10.0-123.el7.x86_64/extra/nvidia/nvidia.ko
drwxr-xr-x 2 root root 0 Jul 30 14:16 usr/lib/modules/3.10.0-123.el7.x86_64/extra/nvidia
-rw-r--r-- 1 root root 18930795 Jul 9 14:35 usr/lib/modules/3.10.0-123.el7.x86_64/extra/nvidia/nvidia.ko


2. The second method is to blacklist the nouveau driver on the grub kernel boot line:

# cat /boot/grub2/grub.cfg | grep rd.driver.blacklist
        linux16 /boot/vmlinuz-3.10.0-123.4.4.el7.x86_64 root=UUID=e2e45082-2c6c-48d8-9158-64f30f846931 ro crashkernel=auto rd.md.uuid=579d4a98:7b099fc7:96f9094d:958e049a vconsole.font=latarcyrheb-sun16 vconsole.keymap=uk rhgb quiet LANG=en_GB.UTF-8 nouveau.modeset=0 rd.driver.blacklist=nouveau
        linux16 /boot/vmlinuz-3.10.0-123.4.2.el7.x86_64 root=UUID=e2e45082-2c6c-48d8-9158-64f30f846931 ro crashkernel=auto rd.md.uuid=579d4a98:7b099fc7:96f9094d:958e049a vconsole.font=latarcyrheb-sun16 vconsole.keymap=uk rhgb quiet LANG=en_GB.UTF-8 nouveau.modeset=0 rd.driver.blacklist=nouveau
        linux16 /boot/vmlinuz-3.10.0-123.el7.x86_64 root=UUID=e2e45082-2c6c-48d8-9158-64f30f846931 ro crashkernel=auto rd.md.uuid=579d4a98:7b099fc7:96f9094d:958e049a vconsole.font=latarcyrheb-sun16 vconsole.keymap=uk rhgb quiet nouveau.modeset=0 rd.driver.blacklist=nouveau
        linux16 /boot/vmlinuz-3.10.0-123.1.2.el7.x86_64 root=UUID=e2e45082-2c6c-48d8-9158-64f30f846931 ro crashkernel=auto rd.md.uuid=579d4a98:7b099fc7:96f9094d:958e049a vconsole.font=latarcyrheb-sun16 vconsole.keymap=uk rhgb quiet nouveau.modeset=0 rd.driver.blacklist=nouveau

This method seems less preferable as when one runs 'grub2-mkconfig -o /boot/grub2/grub.cfg' this can undo our custom settings.

So those are the two methods by which we stop the nouveau driver loading early in the boot process.

As this issue looks like a grub configuration issue, that's probably where we need to look to refine things.

Looking in /etc/default/grub, I see RHGB is set as default there:

# cat /etc/default/grub
GRUB_TIMEOUT=5
GRUB_DISTRIBUTOR="$(sed 's, release .*$,,g' /etc/system-release)"
GRUB_DEFAULT=saved
GRUB_DISABLE_SUBMENU=true
GRUB_TERMINAL_OUTPUT="console"
GRUB_CMDLINE_LINUX="crashkernel=auto rd.md.uuid=579d4a98:7b099fc7:96f9094d:958e049a vconsole.font=latarcyrheb-sun16 vconsole.keymap=uk rhgb quiet"
GRUB_DISABLE_RECOVERY="true"

Also, I wonder if setting the correct graphics mode might help? So one could try adding something like the following to /etc/default/grub:

GRUB_GFXMODE=1280x1024

trying whatever your default resolution is, and running grub2-mkconfig to pick up the changes.

pperry

2014-07-30 08:03

administrator   ~0003826

OK, I just tried some of the above grub options.

Under default settings, after selecting the kernel to boot in the grub menu my system boots to a linking cursor in the top left corner for about 10 secs and then displays the graphical login screen.

Setting GRUB_GFXMODE=1280x1024 made no difference.

Removing 'rhgb' from the GRUB_CMDLINE_LINUX in /etc/default/grub allows the system to boot in non-graphical mode displaying the usual list of services and the green [OK] next to each.

I'm not inclined to have our packages alter these default settings on end user systems. I think in this case we should simply document the issue and the workaround of removing rhgb that you have discovered.

zmyrgel

2014-07-31 03:09

reporter   ~0003831

Yeah, probably best option would be to document the necessary changes in the package page and have the package install warn about that extra changes might be required.

Shouldn't the blacklist etc. grub options be added to the package page right next to the glamor removal/disable note?

pperry

2014-07-31 12:48

administrator   ~0003833

Or we could add a README file to document such issues and distributed it with the package.

That way any community member could easily document issues and workarounds at our github repository:

https://github.com/elrepo/packages/

Also a package-specific README file has the advantage that we can ship distro-specific documentation for each distro. For example, the glamor issue only affects rhel6 and rhel7, not rhel5.

pperry

2014-07-31 12:57

administrator   ~0003834

Whilst I'm here... I haven't been able to figure out how to get the plymouth boot screen stuff working yet on rhel7.

It works on rhel6 after I append 'vga=795' to my kernel boot line, so from that I assume the nvidia driver must support some aspect of kernel mode setting (KMS), but I haven't been able to get it to work on rhel7 yet (or at least I don't get a plymouth boot screen).

Mind you, I'm not even sure what "working" looks like. My development laptop (with Intel graphics) boots rhel7 in less than 10 secs so there's hardly time for any boot screen to show. There's not a lot of animation - looks more like a darkish screen with a "7" in the background, a little like one of the default desktop wallpapers in gnome.

zmyrgel

2014-07-31 22:46

reporter   ~0003837

Yes, the normal VGA boot up works but that graphical boot screen with 7 AFAIK requires KMS driver which the nvidia is lacking.

pperry

2014-08-01 05:57

administrator   ~0003838

So presumably that is a change in plymouth between rhel6 and rhel7. Plymouth in rhel6 definitely works with the nvidia drivers, but using the same techniques I can't get it to work on rhel7. So that would support your assertion that plymouth on rhel7 requires a driver supporting KMS.

Further, the rhel7 docs state here that plymouth uses KMS:

https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/7/html/Desktop_Migration_and_Administration_Guide/plymouth.html

Issue History

Date Modified Username Field Change
2014-07-29 00:00 zmyrgel New Issue
2014-07-29 00:00 zmyrgel Status new => assigned
2014-07-29 00:00 zmyrgel Assigned To => pperry
2014-07-29 13:02 pperry Note Added: 0003819
2014-07-29 13:12 pperry Note Edited: 0003819
2014-07-29 23:24 zmyrgel Note Added: 0003822
2014-07-30 01:59 zmyrgel Note Added: 0003823
2014-07-30 07:48 pperry Note Added: 0003825
2014-07-30 08:03 pperry Note Added: 0003826
2014-07-31 03:09 zmyrgel Note Added: 0003831
2014-07-31 12:48 pperry Note Added: 0003833
2014-07-31 12:57 pperry Note Added: 0003834
2014-07-31 22:46 zmyrgel Note Added: 0003837
2014-08-01 05:57 pperry Note Added: 0003838