View Issue Details

IDProjectCategoryView StatusLast Update
0001468channel: elrepo/el8kmod-mpt3saspublic2024-08-26 11:05
Reportermonsond Assigned Topperry  
PriorityhighSeveritycrashReproducibilityalways
Status acknowledgedResolutionopen 
PlatformDell C6320OSRHELOS Version8.10
Summary0001468: Fail to load SAS Driver During Provisioning
DescriptionAttempting to provision a C6320 from a Red Hat Satellite version 6.13, using RHEL8.10. The server has an "LSI SAS2008" raid controller, and thus requires the dd-mpt3sas-43.100.00.00-2.el8_10.elrepo.iso driver to be loaded with the kernel.

Provisioning fails, and troubleshooting the server shows the following:
journalctl:
'out of tree module taints kernel'
'module verification failed: signature and/or required key missing - tainting kernel'

/proc/sys/kernel/tainted : 77824
Steps To ReproduceSatellite provisioning templates updated with the following:

APPEND initrd=<%= @initrd %> <%= pxe_kernel_options %> <%= snippet("kickstart_kernel_options").strip %> dd=nfs:<IP_ADDRESS>:/<VOLUME>/C6320_RAID_Driver_RHEL8/dd-mpt3sas-43.100.00.00-2.el8_10.elrepo.iso

Target server is PXE booted from satellite, loads the OS but then consistently fails.
Additional InformationThis process succeeds with the 8_9 version of the driver on RHEL8.9, however we are blocked from upgrading the system to 8.10 until this is resolved.
TagsNo tags attached.

Activities

toracat

2024-07-08 12:04

administrator   ~0009953

> 'out of tree module taints kernel'
> 'module verification failed: signature and/or required key missing - tainting kernel'

This is just informational, not an error. However we have seen this kind of failure with 8.10 DUD install:

https://elrepo.org/bugs/view.php?id=1458

We are investigating the cause.

geturner

2024-07-09 08:39

reporter   ~0009958

I am the software lead on our program. My co-worker, Derek, entered this bug report. What can we do to assist with resolving this? Because RH support is "best" on even minor versions, 8.10 is our preferred production delivery, but we are stuck with 8.9 if we cannot get this driver working. We will do whatever we can to assist with this issue.

tqhoang

2024-07-12 16:46

manager   ~0009964

@monsond @geturner
Can you please elaborate on "Target server is PXE booted from satellite, loads the OS but then consistently fails."?
For example, does the RHEL install complete and upon reboot fails to find the OS?

geturner

2024-07-12 17:05

reporter   ~0009965

We have what we call a "provisioning" server. This server is hosting RedHat Satellite. When we provision a new machine with RHEL 8.10 we use a "kickstart" and that change to use the SaS driver is specified in that file. The "target server" is the one being provisioned. We initial a PXE boot which would load the driver thru the kickstart. Once the driver is loaded, RH Satellite would see the drive and be able to install the RH OS onto it. But when the driver loads, no "visible drive" is available. This works fine on RH OS 8.9 with the 8.0 SaS driver.

monsond

2024-07-12 17:41

reporter   ~0009966

As Gene stated, the pxe boot loads the image and the SAS driver, both get executed. Midway through the OS install it loads the SAS driver, which reports an error. This is prior to drive partitioning, which fails as a result. So the OS never gets installed. My earlier statement on "consistent failures" could be restated as each time we pxe boot the target host, it dies in the same part of the installation, just before drive partitioning.

tqhoang

2024-07-15 09:24

manager   ~0009970

When you are at the drive partitioning stage, does your process stop and allow you to get to a terminal (ex: CTRL+ALT+F2)?

Can you please run the following?
1. lsmod | grep mpt3sas
2. modinfo mpt3sas | grep filename
3. find /lib/modules -name mpt3sas.ko
4. lsinitrd -k $(uname -r) | grep -i mpt3sas
5. Attach a copy of "dmesg" output
6. Attach a copy of "lspci -v -nn" output

monsond

2024-07-15 10:18

reporter   ~0009971

Yes we're able to open a terminal, with a few caveats. The target system is unfortunately in a closed lab, which makes the process of transferring out log files complicated, both procedurally and logistically. It also means that someone needs to be physically at the terminal, since this server type does not support remote management (these are some of our few infrastructure components with this limitation, and it's a huge pain!)

So all of that being said, we'll work on collecting items 1-4 for you, however 5 & 6 will be difficult. If there's something specific we could look for and provide you in the output of 5/6, that would be easier.

tqhoang

2024-07-15 10:33

manager   ~0009972

Last edited: 2024-07-15 10:37

1-4 should be sufficient.
For #5, can you grep the dmesg output for something like "Warning: Disabled Hardware is detected", that would be helpful...just want to see if it's printed.
For #6, can you run "lspci -v -nn | grep -A1 -i MPT" and send the PCI vendor and device ID?

By chance, have you tried to manually install RHEL 8.10 with just the driver disk ISO on a USB stick? (i.e. append "inst.dd" to the command-line after the "quiet" param)

monsond

2024-07-15 13:36

reporter   ~0009973

1.
mpt3sas 344064 0
raid_class 16384 1 mpt3sas
scsi_transport_sas 45056 1 mpt3sas

2.
filename: /lib/modules/4.18.0-553.el8_10.x86_64/updates/extra/mpt3sas/mpt3sas.ko

3.
/lib/modules/4.18.0-553.el8_10.x86_64/updates/extra/mpt3sas/mpt3sas.ko

4. lsinitrd: command not found; present on our other hosts, so it must not be installed yet.

5.
[ 8.743693] Warning: Disabled Hardware is detected: mpt3sas:1000:0072 @ 0000:02:00.0 is no longer enabled in this release.

6.
02:00.0 Serial Attached SCSI controller [0107]: Broadcom / LSI SAS2008 PCI-Express Fusion-MPT SAS-2 [Falcon] [1000:0072] (rev 03)
        Subsystem: Inventec Corporation Device [1170:6019]

tqhoang

2024-07-15 15:13

manager   ~0009974

So it looks as the driver disk is installed correctly but it's not being used over the RHEL kernel's in-tree driver (which has your hardware blocked).

I've never used this Red Hat Satellite before, but by chance does it download and use errata kernels instead of the stock RHEL 8.10 kernel (i.e. 4.18.0-553.el8_10.x86_64)?
Can you run "uname -r" and let us know if the version number is different?

monsond

2024-07-15 15:48

reporter   ~0009975

It matches: 4.18.0-553.el8_10.x86_64

We don't modify the kernel directly or through satellite, most of our proprietary system configuration is performed post OS install.

tqhoang

2024-08-26 11:05

manager   ~0010060

Any update on this issue?

Have you tried to install the system the old fashioned way with a Blu-ray disc and loading our kmod-mpt3sas ISO or CD? Please try specifying the driver with "inst.dd" instead of the older "dd".

From what I've read, the anaconda installer is supposed to remove any existing kmod with the same name (i.e. modprobe -r) and then install the driver disk. So that is why you still see the "Warning: Disabled Hardware is detected" message.

Issue History

Date Modified Username Field Change
2024-07-08 11:47 monsond New Issue
2024-07-08 11:47 monsond Status new => assigned
2024-07-08 11:47 monsond Assigned To => pperry
2024-07-08 11:49 toracat Status assigned => acknowledged
2024-07-08 11:52 toracat Relationship added related to 0001458
2024-07-08 12:04 toracat Note Added: 0009953
2024-07-09 08:39 geturner Note Added: 0009958
2024-07-12 16:46 tqhoang Note Added: 0009964
2024-07-12 17:05 geturner Note Added: 0009965
2024-07-12 17:41 monsond Note Added: 0009966
2024-07-12 18:45 toracat Relationship deleted related to 0001458
2024-07-15 09:24 tqhoang Note Added: 0009970
2024-07-15 10:18 monsond Note Added: 0009971
2024-07-15 10:33 tqhoang Note Added: 0009972
2024-07-15 10:37 tqhoang Note Edited: 0009972
2024-07-15 13:36 monsond Note Added: 0009973
2024-07-15 15:13 tqhoang Note Added: 0009974
2024-07-15 15:48 monsond Note Added: 0009975
2024-08-26 11:05 tqhoang Note Added: 0010060