View Issue Details

IDProjectCategoryView StatusLast Update
0001462channel: elrepo/el8kmod-ib_qibpublic2024-07-09 22:48
Reporterheckman Assigned Totqhoang  
PriorityhighSeveritymajorReproducibilityalways
Status closedResolutionfixed 
Summary0001462: driver ib_qib
DescriptionDears Srs.


I'll apreciate so much if you i you can create de rpm for kmod-ib_qib-1.11-X.el8_8.elrepo.x86_64.rpm for Rocky Linux 8.8 with kernel kernel-4.18.0-477.27.1.el8_8.x86_64 .

i sow that you create for rocky linux (redhat ) until 8.5 release, but i can't startup my Infniband QLogic HCA for mount lustre.
Steps To Reproduceroot@nodo1 ~]# ibstat
[root@nodo1 ~]# ibv_devinfo
No IB devices found
[root@nodo1 ~]#
Additional Information
lspci | grep QL
81:00.0 InfiniBand: QLogic Corp. IBA7322 QDR InfiniBand HCA (rev 02)
TagsNo tags attached.

Relationships

related to 0001459 closedtqhoang Request for kmod ib-qib for InfiniBand: QLogic Corp. IBA7220 InfiniBand HCA for RHEL 8.10 

Activities

toracat

2024-06-14 03:59

administrator   ~0009850

Rocky Linux 8.10 is the current release. 8.8 is no longer supported. We have released kmod-ib_qib-1.11-3.el8_10.elrepo.x86_64.rpm. Do you really need the kmod for EL 8.8?

heckman

2024-06-14 14:17

reporter   ~0009852

Hello toracat,

Yes, I really need it , and i'll apreciate so much that you can create the kmod for EL 8.8 in kernel-4.18.0-477.27.1.el8_8.x86_64 or if you can share with me how i build it, i'll do my best effort.

Thanks in advance.

tqhoang

2024-06-17 09:40

manager   ~0009859

Last edited: 2024-06-17 11:00

We have rebuilt our EL8_10 kmod for EL8_8. Please download and keep a copy of the packages posted here:
https://elrepo.org/people/tqhoang/bug-1462/

On our build server, we only have the EL8_8 GA kernel release, but it should weak-link correctly with the errata kernel. But if you need to rebuild, you can do so with a command like this:
rpmbuild --rebuild --clean --define 'dist .el8_8.errata' --define 'kmod_kernel_version 4.18.0-477.27.1.el8_8' kmod-ib_qib-1.11-2.1.el8_8.elrepo.src.rpm

FWIW, here is a basic rpmbuild tutorial for reference, as you will need to do some setup first.
https://www.redhat.com/sysadmin/create-rpm-package

Finally, please submit feedback if the driver works for you. Thanks!

heckman

2024-06-17 12:58

reporter   ~0009862

Thank so much! Im try the driver and post the results .

heckman

2024-06-17 16:06

reporter   ~0009863

Sorry, i get this errors in dmesg:
==========================
ib_qib: disagreesaboutversionofsymbolib_create_send_mad
ib_qib: disagreesaboutversionofsymbolib_dispatch_event
ib_qib: disagreesaboutversionofsymbolib_free_send_mad
ib_qib: disagreesaboutversionofsymbolib_post_send_mad
ib_qib: disagreesaboutversionofsymbolib_set_device_ops
ib_qib: disagreesaboutversionofsymbolrdma_create_ah
ib_qib: disagreesaboutversionofsymbolrdma_destroy_ah_attr
ib_qib: disagreesaboutversionofsymbolrdma_destroy_ah_user
ib_qib: Unknownsymbolib_create_send_mad(err-22)
ib_qib: Unknownsymbolib_dispatch_event(err-22)
ib_qib: Unknownsymbolib_free_send_mad(err-22)
ib_qib: Unknownsymbolib_port_sysfs_get_ibdev_kobj(err0)
ib_qib: Unknownsymbolib_post_send_mad(err-22)
ib_qib: Unknownsymbolib_rvt_state_ops(err0)
ib_qib: Unknownsymbolib_set_device_ops(err-22)
ib_qib: Unknownsymbolrdma_create_ah(err-22)
ib_qib: Unknownsymbolrdma_destroy_ah_attr(err-22)
ib_qib: Unknownsymbolrdma_destroy_ah_user(err-22)
ib_qib: Unknownsymbolrvt_add_retry_timer_ext(err0)
ib_qib: Unknownsymbolrvt_add_rnr_timer(err0)
ib_qib: Unknownsymbolrvt_alloc_device(err0)
ib_qib: Unknownsymbolrvt_comm_est(err0)
ib_qib: Unknownsymbolrvt_compute_aeth(err0)
ib_qib: Unknownsymbolrvt_copy_sge(err0)
ib_qib: Unknownsymbolrvt_cq_enter(err0)
ib_qib: Unknownsymbolrvt_dealloc_device(err0)
ib_qib: Unknownsymbolrvt_error_qp(err0)
ib_qib: Unknownsymbolrvt_get_credit(err0)
ib_qib: Unknownsymbolrvt_get_rwqe(err0)
ib_qib: Unknownsymbolrvt_init_port(err0)
ib_qib: Unknownsymbolrvt_mcast_find(err0)
ib_qib: Unknownsymbolrvt_qp_iter_init(err0)
ib_qib: Unknownsymbolrvt_qp_iter_next(err0)
ib_qib: Unknownsymbolrvt_rc_error(err0)
ib_qib: Unknownsymbolrvt_register_device(err0)
ib_qib: Unknownsymbolrvt_restart_sge(err0)
ib_qib: Unknownsymbolrvt_rkey_ok(err0)
ib_qib: Unknownsymbolrvt_ruc_loopback(err0)
ib_qib: Unknownsymbolrvt_send_complete(err0)
ib_qib: Unknownsymbolrvt_stop_rc_timers(err0)
ib_qib: Unknownsymbolrvt_unregister_device(err0)
==========================================

and ibstatus:

root@nodo1 ~]# ibstatus
Fatal error: device '*': sys files not found (/sys/class/infiniband/*/ports)
[root@node193 ~]# ibstat
[root@node193 ~]

tqhoang

2024-06-17 18:42

manager   ~0009864

Last edited: 2024-06-17 18:51

Can you please send me the output for the following commands?
1. lspci -v -nn (just send the block relevant to the IB controller)
2. rpm -q kmod-ib_qib
3. modinfo ib_qib | grep filename
4. grep ib_qib /lib/modules/`uname -r`/modules.dep

Also is the module being loaded automatically or are you running modprobe manually?

heckman

2024-06-17 18:49

reporter   ~0009865

Hello!

Yes, thi is the output for IBA7322 QDR InfiniBand HCA


[root@nodo1 ~]# lspci -v -nn | grep QL
81:00.0 InfiniBand [0c06]: QLogic Corp. IBA7322 QDR InfiniBand HCA [1077:7322] (rev 02)
    Subsystem: QLogic Corp. IBA7322 QDR InfiniBand HCA [1077:7322]
[root@nodo1 ~]#

[root@nodo1 ~]# rpm -q kmod-ib_qib
kmod-ib_qib-1.11-2.1.el8_8.elrepo.x86_64
[root@nodo1 ~]#

[root@nodo1 ~]# modinfo ib_qib | grep filename
modinfo: ERROR: Module ib_qib not found.
[root@nod01 ~]#

[root@nodo1 ~]# grep ib_qib /lib/modules/`uname -r`/modules.dep
[root@nodo1 ~]#


Thanks in advance!

heckman

2024-06-17 18:51

reporter   ~0009866

Sorry,

the module it's loaded automatically after install rpm.

Thanks in advance!

tqhoang

2024-06-17 18:55

manager   ~0009867

Ok, the last two should have output. For some reason the RPM installation did not finish correctly. Assume you were manually running "insmod ib_qib.ko" or something like that right?

Can you run these too?
1. uname -a
2. rpm -q kernel
3. ls -R /lib/modules/`uname -r`/weak-updates

heckman

2024-06-18 12:54

reporter   ~0009869

yes, this is the output:

[root@node193 ~]# uname -a
Linux nodo1 4.18.0-477.27.1.el8_8.x86_64 #1 SMP Wed Sep 20 15:55:39 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
[root@nodo1 ~]# rpm -q kernel
kernel-4.18.0-477.27.1.el8_8.x86_64
[root@nodo1 ~]# ls -R /lib/modules/`uname -r`/weak-updates
/lib/modules/4.18.0-477.27.1.el8_8.x86_64/weak-updates:
kmod-kvdo

/lib/modules/4.18.0-477.27.1.el8_8.x86_64/weak-updates/kmod-kvdo:
uds vdo

/lib/modules/4.18.0-477.27.1.el8_8.x86_64/weak-updates/kmod-kvdo/uds:
uds.ko

/lib/modules/4.18.0-477.27.1.el8_8.x86_64/weak-updates/kmod-kvdo/vdo:
kvdo.ko
[root@nodo1 ~]#


Thanks in adavance!

heckman

2024-06-18 12:55

reporter   ~0009870

nodo1 and node193 are the same configuration.

tqhoang

2024-06-18 16:46

manager   ~0009875

Everything you've posted so far indicates that the kmod built against the EL8_8 GA kernel is not compatible with the errata kernel you are using. I'm still unsure how it's even attempting to auto-load since it's not weak-linked (i.e. in the weak-updates folder) and there isn't an entry for it in the modules.dep file.

When you installed the x86_64 binary RPM, did it have any errors? If not, then I think you will need to recompile the src.rpm as I've indicated above.

heckman

2024-06-18 17:43

reporter   ~0009876

After rebuil de rpm :



[root@nodo1 ~]# rpm -ivh /root/rpmbuild/RPMS/x86_64/kmod-ib_qib-1.11-2.1.el8_8.errata.x86_64.rpm
Verifying... ################################# [100%]
Preparing... ################################# [100%]
Updating / installing...
   1:kmod-ib_qib-1.11-2.1.el8_8.errata################################# [100%]
depmod: WARNING: /lib/modules/4.18.0-477.27.1.el8_8.x86_64/kernel/drivers/infiniband/hw/irdma/irdma.ko.xz needs unknown symbol rdma_alloc_hw_stats_struct
depmod: WARNING: /lib/modules/4.18.0-477.27.1.el8_8.x86_64/extra/ib_qib/ib_qib.ko needs unknown symbol ib_rvt_state_ops
depmod: WARNING: /lib/modules/4.18.0-477.27.1.el8_8.x86_64/extra/ib_qib/ib_qib.ko needs unknown symbol rvt_ruc_loopback
depmod: WARNING: /lib/modules/4.18.0-477.27.1.el8_8.x86_64/extra/ib_qib/ib_qib.ko needs unknown symbol rvt_error_qp
depmod: WARNING: /lib/modules/4.18.0-477.27.1.el8_8.x86_64/extra/ib_qib/ib_qib.ko needs unknown symbol rvt_stop_rc_timers
depmod: WARNING: /lib/modules/4.18.0-477.27.1.el8_8.x86_64/extra/ib_qib/ib_qib.ko needs unknown symbol rvt_unregister_device
depmod: WARNING: /lib/modules/4.18.0-477.27.1.el8_8.x86_64/extra/ib_qib/ib_qib.ko needs unknown symbol rvt_send_complete
depmod: WARNING: /lib/modules/4.18.0-477.27.1.el8_8.x86_64/extra/ib_qib/ib_qib.ko needs unknown symbol rvt_get_credit
depmod: WARNING: /lib/modules/4.18.0-477.27.1.el8_8.x86_64/extra/ib_qib/ib_qib.ko needs unknown symbol rvt_rkey_ok
depmod: WARNING: /lib/modules/4.18.0-477.27.1.el8_8.x86_64/extra/ib_qib/ib_qib.ko needs unknown symbol ib_port_sysfs_get_ibdev_kobj
depmod: WARNING: /lib/modules/4.18.0-477.27.1.el8_8.x86_64/extra/ib_qib/ib_qib.ko needs unknown symbol rvt_add_retry_timer_ext
depmod: WARNING: /lib/modules/4.18.0-477.27.1.el8_8.x86_64/extra/ib_qib/ib_qib.ko needs unknown symbol rvt_qp_iter_next
depmod: WARNING: /lib/modules/4.18.0-477.27.1.el8_8.x86_64/extra/ib_qib/ib_qib.ko needs unknown symbol rvt_qp_iter_init
depmod: WARNING: /lib/modules/4.18.0-477.27.1.el8_8.x86_64/extra/ib_qib/ib_qib.ko needs unknown symbol rvt_cq_enter
depmod: WARNING: /lib/modules/4.18.0-477.27.1.el8_8.x86_64/extra/ib_qib/ib_qib.ko needs unknown symbol rvt_dealloc_device
depmod: WARNING: /lib/modules/4.18.0-477.27.1.el8_8.x86_64/extra/ib_qib/ib_qib.ko needs unknown symbol rvt_mcast_find
depmod: WARNING: /lib/modules/4.18.0-477.27.1.el8_8.x86_64/extra/ib_qib/ib_qib.ko needs unknown symbol rvt_get_rwqe
depmod: WARNING: /lib/modules/4.18.0-477.27.1.el8_8.x86_64/extra/ib_qib/ib_qib.ko needs unknown symbol rvt_rc_error
depmod: WARNING: /lib/modules/4.18.0-477.27.1.el8_8.x86_64/extra/ib_qib/ib_qib.ko needs unknown symbol rvt_register_device
depmod: WARNING: /lib/modules/4.18.0-477.27.1.el8_8.x86_64/extra/ib_qib/ib_qib.ko needs unknown symbol rvt_init_port
depmod: WARNING: /lib/modules/4.18.0-477.27.1.el8_8.x86_64/extra/ib_qib/ib_qib.ko needs unknown symbol rvt_copy_sge
depmod: WARNING: /lib/modules/4.18.0-477.27.1.el8_8.x86_64/extra/ib_qib/ib_qib.ko needs unknown symbol rvt_restart_sge
depmod: WARNING: /lib/modules/4.18.0-477.27.1.el8_8.x86_64/extra/ib_qib/ib_qib.ko needs unknown symbol rvt_comm_est
depmod: WARNING: /lib/modules/4.18.0-477.27.1.el8_8.x86_64/extra/ib_qib/ib_qib.ko needs unknown symbol rvt_alloc_device
depmod: WARNING: /lib/modules/4.18.0-477.27.1.el8_8.x86_64/extra/ib_qib/ib_qib.ko needs unknown symbol rvt_compute_aeth
depmod: WARNING: /lib/modules/4.18.0-477.27.1.el8_8.x86_64/extra/ib_qib/ib_qib.ko needs unknown symbol rvt_add_rnr_timer
[root@nodo1 ~]#

thanks in advance

tqhoang

2024-06-18 18:40

manager   ~0009877

Last edited: 2024-06-18 18:50

Can you please run this command?
rpm -qa kernel\* | sort

I have the suspicion that you do not have the kernel-modules package installed. Did you manually uninstall it?

heckman

2024-06-18 20:47

reporter   ~0009878

Yes, this is the output:

[root@nodo1 ~]# rpm -qa kernel\* | sort
kernel-4.18.0-477.27.1.el8_8.x86_64
kernel-abi-stablelists-4.18.0-477.27.1.el8_8.noarch
kernel-core-4.18.0-477.27.1.el8_8.x86_64
kernel-devel-4.18.0-477.27.1.el8_8.x86_64
kernel-headers-4.18.0-477.27.1.el8_8.x86_64
kernel-mft-4.15.1-9.kver.4.18.0_477.27.1.el8_8.x86_64.x86_64
kernel-modules-4.18.0-477.27.1.el8_8.x86_64
kernel-rpm-macros-131-1.el8.noarch
kernel-tools-4.18.0-477.27.1.el8_8.x86_64
kernel-tools-libs-4.18.0-477.27.1.el8_8.x86_64
[root@nodo1 ~]#

Thanks in advance!

tqhoang

2024-06-18 21:52

manager   ~0009879

Last edited: 2024-06-19 09:59

1. What is this package? kernel-mft-4.15.1-9.kver.4.18.0_477.27.1.el8_8.x86_64.x86_64

2. Can you please send the output of this command? If it's really long, please put the output into an attachment.
rpm -qil kernel-mft-4.15.1-9.kver.4.18.0_477.27.1.el8_8.x86_64.x86_64

3. Do you have any IT security policies in place that would restrict loading or accessing kernel modules?

There just seems to be something non-standard with your installation. Can you please let us know if you have any kind of custom kernel image running? Or even a custom script finding and loading the ib_qib kernel module?

heckman

2024-06-19 12:36

reporter   ~0009884

Hello, this is the output of command:

[root@nodo1~]# rpm -qil kernel-mft-4.15.1-9.kver.4.18.0_477.27.1.el8_8.x86_64.x86_64
Name : kernel-mft
Version : 4.15.1
Release : 9.kver.4.18.0_477.27.1.el8_8.x86_64
Architecture: x86_64
Install Date: Wed 27 Dec 2023 12:45:32 AM CST
Group : System Environment/Kernel
Size : 40662
License : Dual BSD/GPL
Signature : (none)
Source RPM : kernel-mft-4.15.1-9.kver.4.18.0_477.27.1.el8_8.x86_64.src.rpm
Build Date : Wed 27 Dec 2023 12:40:49 AM CST
Build Host : localhost
Relocations : (not relocatable)
Packager : Omer Dagan <omerd@mellanox.com>
Vendor : Mellanox Technologies Ltd.
Summary : kernel-mft Kernel Module for the 4.18.0-477.27.1.el8_8.x86_64 kernel
Description :
This package provides a kernel-mft kernel module for kernel.
/etc/depmod.d/kernel-mft-mst_pci.conf
/etc/depmod.d/kernel-mft-mst_pciconf.conf
/lib/modules/4.18.0-477.27.1.el8_8.x86_64/extra/kernel-mft
/lib/modules/4.18.0-477.27.1.el8_8.x86_64/extra/kernel-mft/mst_pci.ko
/lib/modules/4.18.0-477.27.1.el8_8.x86_64/extra/kernel-mft/mst_pciconf.ko
/usr/lib/.build-id
/usr/lib/.build-id/b6
/usr/lib/.build-id/b6/a734e777464bc86d26585a35638dcc191d051a
/usr/lib/.build-id/e8
/usr/lib/.build-id/e8/c49f024de65f3fd5a3a0539ebd49c86430b8ab
[root@nodo1 ~]#

No, no there are IT security policies restrict loading or accessing kernel modules. Only lustre rpms and files.conf in modprobe:

[root@nodo1~]# ls /etc/modprobe.d/
firewalld-sysctls.conf ib_ipoib.conf ko2iblnd.conf lnet.conf lockd.conf mlnx-bf.conf mlnx.conf nouveau.conf nvidia-installer-disable-nouveau.conf tuned.conf
[root@nodo1 ~]#

Thanks in advance!

tqhoang

2024-06-19 13:58

manager   ~0009885

Last edited: 2024-06-19 13:59

I have not used lustre RPM's, so I can't comment on those. Please check the contents of all the modprobe config files to see if anything looks suspect.

1. Please run the following commands as root...they should all come back clean.
rpm -V kernel
rpm -V kernel-core
rpm -V kernel-devel
rpm -V kernel-modules
rpm -V kmod-ib_qib

2. Please run "depmod -a" and post any errors.

3. You should be able to run these and have them display the kernel module info.
modinfo rdmavt
modinfo ib_core
modinfo ib_qib

If you can't run #3 successfully, then something is wrong with your system that needs to be resolved.

heckman

2024-06-19 14:59

reporter   ~0009887

Hello,

the output in commands groups 1 & 2 are clean, the output of group 3 its:

[root@nodo1~]# modinfo rdmavt
filename: /lib/modules/4.18.0-477.27.1.el8_8.x86_64/extra/mlnx-ofa_kernel/drivers/infiniband/sw/rdmavt/rdmavt.ko
version: 2.0.0
license: Dual BSD/GPL
description: rdmavt dummy kernel module
author: Alaa Hleihel
rhelversion: 8.8
srcversion: 18B6A83ED2A3DB41379D950
depends: mlx_compat
name: rdmavt
vermagic: 4.18.0-477.27.1.el8_8.x86_64 SMP mod_unload modversions

[root@nodo1~]# modinfo ib_core
filename: /lib/modules/4.18.0-477.27.1.el8_8.x86_64/extra/mlnx-ofa_kernel/drivers/infiniband/core/ib_core.ko
alias: rdma-netlink-subsys-4
license: Dual BSD/GPL
description: core kernel InfiniBand API
author: Roland Dreier
alias: net-pf-16-proto-20
alias: rdma-netlink-subsys-5
rhelversion: 8.8
srcversion: 4F2C61F1ECDCD5D252D7335
depends: mlx_compat
name: ib_core
vermagic: 4.18.0-477.27.1.el8_8.x86_64 SMP mod_unload modversions
parm: send_queue_size:Size of send queue in number of work requests (int)
parm: recv_queue_size:Size of receive queue in number of work requests (int)
parm: roce_v1_noncompat_gid:Default GID auto configuration (Default: yes) (bool)
parm: netns_mode:Share device among net namespaces; default=1 (shared) (bool)
parm: force_mr:Force usage of MRs for RDMA READ/WRITE operations (bool)

[root@nodo1~]# modinfo ib_qib
filename: /lib/modules/4.18.0-477.27.1.el8_8.x86_64/extra/ib_qib/ib_qib.ko
description: Cornelis IB driver
author: Cornelis <support@cornelisnetworks.com>
license: Dual BSD/GPL
alias: fs-ipathfs
firmware: qlogic/sd7220.fw
rhelversion: 8.8
srcversion: DB00CB738FE611CACCE1BEF
alias: pci:v00001077d00007322sv*sd*bc*sc*i*
alias: pci:v00001077d00007220sv*sd*bc*sc*i*
alias: pci:v00001FC1d00000010sv*sd*bc*sc*i*
depends: rdmavt,ib_core
name: ib_qib
vermagic: 4.18.0-477.27.1.el8_8.x86_64 SMP mod_unload modversions
parm: qp_table_size:QP table size (uint)
parm: lkey_table_size:LKEY table size in bits (2^n, 1 <= n <= 23) (uint)
parm: max_pds:Maximum number of protection domains to support (uint)
parm: max_ahs:Maximum number of address handles to support (uint)
parm: max_cqes:Maximum number of completion queue entries to support (uint)
parm: max_cqs:Maximum number of completion queues to support (uint)
parm: max_qp_wrs:Maximum number of QP WRs to support (uint)
parm: max_qps:Maximum number of QPs to support (uint)
parm: max_sges:Maximum number of SGEs to support (uint)
parm: max_mcast_grps:Maximum number of multicast groups to support (uint)
parm: max_mcast_qp_attached:Maximum number of attached QPs to support (uint)
parm: max_srqs:Maximum number of SRQs to support (uint)
parm: max_srq_sges:Maximum number of SRQ SGEs to support (uint)
parm: max_srq_wrs:Maximum number of SRQ WRs support (uint)
parm: disable_sma:Disable the SMA (uint)
parm: num_vls:Set number of Virtual Lanes to use (1-8) (ushort)
parm: chase:Enable state chase handling (ushort)
parm: long_attenuation:attenuation cutoff (dB) for long copper cable setup (ushort)
parm: singleport:Use only IB port 1; more per-port buffer space (ushort)
parm: krcvq01_no_msi:No MSI for kctx < 2 (ushort)
parm: rcvhdrcnt:receive header count (uint)
parm: rcvhdrsize:receive header size in 32-bit words (uint)
parm: rcvhdrentsize:receive header entry size in 32-bit words (uint)
parm: txselect:Tx serdes indices (for no QSFP or invalid QSFP data)
parm: sdma_fetch_prio:SDMA descriptor fetch priority (ushort)
parm: rxeq_default_set:Which set [0..3] of Rx Equalization values is default (uint)
parm: relock_by_timer:Allow relock attempt if link not up (uint)
parm: special_trigger:Enable SpecialTrigger arm/launch (int)
parm: hol_timeout_ms:duration of user app suspension after link failure (uint)
parm: fetch_arb:IBA7220: change SDMA descriptor arbitration (uint)
parm: sdma_descq_cnt:Number of SDMA descq entries (ushort)
parm: pcie_coalesce:tune PCIe coalescing on some Intel chipsets (int)
parm: pcie_caps:Max PCIe tuning: Payload (0..3), ReadReq (4..7) (int)
parm: cfgctxts:Set max number of contexts to use (ushort)
parm: numa_aware:0 -> PSM allocation close to HCA, 1 -> PSM allocation local to process (uint)
parm: mini_init:If set, do minimal diag init (ushort)
parm: krcvqs:number of kernel receive queues per IB port (uint)
parm: cc_table_size:Congestion control table entries 0 (CCA disabled - default), min = 128, max = 1984 (uint)
parm: ibmtu:Set max IB MTU (0=2KB, 1=256, 2=512, ... 5=4096 (uint)
parm: compat_ddr_negotiate:Attempt pre-IBTA 1.2 DDR speed negotiation (uint)
[root@nodo1~]#

I'll need reintall the kernel from vault of Rocky Linux 8.8?

Thanks in advance!

tqhoang

2024-06-19 15:49

manager   ~0009888

Last edited: 2024-06-19 22:30

Ok, this reveals the problem! Whatever package that owns the "mlnx-ofa_kernel" directory has overridden the RHEL 8.8 kernel's modules (rdmavt and ib_core). Reinstalling the RHEL 8.8 kernel won't help.

Find the culprit: rpm -qf /lib/modules/4.18.0-477.27.1.el8_8.x86_64/extra/mlnx-ofa_kernel

Either you will need to patch our ib_qib SRPM to build against these updated modules or ask the owner of the "mlnx-ofa_kernel" RPM's to build you an ib_qib RPM using our source.

heckman

2024-06-19 19:52

reporter   ~0009890

Hello!

Thankyou, i have remove all OFED rpms and again install kmod-ib_qib...rpm that you provide to me and now i get:

[root@nodo1~]# ibstatus
Infiniband device 'qib0' port 1 status:
    default gid: fe80:0000:0000:0000:0011:7500:006f:7a10
    base lid: 0xff
    sm lid: 0x7c
    state: 4: ACTIVE
    phys state: 5: LinkUp
    rate: 40 Gb/sec (4X QDR)
    link_layer: InfiniBand

[root@nodo1~]#


Thanks in advance!

tqhoang

2024-06-19 22:34

manager   ~0009891

So I take it that the ib_qib module is 100% working against a clean RHEL 8.8 kernel?

heckman

2024-06-21 12:15

reporter   ~0009898

Thanks so much for you support throught all process.

Best Regards to all your team!

tqhoang

2024-07-08 16:14

manager   ~0009956

Before I close this ticket, do you require IB Verbs support for your RHEL 8.8 systems?

I noticed you used "ibv_devinfo" in the original bug report. The "ibv_*" utilities won't work with the Intel QLogic InfiniPath cards because Red Hat removed the plugin driver for it. We restored it for RHEL 8.10's rdma-core v48.0, but RHEL 8.8 needs the older rdma-core v44.0.

heckman

2024-07-09 17:35

reporter   ~0009959

Hi,

Yes, if you can support me with this, i'll areciate so much!

Thanks in advance!

tqhoang

2024-07-09 22:48

manager   ~0009960

I added the "ib_qib-ibverbs" package to the web folder: https://elrepo.org/people/tqhoang/bug-1462/
You should be able to use the libibverbs utilities like "ibv_devinfo" now.

I'm going to close this ticket as fixed. If you ever plan to upgrade to RHEL 8.10, we have the packages ready to go.

Issue History

Date Modified Username Field Change
2024-06-13 19:26 heckman New Issue
2024-06-13 19:26 heckman Status new => assigned
2024-06-13 19:26 heckman Assigned To => toracat
2024-06-14 03:59 toracat Note Added: 0009850
2024-06-14 04:24 toracat Project channel: kernel/el8 => channel: elrepo/el8
2024-06-14 04:24 toracat Category --kernel--OTHER-- => General
2024-06-14 04:24 toracat Category General => kmod-ib_qib
2024-06-14 14:17 heckman Note Added: 0009852
2024-06-17 09:26 tqhoang Assigned To toracat => tqhoang
2024-06-17 09:40 tqhoang Note Added: 0009859
2024-06-17 09:40 tqhoang Status assigned => feedback
2024-06-17 09:40 tqhoang Note Edited: 0009859
2024-06-17 09:41 tqhoang Note Edited: 0009859
2024-06-17 11:00 tqhoang Note Edited: 0009859
2024-06-17 12:58 heckman Note Added: 0009862
2024-06-17 12:58 heckman Status feedback => assigned
2024-06-17 16:06 heckman Note Added: 0009863
2024-06-17 18:42 tqhoang Note Added: 0009864
2024-06-17 18:49 heckman Note Added: 0009865
2024-06-17 18:51 tqhoang Note Edited: 0009864
2024-06-17 18:51 heckman Note Added: 0009866
2024-06-17 18:55 tqhoang Note Added: 0009867
2024-06-18 12:54 heckman Note Added: 0009869
2024-06-18 12:55 heckman Note Added: 0009870
2024-06-18 16:46 tqhoang Note Added: 0009875
2024-06-18 17:43 heckman Note Added: 0009876
2024-06-18 18:40 tqhoang Note Added: 0009877
2024-06-18 18:50 tqhoang Note Edited: 0009877
2024-06-18 20:47 heckman Note Added: 0009878
2024-06-18 21:52 tqhoang Note Added: 0009879
2024-06-18 21:53 tqhoang Note Edited: 0009879
2024-06-18 21:55 tqhoang Note Edited: 0009879
2024-06-19 09:59 tqhoang Note Edited: 0009879
2024-06-19 12:36 heckman Note Added: 0009884
2024-06-19 13:58 tqhoang Note Added: 0009885
2024-06-19 13:58 tqhoang Note Edited: 0009885
2024-06-19 13:59 tqhoang Note Edited: 0009885
2024-06-19 14:59 heckman Note Added: 0009887
2024-06-19 15:49 tqhoang Note Added: 0009888
2024-06-19 15:52 tqhoang Note Edited: 0009888
2024-06-19 15:56 tqhoang Note Edited: 0009888
2024-06-19 19:52 heckman Note Added: 0009890
2024-06-19 22:30 tqhoang Note Edited: 0009888
2024-06-19 22:34 tqhoang Note Added: 0009891
2024-06-19 22:36 tqhoang Relationship added related to 0001459
2024-06-21 12:15 heckman Note Added: 0009898
2024-07-08 16:14 tqhoang Note Added: 0009956
2024-07-09 17:35 heckman Note Added: 0009959
2024-07-09 22:48 tqhoang Status assigned => closed
2024-07-09 22:48 tqhoang Resolution open => fixed
2024-07-09 22:48 tqhoang Note Added: 0009960