View Issue Details

IDProjectCategoryView StatusLast Update
0001142channel: kernel/el7kernel-mlpublic2021-09-27 17:32
Reporterishank005 Assigned Toburakkucat  
PrioritynormalSeverityminorReproducibilityalways
Status assignedResolutionopen 
Summary0001142: AWS EC2 non-nitro instances fail to boot via kernel-ml-5.14.X
DescriptionAWS EC2 non-nitro instances which are essentially xen based fail to boot via kernel-ml-5.14.X series of kernel versions. The console indicates following failure message :

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
[ 189.913219] dracut-initqueue[271]: Warning: dracut-initqueue timeout - starting timeout scripts
[ 190.436137] dracut-initqueue[271]: Warning: dracut-initqueue timeout - starting timeout scripts
[ 190.447390] dracut-initqueue[271]: Warning: Could not boot.
         Starting Setup Virtual Console...
[ OK ] Started Setup Virtual Console.
[ 190.462595] dracut-initqueue[271]: Warning: /dev/disk/by-label/root does not exist
         Starting Dracut Emergency Shell...
Warning: /dev/disk/by-label/root does not exist
.
.
.
Entering emergency mode. Exit the shell to continue.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Once the instance is converted to nitro instance, it boots fine. The analysis of the issue indicates that the non-nitro xen based instances require "xen-blkfront" device driver which is missing it kernel-ml-5.14.X. It only contains "nvme" drivers which are essential only for nitro instances.

The solution here is to force-add "xen-blkfront" to kernel-ml-5.14.X and then rebuild the initramfs, post which the instances boot successfully.
Steps To Reproduce- Launch non-nitro EC2 instance (like t2.micro)
- Install any kernel-ml-5.14.X series of kernels
- Reboot the instance
- The instance will fail to boot with the above mentioned error messages visible
Additional Information# uname -r
5.13.10-1.el7.elrepo.x86_64

# lsmod | grep -i "nvme\|xen"
xen_blkfront 45056 2

# uname -r
5.14.6-1.el7.elrepo.x86_64

# lsmod | grep -i "nvme\|xen"
nvme 49152 1
nvme_core 131072 2 nvme
t10_pi 16384 1 nvme_core

To resolve the issue :

- Change the instance type temporarily to nitro instance (ex t3.micro)
- Once it boots back, edit "/etc/dracut.conf" and add "xen_blkfront" to it :

    add_drivers+=" xen_blkfront "

- One can create a separate file under "/etc/dracut.conf.d"
- Rebuild initramfs
- Reboot and change the instance type back to non-nitro
TagsAWS, centos, EC2, kernel-ml

Relationships

related to 0001137 closedburakkucat channel: kernel/el8 kernel-ml 5.14.* fails to boot xen VMs - drops to dracut# prompt (missing driver xen-blkfront) 

Activities

pperry

2021-09-25 08:02

administrator   ~0007878

Thank you for the report. Looks very similar to the following bug to my untrained eye:

https://elrepo.org/bugs/view.php?id=1137

I guess the key question is if this is caused by / a result of something we have done in packaging the upstream kernel sources, and is it something we can / should fix? I think your solution to force the missing driver into the initramfs image is a good solution for now.

It would be interesting to see how this is handled on other platforms / kernels. Are they having the same issue, and how are they dealing with it.

Alan - I note in the last config file I have for kernel-ml-5.14 we have:

CONFIG_XEN_BLKDEV_FRONTEND=m
CONFIG_XEN_BLKDEV_BACKEND=m

config XEN_BLKDEV_FRONTEND
    tristate "Xen virtual block device support"
    depends on XEN
    default y
    select XEN_XENBUS_FRONTEND
    help
      This driver implements the front-end of the Xen virtual
      block device driver. It communicates with a back-end driver
      in another domain which drives the actual block device.

config XEN_BLKDEV_BACKEND
    tristate "Xen block-device backend driver"
    depends on XEN_BACKEND
    help
      The block-device backend driver allows the kernel to export its
      block devices to other guests via a high-performance shared-memory
      interface.

      The corresponding Linux frontend driver is enabled by the
      CONFIG_XEN_BLKDEV_FRONTEND configuration option.

      The backend driver attaches itself to a any block device specified
      in the XenBus configuration. There are no limits to what the block
      device as long as it has a major and minor.

      If you are compiling a kernel to run in a Xen block backend driver
      domain (often this is domain 0) you should say Y here. To
      compile this driver as a module, chose M here: the module
      will be called xen-blkback.


I wonder if these (or at least CONFIG_XEN_BLKDEV_FRONTEND) were built-in rather than modules, if that would help? I'd like to see some precedent for that as both RHEL7 and RHEL8 distro kernels are modules (CONFIG_XEN_BLKDEV_FRONTEND=m)

toracat

2021-09-25 14:02

administrator   ~0007882

Did I understand correctly that the issue you are reporting happens with 5.14.X but _not_ with 5.13.10-1.el7.elrepo?

How about the distro kernel (RHEL7/CentOS7 etc) ?

ishank005

2021-09-27 04:07

reporter   ~0007883

@toracat, essentially yes. i can boot via 5.13.X but not via 5.14.X series. the above mentioned output is from CentOS 7 instances. tested on RHEL 7 as well and observed exact same behaviour. the default kernel of RHEL 7 which is 3.10.X series does contain "xen_blkfront" module :

# cat /etc/redhat-release
Red Hat Enterprise Linux Server release 7.9 (Maipo)

# uname -a
Linux ip-172-31-37-43.us-east-2.compute.internal 3.10.0-1160.42.2.el7.x86_64 #1 SMP Tue Aug 31 20:15:00 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux

# uname -r
3.10.0-1160.42.2.el7.x86_64

# lsmod | grep -i xen
xen_blkfront 26966 2
xen_netfront 27206 0


now, booting the same RHEL 7 instance via 5.13.10-1.el7.elrepo.x86_64. the instance is still non-nitro (xen based). following results indicate that its booting successfully.

# uname -a
Linux ip-172-31-37-43.us-east-2.compute.internal 5.13.10-1.el7.elrepo.x86_64 #1 SMP Wed Aug 11 10:46:03 EDT 2021 x86_64 x86_64 x86_64 GNU/Linux

# uname -r
5.13.10-1.el7.elrepo.x86_64

# lsmod | grep -i xen
xen_netfront 36864 1
xen_blkfront 45056 2


as soon as i install 5.14.0-1.el7.elrepo.x86_64 and boot via it, the instance fails to boot.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
[ 190.024661] dracut-initqueue[268]: Warning: dracut-initqueue timeout - starting timeout scripts
[ 190.551643] dracut-initqueue[268]: Warning: dracut-initqueue timeout - starting timeout scripts
[ 191.076618] dracut-initqueue[268]: Warning: dracut-initqueue timeout - starting timeout scripts
[ 191.600638] dracut-initqueue[268]: Warning: dracut-initqueue timeout - starting timeout scripts
[ 191.610723] dracut-initqueue[268]: Warning: Could not boot.
         Starting Setup Virtual Console...
[ 191.624353] dracut-initqueue[268]: Warning: /dev/disk/by-uuid/6bd5eb4d-fadd-4063-9d42-d5063758d65d does not exist
[ OK ] Started Setup Virtual Console.
         Starting Dracut Emergency Shell...
Warning: /dev/disk/by-uuid/6bd5eb4d-fadd-4063-9d42-d5063758d65d does not exist

Generating "/run/initramfs/rdsosreport.txt"


Entering emergency mode. Exit the shell to continue.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

from the rescue mode, checking for "xen-blkfront.ko" in the present initramfs files indicate that its present only in 3.10.X and 5.13.X series. its missing in 5.14.X series :

# lsinitrd initramfs-3.10.0-1160.42.2.el7.x86_64.img | grep -i "xen-blkfront"
-rw-r--r-- 1 root root 15464 Aug 31 21:13 usr/lib/modules/3.10.0-1160.42.2.el7.x86_64/kernel/drivers/block/xen-blkfront.ko.xz

# lsinitrd initramfs-5.13.10-1.el7.elrepo.x86_64.img | grep -i "xen-blkfront"
-rwxr--r-- 1 root root 74888 Sep 27 06:24 usr/lib/modules/5.13.10-1.el7.elrepo.x86_64/kernel/drivers/block/xen-blkfront.ko

# lsinitrd initramfs-5.14.0-1.el7.elrepo.x86_64.img | grep -i "xen-blkfront"
<NO_OUTPUT>


the kernel 5.13.X contains 7 "xen" drivers/modules :

# lsinitrd initramfs-5.13.10-1.el7.elrepo.x86_64.img | grep -i "xen"
-rwxr--r-- 1 root root 74888 Sep 27 06:24 usr/lib/modules/5.13.10-1.el7.elrepo.x86_64/kernel/drivers/block/xen-blkfront.ko
drwxr-xr-x 2 root root 0 Sep 27 06:24 usr/lib/modules/5.13.10-1.el7.elrepo.x86_64/kernel/drivers/net/ethernet/qlogic/netxen
-rwxr--r-- 1 root root 176656 Sep 27 06:24 usr/lib/modules/5.13.10-1.el7.elrepo.x86_64/kernel/drivers/net/ethernet/qlogic/netxen/netxen_nic.ko
drwxr-xr-x 2 root root 0 Sep 27 06:24 usr/lib/modules/5.13.10-1.el7.elrepo.x86_64/kernel/drivers/net/xen-netback
-rwxr--r-- 1 root root 108960 Sep 27 06:24 usr/lib/modules/5.13.10-1.el7.elrepo.x86_64/kernel/drivers/net/xen-netback/xen-netback.ko
-rwxr--r-- 1 root root 65944 Sep 27 06:24 usr/lib/modules/5.13.10-1.el7.elrepo.x86_64/kernel/drivers/net/xen-netfront.ko
-rwxr--r-- 1 root root 40032 Sep 27 06:24 usr/lib/modules/5.13.10-1.el7.elrepo.x86_64/kernel/drivers/scsi/xen-scsifront.ko


for comparison, following 6 "xen" related modules/drivers were observed in 5.14.X and as we can see, the only missing one is "xen-blkfront.ko" :

# lsinitrd initramfs-5.14.0-1.el7.elrepo.x86_64.img | grep -i "xen"
drwxr-xr-x 2 root root 0 Sep 27 06:32 usr/lib/modules/5.14.0-1.el7.elrepo.x86_64/kernel/drivers/net/ethernet/qlogic/netxen
-rwxr--r-- 1 root root 176664 Sep 27 06:32 usr/lib/modules/5.14.0-1.el7.elrepo.x86_64/kernel/drivers/net/ethernet/qlogic/netxen/netxen_nic.ko
drwxr-xr-x 2 root root 0 Sep 27 06:32 usr/lib/modules/5.14.0-1.el7.elrepo.x86_64/kernel/drivers/net/xen-netback
-rwxr--r-- 1 root root 109224 Sep 27 06:32 usr/lib/modules/5.14.0-1.el7.elrepo.x86_64/kernel/drivers/net/xen-netback/xen-netback.ko
-rwxr--r-- 1 root root 65944 Sep 27 06:32 usr/lib/modules/5.14.0-1.el7.elrepo.x86_64/kernel/drivers/net/xen-netfront.ko
-rwxr--r-- 1 root root 40312 Sep 27 06:32 usr/lib/modules/5.14.0-1.el7.elrepo.x86_64/kernel/drivers/scsi/xen-scsifront.ko

Issue History

Date Modified Username Field Change
2021-09-25 00:15 ishank005 New Issue
2021-09-25 00:15 ishank005 Status new => assigned
2021-09-25 00:15 ishank005 Assigned To => burakkucat
2021-09-25 00:38 ishank005 Tag Attached: centos
2021-09-25 00:38 ishank005 Tag Attached: kernel-ml
2021-09-25 00:38 ishank005 Tag Attached: AWS
2021-09-25 00:38 ishank005 Tag Attached: EC2
2021-09-25 07:44 pperry Relationship added related to 0001137
2021-09-25 08:02 pperry Note Added: 0007878
2021-09-25 14:02 toracat Note Added: 0007882
2021-09-27 04:07 ishank005 Note Added: 0007883