View Issue Details

IDProjectCategoryView StatusLast Update
0001432channel: kernel/el8kernel-mlpublic2024-03-11 15:07
Reportertoracat Assigned Totoracat  
PrioritynormalSeveritymajorReproducibilityalways
Status resolvedResolutionfixed 
Summary0001432: kernel-ml-6.8.0-rc7.el8 fails to build
Descriptionlinux-6.8-rc7 was released on 2024-03-03. kernel-ml-6.8.0-rc7.el8 fails to build.
el7 also fails but there was no build issue with el9.

tl;dr ==> gcc version too old.
Additional InformationBuild error from the first build attempt:

In file included from ./arch/x86/include/generated/asm/rwonce.h:1,
                 from ./include/linux/compiler.h:251,
                 from ./include/linux/instrumented.h:10,
                 from ./include/linux/uaccess.h:6,
                 from net/core/dev.c:71:
net/core/dev.c: In function 'netdev_dpll_pin_assign':
./include/linux/rcupdate.h:462:36: error: dereferencing pointer to incomplete type 'struct dpll_pin'
  462 | #define RCU_INITIALIZER(v) (typeof(*(v)) __force __rcu *)(v)
      | ^~~~
./include/asm-generic/rwonce.h:55:33: note: in definition of macro '__WRITE_ONCE'
   55 | *(volatile typeof(x) *)&(x) = (val); \
      | ^~~
./arch/x86/include/asm/barrier.h:67:2: note: in expansion of macro 'WRITE_ONCE'
   67 | WRITE_ONCE(*p, v); \
      | ^~~~~~~~~~
./include/asm-generic/barrier.h:172:55: note: in expansion of macro '__smp_store_release'
  172 | #define smp_store_release(p, v) do { kcsan_release(); __smp_store_release(p, v); } while (0)
      | ^~~~~~~~~~~~~~~~~~~
./include/linux/rcupdate.h:503:3: note: in expansion of macro 'smp_store_release'
  503 | smp_store_release(&p, RCU_INITIALIZER((typeof(p))_r_a_p__v)); \
      | ^~~~~~~~~~~~~~~~~
./include/linux/rcupdate.h:503:25: note: in expansion of macro 'RCU_INITIALIZER'
  503 | smp_store_release(&p, RCU_INITIALIZER((typeof(p))_r_a_p__v)); \
      | ^~~~~~~~~~~~~~~
net/core/dev.c:9081:2: note: in expansion of macro 'rcu_assign_pointer'
 9081 | rcu_assign_pointer(dev->dpll_pin, dpll_pin);
      | ^~~~~~~~~~~~~~~~~~
make[4]: *** [net/core/dev.o] Error 1
TagsNo tags attached.

Activities

toracat

2024-03-04 17:50

administrator   ~0009589

(1) The first build error that referred to "net/core/dev.o" looked similar to what was reported in:

https://lore.kernel.org/lkml/875xy8103a.fsf@mail.lhotse/T/

which described the cause as: "Build failure is seen while using gcc-8.5.x but not with gcc-11.4.x"

Therefore we installed gcc-toolset-11-11.1-1.el8.x86_64

$ scl enable gcc-toolset-11 'bash'

'gcc -v ' shows gcc version 11.2.1 20220127 (Red Hat 11.2.1-9) (GCC)

toracat

2024-03-04 17:56

administrator   ~0009590

(2) The config file was adjusted to take care of the gcg version change.

'rpmbuild -bb' ended with error:

+ /usr/bin/chmod +x tools/perf/check-headers.sh
+ /usr/bin/make -s -C tools/perf prefix=/usr 'EXTRA_CFLAGS=-O2 -g -pipe -Wall -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -Wp,-D_GLIBCXX_ASSERTIONS -fexceptions -fstack-protector-strong -grecord-gcc-switches -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection' 'LDFLAGS=-Wl,-z,relro -Wl,-z,now -specs=/usr/lib/rpm/redhat/redhat-hardened-ld' PYTHON=/usr/libexec/platform-python WERROR=0 HAVE_CPLUS_DEMANGLE=1 NO_BIONIC=1 NO_GTK2=1 NO_LIBBABELTRACE=1 NO_LIBUNWIND=1 NO_LIBZSTD=1 NO_PERF_READ_VDSO32=1 NO_PERF_READ_VDSOX32=1 NO_STRLCPY=1 DESTDIR=/home/akemi/rpmbuild/BUILDROOT/kernel-ml-6.8.0-0.rc7.el8.elrepo.x86_64 all
  BUILD: Doing 'make -j8' parallel build
  HOSTCC fixdep.o
  HOSTLD fixdep-in.o
  LINK fixdep
Warning: Kernel ABI header differences:
  diff -u tools/include/uapi/sound/asound.h include/uapi/sound/asound.h
  diff -u tools/arch/x86/include/asm/cpufeatures.h arch/x86/include/asm/cpufeatures.h
  diff -u tools/arch/arm64/include/asm/cputype.h arch/arm64/include/asm/cputype.h
Makefile.config:458: *** No gnu/libc-version.h found, please install glibc-dev[el]. Stop.
make[1]: *** [Makefile.perf:261: sub-make] Error 2

What was odd was the fact that gnu/libc-version.h does exist (glibc-devel was installed).

This led to the following post:
https://lore.kernel.org/lkml/1591071304-19338-1-git-send-email-yangtiezhu@loongson.cn/T/#mda9d4640af6d652de9228aae48cbcb5aaaa15407

> When build perf with ASan or UBSan, if libasan or libubsan can not find,
> the feature-glibc is 0 and there exists the following error log which is
> wrong, because we can find gnu/libc-version.h in /usr/include, glibc-devel
> is also installed.
> After install libasan and libubsan, the feature-glibc is 1 and the build
> process is success

toracat

2024-03-04 18:02

administrator   ~0009591

(3) Installed libasan and libubsan which were not on the build system.

$ sudo dnf install libasan libubsan
Installed:
  libasan-8.5.0-20.el8.x86_64 libubsan-8.5.0-20.el8.x86_64

(4) In the meantime, rpmbuild was tried with perf disabled. This resulted in error in the %build phase:

+ pushd tools/power/cpupower
+ /usr/bin/make -s 'CFLAGS=-O2 -g -pipe -Wall -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -Wp,-D_GLIBCXX_ASSERTIONS -fexceptions -fstack-protector-strong -grecord-gcc-switches -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection' 'LDFLAGS=-Wl,-z,relro -Wl,-z,now -specs=/usr/lib/rpm/redhat/redhat-hardened-ld' CPUFREQ_BENCH=false DEBUG=false
  CC lib/cpufreq.o
cc1: fatal error: inaccessible plugin file /opt/rh/gcc-toolset-11/root/usr/lib/gcc/x86_64-redhat-linux/11/plugin/annobin.so expanded from short plugin name annobin: No such file or directory
compilation terminated.
make: *** [Makefile:201: lib/cpufreq.o] Error 1
error: Bad exit status from /var/tmp/rpm-tmp.w37Ny5 (%build)

Found something that seemed to be related:

https://github.com/openzfs/zfs/issues/14386

Turned out we need gcc-toolset-11-annobin-plugin-gcc.

toracat

2024-03-04 18:04

administrator   ~0009592

Finally building kernel-ml-rc7 was successful.

Quoting Steve:

< whn> Getting correct list of BuildReqires may require trial-and-error.

toracat

2024-03-04 23:08

administrator   ~0009593

The first try worked.

# gcc11
%define with_gcc11 %{?_without_gcc11: 0} %{?!_without_gcc11: 1}

%if %{with_perf}
BuildRequires: libasan, libubsan
%endif
%if %{with_gcc11}
BuildRequires: gcc-toolset-11-annobin-plugin-gcc
%endif

toracat

2024-03-05 13:49

administrator   ~0009597

All the test builds above were done by running the rpmbuild command. When the same source was built in mock, it failed with:

+ /usr/bin/cp config-6.8.0-x86_64 .config
+ /usr/bin/make -s ARCH=x86_64 listnewconfig
+ /usr/bin/grep -E '^CONFIG_'
+ '[' -s newoptions-el8-x86_64.txt ']'
+ /usr/bin/cat newoptions-el8-x86_64.txt
CONFIG_SLS=n
+ exit 1
error: Bad exit status from /var/tmp/rpm-tmp.UCKZqy (%prep)

In 6.7.7, config has CONFIG_SLS=y .When this was added to the 6.7.8 config, it was then removed by 'make oldconfig'.

** linux-6.8.0-0.rc7.el8.x86_64/arch/x86/Kconfig:

config CC_HAS_SLS
        def_bool $(cc-option,-mharden-sls=all)

config SLS
        bool "Mitigate Straight-Line-Speculation"
        depends on CC_HAS_SLS && X86_64
        select OBJTOOL if HAVE_OBJTOOL
        default n
        help
          Compile the kernel with straight-line-speculation options to guard
          against straight line speculation. The kernel image might be slightly
          larger.

toracat

2024-03-06 20:02

administrator   ~0009603

For successful build, the following lines had to be added to the spec file in appropriate places:

# gcc12
%define with_gcc12 %{?_without_gcc12: 0} %{?!_without_gcc12: 1}

%if %{with_perf}
BuildRequires: libasan, libubsan
%endif
%if %{with_gcc12}
BuildRequires: gcc-toolset-12-annobin-plugin-gcc
%endif

%prep
%if %{with_gcc12}
. /opt/rh/gcc-toolset-12/enable
%endif

%build
%if %{with_gcc12}
. /opt/rh/gcc-toolset-12/enable
%endif

%install
%if %{with_gcc12}
. /opt/rh/gcc-toolset-12/enable
%endif

toracat

2024-03-06 20:32

administrator   ~0009604

For the record:

A candidate commit that caused the gcg version issue: 0d60d8df6f493bb46bf5db40d39dd60a1bafdd4e
"dpll: rely on rcu for netdev_dpll_pin()"

diff --git a/net/core/dev.c b/net/core/dev.c
index 73a0219730075e..0230391c78f71e 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -9078,7 +9078,7 @@ static void netdev_dpll_pin_assign(struct net_device *dev, struct dpll_pin *dpll
 {
 #if IS_ENABLED(CONFIG_DPLL)
     rtnl_lock();
- dev->dpll_pin = dpll_pin;
+ rcu_assign_pointer(dev->dpll_pin, dpll_pin);
     rtnl_unlock();
 #endif
 }

$ git describe --contain 0d60d8df6f493bb46bf5db40d39dd60a1bafdd4e
v6.8-rc7~26^2~22

toracat

2024-03-11 13:45

administrator   ~0009612

For the record - 2

The source code quoted in note #9604 disappeared in linux-6.8. And incidentally (?), kernel-ml-6.8 now builds with the distro gcc - no need to use a newer version.

toracat

2024-03-11 13:51

administrator   ~0009613

Last edited: 2024-03-11 13:59

In note #9597, I wrote:

"In 6.7.7, config has CONFIG_SLS=y .When this was added to the 6.7.8 config, it was then removed by 'make oldconfig'."

However, when the config files were checked later, CONFIG_SLS=y was actually there in the 6.7.8 config file. In fact there was no change in the config file going from 6.7.7 to 6.7.8 and to 6.7.9.

$ grep SLS con*
config-6.7.7-x86_64:CONFIG_CC_HAS_SLS=y
config-6.7.7-x86_64:CONFIG_SLS=y
config-6.7.8-x86_64:CONFIG_CC_HAS_SLS=y
config-6.7.8-x86_64:CONFIG_SLS=y
config-6.7.9-x86_64:CONFIG_CC_HAS_SLS=y
config-6.7.9-x86_64:CONFIG_SLS=y

toracat

2024-03-11 14:17

administrator   ~0009614

Last edited: 2024-03-11 14:20

I think this commit fixed the gcc issue. It was added to 6.8.

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h=v6.8&id=289e922582af5b4721ba02e86bde4d9ba918158a

dpll: move all dpll<>netdev helpers to dpll code

Older versions of GCC really want to know the full definition
of the type involved in rcu_assign_pointer().

struct dpll_pin is defined in a local header, net/core can't
reach it. Move all the netdev <> dpll code into dpll, where
the type is known. Otherwise we'd need multiple function calls
to jump between the compilation units.

This is the same problem the commit under fixes was trying to address,
but with rcu_assign_pointer() not rcu_dereference().

Some of the exports are not needed, networking core can't
be a module, we only need exports for the helpers used by
drivers.

Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Link: https://lore.kernel.org/all/35a869c8-52e8-177-1d4d-e57578b99b6@linux-m68k.org/
Fixes: 640f41ed33b5 ("dpll: fix build failure due to rcu_dereference_check() on unknown type")
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20240305013532.694866-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

$ git describe --contain 289e922582af5b4721ba02e86bde4d9ba918158a
v6.8~19^2~8

toracat

2024-03-11 14:33

administrator   ~0009615

Last edited: 2024-03-11 15:07

In conclusion, the "problematic" code that triggered the requirement for a newer version of gcc was introduced in 6.8-rc and then fixed in the 6.8 GA release. kernel-ml builds fine for both el7 and el8 without changing the currently used gcg version (gcc9 for el7 and gcc 8.5 for el8).

After much ado .......

Issue History

Date Modified Username Field Change
2024-03-04 14:10 toracat New Issue
2024-03-04 14:10 toracat Status new => assigned
2024-03-04 14:10 toracat Assigned To => toracat
2024-03-04 14:10 toracat Description Updated
2024-03-04 14:11 toracat Description Updated
2024-03-04 17:50 toracat Note Added: 0009589
2024-03-04 17:56 toracat Note Added: 0009590
2024-03-04 18:02 toracat Note Added: 0009591
2024-03-04 18:04 toracat Note Added: 0009592
2024-03-04 23:08 toracat Note Added: 0009593
2024-03-05 13:21 toracat Additional Information Updated
2024-03-05 13:49 toracat Note Added: 0009597
2024-03-06 20:02 toracat Note Added: 0009603
2024-03-06 20:32 toracat Note Added: 0009604
2024-03-11 13:45 toracat Note Added: 0009612
2024-03-11 13:51 toracat Note Added: 0009613
2024-03-11 13:59 toracat Note Edited: 0009613
2024-03-11 14:17 toracat Note Added: 0009614
2024-03-11 14:20 toracat Note Edited: 0009614
2024-03-11 14:33 toracat Note Added: 0009615
2024-03-11 14:33 toracat Status assigned => resolved
2024-03-11 14:33 toracat Resolution open => fixed
2024-03-11 15:07 toracat Note Edited: 0009615