View Issue Details
ID | Project | Category | View Status | Date Submitted | Last Update |
---|---|---|---|---|---|
0001432 | channel: kernel/el8 | kernel-ml | public | 2024-03-04 14:10 | 2024-03-11 15:07 |
Reporter | toracat | Assigned To | toracat | ||
Priority | normal | Severity | major | Reproducibility | always |
Status | resolved | Resolution | fixed | ||
Summary | 0001432: kernel-ml-6.8.0-rc7.el8 fails to build | ||||
Description | linux-6.8-rc7 was released on 2024-03-03. kernel-ml-6.8.0-rc7.el8 fails to build. el7 also fails but there was no build issue with el9. tl;dr ==> gcc version too old. | ||||
Additional Information | Build error from the first build attempt: In file included from ./arch/x86/include/generated/asm/rwonce.h:1, from ./include/linux/compiler.h:251, from ./include/linux/instrumented.h:10, from ./include/linux/uaccess.h:6, from net/core/dev.c:71: net/core/dev.c: In function 'netdev_dpll_pin_assign': ./include/linux/rcupdate.h:462:36: error: dereferencing pointer to incomplete type 'struct dpll_pin' 462 | #define RCU_INITIALIZER(v) (typeof(*(v)) __force __rcu *)(v) | ^~~~ ./include/asm-generic/rwonce.h:55:33: note: in definition of macro '__WRITE_ONCE' 55 | *(volatile typeof(x) *)&(x) = (val); \ | ^~~ ./arch/x86/include/asm/barrier.h:67:2: note: in expansion of macro 'WRITE_ONCE' 67 | WRITE_ONCE(*p, v); \ | ^~~~~~~~~~ ./include/asm-generic/barrier.h:172:55: note: in expansion of macro '__smp_store_release' 172 | #define smp_store_release(p, v) do { kcsan_release(); __smp_store_release(p, v); } while (0) | ^~~~~~~~~~~~~~~~~~~ ./include/linux/rcupdate.h:503:3: note: in expansion of macro 'smp_store_release' 503 | smp_store_release(&p, RCU_INITIALIZER((typeof(p))_r_a_p__v)); \ | ^~~~~~~~~~~~~~~~~ ./include/linux/rcupdate.h:503:25: note: in expansion of macro 'RCU_INITIALIZER' 503 | smp_store_release(&p, RCU_INITIALIZER((typeof(p))_r_a_p__v)); \ | ^~~~~~~~~~~~~~~ net/core/dev.c:9081:2: note: in expansion of macro 'rcu_assign_pointer' 9081 | rcu_assign_pointer(dev->dpll_pin, dpll_pin); | ^~~~~~~~~~~~~~~~~~ make[4]: *** [net/core/dev.o] Error 1 | ||||
Tags | No tags attached. | ||||
|
(1) The first build error that referred to "net/core/dev.o" looked similar to what was reported in: https://lore.kernel.org/lkml/875xy8103a.fsf@mail.lhotse/T/ which described the cause as: "Build failure is seen while using gcc-8.5.x but not with gcc-11.4.x" Therefore we installed gcc-toolset-11-11.1-1.el8.x86_64 $ scl enable gcc-toolset-11 'bash' 'gcc -v ' shows gcc version 11.2.1 20220127 (Red Hat 11.2.1-9) (GCC) |
|
(2) The config file was adjusted to take care of the gcg version change. 'rpmbuild -bb' ended with error: + /usr/bin/chmod +x tools/perf/check-headers.sh + /usr/bin/make -s -C tools/perf prefix=/usr 'EXTRA_CFLAGS=-O2 -g -pipe -Wall -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -Wp,-D_GLIBCXX_ASSERTIONS -fexceptions -fstack-protector-strong -grecord-gcc-switches -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection' 'LDFLAGS=-Wl,-z,relro -Wl,-z,now -specs=/usr/lib/rpm/redhat/redhat-hardened-ld' PYTHON=/usr/libexec/platform-python WERROR=0 HAVE_CPLUS_DEMANGLE=1 NO_BIONIC=1 NO_GTK2=1 NO_LIBBABELTRACE=1 NO_LIBUNWIND=1 NO_LIBZSTD=1 NO_PERF_READ_VDSO32=1 NO_PERF_READ_VDSOX32=1 NO_STRLCPY=1 DESTDIR=/home/akemi/rpmbuild/BUILDROOT/kernel-ml-6.8.0-0.rc7.el8.elrepo.x86_64 all BUILD: Doing 'make -j8' parallel build HOSTCC fixdep.o HOSTLD fixdep-in.o LINK fixdep Warning: Kernel ABI header differences: diff -u tools/include/uapi/sound/asound.h include/uapi/sound/asound.h diff -u tools/arch/x86/include/asm/cpufeatures.h arch/x86/include/asm/cpufeatures.h diff -u tools/arch/arm64/include/asm/cputype.h arch/arm64/include/asm/cputype.h Makefile.config:458: *** No gnu/libc-version.h found, please install glibc-dev[el]. Stop. make[1]: *** [Makefile.perf:261: sub-make] Error 2 What was odd was the fact that gnu/libc-version.h does exist (glibc-devel was installed). This led to the following post: https://lore.kernel.org/lkml/1591071304-19338-1-git-send-email-yangtiezhu@loongson.cn/T/#mda9d4640af6d652de9228aae48cbcb5aaaa15407 > When build perf with ASan or UBSan, if libasan or libubsan can not find, > the feature-glibc is 0 and there exists the following error log which is > wrong, because we can find gnu/libc-version.h in /usr/include, glibc-devel > is also installed. > After install libasan and libubsan, the feature-glibc is 1 and the build > process is success |
|
(3) Installed libasan and libubsan which were not on the build system. $ sudo dnf install libasan libubsan Installed: libasan-8.5.0-20.el8.x86_64 libubsan-8.5.0-20.el8.x86_64 (4) In the meantime, rpmbuild was tried with perf disabled. This resulted in error in the %build phase: + pushd tools/power/cpupower + /usr/bin/make -s 'CFLAGS=-O2 -g -pipe -Wall -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -Wp,-D_GLIBCXX_ASSERTIONS -fexceptions -fstack-protector-strong -grecord-gcc-switches -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection' 'LDFLAGS=-Wl,-z,relro -Wl,-z,now -specs=/usr/lib/rpm/redhat/redhat-hardened-ld' CPUFREQ_BENCH=false DEBUG=false CC lib/cpufreq.o cc1: fatal error: inaccessible plugin file /opt/rh/gcc-toolset-11/root/usr/lib/gcc/x86_64-redhat-linux/11/plugin/annobin.so expanded from short plugin name annobin: No such file or directory compilation terminated. make: *** [Makefile:201: lib/cpufreq.o] Error 1 error: Bad exit status from /var/tmp/rpm-tmp.w37Ny5 (%build) Found something that seemed to be related: https://github.com/openzfs/zfs/issues/14386 Turned out we need gcc-toolset-11-annobin-plugin-gcc. |
|
Finally building kernel-ml-rc7 was successful. Quoting Steve: < whn> Getting correct list of BuildReqires may require trial-and-error. |
|
The first try worked. # gcc11 %define with_gcc11 %{?_without_gcc11: 0} %{?!_without_gcc11: 1} %if %{with_perf} BuildRequires: libasan, libubsan %endif %if %{with_gcc11} BuildRequires: gcc-toolset-11-annobin-plugin-gcc %endif |
|
All the test builds above were done by running the rpmbuild command. When the same source was built in mock, it failed with: + /usr/bin/cp config-6.8.0-x86_64 .config + /usr/bin/make -s ARCH=x86_64 listnewconfig + /usr/bin/grep -E '^CONFIG_' + '[' -s newoptions-el8-x86_64.txt ']' + /usr/bin/cat newoptions-el8-x86_64.txt CONFIG_SLS=n + exit 1 error: Bad exit status from /var/tmp/rpm-tmp.UCKZqy (%prep) In 6.7.7, config has CONFIG_SLS=y .When this was added to the 6.7.8 config, it was then removed by 'make oldconfig'. ** linux-6.8.0-0.rc7.el8.x86_64/arch/x86/Kconfig: config CC_HAS_SLS def_bool $(cc-option,-mharden-sls=all) config SLS bool "Mitigate Straight-Line-Speculation" depends on CC_HAS_SLS && X86_64 select OBJTOOL if HAVE_OBJTOOL default n help Compile the kernel with straight-line-speculation options to guard against straight line speculation. The kernel image might be slightly larger. |
|
For successful build, the following lines had to be added to the spec file in appropriate places: # gcc12 %define with_gcc12 %{?_without_gcc12: 0} %{?!_without_gcc12: 1} %if %{with_perf} BuildRequires: libasan, libubsan %endif %if %{with_gcc12} BuildRequires: gcc-toolset-12-annobin-plugin-gcc %endif %prep %if %{with_gcc12} . /opt/rh/gcc-toolset-12/enable %endif %build %if %{with_gcc12} . /opt/rh/gcc-toolset-12/enable %endif %install %if %{with_gcc12} . /opt/rh/gcc-toolset-12/enable %endif |
|
For the record: A candidate commit that caused the gcg version issue: 0d60d8df6f493bb46bf5db40d39dd60a1bafdd4e "dpll: rely on rcu for netdev_dpll_pin()" diff --git a/net/core/dev.c b/net/core/dev.c index 73a0219730075e..0230391c78f71e 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -9078,7 +9078,7 @@ static void netdev_dpll_pin_assign(struct net_device *dev, struct dpll_pin *dpll { #if IS_ENABLED(CONFIG_DPLL) rtnl_lock(); - dev->dpll_pin = dpll_pin; + rcu_assign_pointer(dev->dpll_pin, dpll_pin); rtnl_unlock(); #endif } $ git describe --contain 0d60d8df6f493bb46bf5db40d39dd60a1bafdd4e v6.8-rc7~26^2~22 |
|
For the record - 2 The source code quoted in note #9604 disappeared in linux-6.8. And incidentally (?), kernel-ml-6.8 now builds with the distro gcc - no need to use a newer version. |
|
In note #9597, I wrote: "In 6.7.7, config has CONFIG_SLS=y .When this was added to the 6.7.8 config, it was then removed by 'make oldconfig'." However, when the config files were checked later, CONFIG_SLS=y was actually there in the 6.7.8 config file. In fact there was no change in the config file going from 6.7.7 to 6.7.8 and to 6.7.9. $ grep SLS con* config-6.7.7-x86_64:CONFIG_CC_HAS_SLS=y config-6.7.7-x86_64:CONFIG_SLS=y config-6.7.8-x86_64:CONFIG_CC_HAS_SLS=y config-6.7.8-x86_64:CONFIG_SLS=y config-6.7.9-x86_64:CONFIG_CC_HAS_SLS=y config-6.7.9-x86_64:CONFIG_SLS=y |
|
I think this commit fixed the gcc issue. It was added to 6.8. https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h=v6.8&id=289e922582af5b4721ba02e86bde4d9ba918158a dpll: move all dpll<>netdev helpers to dpll code Older versions of GCC really want to know the full definition of the type involved in rcu_assign_pointer(). struct dpll_pin is defined in a local header, net/core can't reach it. Move all the netdev <> dpll code into dpll, where the type is known. Otherwise we'd need multiple function calls to jump between the compilation units. This is the same problem the commit under fixes was trying to address, but with rcu_assign_pointer() not rcu_dereference(). Some of the exports are not needed, networking core can't be a module, we only need exports for the helpers used by drivers. Reported-by: Geert Uytterhoeven <geert@linux-m68k.org> Link: https://lore.kernel.org/all/35a869c8-52e8-177-1d4d-e57578b99b6@linux-m68k.org/ Fixes: 640f41ed33b5 ("dpll: fix build failure due to rcu_dereference_check() on unknown type") Reviewed-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20240305013532.694866-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org> $ git describe --contain 289e922582af5b4721ba02e86bde4d9ba918158a v6.8~19^2~8 |
|
In conclusion, the "problematic" code that triggered the requirement for a newer version of gcc was introduced in 6.8-rc and then fixed in the 6.8 GA release. kernel-ml builds fine for both el7 and el8 without changing the currently used gcg version (gcc9 for el7 and gcc 8.5 for el8). After much ado ....... |
Date Modified | Username | Field | Change |
---|---|---|---|
2024-03-04 14:10 | toracat | New Issue | |
2024-03-04 14:10 | toracat | Status | new => assigned |
2024-03-04 14:10 | toracat | Assigned To | => toracat |
2024-03-04 14:10 | toracat | Description Updated | |
2024-03-04 14:11 | toracat | Description Updated | |
2024-03-04 17:50 | toracat | Note Added: 0009589 | |
2024-03-04 17:56 | toracat | Note Added: 0009590 | |
2024-03-04 18:02 | toracat | Note Added: 0009591 | |
2024-03-04 18:04 | toracat | Note Added: 0009592 | |
2024-03-04 23:08 | toracat | Note Added: 0009593 | |
2024-03-05 13:21 | toracat | Additional Information Updated | |
2024-03-05 13:49 | toracat | Note Added: 0009597 | |
2024-03-06 20:02 | toracat | Note Added: 0009603 | |
2024-03-06 20:32 | toracat | Note Added: 0009604 | |
2024-03-11 13:45 | toracat | Note Added: 0009612 | |
2024-03-11 13:51 | toracat | Note Added: 0009613 | |
2024-03-11 13:59 | toracat | Note Edited: 0009613 | |
2024-03-11 14:17 | toracat | Note Added: 0009614 | |
2024-03-11 14:20 | toracat | Note Edited: 0009614 | |
2024-03-11 14:33 | toracat | Note Added: 0009615 | |
2024-03-11 14:33 | toracat | Status | assigned => resolved |
2024-03-11 14:33 | toracat | Resolution | open => fixed |
2024-03-11 15:07 | toracat | Note Edited: 0009615 |