View Issue Details
ID | Project | Category | View Status | Date Submitted | Last Update |
---|---|---|---|---|---|
0001365 | channel: kernel/el7 | kernel-ml | public | 2023-07-12 08:46 | 2023-07-12 14:24 |
Reporter | OnnoZweers | Assigned To | toracat | ||
Priority | normal | Severity | major | Reproducibility | random |
Status | assigned | Resolution | open | ||
Platform | Dell Poweredge | OS | Centos | OS Version | 7 & 8 |
Summary | 0001365: Kernel 6.4.0-1.el7.elrepo.x86_64 unstable, freezes & IPv6 issues | ||||
Description | We have been running 6.4.0-1.el7.elrepo.x86_64 for two weeks on a cluster of 0000054:0000250 nodes with dual stack (IPv4 & IPv6). We've had so many network issues since then, that our service was basically out of production. At random, nodes cannot ping6 other nodes anymore, while IPv4 ping still works, and other nodes via ping6 also work. Sometimes the network would just stop working at all, until reboot. When we select the previous kernel we had, 6.2.6, there are no issues anymore. We see this on both Intel and Mellanox network cards. All our servers are Dell Poweredges. Additionally, two nodes froze during this period. There was no logging, there was no message on the console besides a frozen login propmt. They just stopped responding. After a reboot they worked again. All this made us scream and run back to the previous kernel we used, 6.2.6. I'm afraid I'm too busy to help troubleshooting this issue. I just wanted to let people know. Stay away from 6.4.0-1.el7.elrepo.x86_64. | ||||
Steps To Reproduce | Run 6.4.0-1.el7.elrepo.x86_64 on a large cluster with IPv6 | ||||
Tags | No tags attached. | ||||
|
Oh by the way, the reason we started using the ML kernels was performance. We have 10Gbit/s and 25Gbit/s interfaces and we need to get the maximum network performance out of them. The Centos stock kernels did not provide that. |
|
@OnnoZweers As noted in our announcement mail as well as on our website, "If a bug is found when using these kernels, the end user is encouraged to report it upstream to the Linux Kernel Bug Tracker ( http://bugzilla.kernel.org/ ). We do not modify the source code. We can only handle issues associated with the packaging process. |
|
@toracat Thanks, I will probably do that. |
Date Modified | Username | Field | Change |
---|---|---|---|
2023-07-12 08:46 | OnnoZweers | New Issue | |
2023-07-12 08:46 | OnnoZweers | Status | new => assigned |
2023-07-12 08:46 | OnnoZweers | Assigned To | => toracat |
2023-07-12 08:59 | OnnoZweers | Note Added: 0009270 | |
2023-07-12 13:24 | toracat | Note Added: 0009271 | |
2023-07-12 14:24 | OnnoZweers | Note Added: 0009272 |