View Issue Details

IDProjectCategoryView StatusLast Update
0001016channel: elrepo/el8elrepo-releasepublic2020-06-26 17:11
Reportercat5dm Assigned Totoracat  
PrioritynormalSeveritycrashReproducibilityalways
Status resolvedResolutionfixed 
Summary0001016: drbd is broken after upgrade to centos 8.2
DescriptionAfter upgrading to CentOS 8.2 drbd doesn't work anymore

version is :

[root@vm-ne-prd-pgs3 pgsadmin]# cat /proc/drbd
version: 9.0.21-1 (api:2/proto:86-116)
GIT-hash: 449d6bf22b01af7d14a297a4ed3e281aa84c94a5 build by akemi@Build64R8, 2020-05-01 21:12:19
Transports (api:16):


worked fine on CentOS 8.1
Additional InformationSteps to reproduce :

should do :

[root@vm-ne-prd-pgs3 pgsadmin]# drbdadm -d up pgs-prd
drbdsetup new-resource pgs-prd 2
drbdsetup new-minor pgs-prd 0 0
drbdsetup new-peer pgs-prd 0 --_name=vm-we-prd-pgs1 --protocol=C
drbdsetup new-peer pgs-prd 1 --_name=vm-we-prd-pgs2 --protocol=C
drbdsetup new-peer pgs-prd 3 --_name=vm-we-quorum --protocol=C
drbdsetup new-path pgs-prd 0 ipv4:10.209.11.20:7789 ipv4:10.209.10.107:7789
drbdsetup new-path pgs-prd 1 ipv4:10.209.11.20:7789 ipv4:10.209.10.108:7789
drbdsetup new-path pgs-prd 3 ipv4:10.209.11.20:7789 ipv4:10.209.10.4:7785
drbdsetup peer-device-options pgs-prd 3 0 --bitmap=no
drbdmeta 0 v09 /dev/drbdpool/r0 internal apply-al
drbdsetup attach 0 /dev/drbdpool/r0 /dev/drbdpool/r0 internal
drbdsetup connect pgs-prd 0
drbdsetup connect pgs-prd 1
drbdsetup connect pgs-prd 3

returns :

[root@vm-ne-prd-pgs3 pgsadmin]# drbdadm -v up pgs-prd
drbdsetup new-resource pgs-prd 2
pgs-prd: Invalid argument
Command 'drbdsetup new-resource pgs-prd 2' terminated with exit code 20


TagsNo tags attached.

Activities

toracat

2020-06-23 10:45

administrator   ~0006988

Is your kernel version 4.18.0-193.6.3.el8_2 ?

kmod-drbd90-9.0.21-3.el8_2.elrepo.x86_64.rpm is the one built for el8.2. Version 9.0.21-1 is for el8.0.

chansen

2020-06-23 13:33

reporter   ~0006990

I have also encountered this bug after upgrading to kernel-4.18.0-193.6.3.el8_2.x86_64 and kmod-drbd90-9.0.21-3.el8_2.elrepo.x86_64

Before:
-------
[root@kvm2 ~]# rpm -qa | grep kernel-4
kernel-4.18.0-147.8.1.el8_1.x86_64

[root@kvm2 ~]# rpm -qa | grep drbd
drbd90-utils-9.10.0-2.el8.elrepo.x86_64
kmod-drbd90-9.0.21-2.el8_1.elrepo.x86_64

[root@kvm2 ~]# uname -a
Linux kvm2 4.18.0-147.8.1.el8_1.x86_64 #1 SMP Thu Apr 9 13:49:54 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux

[root@kvm2 ~]# cat /proc/drbd
version: 9.0.21-1 (api:2/proto:86-116)
GIT-hash: 449d6bf22b01af7d14a297a4ed3e281aa84c94a5 build by mockbuild@, 2019-11-24 13:18:12
Transports (api:16): tcp (9.0.21-1)

[root@kvm2 ~]# drbdadm status
# No currently configured DRBD found.

[root@kvm2 ~]# drbdadm -v up gw3
drbdsetup new-resource gw3 1
drbdsetup new-minor gw3 1 0
drbdsetup new-peer gw3 0 --_name=kvm1 --verify-alg=sha1 --protocol=C --sndbuf-size=10M --rcvbuf-size=10M --max-buffers=128k --max-epoch-size=16k --csums-alg=crc32c
drbdsetup new-path gw3 0 ipv4:192.168.100.12:7789 ipv4:192.168.100.11:7789
drbdsetup peer-device-options gw3 0 0 --c-fill-target=64M --c-max-rate=800M --c-min-rate=80M --c-plan-ahead=20 --resync-rate=800M
drbdmeta 1 v09 /dev/system/gw3 internal apply-al
drbdsetup attach 1 /dev/system/gw3 /dev/system/gw3 internal --al-extents=6433 --on-io-error=detach --disk-flushes=no --disk-barrier=no
drbdsetup connect gw3 0

[root@kvm2 ~]# cat /proc/drbd
version: 9.0.21-1 (api:2/proto:86-116)
GIT-hash: 449d6bf22b01af7d14a297a4ed3e281aa84c94a5 build by mockbuild@, 2019-11-24 13:18:12
Transports (api:16): tcp (9.0.21-1)

[root@kvm2 ~]# drbdadm status
gw3 role:Secondary
  disk:UpToDate
  kvm1 role:Primary
    peer-disk:UpToDate

[root@kvm2 ~]# lsmod | grep -i drbd
drbd_transport_tcp 28672 1
drbd 643072 2 drbd_transport_tcp
libcrc32c 16384 4 nf_conntrack,nf_nat,xfs,drbd


Apply updates:
--------------
[root@kvm2 ~]# yum clean all && yum makecache


[root@kvm2 ~]# yum check-update | grep -i "kernel.x86_64\|drbd"
kernel.x86_64 4.18.0-193.6.3.el8_2 BaseOS
kmod-drbd90.x86_64 9.0.21-3.el8_2.elrepo elrepo


[root@kvm2 ~]# yum -y update
[root@kvm2 ~]# reboot


After:
------
[root@kvm2 ~]# rpm -qa | grep kernel-4
kernel-4.18.0-193.6.3.el8_2.x86_64
kernel-4.18.0-147.8.1.el8_1.x86_64

[root@kvm2 ~]# rpm -qa | grep drbd
drbd90-utils-9.10.0-2.el8.elrepo.x86_64
kmod-drbd90-9.0.21-3.el8_2.elrepo.x86_64

[root@kvm2 ~]# uname -a
Linux kvm2 4.18.0-193.6.3.el8_2.x86_64 #1 SMP Wed Jun 10 11:09:32 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux

[root@kvm2 ~]# cat /proc/drbd
version: 9.0.21-1 (api:2/proto:86-116)
GIT-hash: 449d6bf22b01af7d14a297a4ed3e281aa84c94a5 build by akemi@Build64R8, 2020-05-01 21:12:19
Transports (api:16):

[root@kvm2 ~]# drbdadm status
# No currently configured DRBD found.

[root@kvm2 ~]# drbdadm -v up gw3
drbdsetup new-resource gw3 1
gw3: Invalid argument
Command 'drbdsetup new-resource gw3 1' terminated with exit code 20

[root@kvm2 ~]# lsmod | grep drbd
drbd 643072 0
libcrc32c 16384 4 nf_conntrack,nf_nat,xfs,drbd


I noticed that module drbd_transport_tcp was not loaded automatically (I suspect because it errored out before attempting to load it?). I tried to manually load it anyway with modprobe but still getting the same error with exit code 20

At any rate, I was able to rollback kernel & kmod-drbd90 to get the DRBD volume working again

Rollback:
--------
[root@kvm2 ~]# yum -y downgrade kmod-drbd90-9.0.21-2.el8_1.elrepo.x86_64

[root@kvm2 ~]# grubby --default-index
0

[root@kvm2 ~]# grubby --set-default-index=1
The default is /boot/loader/entries/71086645ae0d4cd9adad30ecc8b437c0-4.18.0-147.8.1.el8_1.x86_64.conf with index 1 and kernel /boot/vmlinuz-4.18.0-147.8.1.el8_1.x86_64

[root@kvm2 ~]# grubby --default-index
1

[root@kvm2 ~]# reboot

chansen

2020-06-23 13:43

reporter   ~0006991

Some additional info from /var/log/messages

Working with
  kernel: kernel-4.18.0-147.8.1.el8_1.x86_64
  kmod-drbd: kmod-drbd90-9.0.21-2.el8_1.elrepo.x86_64
-----------------------------------------------------
Jun 23 14:31:11 kvm2 kernel: drbd: loading out-of-tree module taints kernel.
Jun 23 14:31:11 kvm2 kernel: drbd: module verification failed: signature and/or required key missing - tainting kernel
Jun 23 14:31:11 kvm2 kernel: drbd: initialized. Version: 9.0.21-1 (api:2/proto:86-116)
Jun 23 14:31:11 kvm2 kernel: drbd: GIT-hash: 449d6bf22b01af7d14a297a4ed3e281aa84c94a5 build by mockbuild@, 2019-11-24 13:18:12
Jun 23 14:31:11 kvm2 kernel: drbd: registered as block device major 147
Jun 23 14:31:15 kvm2 systemd[1]: Starting DRBD -- please disable. Unless you are NOT using a cluster manager....
Jun 23 14:31:15 kvm2 drbd[3962]: Starting DRBD resources: [
Jun 23 14:31:15 kvm2 drbd[3962]: create res: gw3
Jun 23 14:31:15 kvm2 kernel: drbd gw3: Starting worker thread (from drbdsetup [3967])
Jun 23 14:31:15 kvm2 drbd[3962]: prepare disk: gw3
Jun 23 14:31:15 kvm2 drbd[3962]: prepare net: gw3
Jun 23 14:31:15 kvm2 kernel: drbd gw3 kvm1: Starting sender thread (from drbdsetup [3971])
Jun 23 14:31:15 kvm2 drbd[3962]: prepare net: gw3
Jun 23 14:31:15 kvm2 drbd[3962]: adjust peer_devices: gw3
Jun 23 14:31:15 kvm2 kernel: drbd gw3/0 drbd1: meta-data IO uses: blk-bio
Jun 23 14:31:15 kvm2 kernel: drbd gw3/0 drbd1: disk( Diskless -> Attaching )
Jun 23 14:31:15 kvm2 kernel: drbd gw3/0 drbd1: Maximum number of peer devices = 1
Jun 23 14:31:15 kvm2 kernel: drbd gw3: Method to ensure write ordering: drain
Jun 23 14:31:15 kvm2 kernel: drbd gw3/0 drbd1: drbd_bm_resize called with capacity == 104854328
Jun 23 14:31:15 kvm2 kernel: drbd gw3/0 drbd1: resync bitmap: bits=13106791 words=204794 pages=400
Jun 23 14:31:15 kvm2 kernel: drbd gw3/0 drbd1: size = 50 GB (52427164 KB)
Jun 23 14:31:15 kvm2 kernel: drbd gw3/0 drbd1: size = 50 GB (52427164 KB)
Jun 23 14:31:15 kvm2 kernel: drbd gw3/0 drbd1: recounting of set bits took additional 0ms
Jun 23 14:31:15 kvm2 kernel: drbd gw3/0 drbd1: disk( Attaching -> Outdated )
Jun 23 14:31:15 kvm2 kernel: drbd gw3/0 drbd1: attached to current UUID: 5CC3B789B5B565F0
Jun 23 14:31:15 kvm2 drbd[3962]: adjust disk: gw3
Jun 23 14:31:15 kvm2 drbd[3962]: attempt to connect: gw3
Jun 23 14:31:15 kvm2 drbd[3962]: ]
Jun 23 14:31:15 kvm2 kernel: drbd gw3 kvm1: conn( StandAlone -> Unconnected )
Jun 23 14:31:15 kvm2 kernel: drbd gw3 kvm1: Starting receiver thread (from drbd_w_gw3 [3968])
Jun 23 14:31:15 kvm2 kernel: drbd gw3 kvm1: conn( Unconnected -> Connecting )
Jun 23 14:31:15 kvm2 kernel: drbd gw3 kvm1: Handshake to peer 0 successful: Agreed network protocol version 116
Jun 23 14:31:15 kvm2 kernel: drbd gw3 kvm1: Feature flags enabled on protocol level: 0xf TRIM THIN_RESYNC WRITE_SAME WRITE_ZEROES.
Jun 23 14:31:15 kvm2 kernel: drbd gw3 kvm1: Starting ack_recv thread (from drbd_r_gw3 [3994])
Jun 23 14:31:15 kvm2 kernel: drbd gw3 kvm1: Preparing remote state change 3244873934
Jun 23 14:31:15 kvm2 kernel: drbd gw3 kvm1: Committing remote state change 3244873934 (primary_nodes=1)
Jun 23 14:31:15 kvm2 kernel: drbd gw3 kvm1: conn( Connecting -> Connected ) peer( Unknown -> Primary )
Jun 23 14:31:15 kvm2 kernel: drbd gw3/0 drbd1 kvm1: drbd_sync_handshake:
Jun 23 14:31:15 kvm2 kernel: drbd gw3/0 drbd1 kvm1: self 5CC3B789B5B565F0:0000000000000000:0000000000000000:0000000000000000 bits:0 flags:20
Jun 23 14:31:15 kvm2 kernel: drbd gw3/0 drbd1 kvm1: peer 5CC3B789B5B565F0:0000000000000000:E479BCF4F135458E:0000000000000000 bits:0 flags:120
Jun 23 14:31:15 kvm2 kernel: drbd gw3/0 drbd1 kvm1: uuid_compare()=no-sync by rule 38
Jun 23 14:31:15 kvm2 kernel: drbd gw3/0 drbd1: disk( Outdated -> UpToDate )
Jun 23 14:31:15 kvm2 kernel: drbd gw3/0 drbd1 kvm1: pdsk( DUnknown -> UpToDate ) repl( Off -> Established )
Jun 23 14:31:15 kvm2 drbd[3962]: WARN: stdin/stdout is not a TTY; using /dev/console.
Jun 23 14:31:15 kvm2 systemd[1]: Started DRBD -- please disable. Unless you are NOT using a cluster manager..


Failing with
  kernel: kernel-4.18.0-193.6.3.el8_2.x86_64
  kmod-drbd: kmod-drbd90-9.0.21-3.el8_2.elrepo.x86_64
-----------------------------------------------------
Jun 23 14:39:06 kvm2 systemd[1]: Starting DRBD -- please disable. Unless you are NOT using a cluster manager....
Jun 23 14:39:06 kvm2 kernel: drbd: loading out-of-tree module taints kernel.
Jun 23 14:39:06 kvm2 kernel: drbd: module verification failed: signature and/or required key missing - tainting kernel
Jun 23 14:39:06 kvm2 kernel: drbd: initialized. Version: 9.0.21-1 (api:2/proto:86-116)
Jun 23 14:39:06 kvm2 kernel: drbd: GIT-hash: 449d6bf22b01af7d14a297a4ed3e281aa84c94a5 build by akemi@Build64R8, 2020-05-01 21:12:19
Jun 23 14:39:06 kvm2 kernel: drbd: registered as block device major 147
Jun 23 14:39:06 kvm2 drbd[3864]: Starting DRBD resources: [
Jun 23 14:39:06 kvm2 drbd[3864]: create res: gw3:failed(new-resource:20)
Jun 23 14:39:06 kvm2 drbd[3864]: prepare disk: [skipped:gw3]
Jun 23 14:39:06 kvm2 drbd[3864]: prepare net: [skipped:gw3]
Jun 23 14:39:06 kvm2 drbd[3864]: adjust peer_devices: [skipped:gw3]
Jun 23 14:39:06 kvm2 drbd[3864]: adjust disk: [skipped:gw3]
Jun 23 14:39:06 kvm2 drbd[3864]: attempt to connect: [skipped:gw3]
Jun 23 14:39:06 kvm2 drbd[3864]: ]
Jun 23 14:39:16 kvm2 drbd[3864]: WARN: stdin/stdout is not a TTY; using /dev/consolereceived netlink error reply: Invalid argument
Jun 23 14:39:16 kvm2 drbd[3864]: .
Jun 23 14:39:16 kvm2 systemd[1]: Started DRBD -- please disable. Unless you are NOT using a cluster manager..

toracat

2020-06-23 13:43

administrator   ~0006992

Let us rebuild the kmod package against the current kernel (4.18.0-193.6.3.el8_2) and see what happens.

toracat

2020-06-23 14:15

administrator   ~0006993

@cat5dm @chansen

kmod-drbd90-9.0.21-4.el8_2.elrepo.x86_64.rpm has been built against kernel-4.18.0-193.6.3.el8_2. It will start sync'ing to our mirrors shortly.

Please give it a try and let us know if this version works.

cat5dm

2020-06-23 14:17

reporter   ~0006994

to reply to your question :

[root@vm-ne-prd-pgs3 pgsadmin]# uname -a
Linux vm-ne-prd-pgs3 4.18.0-193.6.3.el8_2.x86_64 #1 SMP Wed Jun 10 11:09:32 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
[root@vm-ne-prd-pgs3 pgsadmin]# cat /proc/drbd
version: 9.0.21-1 (api:2/proto:86-116)
GIT-hash: 449d6bf22b01af7d14a297a4ed3e281aa84c94a5 build by akemi@Build64R8, 2020-05-01 21:12:19
Transports (api:16):
[root@vm-ne-prd-pgs3 pgsadmin]# rpm -qa | grep drbd
drbd90-utils-9.10.0-2.el8.elrepo.x86_64
kmod-drbd90-9.0.21-3.el8_2.elrepo.x86_64
[root@vm-ne-prd-pgs3 pgsadmin]#

so something must have gone wrong here.

behaviour is exactly as chansen describes

cat5dm

2020-06-23 14:21

reporter   ~0006995

waiting for the new build now

toracat

2020-06-23 14:27

administrator   ~0006996

It's here now:

https://elrepo.org/linux/elrepo/el8/x86_64/RPMS/kmod-drbd90-9.0.21-4.el8_2.elrepo.x86_64.rpm

cat5dm

2020-06-23 14:45

reporter   ~0006997

unfortunately, got the same result :

[root@vm-ne-prd-pgs3 pgsadmin]# modprobe drbd

[root@vm-ne-prd-pgs3 pgsadmin]# cat /proc/drbd
version: 9.0.21-1 (api:2/proto:86-116)
GIT-hash: 449d6bf22b01af7d14a297a4ed3e281aa84c94a5 build by akemi@Build64R8, 2020-06-23 16:02:37
Transports (api:16):

[root@vm-ne-prd-pgs3 pgsadmin]# rpm -qa | grep 'drbd'
drbd90-utils-9.10.0-2.el8.elrepo.x86_64
kmod-drbd90-9.0.21-4.el8_2.elrepo.x86_64

[root@vm-ne-prd-pgs3 pgsadmin]# drbdadm -v up pgs-prd
drbdsetup new-resource pgs-prd 2
pgs-prd: Invalid argument
Command 'drbdsetup new-resource pgs-prd 2' terminated with exit code 20

[root@vm-ne-prd-pgs3 pgsadmin]# uname -a
Linux vm-ne-prd-pgs3 4.18.0-193.6.3.el8_2.x86_64 #1 SMP Wed Jun 10 11:09:32 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux

chansen

2020-06-23 14:46

reporter   ~0006998

Same for me as well...

chansen

2020-06-23 14:57

reporter   ~0006999

I don't know whether this is of any significance but I did notice quite a few differences in the respective versions' greylist.txt files:

[root@kvm1 ~]# rpm -q kmod-drbd90-9.0.21
kmod-drbd90-9.0.21-2.el8_1.elrepo.x86_64

[root@kvm1 ~]# ls -l /usr/share/doc/kmod-drbd90-9.0.21/greylist.txt
-rw-r--r-- 1 root root 1900 Nov 24 2019 /usr/share/doc/kmod-drbd90-9.0.21/greylist.txt

[root@kvm1 ~]# wc -l /usr/share/doc/kmod-drbd90-9.0.21/greylist.txt
116 /usr/share/doc/kmod-drbd90-9.0.21/greylist.txt



[root@kvm2 ~]# rpm -q kmod-drbd90-9.0.21
kmod-drbd90-9.0.21-4.el8_2.elrepo.x86_64

[root@kvm2 ~]# ls -l /usr/share/doc/kmod-drbd90-9.0.21/greylist.txt
-rw-r--r-- 1 root root 1315 Jun 23 15:02 /usr/share/doc/kmod-drbd90-9.0.21/greylist.txt

[root@kvm2 ~]# wc -l /usr/share/doc/kmod-drbd90-9.0.21/greylist.txt
84 /usr/share/doc/kmod-drbd90-9.0.21/greylist.txt



[root@kvm1 ~]# diff /tmp/kmod-drbd90-9.0.21-2.el8_1-greylist.txt /tmp/kmod-drbd90-9.0.21-4.el8_2-greylist.txt
2,4d1
< __alloc_disk_node
< __default_kernel_pte_mask
< __dynamic_pr_debug
6c3
< __ipv6_addr_type
---
> __nla_parse
12c9
< arch_wb_cache_pmem
---
> alloc_workqueue
17,28d13
< bio_clone_fast
< bioset_exit
< bioset_init
< blk_check_plugged
< blk_queue_flag_set
< blk_queue_max_write_same_sectors
< blk_queue_split
< blk_queue_stack_limits
< blk_queue_write_cache
< blk_set_stacking_limits
< blkdev_issue_write_same
< blkdev_issue_zeroout
44d28
< device_add_disk
57d40
< errno_to_blk_status
59,64d41
< generic_end_io_acct
< generic_make_request
< generic_start_io_acct
< genl_register_family
< genl_unregister_family
< genlmsg_put
70d46
< init_net
81a58
> mutex_is_locked
83,84d59
< netlink_broadcast
< netlink_unicast
87d61
< nla_parse
96d69
< pv_lock_ops
102d74
< sched_setscheduler
110,111d81
< skb_trim
< sock_release
115,116d84
< vscnprintf
< zalloc_cpumask_var

toracat

2020-06-23 15:04

administrator   ~0007000

We can try updating to the current version of drbd (9.0.23).

chansen

2020-06-23 15:08

reporter   ~0007001

worth a try; standing by to test...

toracat

2020-06-23 15:19

administrator   ~0007002

Here:

https://elrepo.org/linux/elrepo/el8/x86_64/RPMS/kmod-drbd90-9.0.23-1.el8_2.elrepo.x86_64.rpm

chansen

2020-06-23 15:44

reporter   ~0007003

Unfortunately I get the same error:
[root@kvm2 ~]# rpm -qa | grep drbd
drbd90-utils-9.10.0-2.el8.elrepo.x86_64
kmod-drbd90-9.0.23-1.el8_2.elrepo.x86_64

[root@kvm2 ~]# drbdadm -v up gw3
drbdsetup new-resource gw3 1
gw3: Invalid argument
Command 'drbdsetup new-resource gw3 1' terminated with exit code 20


Jun 23 16:37:42 kvm2 systemd[1]: Starting DRBD -- please disable. Unless you are NOT using a cluster manager....
Jun 23 16:37:42 kvm2 kernel: drbd: loading out-of-tree module taints kernel.
Jun 23 16:37:42 kvm2 kernel: drbd: module verification failed: signature and/or required key missing - tainting kernel
Jun 23 16:37:43 kvm2 kernel: drbd: initialized. Version: 9.0.23-1 (api:2/proto:86-116)
Jun 23 16:37:43 kvm2 kernel: drbd: GIT-hash: d16bfab7a4033024fed2d99d3b179aa6bb6eb300 build by akemi@Build64R8, 2020-06-23 17:13:21
Jun 23 16:37:43 kvm2 kernel: drbd: registered as block device major 147
Jun 23 16:37:43 kvm2 drbd[3352]: Starting DRBD resources: [
Jun 23 16:37:43 kvm2 drbd[3352]: create res: gw3:failed(new-resource:20)
Jun 23 16:37:43 kvm2 drbd[3352]: prepare disk: [skipped:gw3]
Jun 23 16:37:43 kvm2 drbd[3352]: prepare net: [skipped:gw3]
Jun 23 16:37:43 kvm2 drbd[3352]: adjust peer_devices: [skipped:gw3]
Jun 23 16:37:43 kvm2 drbd[3352]: adjust disk: [skipped:gw3]
Jun 23 16:37:43 kvm2 drbd[3352]: attempt to connect: [skipped:gw3]
Jun 23 16:37:43 kvm2 drbd[3352]: ]
Jun 23 16:37:53 kvm2 drbd[3352]: WARN: stdin/stdout is not a TTY; using /dev/consolereceived netlink error reply: Invalid argument
Jun 23 16:37:53 kvm2 drbd[3352]: .
Jun 23 16:37:53 kvm2 systemd[1]: Started DRBD -- please disable. Unless you are NOT using a cluster manager..

chansen

2020-06-23 16:00

reporter   ~0007004

Perhaps the drb90-utils package needs an update too? I see the EPEL repo has version 9.10.0 last built in Nov 2019, but LIBINT has the utils up to version 9.13.1 as of last month: https://lists.linbit.com/pipermail/drbd-announce/2020-May/000364.html

toracat

2020-06-23 16:08

administrator   ~0007005

Last edited: 2020-06-24 01:40

Actually I tried to build -utils 9.13.1. However building has been failing with an error:

error: Empty %files file drbd-utils-9.13.1/debugsourcefiles.list

Needs to find a cause for this failure.

[EDIT] A workaround is to add a line:

%define debug_package %{nil}

to the spec file.

toracat

2020-06-24 01:42

administrator   ~0007006

@cat5dm @chansen

The following files are syncing to the mirrors. Please give them a try.

drbd90-utils-9.13.1-1.el8.elrepo.x86_64.rpm
drbd90-utils-sysvinit-9.13.1-1.el8.elrepo.x86_64.rpm

cat5dm

2020-06-24 02:43

reporter   ~0007007

I think we have a success with drbd90-utils-9.13.1-1 :

  --== Thank you for participating in the global usage survey ==--
The server's response is:

you are the 32014th user to install this version
drbdsetup new-resource pgs-prd 0
drbdsetup new-minor pgs-prd 0 0
drbdsetup new-peer pgs-prd 1 --_name=vm-we-prd-pgs2 --protocol=C
drbdsetup new-peer pgs-prd 2 --_name=vm-ce-prd-pgs3 --protocol=C
drbdsetup new-peer pgs-prd 3 --_name=vm-we-quorum --protocol=C
drbdsetup new-path pgs-prd 1 ipv4:10.209.10.107:7789 ipv4:10.209.10.108:7789
drbdsetup new-path pgs-prd 2 ipv4:10.209.10.107:7789 ipv4:10.209.11.20:7789
drbdsetup new-path pgs-prd 3 ipv4:10.209.10.107:7789 ipv4:10.209.10.4:7785
drbdsetup peer-device-options pgs-prd 3 0 --bitmap=no
drbdmeta 0 v09 /dev/drbdpool/r0 internal apply-al
drbdsetup attach 0 /dev/drbdpool/r0 /dev/drbdpool/r0 internal
drbdsetup connect pgs-prd 1
drbdsetup connect pgs-prd 2
drbdsetup connect pgs-prd 3
[root@vm-we-prd-pgs1 pgsadmin]# drbdadm status
pgs-prd role:Secondary
  disk:Inconsistent
  vm-ce-prd-pgs3 connection:Connecting
  vm-we-prd-pgs2 role:Secondary
    replication:SyncTarget peer-disk:Outdated done:1.23
  vm-we-quorum role:Secondary
    peer-disk:Diskless

will do further testing after I could fetch from the mirrors - I did the install manually

chansen

2020-06-24 09:18

reporter   ~0007008

Confirmed that it is now working for me:

[root@kvm2 ~]# uname -a
Linux kvm2 4.18.0-193.6.3.el8_2.x86_64 #1 SMP Wed Jun 10 11:09:32 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux

[root@kvm2 ~]# rpm -qa | sort -V | grep "drbd\|kernel-4.18.0-193"
drbd90-utils-9.13.1-1.el8.elrepo.x86_64
kernel-4.18.0-193.6.3.el8_2.x86_64
kmod-drbd90-9.0.23-1.el8_2.elrepo.x86_64

[root@kvm2 ~]# systemctl start drbd

[root@kvm2 ~]# cat /proc/drbd
version: 9.0.23-1 (api:2/proto:86-116)
GIT-hash: d16bfab7a4033024fed2d99d3b179aa6bb6eb300 build by akemi@Build64R8, 2020-06-23 17:13:21
Transports (api:16): tcp (9.0.23-1)

[root@kvm2 ~]# drbdadm status
gw3 role:Secondary
  disk:UpToDate
  kvm1 role:Primary
    peer-disk:UpToDate

Jun 24 10:10:36 kvm2 systemd[1]: Starting DRBD -- please disable. Unless you are NOT using a cluster manager....
Jun 24 10:10:36 kvm2 kernel: drbd: loading out-of-tree module taints kernel.
Jun 24 10:10:36 kvm2 kernel: drbd: module verification failed: signature and/or required key missing - tainting kernel
Jun 24 10:10:36 kvm2 kernel: drbd: initialized. Version: 9.0.23-1 (api:2/proto:86-116)
Jun 24 10:10:36 kvm2 kernel: drbd: GIT-hash: d16bfab7a4033024fed2d99d3b179aa6bb6eb300 build by akemi@Build64R8, 2020-06-23 17:13:21
Jun 24 10:10:36 kvm2 kernel: drbd: registered as block device major 147
Jun 24 10:10:36 kvm2 drbd[3932]: Starting DRBD resources: /lib/drbd/drbd: line 148: /var/lib/linstor/loop_device_mapping: No such file or directory
Jun 24 10:10:36 kvm2 drbd[3932]: [
Jun 24 10:10:36 kvm2 drbd[3932]: create res: gw3
Jun 24 10:10:36 kvm2 kernel: drbd gw3: Starting worker thread (from drbdsetup [3940])
Jun 24 10:10:36 kvm2 drbd[3932]: prepare disk: gw3
Jun 24 10:10:36 kvm2 drbd[3932]: prepare net: gw3
Jun 24 10:10:36 kvm2 drbd[3932]: prepare net: gw3
Jun 24 10:10:36 kvm2 kernel: drbd gw3 kvm1: Starting sender thread (from drbdsetup [3944])
Jun 24 10:10:36 kvm2 drbd[3932]: adjust peer_devices: gw3
Jun 24 10:10:36 kvm2 kernel: drbd gw3/0 drbd1: meta-data IO uses: blk-bio
Jun 24 10:10:36 kvm2 kernel: drbd gw3/0 drbd1: disk( Diskless -> Attaching )
Jun 24 10:10:36 kvm2 kernel: drbd gw3/0 drbd1: Maximum number of peer devices = 1
Jun 24 10:10:36 kvm2 kernel: drbd gw3: Method to ensure write ordering: drain
Jun 24 10:10:36 kvm2 kernel: drbd gw3/0 drbd1: drbd_bm_resize called with capacity == 104854328
Jun 24 10:10:36 kvm2 kernel: drbd gw3/0 drbd1: resync bitmap: bits=13106791 words=204794 pages=400
Jun 24 10:10:36 kvm2 kernel: drbd gw3/0 drbd1: size = 50 GB (52427164 KB)
Jun 24 10:10:36 kvm2 kernel: drbd gw3/0 drbd1: size = 50 GB (52427164 KB)
Jun 24 10:10:36 kvm2 kernel: drbd gw3/0 drbd1: recounting of set bits took additional 1ms
Jun 24 10:10:36 kvm2 kernel: drbd gw3/0 drbd1: disk( Attaching -> Outdated )
Jun 24 10:10:36 kvm2 kernel: drbd gw3/0 drbd1: attached to current UUID: 5CC3B789B5B565F0
Jun 24 10:10:37 kvm2 drbd[3932]: adjust disk: gw3
Jun 24 10:10:37 kvm2 drbd[3932]: attempt to connect: gw3
Jun 24 10:10:37 kvm2 drbd[3932]: ]
Jun 24 10:10:37 kvm2 kernel: drbd gw3 kvm1: conn( StandAlone -> Unconnected )
Jun 24 10:10:37 kvm2 kernel: drbd gw3 kvm1: Starting receiver thread (from drbd_w_gw3 [3941])
Jun 24 10:10:37 kvm2 kernel: drbd gw3 kvm1: conn( Unconnected -> Connecting )
Jun 24 10:10:37 kvm2 kernel: drbd gw3 kvm1: Handshake to peer 0 successful: Agreed network protocol version 116
Jun 24 10:10:37 kvm2 kernel: drbd gw3 kvm1: Feature flags enabled on protocol level: 0xf TRIM THIN_RESYNC WRITE_SAME WRITE_ZEROES.
Jun 24 10:10:37 kvm2 kernel: drbd gw3 kvm1: Starting ack_recv thread (from drbd_r_gw3 [3967])
Jun 24 10:10:37 kvm2 kernel: drbd gw3 kvm1: Preparing remote state change 788806086
Jun 24 10:10:37 kvm2 kernel: drbd gw3 kvm1: Committing remote state change 788806086 (primary_nodes=1)
Jun 24 10:10:37 kvm2 kernel: drbd gw3 kvm1: conn( Connecting -> Connected ) peer( Unknown -> Primary )
Jun 24 10:10:37 kvm2 kernel: drbd gw3/0 drbd1 kvm1: drbd_sync_handshake:
Jun 24 10:10:37 kvm2 kernel: drbd gw3/0 drbd1 kvm1: self 5CC3B789B5B565F0:0000000000000000:92F795AFEBD6DE18:0000000000000000 bits:0 flags:20
Jun 24 10:10:37 kvm2 kernel: drbd gw3/0 drbd1 kvm1: peer 5CC3B789B5B565F0:0000000000000000:E479BCF4F135458E:0000000000000000 bits:0 flags:120
Jun 24 10:10:37 kvm2 kernel: drbd gw3/0 drbd1 kvm1: uuid_compare()=no-sync by rule 38
Jun 24 10:10:37 kvm2 kernel: drbd gw3/0 drbd1: disk( Outdated -> UpToDate )
Jun 24 10:10:37 kvm2 kernel: drbd gw3/0 drbd1 kvm1: pdsk( DUnknown -> UpToDate ) repl( Off -> Established )
Jun 24 10:10:37 kvm2 drbd[3932]: WARN: stdin/stdout is not a TTY; using /dev/console.
Jun 24 10:10:37 kvm2 systemd[1]: Started DRBD -- please disable. Unless you are NOT using a cluster manager..

Thank you for your prompt response with this matter, cheers

toracat

2020-06-24 10:31

administrator   ~0007010

Glad to hear success stories. :)

I am now curious. Was it the -utils package that fixed the issue rather than the kmod? Or both?

chansen

2020-06-24 12:21

reporter   ~0007012

I suspect it was a bit of both; the fact that we saw errors after updating both kernel and kmod packages but could rollback to a successful DRBD implementation by booting the previous kernel while downgrading kmod implies the kmod package was the problem, as the drbd-90 utils was unchanged.

On the other hand, looking at the error message I saw:
[root@kvm2 ~]# drbdadm -v up gw3
drbdsetup new-resource gw3 1
gw3: Invalid argument
Command 'drbdsetup new-resource gw3 1' terminated with exit code 20

[root@kvm2 ~]# grep -i drbd /var/log/messages # relevant snippet below
Jun 23 14:39:06 kvm2 kernel: drbd: registered as block device major 147
Jun 23 14:39:06 kvm2 drbd[3864]: Starting DRBD resources: [
Jun 23 14:39:06 kvm2 drbd[3864]: create res: gw3:failed(new-resource:20)
Jun 23 14:39:06 kvm2 drbd[3864]: prepare disk: [skipped:gw3]
Jun 23 14:39:06 kvm2 drbd[3864]: prepare net: [skipped:gw3]
Jun 23 14:39:06 kvm2 drbd[3864]: adjust peer_devices: [skipped:gw3]
Jun 23 14:39:06 kvm2 drbd[3864]: adjust disk: [skipped:gw3]
Jun 23 14:39:06 kvm2 drbd[3864]: attempt to connect: [skipped:gw3]
Jun 23 14:39:06 kvm2 drbd[3864]: ]


I'd assumed the "Invalid argument" was the node_id=1 argument at the end of the command, but when I ran that command manually I got:
[root@kvm2 ~]# drbdsetup new-resource gw3
Missing argument 'node_id'
drbdsetup new-resource - Create a new resource.

USAGE: drbdsetup new-resource {resource} {node_id}
    [--cpu-mask=<str>] [--on-no-data-accessible={io-error|suspend-io}] [--auto-promote={yes|no}] [--peer-ack-window=(2048 ... 204800)] [--peer-ack-delay=(1 ... 10000)] [--twopc-timeout=(50 ... 600)] [--twopc-retry-timeout=(1 ... 50)]
    [--auto-promote-timeout=(0 ... 600)] [--max-io-depth=(4 ... 4294967295)] [--quorum={off|majority|all}|(1 ... 32)] [--on-no-quorum={io-error|suspend-io}] [--quorum-minimum-redundancy={off|majority|all}|(1 ... 32)]

[root@kvm2 ~]# drbdsetup new-resource gw3 1
gw3: Invalid argument

So drbdsetup didn't take when node_id was left out nor did it work when node_id was specified, obviously a problem when given the correct syntax. After running with the latest kernel and kmod package, then applying the upgrade to drbd90-utils-9.13.1-1 (as the drbdadm & drbdsetup utilities reside within that package) this was no longer a problem:
[root@kvm2 ~]# drbdadm down gw3

[root@kvm2 ~]# drbdsetup new-resource gw3 1
[root@kvm2 ~]# echo $?
0

Once I'd verified that the drbdsetup sub-command was working without error, I then tried the its calling command 'drbdadm up' and it too worked without issues
[root@kvm2 ~]# drbdadm down gw3

[root@kvm2 ~]# drbdadm up gw3

[root@kvm2 ~]# grep -i drbd /var/log/messages # relevant snippet below
Jun 24 10:08:06 kvm2 kernel: drbd: registered as block device major 147
Jun 24 10:08:06 kvm2 drbd[588458]: Starting DRBD resources: /lib/drbd/drbd: line 148: /var/lib/linstor/loop_device_mapping: No such file or directory
Jun 24 10:08:06 kvm2 drbd[588458]: [
Jun 24 10:08:06 kvm2 drbd[588458]: create res: gw3
Jun 24 10:08:06 kvm2 kernel: drbd gw3: Starting worker thread (from drbdsetup [588466])
Jun 24 10:08:06 kvm2 drbd[588458]: prepare disk: gw3
Jun 24 10:08:06 kvm2 kernel: drbd gw3 kvm1: Starting sender thread (from drbdsetup [588470])
Jun 24 10:08:06 kvm2 drbd[588458]: prepare net: gw3
Jun 24 10:08:06 kvm2 drbd[588458]: prepare net: gw3
Jun 24 10:08:06 kvm2 drbd[588458]: adjust peer_devices: gw3
Jun 24 10:08:06 kvm2 kernel: drbd gw3/0 drbd1: meta-data IO uses: blk-bio
Jun 24 10:08:06 kvm2 kernel: drbd gw3/0 drbd1: disk( Diskless -> Attaching )
Jun 24 10:08:06 kvm2 kernel: drbd gw3/0 drbd1: Maximum number of peer devices = 1
Jun 24 10:08:06 kvm2 kernel: drbd gw3: Method to ensure write ordering: drain
Jun 24 10:08:06 kvm2 kernel: drbd gw3/0 drbd1: drbd_bm_resize called with capacity == 104854328
Jun 24 10:08:06 kvm2 kernel: drbd gw3/0 drbd1: resync bitmap: bits=13106791 words=204794 pages=400
Jun 24 10:08:06 kvm2 kernel: drbd gw3/0 drbd1: size = 50 GB (52427164 KB)
Jun 24 10:08:06 kvm2 kernel: drbd gw3/0 drbd1: size = 50 GB (52427164 KB)
Jun 24 10:08:06 kvm2 kernel: drbd gw3/0 drbd1: recounting of set bits took additional 1ms
Jun 24 10:08:06 kvm2 kernel: drbd gw3/0 drbd1: disk( Attaching -> Outdated )
Jun 24 10:08:06 kvm2 kernel: drbd gw3/0 drbd1: attached to current UUID: 5CC3B789B5B565F0
Jun 24 10:08:06 kvm2 drbd[588458]: adjust disk: gw3
Jun 24 10:08:06 kvm2 drbd[588458]: attempt to connect: gw3
Jun 24 10:08:06 kvm2 drbd[588458]: ]

That warning "Starting DRBD resources: /lib/drbd/drbd: line 148: /var/lib/linstor/loop_device_mapping: No such file or directory" is new following the drbd90-utils upgrade, but did not affect my particular configuration at all, verified on multiple DRBD setups. Could be possibly be a problem for someone else using that functionality though...

toracat

2020-06-24 18:17

administrator   ~0007013

@chansen

Thanks for your detailed report and the note about a possible/potential issue with the -utils package.

If no more apparent problem, I will close the ticket as 'resolved'.

cat5dm

2020-06-26 16:27

reporter   ~0007022

Thanks for solving this issue in a very short term.
Tested this in a full HA environment including pacemaker and corosync.
Works fine.
The warning @chansen is talking about is generated only when running in daemon mode. I haven't seen it in the HA mode I was testing it.

Thanks again

toracat

2020-06-26 17:10

administrator   ~0007023

Excellent!

Thanks, both, for the thorough testing. I am now closing this ticket.

Issue History

Date Modified Username Field Change
2020-06-23 10:25 cat5dm New Issue
2020-06-23 10:25 cat5dm Status new => assigned
2020-06-23 10:25 cat5dm Assigned To => pperry
2020-06-23 10:45 toracat Note Added: 0006988
2020-06-23 13:33 chansen Note Added: 0006990
2020-06-23 13:43 chansen Note Added: 0006991
2020-06-23 13:43 toracat Note Added: 0006992
2020-06-23 14:15 toracat Note Added: 0006993
2020-06-23 14:17 cat5dm Note Added: 0006994
2020-06-23 14:21 cat5dm Note Added: 0006995
2020-06-23 14:27 toracat Note Added: 0006996
2020-06-23 14:45 cat5dm Note Added: 0006997
2020-06-23 14:46 chansen Note Added: 0006998
2020-06-23 14:57 chansen Note Added: 0006999
2020-06-23 15:04 toracat Note Added: 0007000
2020-06-23 15:08 chansen Note Added: 0007001
2020-06-23 15:19 toracat Note Added: 0007002
2020-06-23 15:44 chansen Note Added: 0007003
2020-06-23 16:00 chansen Note Added: 0007004
2020-06-23 16:08 toracat Note Added: 0007005
2020-06-24 01:40 toracat Note Edited: 0007005
2020-06-24 01:42 toracat Note Added: 0007006
2020-06-24 01:43 toracat Assigned To pperry => toracat
2020-06-24 02:43 cat5dm Note Added: 0007007
2020-06-24 09:18 chansen Note Added: 0007008
2020-06-24 10:31 toracat Note Added: 0007010
2020-06-24 12:21 chansen Note Added: 0007012
2020-06-24 18:17 toracat Note Added: 0007013
2020-06-26 16:27 cat5dm Note Added: 0007022
2020-06-26 17:10 toracat Note Added: 0007023
2020-06-26 17:11 toracat Status assigned => resolved
2020-06-26 17:11 toracat Resolution open => fixed