DevHeads.net

encrypted swap with urandom key problem

Hi,

I'm having a heck of a time trying to troubleshoot an indefinite hang
on startup due to setting up crypto swap on one of my laptops. At this
point it seems computer specific. I can't reproduce it on two other
(dissimilar) computers or any qemu-kvm VM.

Without any debugging enabled, the problem happens perhaps 1 in 10
boots. With debugging (systemd.log_level=debug rd.udev.debug
systemd.debug-shell=1) I can reverse that, about 9 in 10 boots hang.
But there still isn't enough information why.

The first hint of a problem is this huge delay in the journal for cryptsetup:

[ 11.464519] flap.local systemd[577]: Operating on architecture: x86-64
[ 21.710149] flap.local systemd-cryptsetup[606]: Set cipher aes,
mode xts-plain64, key size 256 bits for device
/dev/disk/by-partuuid/688b193f-3b38-4ca5-8b65-2ef61f27ec83.
...
[ 21.777721] flap.local systemd[606]:
systemd- ... at cryptswap dot service: Executing:
/usr/lib/systemd/systemd-cryptsetup attach cryptswap
/dev/disk/by-partuuid/688b193f-3b38-4ca5-8b65-2ef61f27ec83
/dev/urandom swap,cipher=aes-xts-plain64,size=256
...
[ 22.732131] flap.local systemd[1]: Child 606 (systemd-cryptse) died
(code=exited, status=0/SUCCESS)

So cryptsetup does succeed. In the early debug shell, I can see it
exist with dmsetup and blkid also shows the /dev/mapper/cryptoswap
device is already formatted swap, so the mkswap command likewise
succeeded.

[ 22.732726] flap.local systemd[725]:
systemd- ... at cryptswap dot service: Executing: /sbin/mkswap
/dev/mapper/cryptswap
...
[ 22.742728] flap.local systemd[1]: Child 725 (mkswap) died
(code=exited, status=0/SUCCESS)

For whatever reason, swapon never happens.

At this point, while in early debug shell, I issue 'systemctl
list-jobs' I get many services 'waiting' with one device stuck
running.

126 dev-mapper-cryptswap.device start running

If at this time I manually 'swapon /dev/mapper/cryptswap' the command
works, and startup resumes.

Anyway I'm stuck, any ideas how to get more information? If it's
hardware specific, could it be a wrong dependency on hwrng and maybe
there just isn't enough entropy? Hence the cryptsetup delay? And then
that delay results in some other race that hangs swapon? How would I
go about showing there is or isn't enough entropy at the time
cryptsetup executes?

I've updated this bug with the latest logs and findings which should
all match the pid and timestamps in this email.

<a href="https://bugzilla.redhat.com/show_bug.cgi?id=1691589" title="https://bugzilla.redhat.com/show_bug.cgi?id=1691589">https://bugzilla.redhat.com/show_bug.cgi?id=1691589</a>

Comments

Re: encrypted swap with urandom key problem

By Chris Murphy at 06/08/2019 - 17:34

Best as I can tell, for some reason the failure case, systemd doesn't
see the formatted swap device appear.

In a working case it looks like this:
[ 23.310555] flap.local systemd[1]:
dev-disk-by\x2duuid-dcae3053\x2d1cc2\x2d4890\x2da33b\x2d6d71b3dc97df.device:
Changed dead -> plugged
[ 23.310639] flap.local systemd[1]:
dev-disk-by\x2did-dm\x2dname\x2dcryptswap.device: Changed dead ->
plugged
[ 23.310681] flap.local systemd[1]: dev-mapper-cryptswap.device:
Changed dead -> plugged
[ 23.310724] flap.local systemd[1]: dev-mapper-cryptswap.device: Job
165 dev-mapper-cryptswap.device/start finished, result=done
[ 23.310765] flap.local systemd[1]: Found device /dev/mapper/cryptswap.
[ 23.312699] flap.local systemd[1]:
dev-disk-by\x2did-dm\x2duuid\x2dCRYPT\x2dPLAIN\x2dcryptswap.device:
Changed dead -> plugged
[ 23.312769] flap.local systemd[1]: dev-dm\x2d0.device: Changed dead
-> plugged
[ 23.312814] flap.local systemd[1]:
sys-devices-virtual-block-dm\x2d0.device: Changed dead -> plugged
[ 23.313594] flap.local systemd[1]: dev-mapper-cryptswap.swap: About
to execute: /sbin/swapon -o nofail /dev/mapper/cryptswap
[ 23.314123] flap.local systemd[1]: dev-mapper-cryptswap.swap:
Forked /sbin/swapon as 710
[ 23.314457] flap.local systemd[1]: dev-mapper-cryptswap.swap:
Changed dead -> activating
[ 23.314565] flap.local systemd[1]: Activating swap /dev/mapper/cryptswap...
[ 23.314831] flap.local systemd[710]: dev-mapper-cryptswap.swap:
Executing: /sbin/swapon -o nofail /dev/mapper/cryptswap
[ 23.322581] flap.local kernel: Adding 8388604k swap on
/dev/mapper/cryptswap. Priority:-2 extents:1 across:8388604k SSFS

However, in the failing case, that doesn't happen, and systemd hangs
indefinitely waiting for it to appear. And in the early debug shell,
'blkid' sees it. That means the kernel and libblkid see it. I've got
no idea why systemd doesn't see it.

Chris Murphy

Re: encrypted swap with urandom key problem

By Lennart Poettering at 06/17/2019 - 05:58

Wat does udevadm info on the block device node say? i.e. does the
device carry the right tag, does it have SYSTEMD_READY=0 set?

Lennart

Re: encrypted swap with urandom key problem

By Chris Murphy at 06/18/2019 - 10:44

On Mon, Jun 17, 2019 at 3:58 AM Lennart Poettering < ... at 0pointer dot de> wrote:
Yes.

# udevadm info /dev/mapper/cryptswap
P: /devices/virtual/block/dm-0
N: dm-0
L: 0
S: disk/by-id/dm-name-cryptswap
S: mapper/cryptswap
S: disk/by-id/dm-uuid-CRYPT-PLAIN-cryptswap
E: DEVPATH=/devices/virtual/block/dm-0
E: DEVNAME=/dev/dm-0
E: DEVTYPE=disk
E: MAJOR=253
E: MINOR=0
E: SUBSYSTEM=block
E: USEC_INITIALIZED=23080972
E: DM_UDEV_DISABLE_LIBRARY_FALLBACK_FLAG=1
E: DM_UDEV_PRIMARY_SOURCE_FLAG=1
E: DM_ACTIVATION=1
E: DM_NAME=cryptswap
E: DM_UUID=CRYPT-PLAIN-cryptswap
E: DM_SUSPENDED=0
E: DM_UDEV_RULES_VSN=2
E: SYSTEMD_READY=0
E: DEVLINKS=/dev/disk/by-id/dm-name-cryptswap /dev/mapper/cryptswap
/dev/disk/by-id/dm-uuid-CRYPT-PLAIN-cryptswap
E: TAGS=:systemd:

systemctl list-jobs says dev-mapper-cryptswap.device is running,
everything else is waiting, but when I get a systemctl status on
dev-mapper-cryptswap.device I get

● dev-mapper-cryptswap.device - /dev/mapper/cryptswap
Loaded: loaded
Drop-In: /run/systemd/generator/dev-mapper-cryptswap.device.d
└─90-device-timeout.conf
Active: inactive (dead)

Jun 18 08:19:41 flap.local systemd[1]: dev-mapper-cryptswap.device:
Merged dev-mapper-cryptswap.device/start into installed job
dev-mapper-cryptswap.device/start as 158
Jun 18 08:19:41 flap.local systemd[1]: dev-mapper-cryptswap.device:
Merged dev-mapper-cryptswap.device/start into installed job
dev-mapper-cryptswap.device/start as 158