ZFS debugging | Stéphane D'Alu

Table of content

Configuration
Dataset
Labels
Corrupted mirror
Unable to destroy snapshot

Sometimes shitty things happen and you need to dig into the insides of ZFS. To this end, there is the zdb command, some of these options are presented below, and some example of its use will be detailed in practical examples.

Requirement: ZFS
Reference: Open ZFS

Configuration

Lists the pool configuration with the attached devices (vdev):

Pool configuration

1	`zdb -C tank`

Dataset

Lists information about the various dataset (clone, snapshot, …), this information includes: name, id number, last transaction number, used space, and objects count:

Information about datasets

1	`zdb -d tank`

Labels

Labels correspond to the configuration description of the pool but they are stored on each disk and include disk specific information. There are 4 labels per disk, two at the beginning and two at the end. They should normally be identical.

List labels

1	`zdb -l /dev/ad10`

Corrupted mirror

A crash of the system has prevented ZFS to complete its write operations on a mirror which has led to a difference in the metadata stored on the two disks. Consequently, the pool is no longer available, because corrupted, event if each disk is online.

The goal now is to identify which disk metadata has been partly written, in order to be able to find a healthy pool by removing the corrupted disk. This approach, somewhat abrupt, works well only in the mirror case. For this, the ZFS labels (metadata) will be read to compare the transaction numbers (fields: txg) and find the order in which records have been written.

zdb -l /dev/ad12
zdb -l /dev/ad10

Actually, during the labels display, the ad10 hard drive is immediately identified as corrupted as only two labels are legible on the 4 availables (two labels are written at the beginning and two others at the end).

The rebuild process chosen is to break the mirror and recreate it:

zpool detach tank ad10
zpool attach tank ad10

This is where a mistake was made. Indeed, at the time (ZFS v15) the pool’s policy was to automatically increase its capacity if the drive capacity allows it. Unfortunately the two disks, although with the same reference, were not of the same capacity, the transition to a single disk caused an increase in capacity of the pool, preventing later the mirror reconstruction because the disk added was then too small. It was therefore necessary to make a backup to entirely rebuild the pool. Since then, the autoexpand attribut was added and is set to false by default.

Unable to destroy snapshot

# zfs destroy tank/data@foobar
cannot destroy 'tank/data@foobar': dataset already exists

There was a bug (CR-6860996) leading to this type of problem when receiving an incremantal stream (zfs recv), a temporary clone, with the “%” character in its name, is created but not deleted automatically.

So we will look for this clone and destroy it explicitly:

Destroying leftover snapshots

1 2	`zdb -d tank \| grep % # Looking for the clone zfs destroy clone-with-%-in-the-name # Detroying the clone`