Shrinking XFS partition on LVM

Why

I learned that XFS has no checksumming of stored data, but there's dm-integrity, that can help with that. It can be used as a separate layer like dm-crypt, but if you happen to use LVM it can be dynamically added by lvconvert --raidintegrity y LV.

Here's the problem:

raspberrypi3:~ # lvconvert --raidintegrity y icy_box/data
  Volume group "icy_box" has insufficient free space (0 extents): 3832 required.

It needs some space. And if your XFS has already been given all available space, you're stuck. You can't shrink XFS, right?

You can shrink XFS!

Yes, since Linux 5.12 it's possible (although in some limited way, details below).

Of course it says it's experimental and will scream at you that it will eat your data. You have been warned. Let's continue anyway.

XFS structure in a nutshell

Those are stats about my partition (after some shrinking if you're curious). I created it with mkfs.xfs without any extra options, just using the defaults.

raspberrypi3:~ # xfs_info /mnt/data/
meta-data=/dev/mapper/icy_box-data isize=512    agcount=32, agsize=45785360 blks
         =                       sectsz=512   attr=2, projid32bit=1
         =                       crc=1        finobt=1, sparse=1, rmapbt=0
         =                       reflink=1    bigtime=1 inobtcount=1
data     =                       bsize=4096   blocks=1460908544, imaxpct=5
         =                       sunit=16     swidth=48 blks
naming   =version 2              bsize=4096   ascii-ci=0, ftype=1
log      =internal log           bsize=4096   blocks=521728, version=2
         =                       sectsz=512   sunit=16 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0

XFS keeps data in blocks. Block size in my case is 4096 bytes as seen next to bsize=. The size of whole XFS partition is data blocks count (1460908544) * block size (4096). The log is internal, so it resides in the data section.

XFS partition is internally split into allocation groups ("AG" in short). Now you may guess what agcount= and agsize= mean.

This is what df says about it:

raspberrypi3:~ # df -B 4K /mnt/data
Filesystem                4K-blocks      Used  Available Use% Mounted on
/dev/mapper/icy_box-data 1460386816 286650528 1173736288  20% /mnt/data

Number of data blocks (1460908544) - number of log blocks (521728) == number of blocks visible in df output (1460386816).

OK, so how do I shrink my XFS filesystem?

xfs_growfs -D ${new_number_of_data_blocks} /mnt/data

Currently shrinking allows only to remove unused space from last AG. Even if you reduce the last AG to 0, you won't get any further.

According to Jan Engelhardt doing it in small steps helps a bit:

it cannot be shrunk all at once, but if you loop it:

# xfs_growfs -D 50335744 /mnt
[EXPERIMENTAL] try to shrink unused space 50335744, old size is 67108864
xfs_growfs: XFS_IOC_FSGROWFSDATA xfsctl failed: No space left on device

# for ((i=67108864;i>=0;i-=4096)); do xfs_growfs -D $i /mnt || break; done

it will run for a while until 50331648 or something.

How do I shrink LV to correct size?

lvreduce --size -${number_of_freed_bytes}b icy_box/data

Number of freed bytes is obviously number of freed blocks * block size.

Let's say you run xfs_growfs in a loop as shown above and you forgot how big your partition was at the start. How to get the number of freed blocks?

Try to grow the partition to some random big number of blocks (1466000000 in my case) with -n (no change) flag.

raspberrypi3:~ # xfs_growfs -n -D 1466000000 /mnt/data/
meta-data=/dev/mapper/icy_box-data isize=512    agcount=32, agsize=45785360 blks
         =                       sectsz=512   attr=2, projid32bit=1
         =                       crc=1        finobt=1, sparse=1, rmapbt=0
         =                       reflink=1    bigtime=1 inobtcount=1
data     =                       bsize=4096   blocks=1460908544, imaxpct=5
         =                       sunit=16     swidth=48 blks
naming   =version 2              bsize=4096   ascii-ci=0, ftype=1
log      =internal log           bsize=4096   blocks=521728, version=2
         =                       sectsz=512   sunit=16 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0
data size 1466000000 too large, maximum is 1465132032

So in my case number of data blocks is 1460908544, maximum is 1465132032, so I freed 4223488 blocks.

Here's how it went for me:

raspberrypi3:~ # lvreduce --size -17299406848b icy_box/data
  Rounding size to boundary between physical extents: 16.11 GiB.
  Rounding size 16.11 GiB (4124 extents) down to stripe boundary size 16.10 GiB (4122 extents).
  WARNING: Reducing active and open logical volume to 5.44 TiB.
  THIS MAY DESTROY YOUR DATA (filesystem etc.)
Do you really want to reduce icy_box/data? [y/n]: y
  Size of logical volume icy_box/data changed from 5.46 TiB (1430793 extents) to 5.44 TiB (1426671 extents).
  Logical volume icy_box/data successfully resized.

There was some mismatch betweed XFS block size and LVM extents.

raspberrypi3:~ # xfs_growfs -n -D 1466000000 /mnt/data/
[...]
data     =                       bsize=4096   blocks=1460908544, imaxpct=5
[...]
data size 1466000000 too large, maximum is 1460911104

Now there is free space only for 2560 XFS blocks.

Can I now enable raid integrity in LVM?

raspberrypi3:~ # lvconvert --raidintegrity y icy_box/data
  Insufficient free space: 3821 extents needed, but only 1374 available
  Failed to create integrity metadata LV
  Failed to add integrity.

I have no idea why after freeing 4122 extents only 1374 are available, but that's something for another time.