I have been running a 3 disk RAID 1 btrfs array for a few weeks now and
simulated more than a dozen power failures without issue. Yesterday I
decided to add my 4th disk, which was the original media drive.
I was able to add it to the array easy enough and then initiated a full
balance, however, when I was trying to squeeze the case panels back on
(SATA power cables are stiff!) two of the drives (sdc, sdd) lost power. I
didn't notice the problem for a few minutes until I looked at the storage
section of Cockpit.
I immediately initiated a shutdown and corrected the issue by rerouting the
power cables a bit.
After booting back up I tried to restart the balance but it would run
briefly and then stop. dmesg shows:
[ 155.497940] BTRFS info (device sdc): balance: start -d -m -s
[ 155.499804] BTRFS info (device sdc): relocating block group
10451967541248 flags data|raid1
[ 162.517172] BTRFS info (device sdc): found 15 extents, stage: move data
extents
[ 162.776081] BTRFS error (device sdc): tree level mismatch detected,
bytenr=10359313989632 level expected=0 has=2
[ 162.776087] BTRFS error (device sdc): tree level mismatch detected,
bytenr=10359313989632 level expected=0 has=2
[ 162.776090] BTRFS error (device sdc): tree level mismatch detected,
bytenr=10359313989632 level expected=0 has=2
[ 162.855784] BTRFS info (device sdc): balance: ended with status: -117
I ran a scrub and a little over 5 hours later it shows it made 4
corrections:
# btrfs scrub start -B /mnt/media
scrub done for d9a2a011-77a2-43be-acd1-c9093d32125b
Scrub started: Tue Apr 13 17:09:51 2021
Status: finished
Duration: 5:20:52
Total to scrub: 7.34TiB
Rate: 399.47MiB/s
Error summary: verify=4
Corrected: 4
Uncorrectable: 0
Unverified: 0
WARNING: errors detected during scrubbing, corrected
However, when I try to balance again I still get errors:
[19480.281381] BTRFS info (device sdc): balance: start -d -m -s
[19480.298196] BTRFS info (device sdc): relocating block group
10451967541248 flags data|raid1
[19485.587630] BTRFS info (device sdc): found 15 extents, stage: move data
extents
[19485.801278] BTRFS error (device sdc): tree level mismatch detected,
bytenr=10359313989632 level expected=0 has=2
[19485.801287] BTRFS error (device sdc): tree level mismatch detected,
bytenr=10359313989632 level expected=0 has=2
[19485.801288] BTRFS error (device sdc): tree level mismatch detected,
bytenr=10359313989632 level expected=0 has=2
[19485.865739] BTRFS info (device sdc): balance: ended with status: -117
Where do I go from here? Googling "tree level mismatch detected" shows
mostly source code results.
Thanks,
Richard