Btrfs文件系统修复日记

记录作死之后的力挽狂澜的失败

Posted by wszqkzqk on May 7, 2023
本文字数:4387

前言

笔者最近在自己备用盘的Btrfs分区上疯狂作死,先是往同一个里面塞了5个Linux发行版,然后又把整个硬盘设备直接作为虚拟机磁盘使用,启动其中的Linux系统,又尝试在虚拟机的Windows中使用winbtrfs读取其中的内容,又在NTFS分区中安装Linux并将boot分区放在Btrfs中,还尝试同时在宿主机和虚拟机中挂载同一个Btrfs分区,最后在对外置的Btrfs分区有读写时尝试休眠宿主机,这时休眠失败,被迫强制关机,再次启动时,备用盘的Btrfs分区终于无法挂载了。(这一段混乱邪恶的东西塞得有点满)

P.S. 休眠失败,被迫强制关机可能是损坏的真正原因,笔者之前也有这样的经历。

错误日志

通过mount命令、udisk或者Dolphin等GUI文件管理器挂载时显示错误:

mount: /mnt: 文件系统类型错误、选项错误、/dev/sda3 上有坏超级块、缺少代码页或帮助程序或其他错误.
       dmesg(1) may have more information after failed mount system call.

可以用dmesg查看具体内容:

[29168.901277] BTRFS: device label MultiLinux devid 1 transid 2812 /dev/sda3 scanned by (udev-worker) (22938)
[29177.831978] BTRFS info (device sda3): using crc32c (crc32c-intel) checksum algorithm
[29177.831988] BTRFS info (device sda3): using free space tree
[29177.834719] BTRFS info (device sda3): bdev /dev/sda3 errs: wr 0, rd 0, flush 0, corrupt 24151150, gen 0
[29177.835597] BTRFS error (device sda3): level verify failed on logical 18941853696 mirror 1 wanted 1 found 0
[29177.835811] BTRFS error (device sda3): level verify failed on logical 18941853696 mirror 2 wanted 1 found 0
[29177.835820] BTRFS error (device sda3): failed to read block groups: -5
[29177.836119] BTRFS error (device sda3): open_ctree failed

排查与修复

超级块

笔者首先加了recoveryrousebackuproot等恢复性挂载参数,均没有作用,又尝试检查修复超级块,执行:

sudo btrfs rescue super-recover -v /dev/sda3

发现超级块并没有错误:

All Devices:
        Device: id = 1, name = /dev/sda3

Before Recovering:
        [All good supers]:
                device name = /dev/sda3
                superblock bytenr = 65536

                device name = /dev/sda3
                superblock bytenr = 67108864

                device name = /dev/sda3
                superblock bytenr = 274877906944

        [All bad supers]:

All supers are valid, no need to recover

修复chunk

笔者尝试修复chunk

sudo btrfs rescue chunk-recover -v /dev/sda3

在输出了冗长的日志后,显示:

Total Chunks:           72
  Recoverable:          72
  Unrecoverable:        0

Orphan Block Groups:

Orphan Device Extents:

Check chunks successfully with no orphans
Chunk tree recovered successfully

笔者以为分区修好了,尝试去挂载,结果又:

mount: /mnt: 文件系统类型错误、选项错误、/dev/sda3 上有坏超级块、缺少代码页或帮助程序或其他错误.
       dmesg(1) may have more information after failed mount system call.

dmesg输出也与之前一样:

[29168.901277] BTRFS: device label MultiLinux devid 1 transid 2812 /dev/sda3 scanned by (udev-worker) (22938)
[29177.831978] BTRFS info (device sda3): using crc32c (crc32c-intel) checksum algorithm
[29177.831988] BTRFS info (device sda3): using free space tree
[29177.834719] BTRFS info (device sda3): bdev /dev/sda3 errs: wr 0, rd 0, flush 0, corrupt 24151150, gen 0
[29177.835597] BTRFS error (device sda3): level verify failed on logical 18941853696 mirror 1 wanted 1 found 0
[29177.835811] BTRFS error (device sda3): level verify failed on logical 18941853696 mirror 2 wanted 1 found 0
[29177.835820] BTRFS error (device sda3): failed to read block groups: -5
[29177.836119] BTRFS error (device sda3): open_ctree failed

使用btrfs restore导出数据

在以上修复失败后,笔者尝试使用btrfs restore导出数据,执行:

sudo btrfs restore -v /dev/sda3 /var/tmp

然而,这样导出的数据非常不完整,很多子卷直接没有,几乎没有什么恢复作用。

使用btrfs check --repair检查与恢复

笔者尝试使用btrfs check --repair检查与恢复,执行:

sudo btrfs check --repair /dev/sda3

然而,修复并不可行:

enabling repair mode
WARNING:

        Do not use --repair unless you are advised to do so by a developer
        or an experienced user, and then only after having accepted that no
        fsck can successfully repair all types of filesystem corruption. E.g.
        some software or hardware bugs can fatally damage a volume.
        The operation will start in 10 seconds.
        Use Ctrl-C to stop it.
10 9 8 7 6 5 4 3 2 1
Starting repair.
Opening filesystem to check...
Checking filesystem on /dev/sda3
UUID: bfaeb419-b426-4d5d-ad54-9fbcfdf3e6b5
[1/7] checking root items
Error: could not find btree root extent for root 257
ERROR: failed to repair root items: No such file or directory

使用btrfs check --init-extent-tree初始化extent tree

笔者尝试使用btrfs check --init-extent-tree初始化extent tree,执行:

sudo btrfs check --init-extent-tree /dev/sda3

命令执行了很长时间,笔者因为中途有事,用Ctrl+C终止了命令,此时发现文件系统已经可以挂载了,执行btrfs scrub也没有报错,但是文件系统中的数据有大量缺失,此时使用btrfs check --repair修复时仍然提示:

enabling repair mode
WARNING:

        Do not use --repair unless you are advised to do so by a developer
        or an experienced user, and then only after having accepted that no
        fsck can successfully repair all types of filesystem corruption. E.g.
        some software or hardware bugs can fatally damage a volume.
        The operation will start in 10 seconds.
        Use Ctrl-C to stop it.
10 9 8 7 6 5 4 3 2 1
Starting repair.
Opening filesystem to check...
Checking filesystem on /dev/sda3
UUID: bfaeb419-b426-4d5d-ad54-9fbcfdf3e6b5
[1/7] checking root items
Error: could not find btree root extent for root 257
ERROR: failed to repair root items: No such file or directory

笔者再次尝试使用btrfs check --init-extent-tree修复文件系统。

在经过漫长的等待后,最终输出:

failed to repair damaged filesystem, aborting

之后运行btrfs check,输出:

ERROR: errors found in fs roots
found 65950724096 bytes used, error(s) found
total csum bytes: 61618052
total tree bytes: 2531147776
total fs tree bytes: 2419032064
total extent tree bytes: 32620544
btree space waste bytes: 408141112
file data blocks allocated: 125725290496
 referenced 103724642304

到目前为止,笔者已经尝试的所有修复方法均失败。