I ran compsize on my debian box. Most files on my btrfs drive are around 20 GB. Almost all are uncompressed. I have 6000 files and 221000 regular extents.

Is that too much fragmentation? The ideal case is 1 extent per file.

I am reading around 100 MiBps from the drive out of a theoretical max of ~119 MiBps on a 1 Gbps line.

edit: On a local read I am getting 130-150 MiBps which exceeds the 1 Gbps network. pv /path/to/file >/dev/null

edit 2: For reference, this is a WD Red 6TB drive from around 2018-2020. Max speed should be in the 200 - 250 MBps range.

I defragged a ~300 GB folder and deleted some unneeded files. Extents per file actually went up, but I think that’s because the remaining files are heavily fragmented (many 70+ extents per file). Somewhat surprisingly, most/all of the defragged files still had 3-10 extents. Each file is under 2 GB.

Before: ~35 extents per file. After: 55 extents per file.

compsize /path/to/folder
Processed 2648 files, 145287 regular extents (145287 refs), 1 inline.
Type       Perc     Disk Usage   Uncompressed Referenced
TOTAL       99%      1.5T         1.5T         1.5T
none       100%      1.5T         1.5T         1.5T
zstd        19%      236M         1.1G         1.1G
  • just_another_person@lemmy.world
    link
    fedilink
    arrow-up
    2
    ·
    edit-2
    8 hours ago

    There is no “normal” amount of fragmentation on modern filesystems that do things like CoW. That’s kind of the point.

    If you’re reading and writing large files with a consistent amount of I/O, you’re going to have a higher amount of fragmentation because of the nature of CoW. This is by design. This doesn’t mean anything is wrong with the filesystem, just that peak performance soon after writing is not achieved. Btrfs and ZFS do online defrag and deferred scheduling of tasks for it to allow for EVENTUAL consistency as far as contiguous block forms go. The more free space you have, the sooner it will become cleaner.

    • BigHeadMode@lemmy.frozeninferno.xyzOP
      link
      fedilink
      arrow-up
      1
      ·
      4 hours ago

      Do you know the implications for wear on the drive (related to TBW [total bytes written] and write amplification) on a CoW / highly fragmented drive? I figure if they are planning on writing once, then moving the whole file to an empty space, that means ~2x the writes.

      • just_another_person@lemmy.world
        link
        fedilink
        arrow-up
        1
        ·
        2 hours ago

        I don’t think it could possibly be measured because it’s something like: (file size ÷ block size) * num_writes

        So it entire depends on the types of files, how often you’re utilizing writes to disk…etc. I just wouldn’t worry about it. If you REALLY want to estimate the tax: use iostat to check the number of writes on the drive in the last 24 hours, THEN enable online defrag and check it again in 24 hours. See what the difference is.

        It really doesn’t matter for HDD though. Barely probably matters for SSD.

    • Static_Rocket@lemmy.world
      link
      fedilink
      English
      arrow-up
      1
      ·
      6 hours ago

      Note: BTRFS defrag will result in a different copy at the end of the day. If you’re using snapshots this will lead to increased utilization.

    • non_burglar@lemmy.world
      link
      fedilink
      arrow-up
      1
      ·
      8 hours ago

      Btrfs and ZFS do online defrag

      News to me for ZFS. Are you talking about the recently implemented rewrite? Because “defrag” isnt really what that does, it simply consolidates metaslab data to (possibly) free up low-use blocks.

      Using ZFS fragmentation profile import/export and/or enabling dynamic gang headers can certainly help with high fragmentation.

    • BigHeadMode@lemmy.frozeninferno.xyzOP
      link
      fedilink
      arrow-up
      1
      ·
      9 hours ago

      Btrfs … do online defrag and deferred scheduling of tasks for it to allow for EVENTUAL consistency as far as contiguous block forms go. The more free space you have, the sooner it will become cleaner.

      Based on my research, this has to be requested by the user/OS. I don’t think Openmediavault (Debian derivative) enables auto defrag out of the box. If I set this up manually, I definitely didn’t enable it. You can run a defrag command online (while the system is running) or set the mount to do it automatically.

      • just_another_person@lemmy.world
        link
        fedilink
        arrow-up
        1
        ·
        8 hours ago

        It should be a default, but I can see why it would be disabled for SSDs to prevent using cycles unnecessarily. If you’re using HDDs, check and see if it’s enabled.

        Either way, unless you’re REALLY needing some minor performance improvements out of your disks, it shouldn’t make a huge difference.