Bug 489971

Summary: Filelight wrongly computes space occupied by backintime snapshots
Product: [Applications] filelight Reporter: Xwang <xwaang1976>
Component: generalAssignee: Unassigned bugs mailing-list <unassigned-bugs>
Status: RESOLVED NOT A BUG    
Severity: normal CC: martin.sandsmark, sitter
Priority: NOR    
Version: unspecified   
Target Milestone: ---   
Platform: Other   
OS: Linux   
Latest Commit: Version Fixed In:
Sentry Crash Report:
Attachments: Filelight output
The hdd is smaller than the space claimed as occupied by filelight !!!
Space correctly computed by dolphin

Description Xwang 2024-07-09 11:20:33 UTC
Created attachment 171498 [details]
Filelight output

SUMMARY

Filelight wrongly computes space occupied by backintime snapshots 

STEPS TO REPRODUCE
1.  Execute filelight on a folder containing multiple backintime snapshots
2.  Wait for filelight computation
3.  Compare filelight space with hdd or sdd maximum space

OBSERVED RESULT

Filelight computes a space bigger than reality (probably summing the space of each file even though they are all pointing to the same file because it has not been changed between snapshots)

EXPECTED RESULT

Filelight should compute only the space really occupied on the hdd 

SOFTWARE/OS VERSIONS
Windows: 
macOS: 
Linux/KDE Plasma: 
(available in About System)
KDE Plasma Version: 
KDE Frameworks Version: 
Qt Version: 

ADDITIONAL INFORMATION
Comment 1 Xwang 2024-07-09 11:22:12 UTC
Operating System: Arch Linux 
KDE Plasma Version: 6.1.2
KDE Frameworks Version: 6.3.0
Qt Version: 6.7.2
Kernel Version: 6.9.8-arch1-1 (64-bit)
Graphics Platform: Wayland
Processors: 12 × 12th Gen Intel® Core™ i5-1240U
Memory: 15.3 GiB of RAM
Graphics Processor: Mesa Intel® Graphics
Manufacturer: Dell Inc.
Product Name: Latitude 9330
Comment 2 Xwang 2024-07-09 11:23:59 UTC
Created attachment 171499 [details]
The hdd is smaller than the space claimed as occupied by filelight !!!
Comment 3 Harald Sitter 2024-07-09 12:12:54 UTC
It's my understanding that this information isn't available outside btrfs specific ioctl commands (those would only work as root). So that is expected behavior on btrfs. I guess you could complain to the kernel folks about not providing suitable API for getting the real/compressed/occupied size of a file.

e.g. on windows we can easily tell if a file is sparse, unpinned (not present on disk -- e.g. only available in the cloud), and then obtain the compressed size (the size on disk). 

https://invent.kde.org/utilities/filelight/-/blob/cbc765ec2f23766f8f76a06d5f7753853a9d5127/src/windowsWalker.cpp
Comment 4 Xwang 2024-07-09 12:59:19 UTC
(In reply to Harald Sitter from comment #3)
> It's my understanding that this information isn't available outside btrfs
> specific ioctl commands (those would only work as root). So that is expected
> behavior on btrfs. I guess you could complain to the kernel folks about not
> providing suitable API for getting the real/compressed/occupied size of a
> file.
> 
> e.g. on windows we can easily tell if a file is sparse, unpinned (not
> present on disk -- e.g. only available in the cloud), and then obtain the
> compressed size (the size on disk). 
> 
> https://invent.kde.org/utilities/filelight/-/blob/
> cbc765ec2f23766f8f76a06d5f7753853a9d5127/src/windowsWalker.cpp

Sorry I do not understand, how it is possible that going to Properties of the folder in Dolphin the space occupied by the same folder is correctly computed (as cleary shown on the second attached image)?
Comment 5 Xwang 2024-07-09 13:01:27 UTC
Created attachment 171502 [details]
Space correctly computed by dolphin
Comment 6 Harald Sitter 2024-07-09 13:20:59 UTC
Because it isn't the correct size. It shows you the theoretical size (e.g. a text file with 4 ascii characters is 4 bytes in size in dolphin) meanwhile filelight shows the actual size on disk (e.g. a text file with 4 ascii characters is actually multiple KiB because files are allocated in blocks so even if the file is smaller than a block it still occupies at least that one block). The advantage of the former is that the filesystem can freely lie about the size. The trouble is filelight doesn't need a lie, it needs to know the actual properties of the file to then determine what type of size we want to know given the properties of a file (is it real, does it exist on disk, is it compressed). And Linux has no API for that as far as I know.
Comment 7 Harald Sitter 2024-07-09 13:30:14 UTC
Oh BTW, there is also a visual design problem here.

Say you have the directory @0 that looks like this:

@0
├── bar
├── foo
└── meow

Let's further assume there is a reflinked copy @1 and a reflinked copy @2 of @1.

@1 has bar modified.
@2 has bar modified from @1. @2 also has foo modified.

i.e. bar is unique in all 3. foo is unique in @2 but shared in @0 and @1. meow is shared in all 3.

How do you visualize that?
Comment 8 Xwang 2024-07-09 13:37:06 UTC
(In reply to Harald Sitter from comment #7)
> Oh BTW, there is also a visual design problem here.
> 
> Say you have the directory @0 that looks like this:
> 
> @0
> ├── bar
> ├── foo
> └── meow
> 
> Let's further assume there is a reflinked copy @1 and a reflinked copy @2 of
> @1.
> 
> @1 has bar modified.
> @2 has bar modified from @1. @2 also has foo modified.
> 
> i.e. bar is unique in all 3. foo is unique in @2 but shared in @0 and @1.
> meow is shared in all 3.
> 
> How do you visualize that?

Ok, I understand that it is not considered a bug for filelight, however as a user I found the output of filelight misleading. 

For example du or gnome's baobab both report measures in line with dolphin's one which IMHO seems more in line with what the user is looking for (namely a quick and maybe crude visualization of space occupied by folders).