Bug 144948 - [wish] Correctly count hardlinks in Filelight
Summary: [wish] Correctly count hardlinks in Filelight
Status: RESOLVED FIXED
Alias: None
Product: filelight
Classification: Applications
Component: general (show other bugs)
Version: 1.0
Platform: Gentoo Packages Linux
: NOR wishlist
Target Milestone: ---
Assignee: Martin Sandsmark
URL:
Keywords:
: 313402 (view as bug list)
Depends on:
Blocks:
 
Reported: 2007-05-02 11:59 UTC by kaplun
Modified: 2023-09-27 14:04 UTC (History)
8 users (show)

See Also:
Latest Commit:
Version Fixed In:


Attachments
Proposed visual representation for hardlinked files (21.42 KB, image/png)
2009-02-08 20:35 UTC, Stephan Sokolow
Details
Screenshot (187.17 KB, image/png)
2019-07-23 21:29 UTC, Martin Sandsmark
Details

Note You need to log in before you can comment on or make changes to this bug.
Description kaplun 2007-05-02 11:59:32 UTC
Version:           1.0 (using KDE KDE 3.5.6)
Installed from:    Gentoo Packages
Compiler:          gcc-4.1.2 
OS:                Linux

I recently noted tha Filelight doesn't handle hardlinks. Infact I have 700Mb file hardlinked in many places in the same filesystem and Filelight account its size as many time as there are links.
Would it be possible to account those file just the first time they are found?
Best regards
Comment 1 Stephan Sokolow 2009-02-08 20:35:16 UTC
Created attachment 31133 [details]
Proposed visual representation for hardlinked files

Since accounting hardlinked files just the first time they're found is misleading (Delete the folder that they're listed under and you don't free up the space you thought you would), here's a proposed mock-up for displaying what portion of each folder's reported space is due to files with multiple hardlinks.

This would be simple to implement (For each stat() call, grab st_nlink as well as st_size and then draw two sub-segments if any of the files have st_nlink > 1) and would be much more useful for determining how to most effectively free up space.
Comment 2 Stephan Sokolow 2009-02-08 20:46:38 UTC
Oh, I should probably explain my choice of pattern for the mockup since it IS deliberate.

First, by using a pattern rather than a color, I ensure that it cannot be mistaken for a separate folder. Second, it produces two implicit meanings which preserve Filelight's intuitive and easy-to-remember design:

1. By using an equal mix of the original color and something close to the background color (but not quite, because that feels sharp and ugly. Hence why the white lines in the pattern are drawn at 50% opacity) in a repeating pattern, there's the implication that the space is shared with files elsewhere.
2. By reducing the perceived saturation of the sub-segment, there's also an implicit connection to intended meaning of "this space may not be freed if the folder is deleted"

I just chose diagonal lines for the repeating pattern because I felt it more aesthetically pleasing than the alternatives I could think of which would be recognizable in potentially tiny radial segments. (eg. vertical lines, horizontal lines, checkerboard pattern, etc.)
Comment 3 aw81 2009-03-09 08:48:15 UTC
I like the mockup although I'm not sure if it's the best solution in every case and if there should also be an option to show hardlinks only in one directory. I tried different tools and I still can't say what _should_ be the right way to handle hardlinks.
My backup is based on snapshots created with rsync+hardlinks, so I have many directories that share a great amount of files and I would like to see only where something has changed. Baobab does a good job at this and even seems to count the oldest entry (or the entry in the oldest directory - do hardlinks know which one was created first?), but it's also only half of the story. An alternative solution would be to have an option to show only one file like baobab does, but mark this entries with multiple hardlinks like in Stephans mockup.
Comment 4 Martin Sandsmark 2010-05-27 01:14:23 UTC
This isn't trivial, and we would have to store more information about each file in memory which would increase memory usage too. I myself don't have any need for it either, so I do not on implementing it in the nearest future. Patches accepted, though.
Comment 5 Martin Sandsmark 2013-01-17 21:02:53 UTC
*** Bug 313402 has been marked as a duplicate of this bug. ***
Comment 6 Martin Sandsmark 2019-07-23 21:25:20 UTC
Better late than never, but I started in this branch (basically just implemented your mockup): 
https://cgit.kde.org/filelight.git/log/?h=martin/hardlinks

Because it's also possible to get the number of hardlinks pointing to the file, I thought we maybe could shade it darker or something depending on how many hardlinks there are. Could display it in the tooltip, anyways, maybe some fancy percentages.

But unless I'm wrong, I don't think we can find the other hardlinks unless they're in the paths we're scanning.
Comment 7 Martin Sandsmark 2019-07-23 21:29:44 UTC
Created attachment 121698 [details]
Screenshot

FWIW, here is a screenshot of the implementation
Comment 8 Jens 2022-03-11 17:22:30 UTC
I was just about to report the same bug.
I want to see the space saved due to hardlinking a lot of files, e.g. for an "rsync --link-dest=" - style backup scheme. 
Currently, Filelight does not do this.

Can we please have an implementation like the one proposed by Marting in the mainline branch?

Thank you!
Comment 9 lrdarknesss 2023-03-04 14:56:59 UTC
It would be very nice if this issue would get some attention after all these years.
I was very confused why Filelight was telling me, that the /timeshift folder contains several hundreds of GB, when it shouldn't back up the user directory.
Then I realized that Timeshift works with hard links, and Filelight just doesn't count these correctly.
Comment 10 Jeremy Whiting 2023-09-27 14:04:50 UTC
Fixed in January here: https://invent.kde.org/utilities/filelight/commit/7356cc3bc8164073e161820c32321ec8bf5b52c4 so was in Gear 23.04 and newer.