Bug 437888

Summary: Opens 19+k files and causes 'Too many open files' error
Product: [Frameworks and Libraries] frameworks-baloo Reporter: Yuri <yuri>
Component: generalAssignee: groot
Status: RESOLVED DOWNSTREAM    
Severity: normal CC: baloo-bugs-null, grahamperrin, nate, stefan.bruens, sylvain.saboua, tagwerk19
Priority: NOR    
Version: 5.82.0   
Target Milestone: ---   
Platform: Other   
OS: FreeBSD   
Latest Commit: Version Fixed In:
Sentry Crash Report:
Attachments: baloofilerc

Description Yuri 2021-05-31 06:15:35 UTC
Downstream bug report: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=256269

SOFTWARE/OS VERSIONS
KDE Plasma Version: 5.21.5
KDE Frameworks Version: 5.82.0
Qt Version: 5.15.2
Comment 1 tagwerk19 2021-05-31 15:40:49 UTC
Can you pin down the problem in any way?

Do you have that many files?

    What happens with a newly created user with nothing much
    in $HOME?

    Are you indexing hidden files? (Think that you might be
    indexing files in your wastebasket and all your thumbnails)

Are you indexing file content?

    What happens if you turn that off?

    Do you see files gradually being indexed if you run
    "balooctl monitor"

Sorry, not au fait with FreeBSD so this will be a bit of guesswork. What filesystem are you using?
Comment 2 Yuri 2021-05-31 16:16:00 UTC
(In reply to tagwerk19 from comment #1)

> Do you have that many files?
There are 1+M files in the home directory.

>     What happens with a newly created user with nothing much
>     in $HOME?
I didn't try but likely kde would start fine.

>     Are you indexing hidden files? (Think that you might be
>     indexing files in your wastebasket and all your thumbnails)

> Are you indexing file content?

The only thing that I did - I installed kde5. It does whatever kde5 does by default.



There is no need to keep files open in order to index them.

It looks like baloo either leaks file descriptors, or keeps them referenced for no good reason.
Comment 3 tagwerk19 2021-05-31 19:38:30 UTC
(In reply to Yuri from comment #2)
> The only thing that I did - I installed kde5. It does whatever kde5 does by
> default.
If you could attach the .config/baloofilerc file, that should say what's happening. I do see that different distributions set up baloo indexing in different ways.

> There is no need to keep files open in order to index them.
Clear.

What I see in Linux is that the "baloo_file" process makes a list of all the files that need indexing and, if content indexing is enabled, feeds batches of 40 files to the indexer process "baloo_file_extractor" to be read and indexed.

"baloo_file" does however attempt to work out the Mime Type of the files - and this can require a 'peek' into the file if it is not clear from the the file extension.

> It looks like baloo either leaks file descriptors, or keeps them referenced
> for no good reason.
I'm pretty sure I've have hit the same issue on Linux if there was a general problem. Googling says that "ulimit -n" shows the limit, this gives me 1024 and I have no trouble...
Comment 4 Yuri 2021-05-31 21:40:07 UTC
Created attachment 138911 [details]
baloofilerc

Attaching the .config/baloofilerc file.
Comment 5 tagwerk19 2021-06-01 07:07:05 UTC
If you append the line:

    only basic indexing=true

to the file, that should avoid the "content indexing".

What is not clear to me is why the rest of KDE is having trouble. I would have thought that if baloo leaked/didn't release resources, that would only affect baloo.

What happens if you kill baloo_file and restart (with a "balooctl enable", maybe running that twice)? Does it fail immediately or index a similar number of files before crashing?

I was also wondering about your 19+k files. This is from "balooctl status" as the number indexed? Did it show a counts of files "waiting for content indexing" and "failed to index"?
Comment 6 Yuri 2021-06-01 07:19:59 UTC
(In reply to tagwerk19 from comment #5)

> I would have thought that if baloo leaked/didn't release resources, that would only affect baloo.

No, the file limit that is exceeded is global. It's either per-user limit or a global limit.

This lsof log shows that baloo is a culprit https://people.freebsd.org/~yuri/lsof-2021-05-30_11%3A13%3A00.txt

There's no doubt that it leaks file descriptors.

Just close files, and this would solve the problem.
Comment 7 tagwerk19 2021-06-01 07:43:58 UTC
(In reply to Yuri from comment #6)
> This lsof log shows that baloo is a culprit
> https://people.freebsd.org/~yuri/lsof-2021-05-30_11%3A13%3A00.txt

> baloo_fil 58271       yuri 6331r    VREG                 0,146             263954 13886769 / (/dev/gpt/ssdrootfs)
Sorry, that's new to me. I think I'm not able to help. I know baloo requires stable devno and inode values otherwise it can treat whatever it's met as a "new file". No idea whether it's comfortable with VREG...
Comment 8 Yuri 2021-06-01 07:47:23 UTC
(In reply to tagwerk19 from comment #7)

> Sorry, that's new to me. I think I'm not able to help. I know baloo requires
> stable devno and inode values otherwise it can treat whatever it's met as a
> "new file". No idea whether it's comfortable with VREG...

baloo can keep file paths, context hash, etc. It just can't keep file descriptors open.
Comment 9 Stefan Brüns 2021-06-01 16:06:14 UTC
Its working fine on Linux. It probably is a problem in one of the libraries used by baloo.

Please find someone who is willing to fix this on FreeBSD. FreeBSD is not supported, as mentioned in its Readme.
Comment 10 tagwerk19 2021-06-01 17:07:31 UTC
I set up a:

    FreeBSD 13.0

    Plasma: 5.20.5
    Frameworks: 5.80.0
    Qt: 5.15.2
    Filesystem: UFS

and I see the same when looking at the lsof output, filtering on "baloo_fil". It does seem to be baloo_file (rather than baloo_file_extractor) that is responsible.

Baloo seems to work until you create more test files than "ulimit -n" says there are file descriptors, then it goes wobbly.

Think yes, flag this Confirmed.
Comment 11 groot 2021-06-03 22:39:26 UTC
tagwerk19@innerjoin.org thank you for looking into this closely and confirming it. Right now downstream there's just the one problem report (PR) open for baloo, but there have been a few in the past. I think the regular KDE packagers / maintainers downstream have baloo switched off (I know I do) for various reasons, so it's not something that shows up on the "dev radar" there.

[adridg@beastie ~]$ lsof | grep baloo_fil | wc -l
  644440

That's a lot of files, but my ulimit is higher:

open files                          (-n) 1884879

I have vague recollections that there are issues in file watches (e.g. inotify or fam wasn't doing the things needed, or there's a Qt class that is wonky) which is why a Linux box doesn't need to watch / open **nearly** as many files as the BSDs do. But, vague recollection, no more.
Comment 12 tagwerk19 2021-06-04 07:00:04 UTC
(In reply to groot from comment #11)
> That's a lot of files, but my ulimit is higher...
I notice that "ulimit -n" varies roughly with the amount of memory. I set up a 4Gbyte VM and got a ulimit of 116991. A quick check with an 8GB VM and I get 234943...

> I have vague recollections that there are issues in file watches (e.g.
> inotify or fam wasn't doing the things needed, or there's a Qt class that is
> wonky) which is why a Linux box doesn't need to watch / open **nearly** as
> many files as the BSDs do. But, vague recollection, no more.
Don't know on that one. On Linux, you can see the these things happen with strace (if you are brave) and it watches for changes in folders rather than files.
Comment 13 Yuri 2021-06-04 07:05:48 UTC
(In reply to tagwerk19 from comment #12)
> (In reply to groot from comment #11)
> > That's a lot of files, but my ulimit is higher...
> I notice that "ulimit -n" varies roughly with the amount of memory. I set up
> a 4Gbyte VM and got a ulimit of 116991. A quick check with an 8GB VM and I
> get 234943...

ulimit shouldn't even matter for baloo more than for other programs because individual processes shouldn't keep too many files open. There's no need to have more open files than number of CPUs for the purpose of their indexing.
Comment 14 Stefan Brüns 2023-07-06 19:37:08 UTC
Not reproducible on Linux, no response from reporter.
Comment 15 sylvain.saboua 2024-01-18 15:53:06 UTC
Seemingly same bug reported on openbsd:
https://marc.info/?l=openbsd-misc&m=170559205405527&w=2
QKqueueFileSystemWatcherEngine::addPaths: open: Too many open files

KDE/Plasma's elisa (music library player/manager) and Dolphin (file manager) both crash when trying to open/manage my music library.

When launched Elisa will show up to "Imported 396 tracks" and then do nothing.
The "Genres" section is full of genres but each contain zero (0) tracks. Opening another section (Files, Tracks, Artists, Albums) crashes the app displaying this email's subject as an error multiple times (caught using a terminal).

Dolphin also crashes when trying to open "/home/media/B - Musithèque"

$ls /home/media/B - Musithèque | wc -l
      624
$lla -R /home/media/B - Musithèque | wc -l
    11643
$

Thank you