Bug 359119 - baloo_file_extractor out-of-memory handling (resource limits and more)
Summary: baloo_file_extractor out-of-memory handling (resource limits and more)
Status: RESOLVED DUPLICATE of bug 400704
Alias: None
Product: frameworks-baloo
Classification: Frameworks and Libraries
Component: Baloo File Daemon (show other bugs)
Version: 5.18.0
Platform: Kubuntu Linux
: NOR grave
Target Milestone: ---
Assignee: Pinak Ahuja
URL: https://paste.kde.org/pgxrvfaep
Keywords:
: 356176 (view as bug list)
Depends on:
Blocks:
 
Reported: 2016-02-08 00:49 UTC by Alexander Zhigalin
Modified: 2018-11-26 21:32 UTC (History)
10 users (show)

See Also:
Latest Commit:
Version Fixed In:
Sentry Crash Report:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Alexander Zhigalin 2016-02-08 00:49:41 UTC
I don't really know what that piece is doing, though it surely isn't worth 3GB of memory.
So I just shoot it down.

Reproducible: Always

Steps to Reproduce:
1. Install Kubuntu 15.10
2. Update your KDE to 5.13 from kubuntu-backports ppa
3. Enjoy

Actual Results:  
The link I gave is self-explanatory, isn't it?

Expected Results:  
Ballo mustn't eat all my RAM and swap even before I start to use my computer.
Comment 1 Alexander Zhigalin 2016-02-08 00:56:55 UTC
Sorry, wrong version.
Kde version is 5.5.3 while frameworks are 5.18.0
Comment 2 J'informatique 2016-07-04 00:06:29 UTC
Hello, I'm running KDE neon user edition since may 2016 and I have this exact same problem. Every morning I boot up my PC at work, I have to kill the process baloo_file_extractor because it eats one CPU core at 100%.

How to change the status to Confirmed?
Comment 3 J'informatique 2016-07-05 13:56:17 UTC
Hi to give you more information, I am using:
KDE Frameworks 5.23.0
Qt 5.6.0 (built against 5.6.0)

Distributor ID: neon
Description:    KDE neon User Edition 5.6
Release:        16.04
Codename:       xenial
Comment 4 Christoph Cullmann 2016-09-11 11:35:56 UTC
e.g. see conversation on framworks-devel:

Hi,

> Hi Christoph,
> 
> On 10 September 2016 at 23:46, Christoph Cullmann <cullmann@absint.com> wrote:
>>>> Would it be a good idea to restrict the file extractor process to some
>>> fixed amount of memory
>>>> to use via setrlimit? (or more fancy stuff?)
>>>
>>> That would probably just make Baloo crash, so fixing the bug is probably
>>> the better option.
>> Actually, that we don't limit the resources + sandbox baloo_file_indexer is the
>> bug,
>> not that some meta info extractor is buggy (which should be fixed, too).
>>
>> ATM, the state is:
>>
>> 1) baloo is on per default
>> 2) it will index at least your home
>> 3) if it encounters any "bad" file, it will OOM you, in my case in a way
>> that a normal user is doomed, as 1-2 seconds after login the machine is already
>> halted.
>>
>> Given that e.g. your "Downloads" might even contain "evil" files from the net,
>> at least some resource limit would be good and even better some sandbox, to
>> avoid that
>> the indexer which is easily pwn'd pwn's your session.
> 
> You've got a point there. In that case, what I'd do is:
> 
> 1) Limit resources on baloo_file_extractor.
> 2) Try to detect if it crashes because it exceeded limits, not sure if
> this is easily possible.
> 3) Mark files causing such crashes as files that should be skipped,
> and the user notified somehow (?).
I think just mark any files as "skipped for the future" for which the indexer crashs.
(if by hinting resource limit or just other index fail doesn't really matter, IMHO,
beside that it perhaps should be logged somewhere)

Other problem: after indexer crash, the DB is corrupted or locked.

It seems one not really does any proper lmdb locking, baloo_file even just kills the lockfile
on startup.

> 
>> Beside that, a real other problem is, that baloo has close to zero error
>> handling for its
>> database, once one error happens, all further things will go down and never
>> recover.
>>
>> e.g. one time balooctl wrongly use => goodbye
>> https://bugs.kde.org/show_bug.cgi?id=368557
>>
>> Interesting too: We use lmdb, which means, we memory map always, aka 32-bit
>> machines will
>> be out of memory if you have large indices like > 2GB :/
> 
> Nah, 32bit machines should have PAE. If they don't... I'm not willing
> to make fundamental changes to how indexes are kept to support edge
> cases like this. Disabling Baloo automatically if you detect machines
> with a 32-bit address space is the way to go.
PAE doesn't help there at all. (it only helps that your system can use more than 3/4GB,
not one application)

If you use lmdb, and lets say the file is 2GB, your applications 4GB of virtual
spaces is halfed (and even more, as some parts are anyways used otherwise).

Beside, other issue: ATM the index is fixed to max 5GB, after that, all things will fail,
see bug https://bugs.kde.org/show_bug.cgi?id=364475, as I have seen index sizes > 2 GB,
that will hit people, too.

We should increase that limit IMHO and out-of-space should be handled at all I guess.

> 
> I'd still wait for Pinak to comment on all of the above though.
Sure.

Greetings
Christoph
Comment 5 Christoph Cullmann 2016-09-11 21:16:36 UTC
*** Bug 356176 has been marked as a duplicate of this bug. ***
Comment 6 downwa 2017-02-12 18:30:04 UTC
I just installed KDE neon 5.9 and can confirm that baloo_file_extractor still consumes 1.3 Gb ram and excessive IO, to the point of taking 30 seconds or more to get a login prompt at a console.  This is on a system with 4 Gb ram, 2 cores, 2.16 Ghz.

After leaving it running all night it completed the initial indexing of my 2 Terrabyte hard drive.  But even though the system was now responsive, baloo_file_extractor soon took off again when I tried to do anything on the system, and I ended up having to kill and disable it (as I have had to do ever since it was introduced, on every new install, on every system).

Something appears fundamentally wrong in its design...
Comment 7 Alexander Zhigalin 2017-06-10 09:46:57 UTC
I'm no longer experiencing this issue.
However, when I do copy a _large_ amount of files my system becomes somewhat slow for a while and I think this is due to baloo indexing them all.
Is there some kind of a limit to make it work less but longer in background?
Comment 8 Nate Graham 2018-11-26 21:32:09 UTC

*** This bug has been marked as a duplicate of bug 400704 ***