Bug 333433 - Indexer should auto exclude external-media in /home
Summary: Indexer should auto exclude external-media in /home
Status: REOPENED
Alias: None
Product: Baloo
Classification: Frameworks and Libraries
Component: Baloo File Daemon (show other bugs)
Version: unspecified
Platform: FreeBSD Ports FreeBSD
: NOR normal
Target Milestone: ---
Assignee: Vishesh Handa
URL: http://forum.kde.org/viewtopic.php?f=...
Keywords:
Depends on:
Blocks:
 
Reported: 2014-04-14 23:51 UTC by Bernard Gray
Modified: 2021-08-20 08:11 UTC (History)
5 users (show)

See Also:
Latest Commit:
Version Fixed In: 5.1
Sentry Crash Report:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Bernard Gray 2014-04-14 23:51:13 UTC
Currently baloo excludes some local media from it's index by default, but it does not exclude any external filesystems, particularly network filesystems. 

Indexing the networked filesystems causes a high amount of IO, so it would be more sensible to exclude these by default.

Reproducible: Always

Steps to Reproduce:
1. Mount network filesystems (eg cifs)
2. watch baloo_file_extractor at the top of the iotop list
3. exclude the network filesystem mount points and the baloo_file_extractor io drops off the iotop list


Expected Results:  
All external media should be excluded from the indexer by default
Comment 1 Vishesh Handa 2014-04-15 09:57:53 UTC
Just to confirm, this only happens if the network storage is mounted under $HOME, right?
Comment 2 Bernard Gray 2014-04-15 22:29:49 UTC
Correct, sorry I should have been more specific -
Comment 3 Vishesh Handa 2014-08-19 16:53:18 UTC
Git commit 69411aadf470e624fbd4fa78696378098d7e7dca by Vishesh Handa.
Committed on 19/08/2014 at 17:04.
Pushed by vhanda into branch 'master'.

FileIndexerConfig: Take RemovableMedia into account

With this we will never index removable media unless they have been
explicitly added in the include folders. No matter where they are
mounted.

M  +2    -0    src/file/cleaner/CMakeLists.txt
M  +2    -0    src/file/extractor/CMakeLists.txt
M  +13   -0    src/file/fileindexerconfig.cpp
M  +5    -0    src/file/fileindexerconfig.h
M  +1    -0    src/file/lib/CMakeLists.txt
M  +4    -0    src/file/storagedevices.cpp
M  +3    -5    src/file/storagedevices.h
M  +2    -2    src/file/tests/CMakeLists.txt

http://commits.kde.org/baloo/69411aadf470e624fbd4fa78696378098d7e7dca
Comment 4 Torsten Eichstädt 2020-07-01 10:35:40 UTC
kf5-baloo-5.68.0, FreeBSD 12.1-RELEASE-p6, installed from package (not self-compiled)
Hi,

I ran into this when I null-mounted (unionfs) three big source trees into my $HOME.  Although otherwise the handling of these mounts is absolutely stable, I think it's a bug in FreeBSD: how can an indexer running at lowest priority (nice 19 and ordinary non-root UID) freeze the whole system?  But maybe it's rooted in some Qt lib, KDE frameworks or baloo, or that one of these uses the VFS in an incorrect manner.

Symptoms: 
1. When an index already exists and baloo is enabled, the whole system becomes so unresponsive that it's impossible to do anything.  I had to force a shutdown by pressing the power button >4 seconds, reboot, and manually disable baloo in ~/.config/baloofilerc.
2. When no index exists and baloo gets enabled, the system becomes very sluggish, like described in the initial bug report, some windows do not refresh correctly, besides that the system & GUI is more or less usable.
3. Stopping a running baloo indexer with 'balooctl purge' fails, neither does 'balooctl disable' help.  It has to be killed manually.

Workaround (does not hurt in my case since source code is not indexed anyway):
1. kill baloo and delete an existing index: 'balooctl purge'
2. Add the union-mount directories to excluded dirs in systemconfig
   -> ~/.config/baloofilerc:exclude folders[$e]=$HOME/Projects/FreeBSD/src
3. Enable baloo in systemconfig

I do reopen this bug, and it would be great if someone with deeper knowledge of VFS handling adds some comment on this.
My /etc/fstab:
/src/13-CUR     /home/paul/Projects/FreeBSD/src/13-CUR unionfs  rw,late,below,noatime 0 0
/src/12-STABLE  /home/paul/Projects/FreeBSD/src/12-STABLE unionfs rw,late,below,noatime 0 0
/src/12.1-REL   /home/paul/Projects/FreeBSD/src/12.1-REL unionfs rw,late,below,noatime 0 0
Comment 5 Torsten Eichstädt 2020-07-01 10:50:43 UTC
I guess Linux and other BSD have such unionfs, too.  Would be helpful to track down the root cause of the issue, if s/o can fire up baloo on a setup like mine.
Comment 6 tagwerk19 2021-08-20 08:11:04 UTC
(In reply to Torsten Eichstädt from comment #5)
> I guess Linux and other BSD have such unionfs, too.  Would be helpful to
> track down the root cause of the issue, if s/o can fire up baloo on a setup
> like mine.
I've encountered something similar with mergerfs, see Bug 420939, comments from 49 onwards.

The issue was that the inode values "handed up" by mergerfs to baloo kept changing. Baloo keeps thinking it's got new files.

The test is to do a

    stat filename

You'll get something like:

    $ stat 1.ts
      File: 1.ts
      Size: 41416704        Blocks: 80896      IO Block: 4096   regular file
    Device: fc01h/64513d    Inode: 794964      Links: 1
    Access: (0664/-rw-rw-r--)  Uid: ( 1000/    test)   Gid: ( 1000/    test)
    Access: 2021-07-24 22:50:57.838161084 +0200
    Modify: 2021-07-24 22:50:57.838161084 +0200
    Change: 2021-07-24 22:51:42.686181710 +0200
    Birth: -

It's the "Device" and "Inode" numbers that you need to keep you eye on. The:

    Device: fc01h/64513d    Inode: 794964

If these values keep changing then you've pinned down the issue.

The solution with mergerfs was that it needed an "use_ino" option in the /etc/fstab file

    https://github.com/trapexit/mergerfs#inodecalc

I don't know if unionfs has or needs the equivalent but it would be something to look at.