Bug 489919 - Index file paths and xattr (but not their content) of files which are excluded by file type
Summary: Index file paths and xattr (but not their content) of files which are exclude...
Status: REPORTED
Alias: None
Product: frameworks-baloo
Classification: Frameworks and Libraries
Component: general (show other bugs)
Version: unspecified
Platform: Other Linux
: NOR wishlist
Target Milestone: ---
Assignee: baloo-bugs-null
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2024-07-08 12:02 UTC by postix
Modified: 2024-07-08 13:38 UTC (History)
1 user (show)

See Also:
Latest Commit:
Version Fixed In:
Sentry Crash Report:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description postix 2024-07-08 12:02:38 UTC
Right now, Baloo indexes only file names (and contents of files), which are not excluded via the file type list in

```
 ~/.config/baloofilerc
[General]
dbVersion=2
exclude filters=*~,*.part,*.o,*.la,*.lo,*.loT,*.moc,moc_*.cpp,qrc_*.cpp,ui_*.h,cmake_install.cmake,CMakeCache.txt,CTestTestfile.cmake,libtool,config.status,confdefs.h,autom4te,conftest,confstat,Makefile.am,*.gcode,.ninja_deps,.ninja_log,build.ninja,*.csproj,*.m4,*.rej,*.gmo,*.pc,*.omf,*.aux,*.tmp,*.po,*.vm*,*.nvram,*.rcore,*.swp,*.swap,lzo,litmain.sh,*.orig,.histfile.*,.xsession-errors*,*.map,*.so,*.a,*.db,*.qrc,*.ini,*.init,*.img,*.vdi,*.vbox*,vbox.log,*.qcow2,*.vmdk,*.vhd,*.vhdx,*.sql,*.sql.gz,*.ytdl,*.class,*.pyc,*.pyo,*.elc,*.qmlc,*.jsc,*.fastq,*.fq,*.gb,*.fasta,*.fna,*.gbff,*.faa,po,CVS,.svn,.git,_darcs,.bzr,.hg,CMakeFiles,CMakeTmp,CMakeTmpQmake,.moc,.obj,.pch,.uic,.npm,.yarn,.yarn-cache,__pycache__,node_modules,node_packages,nbproject,core-dumps,lost+found
exclude filters version=8
```

Meaning, if you try to find an existing file, called e.g. `abc.vdi`, Baloo will tell you that it can't find it:
```
baloosearch -i filename:abc.vdi
```

From a UX POV it would be much better if Baloo indexed file names of any files, but just excluded the content of file types from the list above.
Comment 1 tagwerk19 2024-07-08 13:38:35 UTC
(In reply to postix from comment #0)
> ... From a UX POV it would be much better if Baloo indexed file names of any
> files, but just excluded the content of file types from the list above ...
Fully agree...

From the experience in:
    https://bugs.kde.org/show_bug.cgi?id=488533#c1
you can stop excluding by file extension (although the ".obj" is troublesome here) and exclude by mimetype

This could have a load impact; Baloo might need to read the first part of the file if it needs to check the "magic".