It would be nice to have baloo support a "robots.txt" like functionality. That is, by a file in a folder (e.g. mypath/robots.txt) one can specify the index behavior for that path. E.g: - do not index at all - do not index certain mimetype - type of indexing (basic or full) - etc... I don't care for the exact file name. It probably shouldn't be robots.txt because that could conflict with actual web content. Ideally something with a leading dot, so it is hidden by default ".indexrc", ".crawlerrc", etc...
GNOME's tracker uses the names ".nomedia" (same as Android) and ".trackerignore" for this feature. https://gnome.pages.gitlab.gnome.org/tracker/faq/#how-can-i-control-what-tracker-indexes