Bug 507569

Summary: Boost Baloo's potency by adding AI
Product: [Frameworks and Libraries] frameworks-baloo Reporter: Mª Jesús M. G. <kde.ego>
Component: EngineAssignee: baloo-bugs-null
Status: RESOLVED NOT A BUG    
Severity: wishlist CC: nate, nicolas.fella, tagwerk19
Priority: NOR    
Version First Reported In: unspecified   
Target Milestone: ---   
Platform: Other   
OS: Linux   
Latest Commit: Version Fixed/Implemented In:
Sentry Crash Report:

Description Mª Jesús M. G. 2025-07-27 22:54:54 UTC
Add some local AI engine. There are plenty of open source ones that can be used locally, Qwen, Deepseek, Llama... It would be a great improvement to be able to query Baloo instead of just getting links to files (Baloo is currently not even able to give us snippets of text with the search terms, unlike Recoll). For students it would be an incredible time saver, but not only for them; and sure a lot more of incredibly useful uses would emerge.
Comment 1 Nicolas Fella 2025-07-27 23:00:50 UTC
Please report actual, specific problems instead of vague and inactionable suggestions like "add AI"
Comment 2 Mª Jesús M. G. 2025-07-27 23:26:48 UTC
Of course. And thank you for responding so quickly.

Look, I would love that, for example, Baloo were able to respond effectively to requests such as: ‘among all my papers, especially those written by me, list all those that mention Ibn Al Arabi and also show me a 40 or 50 line summary of those that relate him to Aristotle's metaphysics, highlighting the similarities in their thinking and their consequences for the whole of medieval Mysticism’.

I have hundreds of documents from my studies, from class notes, class works, papers that I have downloaded from academic sites, including several dozen doctoral theses, which understandably, a human cannot read in a reasonable period of time for our brief existence. I would like Baloo not only to find me the files containing the words "Ibn Al Arabi", "metaphysics" "Aristotle", etc. I would have to spend hours and hours, if not days, reading them. It'd be a real productivity boost being able to use my desktop as a sort of Perplexity of my local documents.
Comment 3 Nate Graham 2025-07-28 02:52:02 UTC
If you're a student, aren't you supposed to be reading the source material yourself?

Assuming this is not in fact required for some reason, what you describe does indeed sound like it could be an interesting approach to that particular situation, providing it worked properly, and gave you a good result, didn't take 10 minutes to generate, and didn't require network access to achieve.

What you're requesting is essentially to write an entire new project that would do this. It's pretty far outside the realm of what can be requested in a Bugzilla ticket.

Of course, you should feel free to work on such a thing yourself! It might be pretty cool.
Comment 4 Mª Jesús M. G. 2025-07-29 00:00:44 UTC
(In reply to Nate Graham from comment #3)
> If you're a student, aren't you supposed to be reading the source material
> yourself?

I suppose so, it can be said that I am and will be a student all my life. However, in any case, are you familiar with postdoctoral work? We often work with source materials read more years ago than I'd care to admit, but typically we're just skimming chapters, sometimes just checking a single citation or reference. We frequently revisit materials we've read before, recalling an interesting idea from a particular author that might be useful now. But naturally, we're not going to re-read entire books from cover to cover. I mean, would you go back and read all your university notes, doctoral research, and past academic documents every time you're working on a new project? Do you see yourself recovering your curricular and bibliographic material from two decades ago and start reading them page by page? No, right?

Anyway, it’s only logical that seeing in forums, social media, etc., that a large part (maybe the most?) of the users of Linux desktop systems are students, the idea comes to my mind that the greatest beneficiaries of my proposal would be them, but of course researchers, professors, teachers, etc, would benefit too.


> does indeed sound like it could be an interesting approach to that
> particular situation, providing it worked properly, and gave you a good
> result, didn't take 10 minutes to generate, and didn't require network
> access to achieve.

I think so too. It would be a colossal leap forward for the Linux desktop, and I’m not convinced that the rather black scenario you outline would be as dire as you suggest (assuming, of course, it all works properly, as you said). 10 Minutes (perhaps fewer, given the relatively modest number of documents individual users keep, even if into the academic world) strikes me as far more acceptable than you imagine for the vast majority. 10 min. to cross-reference data from twenty or so books and other sources, something that would take us a week or two by hand, honestly, I reckon even the most die-hard traditional academic would sign up for that in a heartbeat. Moreover, those ten minutes today will be 5 in a couple of years, and mere seconds in a decade, thanks to ongoing improvements in hardware and software. Think of it as a long-term investment, the start of a journey brimming with possibilities.
Ideally, it shouldn’t require an Internet connection; if the documents reside locally, I see no reason to reach out to the web. If I had to upload the material to a remote server that I can do via a premium account with any of the leading AI platforms ergo I would not need an IA agent ion my desktop. The guiding principle, in true KDE fashion, should be: whatever can be done locally, do it locally.

But you are the IT experts, so that aspect should really be discussed among yourselves—you’re the ones familiar with the technical ins and outs.


> What you're requesting is essentially to write an entire new project that
> would do this. It's pretty far outside the realm of what can be requested in
> a Bugzilla ticket.

How much can one ask for in a Bugzilla ticket? You all always say in the forums and blogs that suggestions and ideas should be posted here, that's why I opened this account. If it's not the right place, I'd be grateful if you could tell me which one it is and I'd also ask you all to please stop telling users to use this site to suggest improvements and that they limit themselves to reporting bugs.
In the KDE ecosystem new projects are constantly being started, and almost as quickly abandoned or left in limbo *sine die*, so it didn't seem so far-fetched to suggest this, really. Another project that is suggested and may or may not come to fruition, what would be so odd about it? It didn't seem to me at any time that I was pissing off the pot.
Nor did I think it was a job that had to be started from scratch. I thought that integrating existing projects like Ollama and open source AI models like Deepseek, Mistral and others would be the way to go.
Digikam implemented something similar using facial detection models based on trained neural networks, or something like that, for their face recognition and similar image matching system, no? I thought this might be a step further but not totally different.


> Of course, you should feel free to work on such a thing yourself! It might
> be pretty cool.

You think so? xD. I, on the other hand, believe it will be rather dismal, disastrous and neverending. However, I propose a deal: when you learn how to form reinforced concrete for bridge construction, perform a root canal, repair a heating boiler, or, more fittingly, solve the issues currently hindering the production of circuits with 2 nm wires, do let me know, and I shall abandon my profession to delve into the study of computer programming.
Just a jest, of course ;). Sometimes, you computer stuff professionals sound funny when you tell professionals from other fields: “Hey, why don’t you just do it yourself?” ;P
Comment 5 Nate Graham 2025-07-29 00:23:11 UTC
A difference between free open source development and other fields is that lots of people ask us to do work on their behalf for free; that's essentially what feature requests are.

If you asked a construction worker to build you a bridge for free, you should expect them to just laugh, right?

But we're much nicer than that, and do in fact entertain many requests anyway. However there's an implicit social contract behind it: it's considered reasonable to request a new feature as long as the request is small and well-scoped, exceptionally well described and actionable, or a precursor to the reporter doing any of the work themselves. When those aren't true, we customarily close out the request because it doesn't help anyone to end up with a bug tracker full of "wouldn't it be nice if…" feature requests that will most likely ever be implemented.