Bug 202123 - Nepomuksearch in Dolphin/Konqueror shows incomplete results and does not accept wildcards
Summary: Nepomuksearch in Dolphin/Konqueror shows incomplete results and does not acce...
Status: RESOLVED FIXED
Alias: None
Product: nepomuk
Classification: Miscellaneous
Component: general (show other bugs)
Version: unspecified
Platform: openSUSE Linux
: NOR normal
Target Milestone: ---
Assignee: Sebastian Trueg
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2009-07-31 19:40 UTC by Ralph Moenchmeyer
Modified: 2009-08-15 10:43 UTC (History)
2 users (show)

See Also:
Latest Commit:
Version Fixed In:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Ralph Moenchmeyer 2009-07-31 19:40:08 UTC
Version:            (using KDE 4.2.98)
OS:                Linux
Installed from:    SuSE RPMs

I run Nepomuk togehther with Strigi in an Opensuse 11.1 environment with KDE 4.2.98. I never used it much before. So, frankly, I do not know whether the following is a bug or whether it should be changed to a feature request. For the time being I regard it as a bug. 

Running a search as "nepomuksearch:kon" gives me a lot of reasonable but incomplete files and folders in the result list. E.g. folders as "kon1" and "kon2" are displayed in the resultlist - but not the folder "konsulat" residing the same directory as "kon1". However, a search like "nepomuksearch:konsulat" gives me a result list containing the folder "konsulat". 

In addition, my guesses for wildcards as in "nepomuksearch:kon*" or "nepomuksearch:kon%" would not be accepted - the special characters are probably treated as normal characters. Thus, the search leads to a zero result list. Does a Nepomuk/Strigi-search accept wildcards at all? Or is there a special syntax to be used for wildcard searches? 

Personally, I would regard a working basic search functionality with complete resultlists being equally important as the functionality for searches based on metadata. Therefore, I consider the at least the incomplete result lists as a bug ...
Comment 1 Sebastian Trueg 2009-07-31 21:11:05 UTC
wildcards * and ? are accepted and they work here. I suspect it is a scoring issue. The results get a score too low for them to be shown. This is a bug indeed, only not with the wildcard handling. ;)
Comment 2 Ralph Moenchmeyer 2009-08-01 16:19:11 UTC
For completeness I should add that I use the soprano-sesame backend. 

Regarding wildcards: 
I cannot get nepomuksearch within dolphin to accept the wildcards "*" and "?". I have set a max. "score" for one of my "kontakt" folders (I assume that this is possible by assigning 5 stars in dolphin ?). But this would not change the search results. A "nepomuksearch:kon*" does not give me any hits. The same ist true for any other searchstring. E.g., "alph*" or "alph?" do not give me any folder or file with some "alpha" string in its name althoug such folders and files do exist.   
The interpretation that dolphin/nepomuk regard "*, ?" as normal characters is by the way consistent with observations of some other users (I found some comments via Google regarding problems with wildcards). 
 
Could it be that something goes wrong when Dolphin initiates the nepomuk search? E.g. a wrong way of passing parameters?  

Regarding the completeness of search results: 
Funny enough the following happens when I add a new folder with the name "kontakt" and below a new folder "alpha" - within my home directory which is marked to be indexed by strigi:  

A "nepomuksearch:alpha" in dolphin then gives me the new "alpha"-directory in the result list. However, the search "nepomuksearch:kontakt" would not display the new "kontakt" folder. As soon as I add a file "alpha.txt" to the "alpha"-folder the search "nepomuksearch:alpha" shows the file "alpha.txt" in the result list. However, "nepomuksearch:alpha.txt" returns no result. 

All this appears to be an inconsistent search behaviour to me. But maybe I expect something unreasonable regarding the search of files/folders by their names as search strings ....          

Any ideas what I could or should do? Would it be worthwhile to deinstall soprano/sesame and delete all nepomuk and strigi configuration folders and reinstall everything back again?
Comment 3 Sebastian Trueg 2009-08-03 10:18:10 UTC
The rating has nothing to do with the "scrore" which I was referring to. It is an internal rating of the search result quality, i.e. how likely it is that the result actually fits the query. You cannot influence this as a user. This is considered buggy.

As for filename searching: this is another problem which I hoped I had corrected. If you want the technical details: the full text indexer splits at word boundaries like spaces or punctuation. Thus, a filename "foobar.txt" is splitted into "foobar" and "txt". "foobar" and "txt" match but not the full filename. This should be fixed in KDE trunk. I will try to backport it to 4.3.

Deleting everything will not fix the issues. These are actual bugs you encounter. :(
Comment 4 Sebastian Trueg 2009-08-03 16:08:42 UTC
I backported the fix that makes it possible to query filenames to 4.3.
Comment 5 Ralph Moenchmeyer 2009-08-09 13:57:46 UTC
Thanks for the development efforts! 

I hope that your fix will soon be made available through the Suse build service for the Suse Factory repository of KDE 4.3. Unfortunately, for the time being I have no time to deal with yet another separate KDE environment reflecting the KDE svn or trunk status on my machines.
 
It stresses me enough to handle in parallel the latest KDE 3.5 and a conservative 4.2 for production plus a 4.3 test environment corresponding to the "factory"  repository of SuSE.    
    
With the present status of SuSE's KDE 4.3 x86_64 RPMs of the "Factory"  repository, e.g.   
 
dolphin-4.3.0-103.6
strigi-0.7.0-27.1 and libstrigi0-0.7.0-27.1
soprano-2.3.0-45.1 and libsoprano4-2.3.0-45.1 and 
soprano-backend-sesame-2.3.0-51.1

the bug behaviour has not changed yet. 

I shall add a new comment as soon as I notice improvements.
Comment 6 Oliver Traeger 2009-08-14 11:32:29 UTC
I wonder if this score described in the second post by Sebastian is dynamically calculated based on the whole set of possible search results ? Here is what I've enountered using Nepomuk and wildcards:

nepomuksearch:/alina* results in a file alina2.jpg (but doesn't show the file 01_alina.jpg, which is probably due to the "full text indexer doesn't recognize full filename" bug described above)

nepomuksearch:/alin* gives me the same file alina2.jpg

nepomuksearch:/ali* results in a file plasma-emailnotify.cpp but alina2.jpg doesn't show up anymore

is this because plasma-emailnotify.cpp has a higher internal rating compared to alina2.jpg (and maybe different files containing ali*) or is this a different problem?

I guess there is no way, e.g using some hidden config option, to force nepomuksearch to show all results with no regard to the score (rating) ;) ?
Comment 7 Sebastian Trueg 2009-08-14 11:49:51 UTC
sadly, no. It has to be patched in the code. Maybe simply showing all results in the kio slave makes more sense after all... opinions?
Comment 8 Oliver Traeger 2009-08-14 12:39:07 UTC
showing all results in the kio slave? I vote for YES ;)

My reasoning:

1. since 4.3, given the tight integration into dolphin and KRunner, nepomuksearch surely aims to become the default search engine on the kde desktop with more and more users employing it, those users surely expect to find everything related to a given keyword, especially files whose names match that keyword. I sure understand that to many search results are of no big help but missing out on a file that would match a given keyword is even worse.

2. I guess the internal rating could still be used to organize the search results.

3. nepomuksearch allows for pretty extensive ways to narrow down your search results(e.g. AND, OR)

4. I don't know if Nepomuk/Strigi treat filenames any different from the file content but since many users are still used to search for filenames maybe it is possible to have something like nepomuksearch:/hasFilename:keyword to avoid finding files that only contain the given keyword.

P.S. apart from the "basic search functionallity" issues I love the metadata stuff you can do with Nepomuk, Great Work!
Comment 9 Sebastian Trueg 2009-08-14 13:09:07 UTC
I will change the scoring thing for 4.4 then. Maybe even backport to 4.3 as it can be seen as a bug that results are not shown.

You can always use "filename:foobar". The searchable fields are defined by the ontologies.

Here we would need a way to let the user know what they can search for.
Comment 10 Sebastian Trueg 2009-08-14 13:19:39 UTC
SVN commit 1011339 by trueg:

Do not cut off results due to their low score by default.
At the moment our scoring is too bad anyway.
In the future this should be come configurable through the service API.

BUG: 202123


 M  +1 -1      searchcore.cpp  
 M  +2 -2      searchcore.h  


WebSVN link: http://websvn.kde.org/?view=rev&revision=1011339
Comment 11 Oliver Traeger 2009-08-14 13:53:36 UTC
wow, that was quick ;)
I'd love to see a backport for 4.3.

Talking about "letting the user know" Im in the middle of writing a little wiki article on ubuntuusers.de (in German) (http://wiki.ubuntuusers.de/Baustelle/Nepomuk) about Nepomuk (though I haven't written anything about using Nepomuk yet). If you like it, or if you dont cause it's all wrong, just email me, maybe i could provide some "user perspective" documentation about Nepomuk (I'd also translate into english of course)

Tanks for your efforts
Comment 12 Sebastian Trueg 2009-08-15 10:43:10 UTC
it already is backported to 4.3. :)