Bug 259318

Summary: Dolphin's Nepomuk search doesn't handle accents properly
Product: nepomuk Reporter: Alvaro Manuel Recio Perez <amrecio>
Component: generalAssignee: Sebastian Trueg <sebastian>
Status: RESOLVED UPSTREAM    
Severity: normal CC: alexvpetrov, jens, kde, trueg
Priority: NOR    
Version: 4.1   
Target Milestone: ---   
Platform: Ubuntu   
OS: Linux   
Latest Commit: Version Fixed In:

Description Alvaro Manuel Recio Perez 2010-12-09 13:01:29 UTC
Version:           4.1 (using KDE 4.5.85) 
OS:                Linux

First of all, I don't really know if this bug belongs to Nepomuk, Strigi or Dolphin.

Mi name is Álvaro (with an accented A) and I have a lot of documents indexed by Nepomuk (actually I guess Strigi indexed them) with my name in their contents. If I try to search for "Álvaro" (with an accent), I get no results. If I search for "Alvaro", I get the documents that contain either "Alvaro" or "Álvaro".

I've tried to do the same with other accented words and the effect is the same.

Reproducible: Always

Steps to Reproduce:
1. Index with Nepomuk a document containing an accented word.
2. Open Dolphin and try to search for that word using the search bar.

Actual Results:  
No results are shown.

Expected Results:  
Documents containing the word should be shown to the user.

OS: Linux (x86_64) release 2.6.35-23-generic
Compiler: cc
Comment 1 Jens Bergqvist 2011-01-10 17:38:28 UTC
I have a similar situation in 4.5.95 (4.6 RC2) on Kubuntu 10.10 (32 and 64 bit), only I get no search results all for accented characters. In my case nepomuk/strigi does not associate e.g. á with a or ö with o, which seems to be what happens for the original reporter.
Comment 2 Sebastian Trueg 2011-01-19 20:15:57 UTC
The next version of Virtuoso will contain a new configuration parameter that normalizes accents for full text queries.
I already added support for that configuration to Nepomuk. Thus, it will be used as soon as the new Virtuoso is installed.
However, only newly added text is affected. I will experiment with updating though.
Comment 3 Sebastian Trueg 2011-02-14 14:25:31 UTC
*** Bug 266294 has been marked as a duplicate of this bug. ***
Comment 4 Ignacio Serantes 2011-02-14 14:34:53 UTC
When the next version of Virtuoso will be available?

On the other side, queries in KDE 4.5 works well with unicode characters, in my case I use many Corean and Japanese characters and result was accurate so I wonder if this could be considered as a virtuoso problem.
Comment 5 Sebastian Trueg 2011-09-28 07:19:07 UTC
*** Bug 282950 has been marked as a duplicate of this bug. ***