Bug 464853

Summary: Adding new Keyword results in error (similar keyword exists) - Add recommendation in online doc about strings encoding support in Mysql/Mariadb
Product: [Applications] digikam Reporter: RalfPeter <ralf.walhoefer>
Component: DocumentationAssignee: Digikam Developers <digikam-bugs-null>
Status: REPORTED ---    
Severity: normal CC: axel.krebs, b4yf823ap56y457945ty4w7o, caulier.gilles, metzpinguin
Priority: NOR    
Version: 7.9.0   
Target Milestone: ---   
Platform: Microsoft Windows   
OS: Microsoft Windows   
Latest Commit: Version Fixed In:
Sentry Crash Report:
Attachments: Trying to Add "Ächt" while "Acht" already exists in Keywords

Description RalfPeter 2023-01-26 14:30:21 UTC
Created attachment 155668 [details]
Trying to Add "Ächt" while "Acht" already exists in Keywords

SUMMARY
***
If you have already keywords and try to add one wich is similar to an existing one, you get an error from the database: keyword cannot created.
Examples for "similar" keywords (i added some city names): Hohn <-> Höhn; Tuttlingen <-> Tüttlingen; aso...
***

STEPS TO REPRODUCE
1. Add a Keyword with the Keyword Manager (Burgenstock as example)
2. Add a similar Keyword with the Keyword Manager (one letter ist changed from base letter to german umlaut: Bürgenstock)
3. Error from Database

2nd try:
1. Add a Keyword with Keyword Manager (Burgenstock as example)
2. Add another Keyword (Brgenstock [with "u"] as example)
3. You get a new Keyword Brgenstock
4. Rename Brgenstock -> Bürgenstock 
5. Seems ok. Restart Digikam
6. Second keyword still is Brgenstock

OBSERVED RESULT
Even other not ascii sign produce the same problem (such as é, ú, ó ...). Can it be a problem with the database engine and the default encoding? I use internal MariaDB on a SSD.

EXPECTED RESULT
I want to create similar Keywords, becaus many city names in europe are only different in one letter (base letter -> umlaut -> accent). I need them both.

SOFTWARE/OS VERSIONS
Windows: 10

ADDITIONAL INFORMATION
Internal MariaDB on a SSD
Comment 1 Maik Qualmann 2023-01-26 18:02:27 UTC
The behavior is unfortunately normal, you are using MySQL. In the text encoding (we use one with UTF8), MySQL has no distinction between upper and lower case and between the German umlauts U-Ü, A-Ä or O-Ö. It's all the same for MySQL. There would be only one way we would use for filenames and the path, a binary storage of the text. When accessing the binary information, it would always have to be converted back, this effort is out of the question for tags. Otherwise, searches would be case-sensitive, etc.

What can you do?
Use SQLite, SQLite can easily be used for more than 100000 items and on an SSD it is even faster than MySQL With digiKam-8.0.0 and WAL mode support, transactions of the SQLite DB are even faster and more stable.

Use the typical German notation for umlauts, i.e. UE, AE, or OE, you know.

Maik
Comment 2 caulier.gilles 2023-05-03 03:35:41 UTC
*** Bug 321242 has been marked as a duplicate of this bug. ***
Comment 3 caulier.gilles 2023-05-03 03:48:31 UTC
Maik,

Do we needs to add a tip in the online documentation about mysql/mariadb text encoding which do not support the case sensitive ?
Gilles
Comment 4 caulier.gilles 2023-10-15 08:11:03 UTC
Maik,

This Mysql encoding of string is always right with recent version of Mariadb/Mysql ?

If yes what the right annotation to put on the online documentation to warn end-users about this particularity ?

Best

Gilles
Comment 5 Maik Qualmann 2024-01-21 08:59:02 UTC
*** Bug 480117 has been marked as a duplicate of this bug. ***
Comment 6 Maik Qualmann 2024-01-21 16:31:47 UTC
Git commit c4fa9a3b631e7684722d70780c9c2df27a9e026c by Maik Qualmann.
Committed on 21/01/2024 at 17:30.
Pushed by mqualmann into branch 'master'.

handle case sensitive tags better under MySQL
Related: bug 480117
FIXED-IN: 8.3.0

M  +1    -1    NEWS
M  +1    -1    core/data/database/dbconfig.xml.cmake.in
M  +18   -0    core/libs/database/coredb/coredb.cpp

https://invent.kde.org/graphics/digikam/-/commit/c4fa9a3b631e7684722d70780c9c2df27a9e026c