Bug 268204

Summary: MYSQL : file-names are case-INsensitive
Product: [Applications] digikam Reporter: Daniel Bauer <linux>
Component: Database-MysqlAssignee: Digikam Developers <digikam-bugs-null>
Status: RESOLVED FIXED    
Severity: normal CC: caulier.gilles, Hamsi2k, linux, swatilodha27
Priority: NOR    
Version: 1.8.0   
Target Milestone: ---   
Platform: openSUSE   
OS: Linux   
Latest Commit: Version Fixed In: 7.0.0
Sentry Crash Report:

Description Daniel Bauer 2011-03-11 11:35:52 UTC
Version:           1.8.0 (using KDE 4.6.0) 
OS:                Linux

When using MySQL database, digikam cannot distinguish between filenames in upper or lower case, i.e. of tho images "image.jpg" and "image.JPG" only one is saved in the database.

This only occurs if using MySQL. 

Using sqlite this bug doesn't apply.

Reproducible: Didn't try

Steps to Reproduce:
While digikam not running, create a folder in the digikam album tree, put some files giving them same names, but change some characters from lower to upper case ("image.jpg", "image.JPG" etc.). Open digikam, search for new pictures, and see.

When refreshing the album (F5) sometimes version "lower case", sometimes version "upper case" is shown and the database gets updated accordingly. But it contains always only one version...

Actual Results:  
Always on one file is shown and saved in the database ("image.jpg" OR "image.JPG"). Upper/lower case is ignored.

Expected Results:  
As we are not on MS-DOS I want digikam to distinguish upper/lower-case as any other program on linux or Mac does that...

The database "images" is collated case-insensitive (utf8-unicode-ci). I guess, digikam searches the database to see if an image already exists, and the search is done case-insensitive (without a COLLATE option).
I could solve the problem for me changing the collation of the field "name" in "images" to latin1_general_cs. 
So ether the collations of the tables or fields should be changed to a case-insensitive collation, or a COLLATE must be added to the appropriate SELECT commands.
Comment 1 Daniel Bauer 2011-03-11 11:38:08 UTC
Sorry, "Reproducible: " should be: ALWAYS
Comment 2 Marcel Wiesweg 2011-03-11 18:08:57 UTC
Holger, we need to set this option at some point when creating the database?
Comment 3 Quallenauge 2011-03-17 10:20:51 UTC
Hi Daniel,

thanks for your detailed report!
I would like to add the COLLATE keyword to the table definition.
As collation type I would sugest to use utf8_bin since we use UTF-8 as default charset. Is this okay for you (and can you try to use this and give us a feedback - in my tests it seems to work)?

Marcel, do we need to update the database schema to version 7 or can I extend the current (version 6) schema definition (since the gsoc is not official released).
Comment 4 Daniel Bauer 2011-03-17 12:38:05 UTC
Hi, I just changed to collation of the whole table to utf8_bin, and it shows all images

BUT

digikam shows problems when importing files containing "Umlaute" (ä, ö, ü) in the file-name, eventually with other special characters, too (don't know). I exportet my original "images" table, renamed it, and created a new images table as follows:

CREATE TABLE IF NOT EXISTS `Images` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `album` int(11) DEFAULT NULL,
  `name` longtext NOT NULL,
  `status` int(11) NOT NULL,
  `category` int(11) NOT NULL,
  `modificationDate` datetime DEFAULT NULL,
  `fileSize` int(11) DEFAULT NULL,
  `uniqueHash` varchar(32) COLLATE utf8_unicode_ci DEFAULT NULL,
  PRIMARY KEY (`id`),
  UNIQUE KEY `album` (`album`,`name`(332)),
  KEY `dir_index` (`album`),
  KEY `hash_index` (`uniqueHash`)
) ENGINE=MyISAM  DEFAULT CHARSET=utf8 COLLATE=utf8_bin AUTO_INCREMENT=63295 ;


I have no problems when using latin1 collation and charset on the filed "name":

CREATE TABLE IF NOT EXISTS `Images` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `album` int(11) DEFAULT NULL,
  `name` longtext CHARACTER SET latin1 COLLATE latin1_general_cs NOT NULL,
  `status` int(11) NOT NULL,
  `category` int(11) NOT NULL,
  `modificationDate` datetime DEFAULT NULL,
  `fileSize` int(11) DEFAULT NULL,
  `uniqueHash` varchar(32) COLLATE utf8_unicode_ci DEFAULT NULL,
  PRIMARY KEY (`id`),
  UNIQUE KEY `album` (`album`,`name`(332)),
  KEY `dir_index` (`album`),
  KEY `hash_index` (`uniqueHash`)
) ENGINE=MyISAM  DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci AUTO_INCREMENT=63295 ;

--------------
slightly off-topic:
why does field "album" allow NULL?
I have lots of entries in images with album name NULL.
Where do they come from?
I guess I can delete them???
--------------
Comment 5 Marcel Wiesweg 2011-03-17 19:49:46 UTC
> Marcel, do we need to update the database schema to version 7 or can I extend
> the current (version 6) schema definition (since the gsoc is not official
> released).

I would really like to avoid incrementing the schema for this, unless really necessary. Is it possible to solve via ALTER TABLE statements? Then we'd have a workaround for existing MySQL users.

> I have no problems when using latin1 collation and charset on the filed "name":

We use UTF8 everywhere internally and never latin1. Maybe something went wrong during import/export?

> why does field "album" allow NULL?

Information about deleted images is kept for some time, then the album is null. If you moved photos from A to B, then scan first A then B, information would be lost.

> As collation type I would sugest to use utf8_bin 

Weird that MySQL does not provide any utf8 _cs collations...
Comment 6 Daniel Bauer 2011-03-17 20:30:37 UTC
Hi Marcel, thanks for the info about the album = NULL, so I just ignore that...

> We use UTF8 everywhere internally and never latin1. Maybe something went wrong
> during import/export?

I've done several tests with several database collation settings. I always added a new folder in digikam and imported the images from a folder on my desktop. 

Whenever the table "images" (or just the field "name") had COLLATE=utf8_bin digikam didn't show the correct file-name for a file called "img_123_oberkörper.jpg" allthough the file was correctly saved in the database. In digikam album the image was replaced by the placeholder-image, clicking on it gave "could not load images" (or alike..) and renaming the file within digikam also did not work...

A "SELECT FROM images WHERE name LIKE '%oberk%'" gave the correct results. Don't know what happens within digikam...

Please try to import a file "nö.jpg" while collation for "name" is set to utf8_bin and see if you can reproduce...
Comment 7 Daniel Bauer 2011-03-17 20:38:21 UTC
> Is it possible to solve via ALTER TABLE statements?

When the database has collation utf8_unicode_ci, as far as I know you can always use "WHERE BINARY" or "COLLATE utf8_bin" in the select statements where case sensitive behaviour is necessary....
Comment 8 Marcel Wiesweg 2011-03-17 20:49:15 UTC
> When the database has collation utf8_unicode_ci, as far as I know you can
> always use "WHERE BINARY" or "COLLATE utf8_bin" in the select statements where
> case sensitive behaviour is necessary....

That could be the easier solution... Images.name, Albums.relativePath, AlbumRoots.specificPath, and in some cases also AlbumRoots.identifier are critical. Need to check if there are more fields involved. At least until now, no other obscure bugs have been discovered.
Comment 9 swatilodha27 2016-06-26 21:39:47 UTC

*** This bug has been marked as a duplicate of bug 364131 ***
Comment 10 caulier.gilles 2020-01-01 21:39:14 UTC
Not reproducible with 7.0.0 beta1.