Version: 0.9.2 (using KDE KDE 3.5.4) Installed from: SuSE RPMs OS: Linux Characters in file or directory names are converted to Latin-1 by digikam. The resulting names are then used to form the Thumb::URI tag for thumbnails. According to the Thumbnail Managing Standard the URI should be formed using the octet sequence that is actually used by the filesystem (in my case UTF-8) so that it can be used to access the file. According to RFC2396 (which is referenced by the TMS) an URI may only contain US-ASCII characters, octets that are not part of the US-ASCII set have to be escaped using the "%<hex>" syntax. digikam uses the unescaped Latin-1 octets instead. As other programs (e.g. konqueror) follow the standard, two thumbnail files may be generated for the same file using different URIs. Addtionally, the re-encoding may fail for characters that cannot be represented in this character set, leading to wrong or incomplete URIs.
Accidently selected the wrong KDE version: It's 3.5.7 (using openSUSE 10.3).
Hi Heiko, thanks for the report, I set the priority to high. The places I found, where the actual path to the thumbnails is constructed, are I think that this is an important issue.digikam/pixmapmanager.cpp: uri = md5.hexDigest(); kioslave/digikamthumbnail.cpp: thumbPath += QFile::encodeName( md5.hexDigest() ) + ".png"; libs/thumbbar/thumbbar.cpp: uri = md5.hexDigest(); utilities/batch/batchthumbsgenerator.cpp: uri = md5.hexDigest(); Are these all? The code usually looks like QString uri = "file://" + QDir::cleanDirPath(url.path()); KMD5 md5(QFile::encodeName(uri)); uri = md5.hexDigest(); QString smallThumbPath = d->thumbCacheDir + "normal/" + uri + ".png"; So the encodeName should be replaced, but I am not familiar with the hex encoding. One way might be something like QString hex; hex.sprintf("%%%02X", uri); (I am not sure about the first argument of sprintf ...) Heiko, do you maybe have the pointer to the source code which konquerer uses?
Hi, I must admit, I didn't even look at the code... I wrote a perl script to cleanup my .thumbnails directory and noticed that some URIs contained Latin-1 characters while my filesystems are in UTF-8. For the same file there ususally was another entry using the hex-encoded URI. Checking the Software tag revealed the producers (digikam vs. konqueror). For the encoding: I'm afraid I can't help you much with the real code (I am not familiar with Qt), but generally: You have to replace all characters outside the US-ASCII range with the hex code (using capital letters) prefixed by "%". So your format string looks right, but of course it has to be applied only to those characters that are outside US-ASCII set (and before converting the path to Latin-1). And it's not just MD5 code for the thumbnail that is affected. The Thumb::URI tag written to the thumbnail file has to be encoded correctly, too.
Heiko, What's news about this report ? It's still valid using digiKam 0.9.4 ? Gilles Caulier
I'm afraid, the problem's still there. For example. look at the following thumbnails (Software is the "Software" tag, URI is "Thumb::URI"): /home/heiko/.thumbnails/normal/0f2371678317882e56dcc233226617de.png: Software=Digikam Thumbnail Generator, URI=file:///home/media/photos/2008_K�lner_Zoo/dsc01908.jpg, mtime=1216725031 /home/heiko/.thumbnails/normal/190601fd9a42077eb8b170db6caf3704.png: Software=KDE Thumbnail Generator, URI=file:///home/media/photos/2008_K%C3%B6lner_Zoo/dsc01908.jpg, mtime=1216725031 Digikam uses Latin-1 to encode the german "ö" while KDE (correctly) uses "%xx-encoded" UTF-8.
I found the problem from digiKam thumb creator. Look this code : http://websvn.kde.org/branches/extragear/kde3/graphics/digikam/kioslave/digikamthumbnail.cpp?view=markup At line 132, you can see: QString uri = "file://" + QDir::cleanDirPath(url.path(-1)); uri is set as embeded text to png file with line 197: img.setText(QString("Thumb::URI").latin1(), 0, uri); Sound like the problem is at line 132 with QDir::cleanDirPath() method. Now compare with KDE KIO thumb creator: http://websvn.kde.org/branches/KDE/3.5/kdelibs/kio/kio/previewjob.cpp?revision=496090&view=markup at line 505, url is recorded to png file as digiKam: thumb.setText("Thumb::URI", 0, d->origName); and at line 384, method statResultThumbnail() do not use QDir::cleanDirPath() Andi, Marcel, your viewpoints ? Note : this report is very important because if we fix it, it will speed up thumbnails rendering with non-latin file paths. Gilles
I update this file with KDE4 code to hack : Gwenview thumb generator use directly a QString as well to record file URi in PNG text chunck: http://lxr.kde.org/source/KDE/kdegraphics/gwenview/lib/thumbnailloadjob.cpp#225 I'm afraid, KDE thumbnail loader do not set URi like this : http://lxr.kde.org/source/KDE/kdebase/runtime/kioslave/thumbnail/thumbnail.cpp Gilles Caulier
Heiko, For me code from digiKam 0.10.0 (KDE4) do not save text in PNG with latin-1 conversion : http://lxr.kde.org/source/extragear/graphics/digikam/libs/threadimageio/thumbnailcreator.cpp#247 Can you try again ? Gilles Caulier
ok, I can reproduce the problem here with KDE4, comparing Gwenview and digiKam. Good news : I have a fix, and this is the results : [gilles@localhost large]$ exiftool b11cb1d3e54783446c86d995683882c0.png ExifTool Version Number : 7.67 File Name : b11cb1d3e54783446c86d995683882c0.png Directory : . File Size : 51 kB File Modification Date/Time : 2009:05:24 15:58:05+02:00 File Type : PNG MIME Type : image/png Image Width : 256 Image Height : 170 Bit Depth : 8 Color Type : RGB with Alpha Compression : Deflate/Inflate Filter : Adaptive Interlace : Noninterlaced Pixels Per Unit X : 3780 Pixels Per Unit Y : 3780 Pixel Units : Meters Software : Digikam Thumbnail Generator Thumb M Time : 1241199026 Thumb URI : file:///mnt/data/photo/test/batch%20queue%20manager/test%20with%20utf8%20char%20as%20'%C3%B6'/PICT2079.png Image Size : 256x170 [gilles@localhost normal]$ exiftool b11cb1d3e54783446c86d995683882c0.png ExifTool Version Number : 7.67 File Name : b11cb1d3e54783446c86d995683882c0.png Directory : . File Size : 14 kB File Modification Date/Time : 2009:05:24 15:59:00+02:00 File Type : PNG MIME Type : image/png Image Width : 128 Image Height : 85 Bit Depth : 8 Color Type : RGB Compression : Deflate/Inflate Filter : Adaptive Interlace : Noninterlaced Pixels Per Unit X : 3780 Pixels Per Unit Y : 3780 Pixel Units : Meters Software : Gwenview Thumb Image Height : 2428 Thumb Image Width : 3646 Thumb M Time : 1241199026 Thumb Mimetype : image/png Thumb Size : 24544311 Thumb Uri : file:///mnt/data/photo/test/batch%20queue%20manager/test%20with%20utf8%20char%20as%20'%C3%B6'/PICT2079.png Image Size : 128x85 Gilles Caulier
SVN commit 972293 by cgilles: fix uri encryption path. Use KUrl::url() now, as Gwenview. BUG: 152877 M +2 -2 thumbnailbasic.cpp WebSVN link: http://websvn.kde.org/?view=rev&revision=972293