Bug 452458 - Broken URLs generated for non-ascii character filenames
Summary: Broken URLs generated for non-ascii character filenames
Status: RESOLVED FIXED
Alias: None
Product: kphotoalbum
Classification: Applications
Component: HTML generator (show other bugs)
Version: GIT master
Platform: Compiled Sources Linux
: NOR normal
Target Milestone: ---
Assignee: KPhotoAlbum Bugs
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2022-04-10 09:53 UTC by Pierre Etchemaïté
Modified: 2022-04-10 15:44 UTC (History)
1 user (show)

See Also:
Latest Commit:
Version Fixed In:
Sentry Crash Report:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Pierre Etchemaïté 2022-04-10 09:53:23 UTC
SUMMARY
Filenames containing accented characters are latin1(?) percent encoded in URLs (eg é -> é) in the generated index.html, leading to broken links both locally and when browsed thru an Apache server

STEPS TO REPRODUCE
1. Create an image with extended ascii characters in name (é, è, ê,...), 
    $ ls *Carré\).* 
    '200416 a (Carré).jpg'
    $ ls *Carré\).*|od -c
    0000000   2   0   0   4   1   6       a       (   C   a   r   r 303 251
    0000020   )   .   j   p   g  \n
    0000026
2. Generate an HTML page containing that image
3. (publish result on a web server)
4. Browse the page (tested with Konqueror, Firefox, Chromium)

OBSERVED RESULT
Thumbnail for the image is okay, but mouse-over preview and full image are broken links

EXPECTED RESULT
All images to appear in generated page

SOFTWARE/OS VERSIONS
Linux/KDE Plasma:  Ubuntu 21.10 with KDE libs
KDE Plasma Version: 5.22.5
KDE Frameworks Version: 5.86.0
Qt Version: 5.15.2
Comment 1 Pierre Etchemaïté 2022-04-10 10:09:11 UTC
Extra information:
filesystem: ext4
locale:
$ locale   
LANG=fr_FR.UTF-8
LANGUAGE=
LC_CTYPE="fr_FR.UTF-8"
LC_NUMERIC="fr_FR.UTF-8"
LC_TIME="fr_FR.UTF-8"
LC_COLLATE="fr_FR.UTF-8"
LC_MONETARY="fr_FR.UTF-8"
LC_MESSAGES="fr_FR.UTF-8"
LC_PAPER="fr_FR.UTF-8"
LC_NAME="fr_FR.UTF-8"
LC_ADDRESS="fr_FR.UTF-8"
LC_TELEPHONE="fr_FR.UTF-8"
LC_MEASUREMENT="fr_FR.UTF-8"
LC_IDENTIFICATION="fr_FR.UTF-8"
LC_ALL=
Comment 2 Tobias Leupold 2022-04-10 10:37:19 UTC
Git commit 99ca48526ec6b6609af674be0401e43f7e39bb19 by Tobias Leupold.
Committed on 10/04/2022 at 10:34.
Pushed by tleupold into branch 'master'.

Use UTF-8 characters witout masking them when generating HTML

M  +2    -14   HTMLGenerator/Generator.cpp
M  +0    -1    HTMLGenerator/Generator.h

https://invent.kde.org/graphics/kphotoalbum/commit/99ca48526ec6b6609af674be0401e43f7e39bb19
Comment 3 Tobias Leupold 2022-04-10 10:39:02 UTC
Thanks for your report! I see this as well for non-ascii characters. The locale doesn't matter I think.

As we use UTF-8 for the HTML page anyway, I think we can simply leave out the masking of special characters and leave them as-is. This fixes the gallery for me for non-ascii characters.
Comment 4 Pierre Etchemaïté 2022-04-10 15:44:40 UTC
Problem fixed indeed, thanks for this quick reply!