Bug 447297

Summary: UTF-8 characters decoded incorrectly on reply
Product: [Applications] kmail2 Reporter: kzi <kde>
Component: composerAssignee: kdepim bugs <kdepim-bugs>
Status: RESOLVED FIXED    
Severity: normal    
Priority: NOR    
Version: unspecified   
Target Milestone: ---   
Platform: openSUSE   
OS: Linux   
Latest Commit: Version Fixed In:
Sentry Crash Report:

Description kzi 2021-12-20 17:02:52 UTC
SUMMARY
If Configure > Composer > Charset > Keep original charset ... is checked, quoted UTF-8 encoded messages are incorrectly decoded as Latin-1 or similar on reply, leading to character sequences such as ü for ü or é for é and so on.

If the checkbox is unchecked, non-ASCII UTF-8 characters are decoded correctly.

See ADDITIONAL INFORMATION below for some settings of mine that might be related. Happy to provide additional information.


STEPS TO REPRODUCE
1. Choose an eMail with Content-Type: text/html; charset=UTF-8 (or"utf-8"), Content-Transfer-Encoding: base64 (not sure transfer encoding matters)
2. Check "Keep original charset ..." (see above)
3. Reply to chosen eMail
4. Uncheck "Keep original charset ..." (see above)
5. Reply to chosen eMail

OBSERVED RESULT
UTF-8 characters decoded as e.g. ü instead of ü when "Keep original charset ..." is checked.
UTF-8 characters correctly decoded when "Keep original charset ..." is unchecked.

EXPECTED RESULT
UTF-8 characters correctly decoded even when "Keep original charset ..." is checked.


SOFTWARE/OS VERSIONS
Linux/KDE Plasma: openSUSE Tumbleweed 20211212
KDE Plasma Version: 5.23.4
KDE Frameworks Version: 5.88.0
Qt Version: 5.15.2

ADDITIONAL INFORMATION
Some other settings of mine:
Configure > Appearance > General > Override character encoding: Auto
Configure > Composer > General > Reply or forward ... (plain text or HTML): unchecked
Configure > Composer > General > Reply or forward ... (plain text or HTML): unchecked
Configure > Composer > Charset: {utf-8, iso-8859-1, us-ascii, utf-8 (locale)} in this order
Comment 1 Bug Janitor Service 2023-03-07 20:38:24 UTC
A possibly relevant merge request was started @ https://invent.kde.org/pim/messagelib/-/merge_requests/107
Comment 2 Laurent Montel 2023-03-09 12:01:54 UTC
Git commit 29a5a05e2078b75f0a994e29e92707e3ec81e2d1 by Laurent Montel, on behalf of Fabian Vogt.
Committed on 09/03/2023 at 07:27.
Pushed by fvogt into branch 'master'.

Fix fallback path in MessageFactoryNG::applyCharset

In the case that the codec of the original message could not encode the reply,
it was still set as charset but the body encoded with the fallback codec.
This resulted in replies having messed up encoding.
It can be triggered by replying to multipart mails which define the charset
in parts only or if the reply template ends up with other special characters.
Related: bug 443009, bug 298349

M  +64   -0    messagecomposer/autotests/messagefactoryngtest.cpp
M  +3    -0    messagecomposer/autotests/messagefactoryngtest.h
M  +5    -4    messagecomposer/src/helper/messagefactoryng.cpp

https://invent.kde.org/pim/messagelib/commit/29a5a05e2078b75f0a994e29e92707e3ec81e2d1