Bug 449460 - Under certain locales, attempt to paste Unicode text ends up as mojibake
Summary: Under certain locales, attempt to paste Unicode text ends up as mojibake
Status: RESOLVED UPSTREAM
Alias: None
Product: kde
Classification: I don't know
Component: general (show other bugs)
Version: unspecified
Platform: Manjaro Linux
: NOR normal
Target Milestone: ---
Assignee: Unassigned bugs mailing-list
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2022-02-01 16:01 UTC by spamless.9v5xj
Modified: 2022-02-02 19:42 UTC (History)
1 user (show)

See Also:
Latest Commit:
Version Fixed In:
Sentry Crash Report:


Attachments
Demo video (2.33 MB, video/mp4)
2022-02-01 16:06 UTC, spamless.9v5xj
Details

Note You need to log in before you can comment on or make changes to this bug.
Description spamless.9v5xj 2022-02-01 16:01:53 UTC
SUMMARY
***
When Formats - System Settings is set to certain regions, attempting to copy formatted Unicode text (eg. bold, italics, a hyperlink) and pasting it into LibreOffice as unformatted text will cause the text to end up being garbled gibberish.

***


STEPS TO REPRODUCE
1.  Under Formats - System Settings, set region to "Belgium - English (en_BE)". (Other regions also cause this issue but for demonstration purposes I am using en_BE.)
2. Log out and log back in so changes would take effect
3.  Copy some formatted Unicode text (I went on ja.wikipedia.org and copied the header)
4.  Open LibreOffice writer and trigger context menu. Go to Paste Special -> Unformatted Text and select that option.

OBSERVED RESULT
Text ends up as a nonsensical string of letters

EXPECTED RESULT
Text is rendered properly

ADDITIONAL INFORMATION
It appears this is because the text encoding is incorrectly set to ISO-8859 rather than UTF-8, as the garbled string can be produced identically by pasting the same string into Kate and changing the encoding to ISO-8859.

This is not an issue with locales such as en_US and en_GB and thus indicates behavior is unintended or at least fixable, hence filing this bug.

Issue seems to only affect LibreOffice, however as the bug is triggered by changing region in Plasma settings evidence points to this being on KDE's end.
Comment 1 spamless.9v5xj 2022-02-01 16:06:02 UTC
Created attachment 146128 [details]
Demo video
Comment 2 spamless.9v5xj 2022-02-01 16:19:33 UTC
Quick clarification: This actually happens with attempting to paste any string of unicode
Comment 3 spamless.9v5xj 2022-02-01 16:21:24 UTC
Quick clarification: This happens when attempting to copy any string of Unicode text, it's simply that only when the text contains formatting might one wish to "Paste unformatted". Alas, it seems encoding info gets discarded along with it.
Comment 4 Nate Graham 2022-02-01 23:17:24 UTC
What languages do you have set in the Languages page, and what order are they in?
Comment 5 spamless.9v5xj 2022-02-01 23:21:25 UTC
(In reply to Nate Graham from comment #4)
> What languages do you have set in the Languages page, and what order are
> they in?

Only one - American English.
Comment 6 Nate Graham 2022-02-02 19:11:55 UTC
Can you paste the contents of ~/.config/plasma-localerc?
Comment 7 spamless.9v5xj 2022-02-02 19:15:10 UTC
(In reply to Nate Graham from comment #6)
> Can you paste the contents of ~/.config/plasma-localerc?

[Formats]
LANG=en_BE.UTF-8
LC_MEASUREMENT=en_GB.UTF-8
LC_MONETARY=en_GB.UTF-8
LC_NUMERIC=en_GB.UTF-8
LC_TIME=en_GB.UTF-8
useDetailed=true
Comment 8 Nate Graham 2022-02-02 19:37:16 UTC
This has to be a problem in LibreOffice or deeper in the stack, then. All we do is set those variables; we don't play with encodings of anything.

Can you report this to the LibreOffice folks at https://bugs.documentfoundation.org? Thanks!
Comment 9 spamless.9v5xj 2022-02-02 19:42:31 UTC
(In reply to Nate Graham from comment #8)
> This has to be a problem in LibreOffice or deeper in the stack, then. All we
> do is set those variables; we don't play with encodings of anything.
> 
> Can you report this to the LibreOffice folks at
> https://bugs.documentfoundation.org? Thanks!

Alright, will do. Thanks for the help!