Bug 500203 - Pasting Unicode text in clipboard history that was previously copied from Firefox would output incorrect characters
Summary: Pasting Unicode text in clipboard history that was previously copied from Fir...
Status: RESOLVED FIXED
Alias: None
Product: plasmashell
Classification: Plasma
Component: Clipboard widget & pop-up (show other bugs)
Version: 6.3.0
Platform: Arch Linux Linux
: VHI normal
Target Milestone: 1.0
Assignee: Plasma Bugs List
URL:
Keywords: regression
: 501130 (view as bug list)
Depends on:
Blocks:
 
Reported: 2025-02-16 17:54 UTC by Timothy B
Modified: 2025-03-26 15:03 UTC (History)
15 users (show)

See Also:
Latest Commit:
Version Fixed In: 6.3.4
Sentry Crash Report:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Timothy B 2025-02-16 17:54:18 UTC
SUMMARY
When pasting Unicode (non-ANSI/ASCII) characters from the clipboard history, the pasted output appears with such characters being replaced with a black diamond with a question mark or "\uXXXX".

STEPS TO REPRODUCE
1. Copy a snippet of non-Latin text or symbols from a webpage in Firefox (e.g., https://gist.github.com/StevenACoffman/a5f6f682d94e38ed804182dc2693ed4b)
2. Do one of the following: (a) Copy another piece of text from any app or (b) open the clipboard history popup and select an item other than the top one on the list
4. In the clipboard history popup, select the second item from the top of the list
5. Paste to a text box in any app

OBSERVED RESULT
The output would have Unicode characters replaced with either a � or something like "\u0105". However, those characters appear fine in both the clipboard history popup and the database file containing the clipboard items.

Using the contents of the above link as an example, "а ạ ą ä à á ą" from the clipboard history would be pasted as "\u0430 \u1ea1 \u0105 � � � \u0105"

EXPECTED RESULT
The pasted text would have characters exactly as displayed in the clipboard history popup, with or without formatting depending on the app.

SOFTWARE/OS VERSIONS
Operating System: EndeavourOS 
KDE Plasma Version: 6.3.0
KDE Frameworks Version: 6.10.0
Qt Version: 6.8.2
Graphics Platform: Wayland

ADDITIONAL INFORMATION
This behavior started in Plasma 6.3.0; I never encountered this in prior versions.

Sometimes, if you paste the history item to a textbox in Firefox, it would output nothing.

I found that in the new SQLite-based database containing the clipboard items (mine was located at ~/.local/share/klipper/history3.sqlite), the affected items would have something like this in the mimetype column:

```
text/html,text/_moz_htmlcontext,text/_moz_htmlinfo,text/plain;charset=utf-8,UTF8_STRING,COMPOUND_TEXT,TEXT,text/plain,STRING,text/plain;charset=ANSI_X3.4-1968,text/x-moz-url-priv
```

A workaround is to edit the affected item by adding/removing characters from within the clipboard history popup. This changes the MIME type for that specific item in the database to "text/plain,text/plain;charset=utf-8", causing Unicode characters in that snippet to be pasted with the correct characters at the expense of any formatting being lost.
Comment 1 Jeff Huang 2025-02-17 03:44:50 UTC
I can confirm this issue on Arch Linux.
Comment 2 David Edmundson 2025-02-17 15:58:40 UTC
I cannot reproduce, can you share the output of 'env' in a command line
Comment 3 Timothy B 2025-02-17 23:58:45 UTC
(In reply to David Edmundson from comment #2)
> I cannot reproduce, can you share the output of 'env' in a command line

BROWSER=firefox
DBUS_SESSION_BUS_ADDRESS=unix:path=/run/user/1000/bus
DISPLAY=:0
EDITOR=nvim
ELECTRON_OZONE_PLATFORM_HINT=wayland
ELECTRON_TRASH=kioclient
GTK2_RC_FILES=/home/pibeng/.config/gtk-2.0/gtkrc
GTK_RC_FILES=/etc/gtk/gtkrc:/home/pibeng/.gtkrc:/home/pibeng/.config/gtkrc
GTK_USE_PORTAL=1
KDE_APPLICATIONS_AS_SCOPE=1
KDE_FULL_SESSION=true
KDE_SESSION_UID=1000
KDE_SESSION_VERSION=6
KGLOBALACCELD_PLATFORM=org.kde.kwin
LANG=en_US.UTF-8
LANGUAGE=en_US
LC_ADDRESS=en_US.UTF-8
LC_IDENTIFICATION=en_US.UTF-8
LC_MEASUREMENT=en_US.UTF-8
LC_MONETARY=en_US.UTF-8
LC_NAME=en_US.UTF-8
LC_NUMERIC=en_US.UTF-8
LC_PAPER=en_US.UTF-8
LC_TELEPHONE=en_US.UTF-8
LC_TIME=en_US.UTF-8
PATH=/home/pibeng/.local/bin:/usr/local/sbin:/usr/local/bin:/usr/local/sbin:/usr/local/bin:/usr/bin:/var/lib/flatpak/exports/bin:/usr/lib/jvm/default/bin:/usr/bin/site_perl:/usr/bin/vendor_perl:/usr/bin/core_perl
QT_AUTO_SCREEN_SCALE_FACTOR=0
QT_FONT_DPI=96
QT_WAYLAND_RECONNECT=1
SESSION_MANAGER=local/pbngtower1:@/tmp/.ICE-unix/1266,unix/pbngtower1:/tmp/.ICE-unix/1266
USER=pibeng
WAYLAND_DISPLAY=wayland-0
XDG_CURRENT_DESKTOP=KDE
XDG_MENU_PREFIX=plasma-
XDG_RUNTIME_DIR=/run/user/1000
XDG_SEAT=seat0
XDG_SESSION_TYPE=wayland
XDG_VTNR=1
XKB_DEFAULT_LAYOUT=us
Comment 4 Timothy B 2025-02-27 15:39:36 UTC
Changing the status back to REPORTED because I already posted my `env` output and it's been 10 days since the last comment.
Comment 5 Fushan Wen 2025-02-27 15:42:13 UTC
Cannot reproduce on X11
Comment 6 Fushan Wen 2025-02-27 15:44:48 UTC
> text/plain;charset=ANSI_X3.4-1968

This might be the culprit, but my clip doesn't have it.
Comment 7 Bug Janitor Service 2025-02-27 16:12:26 UTC
A possibly relevant merge request was started @ https://invent.kde.org/plasma/plasma-workspace/-/merge_requests/5266
Comment 8 Fushan Wen 2025-02-28 00:38:32 UTC
Git commit 709f0738216fee7523f09f1169965c3a30b32095 by Fushan Wen.
Committed on 28/02/2025 at 00:12.
Pushed by fusionfuture into branch 'master'.

appiumtests: test copying UTF-8 string

M  +6    -4    appiumtests/applets/clipboardtest.py

https://invent.kde.org/plasma/plasma-workspace/-/commit/709f0738216fee7523f09f1169965c3a30b32095
Comment 9 Bug Janitor Service 2025-02-28 03:16:55 UTC
A possibly relevant merge request was started @ https://invent.kde.org/plasma/plasma-workspace/-/merge_requests/5268
Comment 10 Fushan Wen 2025-02-28 11:54:47 UTC
Git commit 183764fdc34b1438d632255d5b655e12ab38570e by Fushan Wen.
Committed on 28/02/2025 at 11:23.
Pushed by fusionfuture into branch 'master'.

klipper: ignore non-UTF-8 encoded plain text

Qt clipboard doesn't support other encodings.
FIXED-IN: 6.3.3

M  +7    -4    appiumtests/applets/clipboardtest.py
M  +7    -0    klipper/updateclipboardjob.cpp
M  +1    -0    klipper/updateclipboardjob.h

https://invent.kde.org/plasma/plasma-workspace/-/commit/183764fdc34b1438d632255d5b655e12ab38570e
Comment 11 Fushan Wen 2025-02-28 12:43:37 UTC
Git commit d79fbbb31decb97e985e0a89718fc05605f9e4de by Fushan Wen.
Committed on 28/02/2025 at 12:22.
Pushed by fusionfuture into branch 'Plasma/6.3'.

klipper: ignore non-UTF-8 encoded plain text

Qt clipboard doesn't support other encodings.
FIXED-IN: 6.3.3


(cherry picked from commit 183764fdc34b1438d632255d5b655e12ab38570e)

Co-authored-by: Fushan Wen <qydwhotmail@gmail.com>

M  +7    -4    appiumtests/applets/clipboardtest.py
M  +7    -0    klipper/updateclipboardjob.cpp
M  +1    -0    klipper/updateclipboardjob.h

https://invent.kde.org/plasma/plasma-workspace/-/commit/d79fbbb31decb97e985e0a89718fc05605f9e4de
Comment 12 Anthony Wang 2025-03-11 21:07:04 UTC
I updated to Plasma 6.3.3 on Arch Linux and sadly this bug can still be reproduced in the same way.

For instance, if I copy the text 狐狸 in Firefox, there's a new row in the main table with the correct mimetype:

INSERT INTO main VALUES('7f1efc3b619b12d38164b3610171e17bf13680a2',1741727038.727999926,1741727038.727999926,'text/plain;charset=utf-8,UTF8_STRING,COMPOUND_TEXT,TEXT,text/plain,STRING','狐狸',NULL);

However, its corresponding rows in the aux table look like this:

INSERT INTO aux VALUES('7f1efc3b619b12d38164b3610171e17bf13680a2','text/plain;charset=utf-8','7f1efc3b619b12d38164b3610171e17bf13680a2');
INSERT INTO aux VALUES('7f1efc3b619b12d38164b3610171e17bf13680a2','text/plain','ce2b5da05d513aed6cf70611ecc367cd333f69ef');

Indeed, if I run `cat .local/share/klipper/data/7f1efc3b619b12d38164b3610171e17bf13680a2/7f1efc3b619b12d38164b3610171e17bf13680a2 ` I get 狐狸 and `cat .local/share/klipper/data/7f1efc3b619b12d38164b3610171e17bf13680a2/ce2b5da05d513aed6cf70611ecc367cd333f69ef` prints \u72d0\u72f8, which I'm guessing is the source of this bug.

However, if I delete 狐狸 from my clipboard history and repeat this experiment inside KWrite, here are the rows that are added:

INSERT INTO main VALUES('7f1efc3b619b12d38164b3610171e17bf13680a2',1741727174.533999919,1741727174.533999919,'text/plain,text/plain;charset=utf-8','狐狸',NULL);
INSERT INTO aux VALUES('7f1efc3b619b12d38164b3610171e17bf13680a2','text/plain','7f1efc3b619b12d38164b3610171e17bf13680a2');
INSERT INTO aux VALUES('7f1efc3b619b12d38164b3610171e17bf13680a2','text/plain;charset=utf-8','7f1efc3b619b12d38164b3610171e17bf13680a2');

In this case, both aux rows point to the same file which contains 狐狸, not \u72d0\u72f8.
Comment 13 Timothy B 2025-03-13 01:22:21 UTC
Can confirm that the patch mentioned in comment 11 did not fix the issue. Copying "а ạ ą ä à á ą" above and following the steps to reproduce still pastes as "\u0430 \u1ea1 \u0105 � � � \u0105" in Kate and the text doesn't paste at all on Firefox
Comment 14 Anthony Wang 2025-03-13 02:50:00 UTC
I took a glance at https://invent.kde.org/plasma/plasma-workspace/-/blob/master/klipper/updateclipboardjob.cpp which was edited in the patch and I'm stumped on why copying Unicode text from Firefox would lead to different results than in KWrite or other apps, so I'm guessing maybe Firefox is the one that's converting Unicode text into \uXXXX form for the text/plain mimetype.

Two possible solutions could be making Klipper only save the text/plain;charset=utf-8 version of the text if both that and text/plain are given to Klipper by an app, or making Klipper prioritize text/plain;charset=utf-8 over text/plain when selecting an item from clipboard history (but I couldn't find the code that handles this). However, I don't know much about the Klipper codebase or what consequences either of these fixes would have.
Comment 15 Timothy B 2025-03-13 03:55:58 UTC
(In reply to Anthony Wang from comment #14)
> I'm guessing maybe Firefox is the one that's converting Unicode text into \uXXXX form for the text/plain mimetype.

The text appears with the expected characters on the Plasma clipboard history popup and in the corresponding database file if you were to inspect it with a SQLite browser or CLI utility, so I believe the problem is on Klipper's end.
Comment 16 John Kizer 2025-03-16 13:17:53 UTC
*** Bug 501130 has been marked as a duplicate of this bug. ***
Comment 17 Bug Janitor Service 2025-03-16 14:56:52 UTC
A possibly relevant merge request was started @ https://invent.kde.org/plasma/plasma-workspace/-/merge_requests/5325
Comment 18 skierpage 2025-03-19 20:10:11 UTC
(In reply to Timothy B from comment #0)
> Sometimes, if you paste the history item to a textbox in Firefox, it would
> output nothing.

I noticed this too on Fedora 41 (Plasma 6.3.3, Frameworks 6.12.0, Qt 6.8.2 on Wayland). Nothing happens when I paste the second item in Klipper into either Firefox's Location/URL field or an HTML textarea; but I see the garbled text if I paste it into Firefox's "Save As..." dialog. I've never seen Firefox decline to paste text clipboard contents.

Strangely, the Klipper text pastes fine into a Thunderbird e-mail composition window, but not Thunderbird's message search input field or "Search Messages" dialog.
Comment 19 Fushan Wen 2025-03-22 10:48:24 UTC
Git commit 9c014bad595b743de4ca7b236c4d467356505990 by Fushan Wen.
Committed on 22/03/2025 at 10:14.
Pushed by fusionfuture into branch 'master'.

klipper: let Qt handle plain text to deal with non-UTF-8 encodings

Data from text/plain might use a weird encoding on some systems, so
don't manually deal with plain text from raw bytes.
FIXED-IN: 6.3.4

M  +4    -4    klipper/autotests/v3migrationtest.py
M  +41   -36   klipper/updateclipboardjob.cpp
M  +2    -1    klipper/updateclipboardjob.h

https://invent.kde.org/plasma/plasma-workspace/-/commit/9c014bad595b743de4ca7b236c4d467356505990
Comment 20 Fushan Wen 2025-03-22 11:38:30 UTC
Git commit cd7d0215a7a0bc3503640f0f8c4e44bde08f7b20 by Fushan Wen.
Committed on 22/03/2025 at 10:49.
Pushed by fusionfuture into branch 'Plasma/6.3'.

klipper: let Qt handle plain text to deal with non-UTF-8 encodings

Data from text/plain might use a weird encoding on some systems, so
don't manually deal with plain text from raw bytes.
FIXED-IN: 6.3.4


(cherry picked from commit 9c014bad595b743de4ca7b236c4d467356505990)

Co-authored-by: Fushan Wen <qydwhotmail@gmail.com>

M  +4    -4    klipper/autotests/v3migrationtest.py
M  +41   -36   klipper/updateclipboardjob.cpp
M  +2    -1    klipper/updateclipboardjob.h

https://invent.kde.org/plasma/plasma-workspace/-/commit/cd7d0215a7a0bc3503640f0f8c4e44bde08f7b20
Comment 21 Victor Ryzhykh 2025-03-25 23:09:20 UTC
This does not correct the situation in case of synchronization of selection with the clipboard and display of Cyrillic.
If the "Keep the selection and clipboard the same" option is enabled, and the value "Text selection: Always save in history" is selected,
then in this case, selecting text in Cyrillic in a web browser, when you subsequently paste text from the clipboard, it looks like \u0441.
Now I have highlighted the text in Cyrillic in Firefox: "Вырабатывающая и передающая", as a result, I then pasted the following text from the clipboard:
"\u0412\u044b\u0440\u0430\u0431\u0430\u0442\u044b\u0432\u0430\u044e\u0449\u0430\u044f \u0438 \u043f\u0435\u0440\u0435\u0434\u0430\u044e\u0449\u0430\u044f"
Comment 22 Fushan Wen 2025-03-26 04:11:45 UTC
(In reply to Victor Ryzhykh from comment #21)
> This does not correct the situation in case of synchronization of selection
> with the clipboard and display of Cyrillic.
> If the "Keep the selection and clipboard the same" option is enabled, and
> the value "Text selection: Always save in history" is selected,
> then in this case, selecting text in Cyrillic in a web browser, when you
> subsequently paste text from the clipboard, it looks like \u0441.
> Now I have highlighted the text in Cyrillic in Firefox: "Вырабатывающая и
> передающая", as a result, I then pasted the following text from the
> clipboard:
> "\u0412\u044b\u0440\u0430\u0431\u0430\u0442\u044b\u0432\u0430\u044e\u0449\u04
> 30\u044f \u0438 \u043f\u0435\u0440\u0435\u0434\u0430\u044e\u0449\u0430\u044f"

It should be the same symptom, but the code is in another place.
Comment 23 Bug Janitor Service 2025-03-26 04:16:48 UTC
A possibly relevant merge request was started @ https://invent.kde.org/plasma/plasma-workspace/-/merge_requests/5340
Comment 24 Victor Ryzhykh 2025-03-26 09:42:34 UTC
(In reply to Bug Janitor Service from comment #23)
> A possibly relevant merge request was started @
> https://invent.kde.org/plasma/plasma-workspace/-/merge_requests/5340

This patch fixed the bug. Thank you.
Comment 25 Fushan Wen 2025-03-26 12:56:43 UTC
Git commit a4b5f1d4e9878ab425a1273d6bfb48252ecedf0b by Fushan Wen.
Committed on 26/03/2025 at 04:14.
Pushed by fusionfuture into branch 'master'.

klipper: let Qt handle plain text to deal with non-UTF-8 encodings when syncing selections
FIXED-IN: 6.3.4

M  +11   -6    klipper/systemclipboard.cpp

https://invent.kde.org/plasma/plasma-workspace/-/commit/a4b5f1d4e9878ab425a1273d6bfb48252ecedf0b
Comment 26 Fushan Wen 2025-03-26 12:57:45 UTC
Git commit 8d81c6816e64a9b9df7a093cc50f093dc258ff7a by Fushan Wen.
Committed on 26/03/2025 at 12:57.
Pushed by fusionfuture into branch 'Plasma/6.3'.

klipper: let Qt handle plain text to deal with non-UTF-8 encodings when syncing selections
FIXED-IN: 6.3.4


(cherry picked from commit a4b5f1d4e9878ab425a1273d6bfb48252ecedf0b)

Co-authored-by: Fushan Wen <qydwhotmail@gmail.com>

M  +11   -6    klipper/systemclipboard.cpp

https://invent.kde.org/plasma/plasma-workspace/-/commit/8d81c6816e64a9b9df7a093cc50f093dc258ff7a
Comment 27 Anthony Wang 2025-03-26 15:03:32 UTC
I compiled Klipper from the Plasma/6.3 branch and can confirm this bug has finally been fixed, unlike the first PR attempt a few weeks ago. Thanks Fushan Wen!