Bug 410951 - Pasting text from Firefox adds bogus whitespace characters at the end of every line / sanitize pasted text
Summary: Pasting text from Firefox adds bogus whitespace characters at the end of ever...
Status: RESOLVED FIXED
Alias: None
Product: kate
Classification: Applications
Component: general (show other bugs)
Version: Git
Platform: Other Linux
: NOR major
Target Milestone: ---
Assignee: KWrite Developers
URL:
Keywords:
: 412613 (view as bug list)
Depends on:
Blocks:
 
Reported: 2019-08-15 18:08 UTC by Nate Graham
Modified: 2019-12-12 15:07 UTC (History)
4 users (show)

See Also:
Latest Commit:
Version Fixed In: 5.62


Attachments
Screen recording (2.66 MB, video/webm)
2019-08-15 18:44 UTC, Nate Graham
Details
Screenshot (187.11 KB, image/png)
2019-12-08 19:30 UTC, Marcus Seyfarth
Details
Screenshot GMail (66.96 KB, image/png)
2019-12-08 19:40 UTC, Marcus Seyfarth
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Nate Graham 2019-08-15 18:08:46 UTC
Kate and KTextEditor built from current git master on Manjaro

I'm unsure whether this is a Firefox bug, a Kate bug, or some combination of the two. But regardless, this has started happening recently (in the last week or two).

1. Open Firefox (68.0.1)
2. Copy any text on https://kde.org, or any other webpage
3. Open Kate (git master)
4. Paste the text into a new document Kate

The text is pasted with bogus whitespace characters after every line. This is extremely irritating for my typical development workflow which involves copying and pasting a lot of text from my web browser. After saving and re-opening the affected file, the bogus whitespace characters are gone (so I cannot attach an example file, sorry).

This does not happen when pasting text from Chromium into Kate, or when pasting text from Firefox into Gedit; only Firefox -> Kate.

Because the problem is not seen when pasting text into Gedit, I am assuming Gedit automatically trims off bogus whitespace characters in pasted text during paste operations the way Kate seems to during save operations. Perhaps Kate should perform this sanitization during paste operations as well.
Comment 1 Christoph Cullmann 2019-08-15 18:21:58 UTC
Hmm, I pasted stuff into Kate and got new strange whitespaces.

Using Firefox 68.0.1
Comment 2 Christoph Cullmann 2019-08-15 18:22:37 UTC
Which text do I need to copy and paste exactly?
Comment 3 Nate Graham 2019-08-15 18:37:04 UTC
For me, it's any text on any web page in Firefox. Any of the text on kde.org should suffice for testing purposes.
Comment 4 Christoph Cullmann 2019-08-15 18:39:34 UTC
I copied the "News" column and I just get

News

    KDE Applications 19.08 Brings New Features to Konsole, Dolphin, Kdenlive, Okular and Dozens of Other Apps
    Trusted IT Consulting Firms Directory Provides Businesses with KDE Support
    The Linux Application Summit is coming to Barcelona in November
    enioka Haute Couture Becomes a KDE Patron
    Powered by Plasma: ALBA Synchrotron in Barcelona, Spain
    Akademy 2019: Talk Schedule is out!
    Plasma + Usability & Productivity Sprint in Valencia, Spain
    Plasma 5.16 by KDE is Now Available
    Announcing Our Google Summer of Code 2019 Students
    Akademy 2019 registration now open

(exactly as here in bugzilla visible, with no stray spaces at the end)
Comment 5 Nate Graham 2019-08-15 18:44:21 UTC
Created attachment 122148 [details]
Screen recording

Here's a screen recording.
Comment 6 Christoph Cullmann 2019-08-15 18:47:22 UTC
Hmm, I did the same, didn't have issues.

My versions:

KDE Frameworks 5.62.0
Qt 5.13.0 (built against 5.13.0)

Frameworks is from master via kdesrc-build
Kate, too.
Qt is from archlinux
Comment 7 Nate Graham 2019-08-15 18:48:22 UTC
Yeah my Qt is 5.13 as well.
Comment 8 Christoph Cullmann 2019-08-15 18:50:08 UTC
Hmm, then I am confused :(
We didn't alter the copy & paste code in ages and I am not sure what should have been broken there.
To sanitize stuff on paste is no good idea, that can silently remove spaces your really "need", e.g. if you copy complex shell commands/code fragments.
Comment 9 Eike Hein 2019-08-15 19:44:15 UTC
I've had a similar problem the other day. I was copying some text from KDE's Etherpad (notes.kde.org), specifically bullet point lists, to Kate and adding `",` at the end and using block insert mode to add `"` and some indentation in the front. Then I copied it into KDevelop. I was turning Etherpad bullets into a QML string list, basically.

Somewhere in this chain of events, some of the lines ended up with whitespace/newlines in front of the closing `"` that would get written out to disk, but not shown in the UI. I had to reload with F5 to see it and be able to remove it.
Comment 10 Christoph Cullmann 2019-08-24 15:01:34 UTC
Hmm, I would like to help you there, but I can't see how we would introduce such characters, as (if I don't misread our code) we just take 1:1 the stuff we get from the Qt clipboard/paste buffer.
Comment 11 Nate Graham 2019-08-24 15:05:10 UTC
I have no doubt that Firefox introduced the problem. I was just thinking that maybe Kate could correct for it because apparently Gedit does.
Comment 12 Christoph Cullmann 2019-08-25 10:03:13 UTC
I just checked our code once more, we do:

void KTextEditor::ViewPrivate::paste(const QString *textToPaste)
{
    m_temporaryAutomaticInvocationDisabled = true;
    doc()->paste(this, textToPaste ? *textToPaste : QApplication::clipboard()->text(QClipboard::Clipboard));
    m_temporaryAutomaticInvocationDisabled = false;
}

If Firefox messes up the text we get there, I am not aware of any fix that will not anger other people that want the text to be 1:1 pasted.

I tend to close this as not a bug.
Comment 13 Christoph Cullmann 2019-08-25 10:04:40 UTC
One question I have: what is the character code of the added space? You get it by e.g. saving and opening the file in okteta.
Comment 14 Nate Graham 2019-08-25 15:17:00 UTC
Okteta says the character is 0D (sorry if this is wrong or totally ignorant; I have no idea what I'm doing in a hex editor).
Comment 15 Christoph Cullmann 2019-08-25 15:18:22 UTC
Ok, then this can be fixed ;)
Comment 16 Christoph Cullmann 2019-08-25 15:27:02 UTC
Git commit e487a184bc6f31f3f0e6ab538cb3406ec282a6d5 by Christoph Cullmann.
Committed on 25/08/2019 at 15:26.
Pushed by cullmann into branch 'master'.

try to sanitize line endings on paste

M  +10   -7    src/document/katedocument.cpp

https://commits.kde.org/ktexteditor/e487a184bc6f31f3f0e6ab538cb3406ec282a6d5
Comment 17 Christoph Cullmann 2019-08-25 15:27:41 UTC
Please try my patch.
No idea why Firefox should start to emit Windows \r\n line endings, but that seems fixable.
Comment 18 Nate Graham 2019-08-25 15:31:06 UTC
You fixed it! Yay!!!!!
Comment 19 Christoph Cullmann 2019-08-25 15:32:39 UTC
No problem, should have thought of Windows line endings a long time ago ;=)
Comment 20 Dominik Haumann 2019-10-04 21:37:11 UTC
*** Bug 412613 has been marked as a duplicate of this bug. ***
Comment 21 Marcus Seyfarth 2019-12-08 19:30:52 UTC
Created attachment 124390 [details]
Screenshot

Screenshot with the bug
Comment 22 Marcus Seyfarth 2019-12-08 19:32:06 UTC
Unfortunately I can still reproduce this bug with KDE Frameworks 5.64 and Chromium on openSUSE Tumbleweed from 4th December 2019.
Comment 23 Marcus Seyfarth 2019-12-08 19:40:43 UTC
Created attachment 124391 [details]
Screenshot GMail

Here a screenshot of the GMail source from where I copied the text from.
Comment 24 Marcus Seyfarth 2019-12-08 19:51:01 UTC
Just tried out Firefox (70.0) with the same result, the bug is also reproducible with that browser (on the same openSUSE Tumbleweed build as above).
Comment 25 Nate Graham 2019-12-09 16:54:29 UTC
That's a different issue. Please file a new bug report to track it. Thanks!
Comment 26 Marcus Seyfarth 2019-12-12 15:07:13 UTC
Just FYI, I've filed my bug here: https://bugs.kde.org/show_bug.cgi?id=414991