SUMMARY When a selection of text spanning multiple lines is copied, the newlines are included. This has the effect of including newlines in the middle of sentences, which is undesirable when copying text from the PDF to a new document. STEPS TO REPRODUCE 1. Obtain a PDF containing a paragraph of text, such as this one: https://unec.edu.az/application/uploads/2014/12/pdf-sample.pdf. 2. Copy an entire paragraph of text, or a selection within the paragraph spanning multiple times. 3. Paste the selection into a new document or text editor. OBSERVED RESULT Newlines used to break the content are preserved: Adobe® Portable Document Format (PDF) is a universal file format that preserves all of the fonts, formatting, colours and graphics of any source document, regardless of the application and platform used to create it. EXPECTED RESULT Newlines used to break the content are not preserved: Adobe® Portable Document Format (PDF) is a universal file format that preserves all of the fonts, formatting, colours and graphics of any source document, regardless of the application and platform used to create it. SOFTWARE/OS VERSIONS Linux/KDE Plasma: Arch Linux KDE Plasma Version: 5.23.3 KDE Frameworks Version: 5.88.0 Qt Version: 5.15.2 ADDITIONAL INFORMATION The pdf.js PDF viewer elides the newlines as I want, but butchers the spacing in seemingly unrelated ways: Adobe® Portable Document Format (PDF) is a universal file format that preserves allof the fonts, formatting, colours and graphics of any source document, regardless ofthe application and platform used to create it. Bug #359242 also discusses unwanted newlines in the clipboard, but this bug discusses the exclusion of newlines that *are* within the text selection.
> Newlines used to break the content are preserved: This is not exactly what happens. The Okular user interface does not know about newlines or paragraphs, it only knows about the positions of individual letters. If a letter is below the previous one, it inserts a newline to the selection. Besides that, I think this should not be different, at least not for PDF. If newlines are not copied, the selection will still contain hyphens. Like this: Because every-thing is on one line, it will be diffi-cult to remove the hyphens manual-ly afterwards. ;)
Thanks for the response! :) I did have a feeling that there is more going on here than what meets the eye. This would still be convenient to have, but I understand if it's not worth the time to fix those edge cases.