Bug 450650 - URL encoded chars in feed-entry-link-href become invalid - replaced by question marks
Summary: URL encoded chars in feed-entry-link-href become invalid - replaced by questi...
Status: RESOLVED FIXED
Alias: None
Product: akregator
Classification: Applications
Component: feed parser (show other bugs)
Version: 5.15.3
Platform: Debian stable Linux
: NOR major
Target Milestone: ---
Assignee: kdepim bugs
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2022-02-21 13:13 UTC by Delian Krustev
Modified: 2022-02-22 07:11 UTC (History)
1 user (show)

See Also:
Latest Commit:
Version Fixed In: 5.19.3


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Delian Krustev 2022-02-21 13:13:58 UTC
SUMMARY

The Atom feed can be seen here:

  https://blog.krustev.net/atom.xml

Here's a feed entry:

<entry>
<id>https://blog.krustev.net/blog/%D0%97%D0%B0+%D0%B7%D0%B0%D0%BF%D0%B5%D1%82%D0%B0%D0%B9%D0%BA%D0%B8%D1%82%D0%B5</id>
<title>За запетайките</title>
<published>2022-02-15T16:44:29+02:00</published>
<updated>2022-02-15T16:44:29+02:00</updated>
<link rel="alternate" type="text/html" href="https://blog.krustev.net/blog/%D0%97%D0%B0+%D0%B7%D0%B0%D0%BF%D0%B5%D1%82%D0%B0%D0%B9%D0%BA%D0%B8%D1%82%D0%B5"/>
<summary>Нужно ли е да има толкова много правила за поставяне на запетайки ?</summary>
</entry>


The URLs of the blog are user controlled and they can contain all sort of chars
like international language, question marks, double quotes, gt, lt, etc.
Thus the links have to be URL encoded in order to be valid.

STEPS TO REPRODUCE
1. Create an article with a URL which needs URL encoding.
2. Open the article in Akregator
3. Right click on "Complete story" and "Copy the link address"

OBSERVED RESULT
When copied the following text gets into the clipboard:

https://blog.krustev.net/blog/??+???????????

And is displayed the same way on the status bar when "hovered" (with all percent encoded chars as question marks) .

EXPECTED RESULT
The link should be displayed as URL decoded in the status bar:

  https://blog.krustev.net/blog/За+запетайките

And the encoded link should be "copied" or passed to the browser when clicked.

Various other feeds work with these links as expected. E.g. gnome-feeds, liferea.
Web based ones also: feedly.com, feedburner.com, feeder.co


SOFTWARE/OS VERSIONS
Up-to-date versions from Debian BullsEye:

KDE Plasma Version: kde-plasma-desktop version 5:111
KDE Frameworks Version: 5.78.0
Qt Version: 5.15.2
Comment 1 Laurent Montel 2022-02-22 06:49:35 UTC
I confirm it.
It's when we extract info from feed that it's not converted correctly
Comment 2 Laurent Montel 2022-02-22 07:09:43 UTC
Git commit 4c061653e32af8944a78fdd1c3f3edda0b106b44 by Laurent Montel.
Committed on 22/02/2022 at 07:08.
Pushed by mlaurent into branch 'release/21.12'.

Fix bug 450650:  URL encoded chars in feed-entry-link-href become invalid - replaced by question marks

Fix store encoding url

FIXED-IN: 5.19.3

M  +2    -2    plugins/mk4storage/feedstoragemk4impl.cpp

https://invent.kde.org/pim/akregator/commit/4c061653e32af8944a78fdd1c3f3edda0b106b44
Comment 3 Laurent Montel 2022-02-22 07:11:28 UTC
Hi,
The bug was how we stored url. We stored as latin1 but we really need to store as utf8
you need to remove feed + reimport otherwise all info will be stored as latin1

Regards