Bug 212409

Summary: URL parsing in text adds extraneus characters giving invalid URL
Product: [Applications] kmail Reporter: Andrey Borzenkov <arvidjaar>
Component: generalAssignee: kdepim bugs <kdepim-bugs>
Status: RESOLVED INTENTIONAL    
Severity: normal CC: kde, kollix
Priority: NOR    
Version: unspecified   
Target Milestone: ---   
Platform: Mandriva RPMs   
OS: Linux   
Latest Commit: Version Fixed In:

Description Andrey Borzenkov 2009-10-30 16:08:13 UTC
Version:            (using KDE 4.3.2)
OS:                Linux
Installed from:    Mandriva RPMs

In the following example (copied from real message):

tracking system at https://bugs.freedesktop.org, "poppler" product.

When I click on URL, attempt is made to open https://bugs.freedesktop.org, - including trailing comma, which of course fails. When I paste the same line in konsole, it correctly excludes trailing comma from URL.

Hmm ... I was sure KDE is using single backend to parse URLs everywhere :)
Comment 1 Nicolas L. 2009-10-31 07:15:19 UTC
confirming  this issue
Comment 2 Martin Koller 2009-11-14 14:02:59 UTC
The problem is: a plain text mail has no definition of a delimiter of an URL.
The parsing was changed in august to solve other bugs (e.g. bug 202445, bug 201900)
Also "," (comma) is a valid character in a URL, so stripping it would simply be wrong.

see also http://tools.ietf.org/html/rfc3986#appendix-A and appendix-C
Comment 3 Andrey Borzenkov 2009-11-15 10:00:52 UTC
Well ... there is also such thing as common sense.

How likely is to get URL that ends with comma? I'd say if you see URL that ends with a comma followed by space in a message text, in 99% this comma is not part of URL. In bugs you quoted you had punctuation characters followed by non-space; it is slightly different.

Having some heuristic that covers common cases and check button to turn it off for purists would be nice.