SUMMARY The URL parsing has recently been re-implemented due to bug 452978. However, parsing is still broken when URLs get put into single-quotes. Highlighting and copying the first two URLs works as expected, but the third one includes the trailing ' character, which is a very annoying issue. 1. echo http://localhost 2. echo "http://localhost" 3. echo 'http://localhost' It's very common to put URLs into quotations when passing them as CLI arguments for example, especially when trying to avoid potential string substitutions via single-quote characters. Quoting is also required due to shell-specific syntax, like the question mark for example which gets interpreted as a wildcard in FISH, so URLs with query strings always have to get quoted either via double or single quotes. SOFTWARE/OS VERSIONS $ lsb_release -d Description: Arch Linux $ pacman -Q konsole konsole 22.08.1-1
Would it make sense adding a word boundary to the URL regex? According to the email regex, you're doing exactly that: https://invent.kde.org/utilities/konsole/-/blob/b733bd03fd8ec49257f0564552a0565d189b8ec6/src/filterHotSpots/UrlFilter.cpp#L82 If that doesn't makes sense for URLs because of the "arbitrary" path/querystring/hash contents, would it instead make sense checking the character before matching the URL and adding a backreference of that character as a suffix? For ' and " (and ` ???) this would be simple. If you want to support parenthesis and brackets (angled ones don't seem to be supported at all), then the regex would be a bit more complex with if-conditions for the backreferences. https://invent.kde.org/utilities/konsole/-/blob/b733bd03fd8ec49257f0564552a0565d189b8ec6/src/filterHotSpots/UrlFilter.cpp#L46 Or could the regex maybe be simplified by matching the character before the URL in a capture group as well as the URL itself and checking the last character of the URL capture group in the application logic afterwards, so that you can deal with the surrounding characters without having to bloat up the regex? That would enable handling all kinds of surrounding characters for URL matches. Either way, always having to remove the quotation mark from a URL copied from konsole has become really tedious and annoying, so I'd really appreciate if this could be fixed soon. Thanks.
A possibly relevant merge request was started @ https://invent.kde.org/utilities/konsole/-/merge_requests/765
Git commit f063ade55b491ef7d6fe6cb87d81adaed00ca041 by Kurt Hindenburg, on behalf of Luis Javier Merino Morán. Committed on 08/11/2022 at 18:52. Pushed by hindenburg into branch 'master'. url filter: remove ending apostrophe When URLs were inside single quotes, we would include the ending quote in the parsed URL. To avoid that, remove a final apostrophe in a URL when creating the hotspot. Test: 'https://en.wikipedia.org/wiki/Earth's_rotation' M +17 -1 src/filterHotSpots/UrlFilter.cpp https://invent.kde.org/utilities/konsole/commit/f063ade55b491ef7d6fe6cb87d81adaed00ca041