It seems like the URL regex has been changed recently, e. g. markdown URLs are broken again, so "[foo](https://bar/)" now includes the ending ), and "`https://foo.bar`" completely fail to match. Here's how markdown URLs were fixed last time fwiw: https://invent.kde.org/utilities/konsole/-/commit/1ee8c64b5189d9e32ae9660308b2edc0f35ea6f0
A possibly relevant merge request was started @ https://invent.kde.org/utilities/konsole/-/merge_requests/682
`https://foo.bar` should be fixed with that MR. The issue with [foo](https://bar/) is that ")" is a valid character than can exist in a URL, I don't see a simple way to match markup/down urls of that form, without breaking other stuff.
(In reply to Ahmad Samir from comment #2) > `https://foo.bar` should be fixed with that MR. > > The issue with [foo](https://bar/) is that ")" is a valid character than can > exist in a URL, I don't see a simple way to match markup/down urls of that > form, without breaking other stuff. What kind of other URLs would break? URLs in markdown are very common, so unless it breaks a lot of URLs to handle them I'd say it's worth it (I don't think there used to be any issues). Another URL that don't work anymore are foo=https://bar.com fwiw.
Created attachment 149643 [details] dumb fix Simplest fix I could think of. It passes the tests, but it feels too simple so I'm assuming it breaks in some weird ways.
(In reply to Martin Sandsmark from comment #3) > (In reply to Ahmad Samir from comment #2) > > `https://foo.bar` should be fixed with that MR. > > > > The issue with [foo](https://bar/) is that ")" is a valid character than can > > exist in a URL, I don't see a simple way to match markup/down urls of that > > form, without breaking other stuff. > > What kind of other URLs would break? Bug 452978 (and it's duplicates).
A possibly relevant merge request was started @ https://invent.kde.org/utilities/konsole/-/merge_requests/683
Git commit 651bcfb733d1cb8a5b05d095c7cbbfebbf889282 by Kurt Hindenburg, on behalf of Ahmad Samir. Committed on 16/06/2022 at 16:23. Pushed by hindenburg into branch 'master'. Match urls between grave ` FIXED-IN: 22.08 M +3 -0 src/autotests/HotSpotFilterTest.cpp M +1 -1 src/filterHotSpots/UrlFilter.cpp https://invent.kde.org/utilities/konsole/commit/651bcfb733d1cb8a5b05d095c7cbbfebbf889282
Git commit 48060e31e23e974eba134bea86c976285085a73b by Kurt Hindenburg, on behalf of Luis Javier Merino Morán. Committed on 16/06/2022 at 16:36. Pushed by hindenburg into branch 'master'. Only recognize URIs with balanced parentheses To prevent URIs inside parentheses from getting extended to the closing parenthesis, only recognize URIs with balanced parentheses in regname, path, query and/or fragment. We still allow unbalanced parenthesis in userInfo, since the postfix @ should prevent most ambiguous situations, and the parenthesis can be part of a password. M +13 -0 src/autotests/HotSpotFilterTest.cpp M +13 -8 src/filterHotSpots/UrlFilter.cpp https://invent.kde.org/utilities/konsole/commit/48060e31e23e974eba134bea86c976285085a73b
Another annoying thing which recently changed is that URLs http://127.0.0.1:4000/ used to be properly parsed but now are interpreted as http://127.0.0.1, i.e. the port is missing. These kind of URLs typically appear in static site generators, kubernetes, etc. See also ticket https://bugs.kde.org/show_bug.cgi?id=430938 which proposes regexpes to fix this on a personal level.
(In reply to SvenK from comment #9) > Another annoying thing which recently changed is that URLs > http://127.0.0.1:4000/ used to be properly parsed but now are interpreted > as http://127.0.0.1, i.e. the port is missing. These kind of URLs typically > appear in static site generators, kubernetes, etc. > That is already fixed in master.
Git commit e925acc1ca60ef1116fa84cc35328d5accad23c5 by Ahmad Samir, on behalf of Luis Javier Merino Morán. Committed on 01/07/2022 at 15:33. Pushed by ahmadsamir into branch 'release/22.04'. Only recognize URIs with balanced parentheses To prevent URIs inside parentheses from getting extended to the closing parenthesis, only recognize URIs with balanced parentheses in regname, path, query and/or fragment. We still allow unbalanced parenthesis in userInfo, since the postfix @ should prevent most ambiguous situations, and the parenthesis can be part of a password. M +13 -0 src/autotests/HotSpotFilterTest.cpp M +13 -8 src/filterHotSpots/UrlFilter.cpp https://invent.kde.org/utilities/konsole/commit/e925acc1ca60ef1116fa84cc35328d5accad23c5
Git commit 6e7cf530a8a9ef2676a1b4faf9a06af9f955c5be by Ahmad Samir. Committed on 01/07/2022 at 15:33. Pushed by ahmadsamir into branch 'release/22.04'. Match urls between grave ` FIXED-IN: 22.08 M +3 -0 src/autotests/HotSpotFilterTest.cpp M +1 -1 src/filterHotSpots/UrlFilter.cpp https://invent.kde.org/utilities/konsole/commit/6e7cf530a8a9ef2676a1b4faf9a06af9f955c5be
A possibly relevant merge request was started @ https://invent.kde.org/utilities/konsole/-/merge_requests/780
Git commit 2e84af9400409021fe1ad64e7e56f2c5a0e6f82f by Kurt Hindenburg, on behalf of Luis Javier Merino Morán. Committed on 10/12/2022 at 08:11. Pushed by hindenburg into branch 'master'. url filter: start regex with word boundary This allows recognizing more kinds of enclosed URIs, e.g: - angle brackets: <https://example.com> - lenticular brackets: 【https://example.com】 - braces: {https://example.com} - guillemets: «https://example.com» - single guillemets ‹https://example.com› - ... and also URIs right after an equals sign, e.g: foo=https://example.com. Related: bug 462511 M +6 -0 src/autotests/HotSpotFilterTest.cpp M +1 -1 src/filterHotSpots/UrlFilter.cpp https://invent.kde.org/utilities/konsole/commit/2e84af9400409021fe1ad64e7e56f2c5a0e6f82f
Git commit a48e4deab23de1283f89aea49bf8037412daede1 by Kurt Hindenburg, on behalf of Luis Javier Merino Morán. Committed on 30/12/2022 at 16:49. Pushed by hindenburg into branch 'release/22.12'. url filter: start regex with word boundary This allows recognizing more kinds of enclosed URIs, e.g: - angle brackets: <https://example.com> - lenticular brackets: 【https://example.com】 - braces: {https://example.com} - guillemets: «https://example.com» - single guillemets ‹https://example.com› - ... and also URIs right after an equals sign, e.g: foo=https://example.com. Related: bug 462511 M +6 -0 src/autotests/HotSpotFilterTest.cpp M +1 -1 src/filterHotSpots/UrlFilter.cpp https://invent.kde.org/utilities/konsole/commit/a48e4deab23de1283f89aea49bf8037412daede1