Summary: | Parse URLs with Konversation parser | ||
---|---|---|---|
Product: | [Unmaintained] telepathy | Reporter: | Jonathan Thomas <echidnaman> |
Component: | text-ui-message-filters | Assignee: | Telepathy Bugs <kde-telepathy-bugs> |
Status: | RESOLVED FIXED | ||
Severity: | task | CC: | kde, kde, mklapetek, rohan |
Priority: | NOR | ||
Version: | git-latest | ||
Target Milestone: | Future | ||
Platform: | Ubuntu | ||
OS: | Linux | ||
Latest Commit: | Version Fixed In: | ||
Sentry Crash Report: |
Description
Jonathan Thomas
2012-08-24 15:37:02 UTC
Can you please test with 0.5? We changed how the links are parsed, though I guess the code was just moved around. Ideally we should be using the Konversation parsing stuff, which is in common-internals, so turning into a task. Tested it on 0.5 right now, still shows up as text and not as a link. So still needs fixing. If someone could explain this a bit to me, I could try and fix it. There's a lib/url-filter.cpp in text-ui, which has it's own parsing stuff. Some time ago we adopted Konversation parsing code, which was then moved to common-internals/KTp/text-parser.cpp. Using it is something like: KTp::TextUrlData urls = KTp::TextParser::instance()->extractUrlData(text); Then look at the TextUrlData structure and just take the corresponding links. From a bit of investigating, this still won't work since Konversation's regex can't parse url's like reddit.com or xkcd.com I've tested this on konversation and after porting text ui to use the Konversation regex matcher. They're not URLs. If we made our parser match them, we'll get a bug the next day from someone saying things rendered as links which are not.such as those last two words there, where I deliberately missed a space after the full stop - or every dbus path we type, or any number with a decimal point. €0.10 should not be a link, but 127.0.0.1 maybe should be. maybe. Before anyone can start fixing this we need a big list of test cases of what does and doesn't render as a link on GTalk, and decide what our intended behaviour should actually be, otherwise you'll just go coding round and round in circles. It could be google actually look up what is and isn't a domain, maybe it knows top level domains, or maybe it is simple words with dots in them. Also if the URL catching is good enough for Konversation (which is a mature project) is it not good enough for us? Git commit 0ffe76e12bb7cc6473346d7f0acce69ebd74be2a by Rohan Garg. Committed on 29/08/2012 at 12:40. Pushed by garg into branch 'kde-telepathy-0.5'. Use KTp::TextUrlData to parse URL's instead of custom parsing REVIEW: 106261 M +2 -0 lib/CMakeLists.txt M +12 -47 lib/url-filter.cpp http://commits.kde.org/telepathy-text-ui/0ffe76e12bb7cc6473346d7f0acce69ebd74be2a *** Bug 299329 has been marked as a duplicate of this bug. *** |