Bug 305725

Summary:	Parse URLs with Konversation parser
Product:	[Unmaintained] telepathy	Reporter:	Jonathan Thomas <echidnaman>
Component:	text-ui-message-filters	Assignee:	Telepathy Bugs <kde-telepathy-bugs-null>
Status:	RESOLVED FIXED
Severity:	task	CC:	kde, kde, mklapetek, rohan
Priority:	NOR
Version First Reported In:	git-latest
Target Milestone:	Future
Platform:	Ubuntu
OS:	Linux
Latest Commit:		Version Fixed/Implemented In:
Sentry Crash Report:

Description Jonathan Thomas 2012-08-24 15:37:02 UTC

I typed in "xkcd.com/1077" to chat. With the GTalk web client in GMail, that got turned into a link to the relevant xkcd. But in the telepathy chat it didn't turn in to a link.

Reproducible: Always

Comment 1 Martin Klapetek 2012-08-27 08:51:13 UTC

Can you please test with 0.5? We changed how the links are parsed, though I guess the code was just moved around.

Ideally we should be using the Konversation parsing stuff, which is in common-internals, so turning into a task.

Comment 2 Rohan Garg 2012-08-27 08:57:33 UTC

Tested it on 0.5 right now, still shows up as text and not as a link. So still needs fixing. If someone could explain this a bit to me, I could try and fix it.

Comment 3 Martin Klapetek 2012-08-27 09:01:43 UTC

There's a lib/url-filter.cpp in text-ui, which has it's own parsing stuff. Some time ago we adopted Konversation parsing code, which was then moved to common-internals/KTp/text-parser.cpp. 

Using it is something like:

KTp::TextUrlData urls = KTp::TextParser::instance()->extractUrlData(text);

Then look at the TextUrlData structure and just take the corresponding links.

Comment 4 Rohan Garg 2012-08-28 23:41:29 UTC

From a bit of investigating, this still won't work since Konversation's regex can't parse url's like reddit.com or xkcd.com

I've tested this on konversation and after porting text ui to use the Konversation regex matcher.

Comment 5 David Edmundson 2012-08-28 23:52:49 UTC

They're not URLs.

If we made our parser match them, we'll get a bug the next day from someone saying things rendered as links which are not.such as those last two words there, where I deliberately missed a space after the full stop - or every dbus path we type, or any number with a decimal point. €0.10 should not be a link, but 127.0.0.1 maybe should be. maybe.

Before anyone can start fixing this we need a big list of test cases of what does and doesn't render as a link on GTalk, and decide what our intended behaviour should actually be, otherwise you'll just go coding round and round in circles.

It could be google actually look up what is and isn't a domain, maybe it knows top level domains, or maybe it is simple words with dots in them. 

Also if the URL catching is good enough for Konversation (which is a mature project) is it not good enough for us?

Comment 6 Rohan Garg 2012-08-29 10:42:49 UTC

Git commit 0ffe76e12bb7cc6473346d7f0acce69ebd74be2a by Rohan Garg.
Committed on 29/08/2012 at 12:40.
Pushed by garg into branch 'kde-telepathy-0.5'.

Use KTp::TextUrlData to parse URL's instead of custom parsing

REVIEW: 106261

M  +2    -0    lib/CMakeLists.txt
M  +12   -47   lib/url-filter.cpp

http://commits.kde.org/telepathy-text-ui/0ffe76e12bb7cc6473346d7f0acce69ebd74be2a

Comment 7 Lasath Fernando 2013-03-17 00:11:34 UTC

*** Bug 299329 has been marked as a duplicate of this bug. ***