Summary: | HTML messages have link tags added around URLs inside HTML tags | ||
---|---|---|---|
Product: | [Unmaintained] kopete | Reporter: | Sean Lynch <techniq35> |
Component: | libkopete | Assignee: | Kopete Developers <kopete-bugs-null> |
Status: | RESOLVED FIXED | ||
Severity: | minor | ||
Priority: | NOR | ||
Version: | 0.8.0 | ||
Target Milestone: | --- | ||
Platform: | unspecified | ||
OS: | Linux | ||
Latest Commit: | Version Fixed In: | ||
Sentry Crash Report: | |||
Attachments: |
Notice double link
Patch: don't parse links on HTML messages. |
Description
Sean Lynch
2003-10-12 08:54:31 UTC
Created attachment 2754 [details]
Notice double link
Confirmed. I see the same thing. looks like the message is parsed two times. we have to do some work at this level. Please post the debug output when you run kopete from a terminal, it's probably a bug with parsing the ugly and broken AIM html code. I'd also appreciate testing with different clients sending the same url so differences will be easier to spot. aim client: kopete (oscar): [void OscarSocket::parseMessage(const UserInfo&, const QString&, unsigned char, unsigned char)] Got a normal message: <HTML><BODY BGCOLOR="#ffffff"><FONT LANG="0"><A HREF="http://www.google.com">www.google.com</A></FONT></BODY></HTML> trillian: kopete (oscar): [void OscarSocket::parseMessage(const UserInfo&, const QString&, unsigned char, unsigned char)] Got a normal me ssage: <HTML><BODY BGCOLOR="#ffffff"><font face="Arial">www.google.com</BODY></HTML> It seems with the aim client that it makes the link the right way (wrapping text with anchor tags), but also seems that kopete parses any www.something address as a link as well. BTW, the trillian link of www.google.com looked and worked right, but the aim client with www.google.com came as... http://www.google.com">www.google.com (with the http://www.google.com and www.google.com both blue as links, but not the ">. Anymore help I can provide, let me know. Kopete does indeed turn URLs clickable. But it only does this for messages that came across the wire as plainText. It seems to me from the debug output the problem is that the OSCAR plugin is sending the <a href=""></a> to KopeteMessage and yet still marking the message type as plainText, when it should either be stripping it out, or marking the type as HTML. This would be a libkopete issue now. I hacked a little bit on this tonight and found out that even if I pass valid HTML, the double link still gets displayed. :/ Problem is in KopeteMessage. parseLinks is replacing links regardless of if type is HTML or PlainText. If the type is HTML, parseLinks needs to add another check to its regexp to ensure that it is not replacing a link that was already "linked" so to speak. Either that or not run parseLinks on HTML type messages at all. A work around could also be to set the message with the type ParsedHTML but then you wouldn't get emotiocons. Created attachment 4662 [details]
Patch: don't parse links on HTML messages.
I can't reproduce this. Does the attached patch help?
Apply in libkopete/ with:
patch < kopetemessage-nolinksonhtml.diff
According to Stefan the patch is wrong. I tried to create another one based on his input two days ago, but didn't really succeed. Adding 0.8 to the version since AFAICS the fix will be backportable. I have a plan. I'll be back with a result soon... CVS commit by lilachaze: Don't parse links inside HTML tags. CCMAIL: 65909@bugs.kde.org A tests/kopetelinktest.cpp 1.1 [GPL (v2+)] M +34 -12 kopetemessage.cpp 1.140 M +1 -1 kopetemessage.h 1.55 M +1 -0 tests/.cvsignore 1.2.4.2 M +5 -1 tests/Makefile.am 1.1.4.3 --- kdenetwork/kopete/libkopete/kopetemessage.cpp #1.139:1.140 @@ -476,22 +476,44 @@ QString KopeteMessage::parsedBody() cons #endif } - /* - // FIXME: Should be uncommented as soon as ICQ and other plugins are free of stuff like: - // text=i18n("<b>[Email Message:]</b> %1").arg(text); - // where text is not yet parsed - else if( d->format & RichText ) - { - // Richtext should already have <a href ...> around URLs, so no link parsing takes place - return KopeteEmoticons::parseEmoticons(escapedBody()); - } - */ else { - return KopeteEmoticons::parseEmoticons(parseLinks(escapedBody())); + return KopeteEmoticons::parseEmoticons(parseLinks(escapedBody(), d->format)); } } -QString KopeteMessage::parseLinks( const QString &message ) const +QString KopeteMessage::parseLinks( const QString &message, MessageFormat format ) { + if ( format == ParsedHTML ) + return message; + + if ( format & RichText ) + { + // < in HTML *always* means start-of-tag + QStringList entries = QStringList::split( QChar('<'), message, true ); + + QStringList::Iterator it = entries.begin(); + + // first one is different: it doesn't start with an HTML tag. + if ( it != entries.end() ) + { + *it = parseLinks( *it, PlainText ); + ++it; + } + + for ( ; it != entries.end(); ++it ) + { + QString curr = *it; + // > in HTML means start-of-tag if and only if it's the first one after a < + int tagclose = curr.find( QChar('>') ); + // no >: the HTML is broken, but we can cope + if ( tagclose == -1 ) + continue; + QString tag = curr.left( tagclose + 1 ); + QString body = curr.mid( tagclose + 1 ); + *it = tag + parseLinks( body, PlainText ); + } + return entries.join(QString::fromLatin1("<")); + } + QString result = message; --- kdenetwork/kopete/libkopete/kopetemessage.h #1.54:1.55 @@ -342,5 +342,5 @@ private: KopeteMessagePrivate *d; - QString parseLinks( const QString &message ) const; + static QString parseLinks( const QString &message, MessageFormat format ); }; --- kdenetwork/kopete/libkopete/tests/.cvsignore #1.2.4.1:1.2.4.2 @@ -4,2 +4,3 @@ kopetepasswordtest kopetewallettest +kopetelinktest --- kdenetwork/kopete/libkopete/tests/Makefile.am #1.1.4.2:1.1.4.3 @@ -7,5 +7,5 @@ METASOURCES = AUTO -check_PROGRAMS = kopeteemoticontest kopetewallettest kopetepasswordtest +check_PROGRAMS = kopeteemoticontest kopetewallettest kopetepasswordtest kopetelinktest kopeteemoticontest_SOURCES = kopeteemoticontest.cpp @@ -20,2 +20,6 @@ kopetepasswordtest_LDFLAGS = -no-undefined $(all_libraries) $(KDE_RPATH) kopetepasswordtest_LDADD = ../libkopete.la + +kopetelinktest_SOURCES = kopetelinktest.cpp +kopetelinktest_LDFLAGS = -no-undefined $(all_libraries) $(KDE_RPATH) +kopetelinktest_LDADD = ../libkopete.la CVS commit by lilachaze: Don't parse links inside HTML tags. CCMAIL: 65909-done@bugs.kde.org M +34 -2 kopetemessage.cpp 1.134.2.1 M +1 -1 kopetemessage.h 1.54.2.1 --- kdenetwork/kopete/libkopete/kopetemessage.cpp #1.134:1.134.2.1 @@ -476,10 +476,42 @@ QString KopeteMessage::parsedBody() cons else { - return KopeteEmoticons::parseEmoticons(parseLinks(escapedBody())); + return KopeteEmoticons::parseEmoticons(parseLinks(escapedBody(), d->format)); } } -QString KopeteMessage::parseLinks( const QString &message ) const +QString KopeteMessage::parseLinks( const QString &message, MessageFormat format ) { + if ( format == ParsedHTML ) + return message; + + if ( format & RichText ) + { + // < in HTML *always* means start-of-tag + QStringList entries = QStringList::split( QChar('<'), message, true ); + + QStringList::Iterator it = entries.begin(); + + // first one is different: it doesn't start with an HTML tag. + if ( it != entries.end() ) + { + *it = parseLinks( *it, PlainText ); + ++it; + } + + for ( ; it != entries.end(); ++it ) + { + QString curr = *it; + // > in HTML means start-of-tag if and only if it's the first one after a < + int tagclose = curr.find( QChar('>') ); + // no >: the HTML is broken, but we can cope + if ( tagclose == -1 ) + continue; + QString tag = curr.left( tagclose + 1 ); + QString body = curr.mid( tagclose + 1 ); + *it = tag + parseLinks( body, PlainText ); + } + return entries.join(QString::fromLatin1("<")); + } + QString result = message; --- kdenetwork/kopete/libkopete/kopetemessage.h #1.54:1.54.2.1 @@ -342,5 +342,5 @@ private: KopeteMessagePrivate *d; - QString parseLinks( const QString &message ) const; + static QString parseLinks( const QString &message, MessageFormat format ); }; |