Bug 133995 - [patch] greedy matching for emoticons
Summary: [patch] greedy matching for emoticons
Status: RESOLVED FIXED
Alias: None
Product: kopete
Classification: Unmaintained
Component: general (show other bugs)
Version: unspecified
Platform: Compiled Sources Linux
: NOR wishlist
Target Milestone: ---
Assignee: Kopete Developers
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2006-09-13 09:01 UTC by Martin Matusiak
Modified: 2006-09-15 22:14 UTC (History)
0 users

See Also:
Latest Commit:
Version Fixed In:
Sentry Crash Report:


Attachments
greedy_emoticon_parsing.patch (1.81 KB, patch)
2006-09-13 09:02 UTC, Martin Matusiak
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Martin Matusiak 2006-09-13 09:01:11 UTC
Version:           0.12.2 (using KDE Devel)
Installed from:    Compiled sources
Compiler:          gcc 4.1.1 
OS:                Linux

Kopete's emoticon parsing is not greedy. This means that :s will be parsed over :strong:, which is not the behavior that users expect. I discovered this problem when testing this emoticon theme: http://www.kde-look.org/content/show.php?content=43094 (which incidentally is good for testing, because of the number of emoticons and emoticon strings it contains).


Now with regards to implementation, this is how Kopete parses emoticons.. In kopete/libkopete/private/kopeteemoticons.cpp, in the function tokenize(), it reads the message letter by letter and tries to match on emoticons. If it matches the first letter of any emoticon, it then traverses a list of emoticons (which matched the first letter) and simply picks the first one that matches. So if :s is listed before :strong:, :s will be the emoticon displayed and that's it.

There are two fairly obvious ways to fix this. One is to hack tokenize() to find more than one match, and then to use the longest one. And this would complicate tokenize() somewhat. So another solution, possibly simpler, is to streamline the data (the list of emoticons) so that they are sorted from longest to shortest. This will ensure that matches are always greedy. My patch does exactly this.
Comment 1 Martin Matusiak 2006-09-13 09:02:01 UTC
Created attachment 17748 [details]
greedy_emoticon_parsing.patch
Comment 2 Olivier Goffart 2006-09-14 07:23:26 UTC
thanks for your patch, it looks fine.
I hope I will have the time to test it this week.
Comment 3 Martin Matusiak 2006-09-14 15:05:08 UTC
I've described the problem in a more user centric way on my blog (includes screenshots): http://www.matusiak.eu/numerodix/blog/index.php/2006/09/13/fixing-lack-of-greedy-emoticon-matching-in-kopete
Comment 4 Olivier Goffart 2006-09-15 22:14:05 UTC
SVN commit 584918 by ogoffart:

Fix emoticon parsing if there are sub-emoticons.
BUG: 133995
Patch by Martin Matusiak, thanks.



 M  +18 -0     kopeteemoticons.cpp  
 M  +6 -0      kopeteemoticons.h  


--- branches/KDE/3.5/kdenetwork/kopete/libkopete/private/kopeteemoticons.cpp #584917:584918
@@ -48,6 +48,8 @@
 struct Emoticons::Emoticon
 {
 	Emoticon(){}
+	/* sort by longest to shortest matchText */
+	bool operator< (const Emoticon &e){ return matchText.length() > e.matchText.length(); }
 	QString matchText;
 	QString matchTextEscaped;
 	QString	picPath;
@@ -424,6 +426,7 @@
 		node = node.nextSibling();
 	}
 	mapFile.close();
+	sortEmoticons();
 }
 
 
@@ -492,12 +495,27 @@
 		node = node.nextSibling();
 	}
 	mapFile.close();
+	sortEmoticons();
 }
 
 
+void Emoticons::sortEmoticons()
+{
+	/* sort strings in order of longest to shortest to provide convenient input for
+		greedy matching in the tokenizer */
+	QValueList<QChar> keys = d->emoticonMap.keys();
+	for ( QValueList<QChar>::const_iterator it = keys.begin(); it != keys.end(); ++it )
+	{
+		QChar key = (*it);
+		QValueList<Emoticon> keyValues = d->emoticonMap[key];
+ 		qHeapSort(keyValues.begin(), keyValues.end());
+ 		d->emoticonMap[key] = keyValues;
+	}
+}
 
 
 
+
 QMap<QString, QString> Emoticons::emoticonAndPicList()
 {
 	return d->emoticonAndPicList;
--- branches/KDE/3.5/kdenetwork/kopete/libkopete/private/kopeteemoticons.h #584917:584918
@@ -156,6 +156,12 @@
 	 * @see initEmoticons
 	 */
 	void initEmoticon_JEP0038( const QString & filename);
+	
+	/**
+	 * sorts emoticons for convenient parsing, which yields greedy matching on
+	 * matchText
+	 */
+	void sortEmoticons();
 
 
 	struct Emoticon;