Bug 318751 - Groupwise logs imported from Kopete are displayed with HTML tags in chat window.
Summary: Groupwise logs imported from Kopete are displayed with HTML tags in chat window.
Status: RESOLVED FIXED
Alias: None
Product: telepathy
Classification: Frameworks and Libraries
Component: text-ui (show other bugs)
Version: 0.6.1
Platform: Other Linux
: NOR normal
Target Milestone: 0.6.2
Assignee: Telepathy Bugs
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2013-04-23 08:52 UTC by Vit Pelcak
Modified: 2013-05-13 13:11 UTC (History)
3 users (show)

See Also:
Latest Commit:
Version Fixed In: 0.6.2


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Vit Pelcak 2013-04-23 08:52:47 UTC
I have imported Groupwise conversations from Kopete and all of them are in chat window displayed like this:
 
<p style=" margin-top:0px; margin-bottom:0px; margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px;"><span style=" font-family:'Unknown'; font-size:12pt; color:#000000;">to je divny ... ale v  Gentoo je to mozna mozne</span></p>

Makes it harder readable.

Is it issue of import tool or chat window displays them wrongly?

Reproducible: Always
Comment 1 Daniel Vrátil 2013-04-23 10:36:21 UTC
It appears to be an issue with the import tool. Kopete escapes the HTML by replacing opening "<" by "&lt;" to prevent the tag from being interpreted as part of the XML document.

The proper solution would be to drop the HTML tags completely when importing the message, but there's always the risk of false positives. I'm afraid that unescaping "&lt;" back to "<" could break Adium themes.

However since this corrupted logs have already been imported, I guess a workaround will be needed in LogViewer as well.
Comment 2 David Edmundson 2013-04-23 18:02:31 UTC
Dan, Have you got any ideas on how we can safely fix this?

I don't.
Comment 3 Daniel Vrátil 2013-04-24 14:12:49 UTC
No. I think we will have to choose the least evil approach of the two I mentioned in comment #1. I think that converting "&lt;" to "<" is safer, since it can only break visual representation, but not the data itself.

But I'm opened to any suggestions :-)
Comment 4 David Edmundson 2013-04-24 20:37:30 UTC
We can fix new imports,  but not fix what's already there like that.

Otherwise any HTML I've sent you in the past would be rendered not shown as HTML tags in the UI.
Comment 5 Daniel Vrátil 2013-04-29 15:30:26 UTC
Ok, it appears that in Kopete logs, any HTML written by user is completely escapes (i.e. both < and > are escaped as &lt; and &gt;), while a generated HTML has only the opening tag escaped:

&lt;p style=" margin-top:0px; margin-bottom:0px; margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px;">Paragraph in HTML is enclosed in &lt;p&gt; and &lt;/p&gt; tags.&lt;/p>

I suck at regular expressions, so could someone help me please with this please?  ;-)
Comment 6 David Edmundson 2013-04-29 15:32:55 UTC
I'm confused, what's your plan?
Comment 7 Daniel Vrátil 2013-04-29 16:34:50 UTC
I thought that during import we would either remove or unescape the &lt;p....> tags, while keeping the user's HTML escaped.
Comment 8 David Edmundson 2013-04-29 16:43:59 UTC
So fixing the import bug, not fixing imported data.  That should be fine.

and we want to match
&lt;p  *junk* >  *important text*    &lt;p>
and replace it with *important text* ?

are the <p> tags always at the start and end of the string? or can we have multiple per message?
Comment 9 Daniele E. Domenichelli 2013-04-29 16:58:17 UTC
Can we store somewhere the version that made the import after importing, so that if we know that logs were imported with a buggy version we can try recovering them?
Comment 10 David Edmundson 2013-04-29 17:04:24 UTC
No, as logger isn't controlled by us.
Comment 11 Daniele E. Domenichelli 2013-04-29 17:08:48 UTC
I mean when our log importer runs, we could store a setting, and when the log-viewer is executed we could read that setting and eventually try to fix the import.
Oh but perhaps you mean that we cannot modify logger data...
Comment 12 Daniel Vrátil 2013-04-29 17:26:59 UTC
@David:
Yup, exactly. I'm pretty much sure that it only appears once and wraps the entire message. However as you can see in the first comment, there's also a wrapping <span> tag that might appear in some messages. 


@Daniele:
We don't need control over TpLogger. We could simply iterate through all existing log files and "fix" them. The importer does not use TpLogger at all, it creates the .xml files and stores them into ~/.local/share/TpLogger/...
Comment 13 Daniel Vrátil 2013-05-13 12:08:28 UTC
Git commit e469dfe3d6bcd02b7c0ffd8e513fa2a9d0face0e by Dan Vrátil.
Committed on 13/05/2013 at 14:04.
Pushed by dvratil into branch 'kde-telepathy-0.6'.

Strip wrapping HTML tags when importing logs from Kopete

Also disable check whether the logfile already exists in TpLogger and
just overwrite it. This allows users to reimport their Kopete logs
FIXED-IN: 0.6.2
Reviewed-By: David Edmundson

M  +11   -7    KTp/logs-importer-private.cpp

http://commits.kde.org/telepathy-common-internals/e469dfe3d6bcd02b7c0ffd8e513fa2a9d0face0e
Comment 14 Vit Pelcak 2013-05-13 12:48:50 UTC
Does that reimport mean that as I impoerted logs with html tags, these will be replaced by newly imported logs without html tags? That's awesome.
Comment 15 Daniel Vrátil 2013-05-13 13:11:41 UTC
(In reply to comment #14)
> Does that reimport mean that as I impoerted logs with html tags, these will
> be replaced by newly imported logs without html tags? That's awesome.

Yes, all imported logs will be overwritten with new ones without the HTML tags. You can run the importer manually from Log Viewer -> Logs -> Import Kopete Logs