Bug 314272 - Kopete disconnects from Jabber when sending text outside of BOM
Summary: Kopete disconnects from Jabber when sending text outside of BOM
Status: RESOLVED DUPLICATE of bug 225747
Alias: None
Product: kopete
Classification: Unmaintained
Component: Jabber Plugin (show other bugs)
Version: unspecified
Platform: Other Linux
: NOR normal
Target Milestone: ---
Assignee: Kopete Developers
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2013-02-01 22:42 UTC by Charles Samuels
Modified: 2016-08-23 21:46 UTC (History)
3 users (show)

See Also:
Latest Commit:
Version Fixed In: 16.08.1
Sentry Crash Report:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Charles Samuels 2013-02-01 22:42:40 UTC
I've connected my Kopete (1.2.3) to the XMPP server at neko.im.

When I try to send an IM with a character not representable by UCS-2, such as 
Comment 1 Charles Samuels 2013-02-01 22:44:01 UTC
Apparently Bugzilla can't handle it either.

Here's my message with the bad character removed

I've connected my Kopete (1.2.3) to the XMPP server at neko.im.

When I try to send an IM with a character not representable by UCS-2, such as U+1F431, Kopete disconnects me from the server. If a person sends me text with that character, everything works fine.
Comment 2 Christoph Feck 2013-02-08 00:17:54 UTC
FYI: the bugzilla issue is bug 308047.
Comment 3 Pali Rohár 2013-10-23 08:27:18 UTC
Can you check if this problem is still present in Kopete from KDE 4.11?
Comment 4 Charles Samuels 2013-10-26 20:22:36 UTC
I can't tell you about KDE 4.11, but I compiled Kopete in master as of right now against my KDE 4.8.4 and it still fails.

If I send U1F431, Kopete appears to say in its log: 

kopete(17681)/kopete (jabber - raw protocol) JabberAccount::slotClientDebugMessage: "XML OUT: <message type="chat" to="[[Sensored IM address]]" id="19">
<body>&#xdc31;</body>
<x xmlns="jabber:x:event">
<offline/>
<composing/>
<delivered/>
<displayed/>
</x>
<active xmlns="http://jabber.org/protocol/chatstates"/>
<request xmlns="urn:xmpp:receipts"/>
</message>
"
Dropping invalid XML char U+d83d


It's unclear what the relationship between UD83D and U1F431 are supposed to be.

Easy way to reproduce this: Copy a character from here: https://en.wikipedia.org/wiki/Emoji and send it to one of your XMPP contacts.
Comment 5 Charles Samuels 2013-10-26 20:24:04 UTC
Also, what's the relationship between 0xD83D and 0xDC31?
Comment 6 Pali Rohár 2013-10-26 20:33:16 UTC
(In reply to comment #4)
> I can't tell you about KDE 4.11, but I compiled Kopete in master as of right
> now against my KDE 4.8.4 and it still fails.
> 

Ok, this is enough (using any version of kopete from git >= KDE/4.11).

> If I send U1F431, Kopete appears to say in its log: 
> 
> kopete(17681)/kopete (jabber - raw protocol)
> JabberAccount::slotClientDebugMessage: "XML OUT: <message type="chat"
> to="[[Sensored IM address]]" id="19">
> <body>&#xdc31;</body>
> <x xmlns="jabber:x:event">
> <offline/>
> <composing/>
> <delivered/>
> <displayed/>
> </x>
> <active xmlns="http://jabber.org/protocol/chatstates"/>
> <request xmlns="urn:xmpp:receipts"/>
> </message>
> "
> Dropping invalid XML char U+d83d
> 

Error message "Dropping invalid XML char" is in protocols/jabber/libiris/src/xmpp/xmpp-core/xmlprotocol.cpp so it is in external libiris xmpp library.

So this needs to be fixed in libiris library (and not in kopete). Can you report this bug in upstream libiris library? https://github.com/psi-im/iris

Authors of that library should know more.

> 
> It's unclear what the relationship between UD83D and U1F431 are supposed to
> be.
> 

I do not know.

> Easy way to reproduce this: Copy a character from here:
> https://en.wikipedia.org/wiki/Emoji and send it to one of your XMPP contacts.
Comment 7 Charles Samuels 2013-10-26 22:01:12 UTC
Filed: https://github.com/psi-im/iris/issues/13
Comment 8 Sergey 2013-10-28 16:56:13 UTC
I can't reproduce this with Psi.
Comment 9 Christoph Feck 2013-11-18 00:39:12 UTC
> relationship between UD83D and U1F431

Looks like a Unicode surrogate code point. QString internally uses UCS-2, so maybe the QChars are sent individually, without joining surrogate pairs.
Comment 10 Pali Rohár 2016-08-14 22:26:13 UTC
For sure this is same bug as #225747... I can reproduce it.

*** This bug has been marked as a duplicate of bug 225747 ***
Comment 11 Pali Rohár 2016-08-15 22:18:27 UTC
Git commit f23d6ccc7a7f542059c3956d64d912a34584723e by Pali Rohár.
Committed on 15/08/2016 at 15:58.
Pushed by pali into branch 'Applications/16.08'.

jabber: Workaround bug in QtXML: Fix xmlToString when QDomElement contains Unicode characters above 0xFFFF

Upstream:
https://github.com/psi-im/iris/commit/8612bc340421087cf0ebfd426661ff22f7351270

See also discussion:
https://github.com/psi-im/iris/pull/44
https://github.com/psi-im/iris/pull/43
https://github.com/psi-im/iris/issues/42
https://github.com/psi-im/iris/issues/13
https://bugreports.qt.io/browse/QTBUG-25291
Related: bug 225747
FIXED-IN: 16.08.1

A  +19   -0    protocols/jabber/libiris/patches/01_qtxml_unicode.patch
M  +8    -0    protocols/jabber/libiris/src/xmpp/xmpp-core/xmlprotocol.cpp

http://commits.kde.org/kopete/f23d6ccc7a7f542059c3956d64d912a34584723e
Comment 12 Pali Rohár 2016-08-23 21:46:20 UTC
*** This bug has been marked as a duplicate of bug 225747 ***