Bug 225747 - kopete does not encode utf xml properly when sending to jabber (xmpp) server
Summary: kopete does not encode utf xml properly when sending to jabber (xmpp) server
Status: RESOLVED FIXED
Alias: None
Product: kopete
Classification: Unmaintained
Component: Jabber Plugin (other bugs)
Version First Reported In: unspecified
Platform: Gentoo Packages Linux
: NOR normal
Target Milestone: ---
Assignee: Kopete Developers
URL:
Keywords:
: 314272 (view as bug list)
Depends on:
Blocks:
 
Reported: 2010-02-06 18:08 UTC by Nikoli
Modified: 2016-08-23 21:46 UTC (History)
2 users (show)

See Also:
Latest Commit:
Version Fixed In: 16.08.1
Sentry Crash Report:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Nikoli 2010-02-06 18:08:05 UTC
Version:           qt 4.5.3 kde 4.3.3 kopete 0.80.2 (using KDE 4.3.3)
OS:                Linux
Installed from:    Gentoo Packages

Try to send '
Comment 1 Nikoli 2010-02-06 18:11:17 UTC
Try to send 'GUTISK' (215 	Gothic 	'GUTISK', bugs.kde.org filters this strings) to any jid using xmpp aka jabber. Server will disconnect you because utf is not encoded well.

I used copypasting http://meta.wikimedia.org/wiki/List_of_Wikipedias for test.

This problem does not have psi, qutim, qip, but does have vacuum.
Comment 2 Roman Jarosz 2010-02-06 18:33:22 UTC
I cannot reproduce this on KDE SC 4.4, I've tried to copy whole "Languages:" box ... what exactly did you send and which jabber server do you use?
Comment 3 Nikoli 2010-02-06 18:46:12 UTC
I opened http://meta.wikimedia.org/wiki/List_of_Wikipedias in firefox, did Ctrl+A, Ctrl+Insert, opened kopete, did Shift+Insert. I think, that problem is only with gutisk http://dpaste.com/155471/ http://paste2.org/p/653016

I used my ejabberd, jabber.org, jabber.ru. All servers do not like message with this text from kopete.
Comment 4 Nikoli 2010-03-16 22:52:35 UTC
Tested with latest Qt and KDE - same problem. Versions: Qt 4.6.2, KDE 4.4.1, kopete 1.0.0

jabber.ru disconnected client, xml log:

<message type="chat" to="nikoli@nikoli.msk.ru" id="122">
<body>&#xdf32;&#xdf3f;&#xdf44;&#xdf39;&#xdf43;&#xdf3a;</body>
<x xmlns="jabber:x:event">
<offline/>
<composing/>
<delivered/>
<displayed/>
</x>
<active xmlns="http://jabber.org/protocol/chatstates"/>
</message>

<stream:error>
<xml-not-well-formed xmlns="urn:ietf:params:xml:ns:xmpp-streams"/>
</stream:error>
Comment 5 Pali Rohár 2016-08-14 09:37:37 UTC
Confirmed, unicode characters in XML should be encoded as full codepoints, not as UTF-16 surrogate pairs. Surrogate pairs are invalid in XML, so server should really disconnect you.

See: http://www.w3.org/TR/REC-xml/#charsets

Character Range

[2]   	Char	   ::=   	#x9 | #xA | #xD | [#x20-#xD7FF] | [#xE000-#xFFFD] | [#x10000-#x10FFFF]	/* any Unicode character, excluding the surrogate blocks, FFFE, and FFFF. */
Comment 6 Pali Rohár 2016-08-14 22:26:13 UTC
*** Bug 314272 has been marked as a duplicate of this bug. ***
Comment 7 Pali Rohár 2016-08-14 22:27:31 UTC
This is bug in QtXml... My workaround for libiris: https://github.com/psi-im/iris/pull/44
Comment 8 Pali Rohár 2016-08-15 22:18:27 UTC
Git commit f23d6ccc7a7f542059c3956d64d912a34584723e by Pali Rohár.
Committed on 15/08/2016 at 15:58.
Pushed by pali into branch 'Applications/16.08'.

jabber: Workaround bug in QtXML: Fix xmlToString when QDomElement contains Unicode characters above 0xFFFF

Upstream:
https://github.com/psi-im/iris/commit/8612bc340421087cf0ebfd426661ff22f7351270

See also discussion:
https://github.com/psi-im/iris/pull/44
https://github.com/psi-im/iris/pull/43
https://github.com/psi-im/iris/issues/42
https://github.com/psi-im/iris/issues/13
https://bugreports.qt.io/browse/QTBUG-25291
Related: bug 314272
FIXED-IN: 16.08.1

A  +19   -0    protocols/jabber/libiris/patches/01_qtxml_unicode.patch
M  +8    -0    protocols/jabber/libiris/src/xmpp/xmpp-core/xmlprotocol.cpp

http://commits.kde.org/kopete/f23d6ccc7a7f542059c3956d64d912a34584723e
Comment 9 Pali Rohár 2016-08-23 21:46:20 UTC
*** Bug 314272 has been marked as a duplicate of this bug. ***