Bug 75497 - An internal Kopete error occurred while parsing a message: XML document could not be parsed! (caused by incorrect charset in message)
Summary: An internal Kopete error occurred while parsing a message: XML document coul...
Status: RESOLVED FIXED
Alias: None
Product: kopete
Classification: Applications
Component: Chatwindow Styles (show other bugs)
Version: 0.8.0
Platform: unspecified Linux
: NOR normal
Target Milestone: ---
Assignee: Kopete Developers
URL:
Keywords:
: 74059 75453 83565 (view as bug list)
Depends on:
Blocks:
 
Reported: 2004-02-18 09:34 UTC by jstuart
Modified: 2004-11-08 11:15 UTC (History)
7 users (show)

See Also:
Latest Commit:
Version Fixed In:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description jstuart 2004-02-18 09:34:26 UTC
Version:           0.8.0 (using KDE KDE 3.2.0)
Installed from:    Unspecified

This is a generic bug for all of the reports that we get for the above error message since as I understand it, this is caused because QT is unable to figure out what character set the message should be displayed and instead converts it to UTF-8.
Comment 1 jstuart 2004-02-18 09:35:03 UTC
*** Bug 75453 has been marked as a duplicate of this bug. ***
Comment 2 jstuart 2004-02-18 09:35:53 UTC
*** Bug 74059 has been marked as a duplicate of this bug. ***
Comment 3 Jason Keirstead 2004-02-18 13:14:57 UTC
I don't think it was a good idea marking these all as duplicate since they have *very* different causes in the different protocols.

In Yahoo it is caused because the extended ASCII color information isn't all being stripped out when it should be.

In Oscar it happens realted to the wrong codecs being chosen to decode messages

In IRC it happens for the same reason, but the wrong codecs are chosen based on different characteristics, since the server has no notion of codec.

In the Crypto plugin it's related to the process of messages going through the process of encryption and decryption

... etc.... all very different.
Comment 4 Richard Smith 2004-02-19 21:27:58 UTC
*** Bug 75646 has been marked as a duplicate of this bug. ***
Comment 5 Richard Smith 2004-02-19 21:29:43 UTC
Retitling so people will spot this more easily and not file duplicates.
Comment 6 Heinrich Wendel 2004-04-05 12:23:50 UTC
any update on this?
Comment 7 Tim Weber 2004-04-05 12:43:18 UTC
I'd really appreciate if somebody could fix this. I can't, because unless Kopete is written in PHP I won't be able to write any source code for it. :)

But this problem is really annoying and embarassing. Ever said "could you please say that again without special characters, my client can't display them"? Makes you look like a real loser.
Comment 8 Will Stephenson 2004-04-05 13:02:56 UTC
If you are experiencing this with ICQ, set an appropriate encoding for the contact in its ICQ User Info and you will not get this error any more.
Comment 9 Jason Keirstead 2004-04-05 13:29:53 UTC
On April 5, 2004 08:03 am, Will Stephenson wrote:
> ------- Additional Comments From lists stevello free-online co uk  2004-04-05 13:02 -------
> If you are experiencing this with ICQ, set an appropriate encoding for the contact in its ICQ User Info and you will not get this error any more.

Can we change this error to something nicer like what Will just said? Saying something like...

"Kopete was unable to decode the last message it received. Please check the character encoding settings for this contact and/or protocol"

.. would probably go a long way towards helping users.

Comment 10 Tom Simnett 2004-04-20 03:18:39 UTC
And how might one change the encoding settings for a Yahoo contact? Accessing User Info just brings up their web profile page in konqueror. I would rather see this fixed properly than have to use a workaround. Like Tim Weber though, I can only code in PHP!
Comment 11 Matt Rogers 2004-04-20 03:26:05 UTC
> And how might one change the encoding settings for a Yahoo contact?

You can't change the encoding settings for a yahoo contact. Generally the 
problems with the yahoo plugin and this error message have to do with invalid 
control codes for different colors and font styles.

Comment 12 Tom Simnett 2004-04-28 00:41:40 UTC
Can it be fixed then? Its somewhat embarrassing asking friends who use the extended ASCII colour codes to change their colour just so I can talk to them, when everyone else using Yahoo! Messenger can chat to them just fine!
Comment 13 Till Gerken 2004-04-28 01:16:27 UTC
I've looked a little bit at the code and I think that it's really bad design that the XSL transformation code takes a string and returns another string, without letting the caller know if the transformation was successful or not.

After all, the original message appears more or less mangled in the debug log, at least for ICQ. Kopete should issue a warning and append this mangled version of the original message, saying that it's either due to wrong encoding settings or a broken client on the other end.

The method as it is now should return an error code and the class should provide means to transform this into a more meaningful string upon request, not simply overwrite the input with an error message and have the caller think everything went well.
Comment 14 Tom Simnett 2004-04-28 01:29:48 UTC
I do wonder (specifically for Yahoo IM as its the only one I'm currently having problems with) why the message is displayed (albeit with ASCII colour information included) in the history, but cannot display in the chat window. Surely this should be exactly the same display?
Comment 15 Nicholas Pilon 2004-04-28 05:32:50 UTC
The AIM/ICQ portion is definitely related to charset. A friend of mine was pasting some info from Word into both AIM and ICQ, and it kept generating this error until I forced ICQ to use ISO-8859-1. Perhaps Kopete should simply default to whatever the "standard" KDE encoding is on the system if it can't work out what encoding a message should be?

The sender was using Trillian, if that matters.
Comment 16 Matt Rogers 2004-04-28 05:36:11 UTC
> The AIM/ICQ portion is definitely related to charset. A friend of mine was
> pasting some info from Word into both AIM and ICQ, and it kept generating
> this error until I forced ICQ to use ISO-8859-1. Perhaps Kopete should
> simply default to whatever the "standard" KDE encoding is on the system if
> it can't work out what encoding a message should be?
>

That's what it tries to do, but it's not that easy. If it was, we'd not be 
having this bug. :-)

Comment 17 Till Gerken 2004-04-28 10:04:18 UTC
That the message appears differently in the history is due to Kopete being overengineered into too many plugins. So, there are different paths in the program where the same message is handled differently, sometimes running into the parsing problem and sometimes not.
Comment 18 Jason Keirstead 2004-04-28 14:05:26 UTC
On April 28, 2004 05:04 am, Till Gerken wrote:
> ------- Additional Comments From till tantalo net  2004-04-28 10:04 -------
> That the message appears differently in the history is due to Kopete being
> overengineered into too many plugins. So, there are different paths in the
> program where the same message is handled differently, sometimes running
> into the parsing problem and sometimes not.

The reason is because the history plugin does not follow your chat window 
scheme, which is actually wrong. Stefan has code to make the History follow 
your XSL scheme, but I am unsure if he committed it yet.


Comment 19 Riku Voipio 2004-04-28 15:33:29 UTC
Forwarded from:

http://bugs.debian.org/cgi-bin/bugreport.cgi?archive=no&bug=246310

> This occurrs in long ICQ- and IRC-messages (>256chars?), probably in conjunction with umlauts
Comment 20 Nicholas Pilon 2004-04-28 15:44:36 UTC
Actually, the issue pointed out by the Debian bug is an interesting one. All the messages I had trouble with were "long" ICQ/AIM messages, though none contained umlauts or any other "strange" characters. But setting the charset type on ICQ still fixed it. Does ICQ still differentiate between "server" and "direct" messages?
Comment 21 Christopher Martin 2004-04-28 16:19:36 UTC
FYI, in case anyone is interested, an older report which has much interesting discussion of the string/encoding issue is bug 67727.
Comment 22 Jason Keirstead 2004-05-13 02:53:44 UTC
CVS commit by brunes: 

Use new decoder helper.

This will ensure that the message will always be parseable to the XML
engine.

Note this could in theory be backported, if the method was backported
as well

CCMAIL:75497-done@bugs.kde.org


  M +1 -48     oscarsocket.cpp   1.182


--- kdenetwork/kopete/protocols/oscar/oscarsocket/oscarsocket.cpp  #1.181:1.182
@@ -4272,52 +4272,5 @@ const QString OscarSocket::ServerToQStri
         }
 
-        if(!codec) // no per-contact codec, guessing starts here :)
-        {
-                codec = QTextCodec::codecForMib(3); // US-ASCII
-
-                if(codec)
-                {
-                        cresult=codec->heuristicContentMatch(string, length);
-#ifdef CHARSET_DEBUG
-                        kdDebug(14150) << k_funcinfo <<
-                                "result for US-ASCII=" << cresult <<
-                                ", message length=" << length << endl;
-#endif
-                        if(cresult < length-1)
-                                codec=0L; // codec not appropriate
-                }
-
-                if(!codec)
-                {
-                        codec = QTextCodec::codecForMib(106); //UTF-8
-                        if(codec)
-                        {
-                                cresult = codec->heuristicContentMatch(string, length);
-#ifdef CHARSET_DEBUG
-                                kdDebug(14150) << k_funcinfo <<
-                                        "result for UTF-8=" << cresult <<
-                                        ", message length=" << length << endl;
-#endif
-                                if(cresult < (length/2)-1)
-                                        codec = 0L;
-                        }
-                }
-
-                if(!codec)
-                {
-                        kdDebug(14150) << k_funcinfo <<
-                                "Couldn't find suitable encoding for incoming message, " <<
-                                "encoding using local system-encoding, TODO: sane fallback?" << endl;
-                        codec = QTextCodec::codecForLocale();
-                        // TODO: optionally have a per-account encoding as fallback!
-                }
-        }
-
-#ifdef CHARSET_DEBUG
-        kdDebug(14150) << k_funcinfo <<
-                "Decoding using codec '" << codec->name() << "'" << endl;
-#endif
-
-        return codec->toUnicode(string);
+        return KopeteMessage::decodeString( string, codec );
 }
 


Comment 23 Jason Keirstead 2004-06-18 00:49:24 UTC
*** Bug 83565 has been marked as a duplicate of this bug. ***
Comment 24 lucien.thomassin@tele2.fr 2004-07-12 21:00:56 UTC
Marcel Meyer a écrit :

>------- You are receiving this mail because: -------
>You are on the CC list for the bug, or are watching someone who is.
>      
>http://bugs.kde.org/show_bug.cgi?id=75497      
>meyerm fs tum de changed:
>
>           What    |Removed                     |Added
>----------------------------------------------------------------------------
>                 CC|                            |meyerm fs tum de
>
>  
>
What is the solution ? Maybe choise UTF8 in place of ISO-8859-15 ?
Good bye.

Comment 25 Matt Rogers 2004-07-12 21:20:53 UTC
the solution is to upgrade to the version included in the KDE 3.3 Betas
Comment 26 meyerm 2004-09-09 12:32:48 UTC
> the solution is to upgrade to the version included in the KDE 3.3 Betas

That's not 100% true. I've just upgraded to KDE 3.3. Now, the message is rendered. But special characters just like the german Umlaute are then rendered as black dots/carrets.
Comment 27 spunti 2004-11-08 11:15:37 UTC
I'd like to add here, that I've no problems with German umlauts in plain text messages. But using the GPG-module brings these "cannot-parse-errors" for all messages with umlauts.
The man on the other side is using Miranda, a Windows-ICQ-Client.

I set my LAND on de@euro and in Kopete I chose the win-codification (I also tried ISO-15) but it doesn't help here.
I'm using Suse-Linux9.2 with KDE3.3.0 and Kopete 0.8.0 now. The problem was also in the Kopete that comes with KDE3.2.

I should mention, that I'm possible to encrypt the message successfully when copying the text from the history to the clipboard end encrypt there.

I found out that the Kopete 0.7.4 has NOT this problem, so I'm forced to downgrade to that or using another ICQ-Client:-(

thanks
spunti