Summary: | Problem receiving ICQ messages containing non-ASCII letters | ||
---|---|---|---|
Product: | [Unmaintained] telepathy | Reporter: | Alex Richardson <arichardson.kde> |
Component: | text-ui | Assignee: | Telepathy Bugs <kde-telepathy-bugs> |
Status: | RESOLVED FIXED | ||
Severity: | normal | CC: | asturm, mklapetek, roland.leissa |
Priority: | NOR | ||
Version: | git-latest | ||
Target Milestone: | Future | ||
Platform: | unspecified | ||
OS: | Linux | ||
Latest Commit: | http://commits.kde.org/telepathy-accounts-kcm/c18f4523fb7fdbdabc6229132362e10ac22f2140 | Version Fixed In: | 0.7.0 |
Sentry Crash Report: |
Description
Alex Richardson
2012-09-07 14:48:33 UTC
I looked into the source of libpurple and found this: else if (charset == AIM_CHARSET_LATIN_1) { if ((sourcebn != NULL) && oscar_util_valid_name_icq(sourcebn)) charsetstr1 = purple_account_get_string(account, "encoding", OSCAR_DEFAULT_CUSTOM_ENCODING); else charsetstr1 = "ISO-8859-1"; charsetstr2 = "UTF-8"; } I.e. the message is a Latin-1 message (set in the protocol header). But it never tries ISO-8859-1 since it uses our custom encoding instead (which is UTF-8). Seems like a bug in libpurple to me, it uses UTF-8 as a fallback encoding for a Latin-1 message and otherwise just the user encoding. I am not sure what sourcebn is, I didn't dig that deep into their code, but at least on my system the first branch (with user encoding) is always taken. I think it should always have ISO-8859-1 since this is the expected encoding (and is sent by the official ICQ client) at least as a fallback instead of UTF-8. This can be worked around by using setting "ISO-8859-1" as the encoding in the accounts KCM, but I will also report a bug to libpurple. Thanks for digging up into that! Good job :) Can the official client have different encoding set? Like UTF-8? Because as much as I'd like to change our default encoding to Latin1, I'm afraid this might get broken for other users with different clients who use characters not present in Latin1 (which is eastern europe, so does not cover for example central europe's characters, not mentioning some unicode stuff). UTF-8 should always be the safest bet. Also it's almost unbelieveable that in 2012 the official ICQ client (is there still such thing?) is still not using UTF-8, but rather a limited Latin1 by default (that's why I personally dislike that protocol very much). Can you link the libpurple bug report so we can track it? Thanks! The ICQ messages have a field in the header which specifies the encoding. The official client uses Latin1 if it can (probably to use less bytes), but once you send characters which are not part of Latin1 it chooses UTF16-BE and sets the flag in the header (verified with wireshark). You are right, it may be different in e.g. Russia, will have to see whether I can check this. Trying UTF-8 first should mostly not be a problem, since at least for the ASCII characters it is the same as ISO-9959-1, and once the high bit is set it will probably fail. Then it could fall back to ISO-9959-1. However as you can see in the source snippet above, the two last lines should be swapped for that to work. Therefore I think we should add ISO-8859-1 to the encoding combobox in the accounts KCM and have that as a default to work around this issue in libpurple. If someone wants/needs to they can still set it back it to UTF-8. Yeah, same here while pidgin is fine. It would be nice to at least get that annoying error message out of the message - maybe hidden in some kind of warning/error icon or status bar. *** Bug 318448 has been marked as a duplicate of this bug. *** Git commit c18f4523fb7fdbdabc6229132362e10ac22f2140 by Dan Vrátil. Committed on 17/04/2013 at 12:47. Pushed by dvratil into branch 'master'. Add configuration for additional charsets in ICQ FIXED-IN: 0.7.0 REVIEW: 110060 M +82 -4 plugins/haze/icq-server-settings-widget.cpp M +3 -0 plugins/haze/icq-server-settings-widget.h M +0 -5 plugins/haze/icq-server-settings-widget.ui http://commits.kde.org/telepathy-accounts-kcm/c18f4523fb7fdbdabc6229132362e10ac22f2140 |