Summary: | long nicknames are cut off | ||
---|---|---|---|
Product: | [Unmaintained] kopete | Reporter: | Favonia <h3226699> |
Component: | MSN Plugin | Assignee: | Kopete Developers <kopete-bugs-null> |
Status: | RESOLVED FIXED | ||
Severity: | normal | CC: | acelan, assemhassan, mrnan216, ogoffart, sromero |
Priority: | NOR | ||
Version: | 0.7.3 | ||
Target Milestone: | --- | ||
Platform: | Debian testing | ||
OS: | Linux | ||
Latest Commit: | Version Fixed In: | ||
Sentry Crash Report: |
Description
Favonia
2004-01-31 17:52:42 UTC
Subject: Re: [Kopete-devel] New: long nicknames are cut off On Saturday 31 January 2004 17:52, Peter Lan wrote: > Entity: line 6: parser error : Input is not proper UTF-8, indicate encoding > ! 暱稱這是一個超長的暱稱這是一個超長的暱稱這是一個超長 and > IMHO, it seems to be a buffer overflow problem... No, it's an encoding problem. What I don't understand is that it's in traditional Chinese encoding. I'm pretty sure the MSN protocol uses UTF-8 (Unicode) since quite some time, so why would your name not be in UTF-8? Does the name look ok in the contact list? Well, it's cut off too (in the contact list and in the title of the chat box) Maybe you could take a look at this screenshot http://www.csie.ntu.edu.tw/~k92201008/temp/screenshot/longnickname1.png You could see that the end of nickname looks strange The MSN protocol indeed uses UTF-8, and I just want to explain which language I use in the nickname. (sorry) I make another sample ABCDEFGHIJKLMNOPQRSTUVWXYZABCDEFGHIJKLMNOPQRSTUVWXYZABCDEFGHIJKLMNOPQRSTUVWXYZABCDEFGHIJKLMNOPQRSTUVWXYZABCDEFGHIJKLMNOPQRSTUVWX and get something similar: ----START---- Entity: line 6: parser error : Input is not proper UTF-8, indicate encoding ! QRSTUVWXYZABCDEFGHIJKLMNOP ...............................................................................^ Entity: line 6: error: Bytes: 0xEF 0xBC 0x25 0x42 QRSTUVWXYZABCDEFGHIJKLMNOP ...............................................................................^ ---- END ---- '.' stands for a space (\x20) I found that the space char was missing last time and the position of '^' was not correct. Here's a screenshot for it: http://www.csie.ntu.edu.tw/~k92201008/temp/screenshot/longnickname2.png Thanks for your helping :) Subject: Re: [Kopete-devel] long nicknames are cut off
On Saturday 31 January 2004 19:01, Favonia wrote:
> I make another sample
> ABCDEFGHIJKLMNOPQRSTUVWXYZABCDEFGHIJKLM
What charset is this? KMail displays it perfectly fine as western letters, but
in fact they are *NOT* the same Ascii codes. Even though I see 'A B C D
E ...'. Are there two different utf-8 sequences to represent the western
alphabet???
When I do 'view source' your text reads as
----START----
Entity: line 6: parser error : Input is not proper UTF-8, indicate encoding !
QRSTUVWXYZABCDEFGHIJKLï¼ï¼®ï¼¯ï¼°
...............................................................................^
Entity: line 6: error: Bytes: 0xEF 0xBC 0x25 0x42
QRSTUVWXYZABCDEFGHIJKLï¼ï¼®ï¼¯ï¼°
...............................................................................^
---- END ----
Which makes it extra strange that it works on the debug output, since the
Konsole is usually not utf-8.
Weird stuff :)
They're called "fullwidth latin letters" in Unicode, and somehow wider than the origin ones. (for decoration?) I made a redirection, somthing like "kopete > output 2>&1" and force the encoding to be UTF-8, so there's no problem. :) I found another symptom about this bug, the screenshot is right here: http://www.csie.ntu.edu.tw/~k92201008/temp/screenshot/longnickname3.png The chatview window doesn't show anything! But if I make the nickname much shorter, it works fine. http://www.csie.ntu.edu.tw/~k92201008/temp/screenshot/longnickname4.png (new nickname: ABCDEFGHIJKLMNOPQRSTUVWXYZ) It seems that the UTF-8 encoding is right as rain. Thanks for helping. :) Subject: Re: [Kopete-devel] long nicknames are cut off On Sunday 01 February 2004 05:02, Favonia wrote: > I found another symptom about this bug, the screenshot is right here: > http://www.csie.ntu.edu.tw/~k92201008/temp/screenshot/longnickname3.png > > The chatview window doesn't show anything! > But if I make the nickname much shorter, it works fine. Hmm, it's almost like it somehow chokes on long names. Very weird. Does this only happen with MSN? Or is it a problem in Kopete's core, used by all protocols? I've tried ICQ pro and Yahoo! messager. ICQ has some constraints that I can't reach the limit, and it has some encoding problem if I use non-ASCII characters. Yahoo! messager's default nickname appears to be (always) the ID, and the user can only change his/her friends' name in the contact list locally. So there's no answer to it. :( Sorry but I don't know how to test other messengers... (well, how about a `fake_messenger` plugin for testing :P) *** Bug 74920 has been marked as a duplicate of this bug. *** *** Bug 75050 has been marked as a duplicate of this bug. *** I think i knoàw what's hapenning. To transmit the message to the server, we need to url encode it. we use a kde method that encode all non-ascii char special char (like space or %) But the MSN official client does not escape as many char as Kopete does. anyway, if we send a too long command to the server we are disconnected. so if someone has "éééééééééééééééééééé" as nickname, kopete will reply back to the server %e7%e7%e7%e7.............. this is more long. To not been disconnected, the command is snipped. So that's explaining all. The solution: code our own URL encoder that only encode char that need to be encoded. Its a big problem to me, so I modify the code to cut the long nick, and it works for me. Hoping there is a good solution for it, before that, I'll use this method to make the kopete works well. #code from cvs KDE_3_2_BRANCH 2004/03/02 #497 protocols/msn/msnsocket.cpp QString MSNSocket::escape( const QString &str ) { return ( KURL::encode_string( str.left( 50), 106 ) ); } *** Bug 76671 has been marked as a duplicate of this bug. *** Hello here, I make a patch for this bug and I put it at http://kde.linux.org.tw/~acelan/kopete/msnsocket.cpp.patch Olivier Goffart said that "The solution: code our own URL encoder that only encode char that need to be encoded". So I copy the encode function from Qt, and cut it to fit our need. I've test the patch for a few days and it works fine for me. The patch source code is come from the cvs i just update. And the patch is a little ugly, I'm not sure where to put the two new function, so i declare them as static. I need somebody help me to modify them to the correct position. I really hope this patch could be accept by the kopete team, because this bug is a very big problem for the Chinese user. hello, I had made a patch for this bug, and now I modified it to make it more clear. Would you please help me to commit this patch. This bug is a big issue for the chinese user, and I think this patch should work well in other language environment. Thank you very much. http://kde.linux.org.tw/~acelan/kopete/msnsocket.cpp.patch Thanks for the patch, i'll test it and commit it if it works. Anyway, you still escape many character. i think only the space and the % is enough, i'll try that. Hi, I have tested the escape characters one by one, let me explain them to you. '<' '>' : If you don't escape them, you can't login to the msn server. ' ' : If you don't escape the space, you'll be disconnected if someone's nick contain the space. '\\' : The code of some chinese words are end with '\', if you don't escape it, some chinese nick will become strange symbols. '^' '&' '*' : These three characters are need to be escape, if the nickname contains them, these characters will disappear. That's why I escape these characters, and you can test them to see if my test is correct or not. Thank you very much. ^^ i'm actualy using your patch but i only escape ' ' and % i'm actually logged and all seems to be fine. the official client only escapes % and space. oh, i think \t should be escaped too. Can you confirm that you are forced to escape all theses symbols ? In fact, if you can't connect, it's certenly because your password contains one of theses symbols. I also use MSNSocket::escape() to escape the passwors which is send in a URL. so there we will have to use the KURL::encode_string And i will encode all chars with the value <= 32 CVS commit by ogoffart: Better escaping of nickname CCMAIL: 73901-done@bugs.kde.org thanks to Chia-Lin Kao M +2 -2 msnnotifysocket.cpp 1.139 M +40 -6 msnsocket.cpp 1.87 --- kdenetwork/kopete/protocols/msn/msnnotifysocket.cpp #1.138:1.139 @@ -591,5 +591,5 @@ void MSNNotifySocket::slotAuthJobDone ( QString authURL = "https://" + m_sid + "/ppsecure/post.srf?lc=" + rx.cap( 1 ) + "&id=" + rx.cap( 2 ) + "&tw=" + rx.cap( 3 ) + "&cbid=" + rx.cap( 2 ) + "&da=passport.com&login=" + - m_account->accountId() + "&domain=passport.com&passwd="; + KURL::encode_string( m_account->accountId()) + "&domain=passport.com&passwd="; kdDebug( 14140 ) << "MSNNotifySocket::slotAuthJobDone: " << authURL << "(*******)" << endl; @@ -599,5 +599,5 @@ void MSNNotifySocket::slotAuthJobDone ( if(m_kv.isNull()) m_kv=""; - authURL += escape( m_password ); + authURL += KURL::encode_string( m_password ) ; job = KIO::get( KURL( authURL ), false, false ); job->addMetaData("cookies", "manual"); --- kdenetwork/kopete/protocols/msn/msnsocket.cpp #1.86:1.87 @@ -494,5 +493,40 @@ void MSNSocket::slotReadyWrite() QString MSNSocket::escape( const QString &str ) { - return ( KURL::encode_string( str, 106 ) ); + //return ( KURL::encode_string( str, 106 ) ); + //It's not needed to encode everything. The official msn client only encode spaces and % + //If we encode more, the size can be longer than excepted. + + int old_length= str.length(); + QChar *new_segment = new QChar[ old_length * 3 + 1 ]; + int new_length = 0; + + for ( int i = 0; i < old_length; i++ ) + { + unsigned char character = str[i]; + + /*character == ' ' || character == '%' || character == '\t' + || characters == '\n' || character == '\r'*/ + /* || character == '<' || character == '>' || character == '\\' + || character == '^' || character == '&' || character == '*'*/ + + if( character <= 32 || character == '%' ) + { + new_segment[ new_length++ ] = '%'; + + unsigned int c = character / 16; + c += (c > 9) ? ('A' - 10) : '0'; + new_segment[ new_length++ ] = c; + + c = character % 16; + c += (c > 9) ? ('A' - 10) : '0'; + new_segment[ new_length++ ] = c; + } + else + new_segment[ new_length++ ] = str[i]; + } + + QString result = QString(new_segment, new_length); + delete [] new_segment; + return result; } *** Bug 81461 has been marked as a duplicate of this bug. *** I know this bug have been marked as resolved, but there is still one thing need to correct. The chinese word is two bytes, and the ascii code number may below 32 or may above 32. If the first byte of the chinese word is below 32 and the second byte is above 32, this esacpe function encode only first byte of the word and that will mix up the chinese word. Please accept this patch to correct this bug. Thank you very much. --- kopete/protocols/msn/msnsocket.cpp.org 2004-08-16 21:39:33.000000000 +0800 +++ kopete/protocols/msn/msnsocket.cpp 2004-08-16 21:39:38.000000000 +0800 @@ -497,7 +497,7 @@ /* || character == '<' || character == '>' || character == '\\' || character == '^' || character == '&' || character =='*'*/ - if( character <= 32 || character == '%' ) + if( character == 32 || character == '%' ) { new_segment[ new_length++ ] = '%'; But the patch is not correct. What about every other char than 32 which should be escaped ? i don't remember exactly the code, but I think I used char* and probably QChar should be used. Anyway, i have no time now to have a look because my examens. We really need to correct this or the chinese word would be mixed up. And you know that next release: 0.9.0 due out 2004.08.18, there is no time to make more check. I just can tell you about my experience, I use the patch for 2 or 3 weeks and it work fine for me. If 0.9 doesn't contain that patch, I might be blamed to die. And I believe only escape space and '%' is enough. Please help us. can you simply try to replace the char with a QChar (QChar is utf8, unlike char) the problem is that char like \n \r or msn plus color code need also to be escaped. CVS commit by ogoffart: fix bug 73901 i'll backport CCMAIL: 73901-done@bugs.kde.org M +1 -6 msnsocket.cpp 1.97 --- kdenetwork/kopete/protocols/msn/msnsocket.cpp #1.96:1.97 @@ -491,10 +491,5 @@ QString MSNSocket::escape( const QString for ( int i = 0; i < old_length; i++ ) { - unsigned char character = str[i]; - - /*character == ' ' || character == '%' || character == '\t' - || characters == '\n' || character == '\r'*/ - /* || character == '<' || character == '>' || character == '\\' - || character == '^' || character == '&' || character == '*'*/ + unsigned short character = str[i].unicode(); if( character <= 32 || character == '%' ) this should be in KDE 3.3.1 Le Vendredi 27 Août 2004 21:47, Matt Rogers a écrit :
> ------- this should be in KDE 3.3.1
I already backported it :-)
|