Bug 105084

Summary: Python DCOP query issues with non-ASCII characters
Product: [Unmaintained] bindings Reporter: Hugo Haas <hugo>
Component: generalAssignee: kde-bindings
Status: RESOLVED UNMAINTAINED    
Severity: normal CC: andresbajotierra
Priority: NOR    
Version: unspecified   
Target Milestone: ---   
Platform: Debian testing   
OS: Linux   
Latest Commit: Version Fixed In:
Sentry Crash Report:
Attachments: Simple test pydcop program
Same script, but converting the query string to Unicode
patch for QString conversion of dcop python binding

Description Hugo Haas 2005-05-04 12:25:15 UTC
Version:           3.3.2 (using KDE KDE 3.3.2)
Installed from:    Debian testing/unstable Packages
OS:                Linux

I seem to be having issues with non-ASCII characters with pydcop. I am using UTF-8 encoding.

With test.py attached, I get:

hugo@jibboom /tmp% dcop amarok collection query 'SELECT count(*) FROM
tags WHERE url LIKE "%a%"'
5901

hugo@jibboom /tmp% python test.py a
SELECT count(*) FROM tags WHERE url LIKE "%a%"
['5901']

Everything looks normal here.

However, with "é" in the query:

hugo@jibboom /tmp% dcop amarok collection query 'SELECT count(*) FROM
tags WHERE url LIKE "%é%"'
506

hugo@jibboom /tmp%python test.py é
SELECT count(*) FROM tags WHERE url LIKE "%é%"
['0']

As you can see, the queries return very different results.
Comment 1 Hugo Haas 2005-05-04 12:26:03 UTC
Created attachment 10894 [details]
Simple test pydcop program
Comment 2 Julian Rockey 2005-05-04 21:49:08 UTC
Quite possibly a problem, I never tested the bindings with non-ASCII characters. Does it work if you use a python Unicode string?
I'll try to look into this over the next week.
Comment 3 Hugo Haas 2005-05-05 00:23:17 UTC
If I add:

query = query.decode('utf-8')

I get an unknown method error:

hugo@jibboom /tmp% python test.py é
SELECT count(*) FROM tags WHERE url LIKE "%é%"
<type 'unicode'>
Traceback (most recent call last):
  File "test.py", line 9, in ?
    print pydcop.anyAppCalled("amarok").collection.query(query )
  File "/usr/lib/python2.3/site-packages/pydcop.py", line 90, in __call__
    return pcop.dcop_call( self.appname, self.objname, self.name, args )
RuntimeError: DCOP: Unknown method.

I'll attach the updated test.py.
Comment 4 Hugo Haas 2005-05-05 00:25:11 UTC
Created attachment 10904 [details]
Same script, but converting the query string to Unicode
Comment 5 Miguel Angel 2006-05-01 13:42:40 UTC
Confirmed on KDE 3.5.2
This happens on any pydcop call, if I try to use unicode strings the "DCOP: Unknown method" runtime error is raised. If I use non-unicode strings, national characters are interpreted as 8 bit ascii.
This makes pydcop unusable for internacionalized projects
Example:
pydcop.anyAppCalled('amarok').scripts.addCustomMenuItem(u'á','b') ->Exception
pydcop.anyAppCalled('amarok').scripts.addCustomMenuItem('á','b') ->wrong characters

As a side note, pykde's dcop extension doesn't have this  problem (it has many others, though)
Comment 6 David Ammouial 2006-09-28 09:25:34 UTC
The problem still shows up now.
In fact, it displays UTF-8 strings as if they were iso-8859-15.

Example:
>>> import pcop
>>> pcop.dcop_call('knotify', 'default', 'notify', ('Event Foo', 'App Bar', 'é', '', '', 16, 0))

This gives a popup containing "é" instead of "é", which means that the 2 bytes of the utf-8 character are interpretated as 2 separate 8-bit characters.

Please note that my system charset is UTF-8.

KDE 3.5.4
Python 2.4.4c0 (Debian sid package 2.4.3-11)
python-dcop (Debian sid package 4:3.5.3-1+b1)

-- 
David Ammouial
Comment 7 Christian Folkers 2006-11-05 17:33:53 UTC
*** This bug has been confirmed by popular vote. ***
Comment 8 Christian Folkers 2006-11-06 15:17:16 UTC
hmm ok i think i have found the problem (and solved it ;) )

my system:
fedora core 6(x86_64)
kdebindings-3.5.5-0.1.fc6

it is not related to pydcop.py (which is only a small wrapper around pcop.so)
it is a bug in the marshal modul of pcop.so

kdebindings-3.5.5/dcoppython/shell/marshal_funcs.data
around line 250

there is the conversion spec for QString which looks like:

type:QString
%doc as str s
%% marshal
  {
    if (!PyString_Check(obj)) return false;
    if (str) {
      QString s( PyString_AsString(obj) );          //<<<---bug---<<<<
      (*str) << s;
    }
    return true;
  }
%% demarshal
  {
    QString s;
    (*str) >> s;
    return PyString_FromString( s.utf8().data() );
  }
%%

at line 250 the function uses the default constructor of QString
which expects a Latin-1 char* string

"""The encoding is assumed to be Latin-1, unless you change it using QTextCodec::setCodecForCStrings().""" <from qt docs>

i think a simple change in 
   QString s = QString::fromUtf8( PyString_AsString(obj) );
should be ok

here my suggestion:

type:QString
%doc as str s
%% marshal
  {
    if (!PyString_Check(obj)) return false;
    if (str) {
      QString s = QString::fromUtf8( PyString_AsString(obj) );
      (*str) << s;
    }
    return true;
  }
%% demarshal
  {
    QString s;
    (*str) >> s;
    return PyString_FromString( s.utf8().data() );
  }
%%

Comment 9 Miguel Angel 2006-11-06 19:04:22 UTC
Christian, could you upload a patch with your fix?
Comment 10 Christian Folkers 2006-11-08 14:46:52 UTC
Created attachment 18469 [details]
patch for QString conversion of dcop python binding
Comment 11 Christian Folkers 2006-11-08 14:59:31 UTC
ok

i'am sure that this patch solves the problem but
there may be other conversions (e.g. type:KURL and uchar)
also suffering from such problems but i couldn't trigger a bug there
(not enough time to test)
so i decided not to change anything there ;)

i think that someone should review the whole marshal code regarding this problems

but i'am happy if the patch could be applied to the next version ;)
"cause i'am writing a new program using it" ;)

thx
Comment 12 Miguel Angel 2007-09-01 22:22:41 UTC
Could anybody apply this patch in kde source code?
Comment 13 Dario Andres 2010-01-20 21:15:13 UTC
Closing as DCOP is unmaintained.