Bug 41558

Summary: KMail put rfc2231-encoded parameter values into DQUOTES.
Product: [Unmaintained] kmail Reporter: Andrey Cherepanov <sibskull>
Component: generalAssignee: kdepim bugs <kdepim-bugs>
Status: RESOLVED FIXED    
Severity: grave CC: Hermann.Zheboldov
Priority: NOR    
Version: 1.4   
Target Milestone: ---   
Platform: RedHat Enterprise Linux   
OS: Linux   
Latest Commit: Version Fixed In:
Sentry Crash Report:
Attachments: KMail patch for Outlook comppatible attachment names encoding

Description Andrey Cherepanov 2002-04-24 07:28:32 UTC
(*** This bug was imported into bugs.kde.org ***)

Package:           kmail
Version:           1.4 (using KDE 3.0.0 )
Severity:          normal
Installed from:    Red Hat Linux 7.2.92
Compiler:          gcc version 2.96 20000731 (Red Hat Linux 7.2 2.96-109)
OS:                Linux (i686) release 2.4.7-10
OS/Compiler notes: 

When I compose messages with subject and attacments using non-latin symbols I get message with encoding subject and attachment's name like koi8-r''%F0%D2%C9%CC%CF%D6%C5%CE%C9%D1%20%CB%20%D0%C9%D3%D8%CD%D5%20%F2%E1%EF%2Ezip
But users who read my e-mail by Outlook Express get wrong subject (not in koi8-r encoding) and attachments like ATT00*.

Please put in KMail configuration dialog option "Don't encode subject and attachment's names" or so like.

(Submitted via bugs.kde.org)
(Called from KBugReport dialog)
Comment 1 Marc Mutz 2002-04-24 08:58:47 UTC
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Wednesday 24 April 2002 09:28 sibskull@mail.ru wrote:
<snip>
> When I compose messages with subject and attacments using non-latin
> symbols I get message with encoding subject and attachment's name like
> koi8-r''%F0%D2%C9%CC%CF%D6%C5%CE%C9%D1%20%CB%20%D0%C9%D3%D8%CD%D5%20%F2%E1%
>EF%2Ezip

Actually the subject should look like
Subject: =?koi8-r?q?=F0=D2=C9....?=
and the attchments filename is encoded with the %-thingy? Right?

That's valid rfc2047/rfc2231 and the only way to encode these things.

> But users who read my e-mail by Outlook Express get wrong subject (not in
> koi8-r encoding) and attachments like ATT00*.
<snip>

Is the specific charset (koi8-r) the problem? You can change that in Configure 
KMail->Composer->charsets.
If it's not the charset it's Outlook's fault. Complain to Microsoft to add 
support for the (already 4 1/2 years old) rfc2231 standard.
Actually it's also Microsoft's problem if they don't recognize the koi8-r 
charset but I guess they do if you use the right locale there...

Marc

- -- 
Marc Mutz <mutz@kde.org>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.6 (GNU/Linux)
Comment: For info see http://www.gnupg.org

iD8DBQE8xnPH3oWD+L2/6DgRAodEAJ9OLMNsaKIOY91JUpvH0SjteLlZ3QCeLZtf
XN6cU54TPJD3180xg9vAtYE=
=Tvsh
-----END PGP SIGNATURE-----
Comment 2 Marc Mutz 2002-04-24 15:38:48 UTC
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

reopen 41558
retitle 41558 KMail put rfc2231-encoded parameter values into DQUOTES.
severity 41558 grave
quit

On Wednesday 24 April 2002 12:58 Andrey S. Cherepanov wrote:
<snip>
> Compare two attachment header with smae name and look on difference!
> Outlook Express:
> Content-Type: application/msword;
> name="=?koi8-r?B?Q8/XxdQuZG9j?="

That's completely invalid stuff. encoded-words are not allowed in 
quoted-strings. I can cite the RFCs if you like me to.
Oh well Microsoft is always right yes? Please believe me when I say that 
it's Outlook's fault :-(

> ----------------------
> KMail [version 1.4] :(
> Content-Type: application/msword;
>   name*="koi8-r''C%CF%D7%C5%D4%2Edoc"

Eeek. Are the double quotes really there?

Except for the double quotes that's completely correct encoding according to 
rfc2231.
Please type "rfc:2231" into konqueror's location field read the document that 
comes up then and report the bug to Microsoft where it belongs.

I see if I can fix the double quotes problem in KMail.

> Subject was encoded well but attachments - wrong. And what's parameter
> 'name*' ??
>
> Outlook Express show Subject correct but attachment name wrong :(

Marc

- -- 
Marc Mutz <mutz@kde.org>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.6 (GNU/Linux)
Comment: For info see http://www.gnupg.org

iD8DBQE8xtGT3oWD+L2/6DgRApVDAKD36QLNF1ix5HxtDW7+J4eterSqagCfUAp4
lIBbE/ZGgRPZPhow4AQ2rPk=
=S8aX
-----END PGP SIGNATURE-----
Comment 3 Andrey Cherepanov 2004-04-16 12:19:30 UTC
Created attachment 5658 [details]
KMail patch for Outlook comppatible attachment names encoding

I reopen this bug because I still need to send files with russian names to my
partners who have Outlook Express. Tested on KMail 1.6.1 from KDE 3.2.1.
Additionally I put option in configuration tab for pure russian people who has
similar problem. Now attachment name encode by RFC2047 only if option "Outlook
compatible attachment encoding".
Possible this option name is ugly. Please, correct me.
Comment 4 Andrey Cherepanov 2004-04-16 12:20:55 UTC
I create the patch. It fix problem, described in this bug. Now I guess, this bug must be reopened.
Comment 5 Don Sanders 2004-04-19 06:40:56 UTC
The problem is that AFAIK using RFC2047 to encode attachment names is 
not valid, it's RFC non-compliant, and it's the general consensus of 
the KMail team that having KMail send RFC non-compliant mails is not 
acceptable even as an option.

Personally I think your patch is fine, but I'm pretty sure I'm 
outvoted here. Hence I work on an alternative patch that renames 
attachment. Unfortunately this is limited to working on extended 
ascii characters only, hence cyrillic will still be a problem.

Don.

Comment 6 Andrey Cherepanov 2004-04-19 07:35:23 UTC
Don, in what way priority is higher: non-official standart (BTW, RFC instead of ECMA standarts IS NOT official standarts!) or real user work? Outlook/Mozilla/Opera encode 'inline' block (note: only KMail put 'attachment' definition) in RFC2047. It's RFC compatible, because MIME block in these proucts is defined as 'inline', not 'attachment'. Why KMail must be incompatible e-mail client, even them users are ready to help developers to make really BEST e-mail client!?
I don't understand your religious fanaticism!
Comment 7 Don Sanders 2004-04-20 02:58:06 UTC
> Don, in what way priority is higher: non-official standart
> (BTW, RFC instead of ECMA standarts IS NOT official standarts!)

(About RFC vs ECMA is that relevant? Is there some kind of ECMA/RFC 
conflict? This is a bit offtopic but what makes you say ECMA is 
official over RFC?)

> or 
> real user work? 

In my opinion real user work always comes before standards compliance.

However in this case Ingo, and Marc have said that Content-Disposition 
filenames shall be restricted to US-ASCII or encoding as specified in 
RFC2231. So it's 2-1 and I'm outvoted.

> Outlook/Mozilla/Opera encode 'inline' block (note: 
> only KMail put 'attachment' definition) in RFC2047. It's RFC
> compatible, because MIME block in these proucts is defined as
> 'inline', not 'attachment'.

After scanning RFC 2045, 2047, 2231, 2183, and 2184 again I get the 
impression that the only valid way to encode non US-ASCII 
content-disposition filenames is by using the method described 
RFC2231.

What makes you say that if message parts are defined to have an 
'inline' content-disposition then it's ok to encode the filename 
using RFC2047? I can't see anything in the RFCs that support this 
position.

> Why KMail must be incompatible e-mail 
> client, even them users are ready to help developers to make really
> BEST e-mail client!? I don't understand your religious fanaticism!

As I said personally I agree with you, and I find the current 
situation to be unsatisfactory for cyrillic language uses. But I'm 
outvoted.

BTW there's no simple algorithm to create a latin string to 
approximate a cyrillic string is there? If so that could be used to 
rename cyrillic filenames to US-ASCII filenames.

Don.

Comment 8 Andrey Cherepanov 2004-04-20 08:58:21 UTC
> > Don, in what way priority is higher: non-official standart
> > (BTW, RFC instead of ECMA standarts IS NOT official standarts!)
>
> (About RFC vs ECMA is that relevant? Is there some kind of ECMA/RFC
> conflict? This is a bit offtopic but what makes you say ECMA is
> official over RFC?)
Because RFC is not product of official standartization organization.

> After scanning RFC 2045, 2047, 2231, 2183, and 2184 again I get the
> impression that the only valid way to encode non US-ASCII
> content-disposition filenames is by using the method described
> RFC2231.
> What makes you say that if message parts are defined to have an
> 'inline' content-disposition then it's ok to encode the filename
> using RFC2047? I can't see anything in the RFCs that support this
> position.
Ok! Look in RFCs together:
RFC2183: "It specifies the "Content-Disposition" header field, which is 
optional and valid for any MIME entity ("message" or "body part")."
In section 2: "Parameter values longer than 78 characters, or which contain 
non-ASCII characters, MUST be encoded as specified in [RFC 2184]."
Go to RFC2184...
"This memo also defines an extension to the encoded words defined in RFC 2047 
to allow the specification of the language to be used for display as well as 
the character set."

I think, software corporations use this two RFCs. You said about KMail MUST 
support standart. Why KMail doesn't support they NOW?


> > Why KMail must be incompatible e-mail
> > client, even them users are ready to help developers to make really
> > BEST e-mail client!? I don't understand your religious fanaticism!
>
> As I said personally I agree with you, and I find the current
> situation to be unsatisfactory for cyrillic language uses. But I'm
> outvoted.
>
> BTW there's no simple algorithm to create a latin string to
> approximate a cyrillic string is there? If so that could be used to
> rename cyrillic filenames to US-ASCII filenames.
It isn't acceptable! Don, please, read RFC2183/2184. 

Comment 9 Don Sanders 2004-04-20 10:48:34 UTC
> Ok! Look in RFCs together:
> RFC2183: "It specifies the "Content-Disposition" header field,
> which is optional and valid for any MIME entity ("message" or "body
> part")." In section 2: "Parameter values longer than 78 characters,
> or which contain non-ASCII characters, MUST be encoded as specified
> in [RFC 2184]." Go to RFC2184...
> "This memo also defines an extension to the encoded words defined
> in RFC 2047 to allow the specification of the language to be used
> for display as well as the character set."

Going to RFC2184 and reading it to see what the defined extension is 
leads me to section 4.

"4.  Parameter Value Character Set and Language Information

   Some parameter values may need to be qualified with character set
   or language information....

   Specifically, an asterisk at the end of a parameter name acts as an
   indicator that character set and language information may appear at
   the beginning of the parameter value. A single quote is used to
   separate the character set, language, and actual value information  
   in
   the parameter value string, and an percent sign is used to flag
   octets encoded in hexadecimal.  For example:

     Content-Type: application/x-stuff;
      title*=us-ascii'en-us'This%20is%20%2A%2A%2Afun%2A%2A%2A
"

As can be seen in bug:18062 the format used in the example in RFC 2184 
is the one used by KMail. Which makes sense since RFC2231"Obsoletes: 
2184", that is RFC2231 replaces but does not contradict RFC2184.

> I think, software corporations use this two RFCs. You said about
> KMail MUST support standart. Why KMail doesn't support they NOW?

Personally I interpret RFC2047 and RFC2184 as showing that KMail is 
strictly compliant, while Outlook is not. I believe this is 
consistent with Marc and Ingo's interpretations.

> > > Why KMail must be incompatible e-mail
> > > client, even them users are ready to help developers to make
> > > really BEST e-mail client!? I don't understand your religious
> > > fanaticism!
> >
> > As I said personally I agree with you, and I find the current
> > situation to be unsatisfactory for cyrillic language uses. But
> > I'm outvoted.
> >
> > BTW there's no simple algorithm to create a latin string to
> > approximate a cyrillic string is there? If so that could be used
> > to rename cyrillic filenames to US-ASCII filenames.
>
> It isn't acceptable!

This is a shame.

> Don, please, read RFC2183/2184. 

I doubt discussing the RFCs further is going to be productive. 

I agree the current situation is unsatisfactory for cyrillic language 
users, and feel bad about this. If anyone (Ingo?) has a suggestion 
for how to improve the situation I'm interested.

Don.

Comment 10 Andrey Cherepanov 2004-04-20 14:47:47 UTC
> > > BTW there's no simple algorithm to create a latin string to
> > > approximate a cyrillic string is there? If so that could be used
> > > to rename cyrillic filenames to US-ASCII filenames.
> >
> > It isn't acceptable!
>
> This is a shame.
No, it's real require from my partners.

> > Don, please, read RFC2183/2184.
>
> I doubt discussing the RFCs further is going to be productive.
>
> I agree the current situation is unsatisfactory for cyrillic language
> users, and feel bad about this. If anyone (Ingo?) has a suggestion
> for how to improve the situation I'm interested.
I also glad to discuss this situation. In my opinion, there are two option in 
KMail settings: encode by RFC2231 standart (by default, right way :) and 
encode Content-Disposition by RFC2047 (only for other proprietary client 
compatibility!). But second variant MUST be realized. Possible, through my 
patch or another way.

Comment 11 Andrey Cherepanov 2004-04-22 14:03:52 UTC
*** Bug 80123 has been marked as a duplicate of this bug. ***
Comment 12 Ingo Klöcker 2004-05-01 16:18:34 UTC
RFCs are no standards? So you don't recognize the Internet Society as standardazation organisation? I advice that you read http://www.ietf.org/rfc/rfc2026.txt because you obviously don't know what you are talking about. Basically _all_ Internet related protocols and standards are made by the Internet Society and are published in form of RFCs. So your claim that RFCs are no standards and therefore don't need to be followed (that's what you wanted to imply with your claim, isn't it?) is outrageous.

And your talk about "inline" vs. "attachment" is also complete nonsense because this is in no way related to the way the MIME parameter values are encoded. Did you actually read (and understand) http://www.ietf.org/rfc/rfc2183.txt?
Comment 13 Andrey Cherepanov 2004-05-05 08:34:21 UTC
> RFCs are no standards? So you don't recognize the Internet Society as
> standardazation organisation? I advice that you read
> http://www.ietf.org/rfc/rfc2026.txt because you obviously don't know what
> you are talking about. Basically _all_ Internet related protocols and
> standards are made by the Internet Society and are published in form of
> RFCs. So your claim that RFCs are no standards and therefore don't need to
> be followed (that's what you wanted to imply with your claim, isn't it?) is
> outrageous.
>
> And your talk about "inline" vs. "attachment" is also complete nonsense
> because this is in no way related to the way the MIME parameter values are
> encoded. Did you actually read (and understand)
> http://www.ietf.org/rfc/rfc2183.txt?
I'm very sorry for my irascibility. I agree with you about meaning RFC as 
Internet standarts.

I read RFC 2183 and I understand what Microsoft/Mozilla teams think about 
preferred show attachment inline (heh! although their products don't support 
this feature in most cases) :)

Ingo, I don't want to dispute about valid RFC's support with you: you better 
developer aginst me. I only want to solve my very little problem. Will you 
help me?

Comment 14 Till Adam 2004-05-29 21:15:11 UTC
The "Outlook-compatible attachment naming" option has been added to the development version.