Summary: | [PATCH] Replying to address with umlaut and comma creates two addressees | ||
---|---|---|---|
Product: | [Unmaintained] kmail | Reporter: | tropikhajma <tropikhajma> |
Component: | mime | Assignee: | Ingo Klöcker <kloecker> |
Status: | RESOLVED FIXED | ||
Severity: | normal | CC: | adam, andreaswuest, bernhard, clcevboxvjeo, coolo, lure, mh+kde-bugs, mueller, ojo, ovit.debian, timo, torsten.irlaender |
Priority: | NOR | ||
Version: | SVN trunk (KDE 4) | ||
Target Milestone: | --- | ||
Platform: | unspecified | ||
OS: | Linux | ||
Latest Commit: | Version Fixed In: | ||
Sentry Crash Report: | |||
Attachments: | patch against proko2 |
Description
tropikhajma
2006-04-21 16:03:56 UTC
Commas have to be quoted, yes. It looks like a bug in the mailer software, not in KMail. I played with it a bit and found out that an email sent by Kmail through smtp with "from" header From: "aaa, bbb" <name@domain.com> will arrive with "from" header From: =?us-ascii?Q?aaa=2C=20bbb?= <name@domain.com> so the missing quoting may be a bug of the smtp program (sendmail) ?? Furthermore, when trying to setup the "from" field (from within gui) without the double quotes, but containing comma, the Kmail does not use this value at all and the "from" header contains default value instead. I would expect at least an error message here. Should I file a bug for this? Check what the email saved in your sent-mail folder has. That's what KMail sent to your SMTP server. If that's different than what was received, then it was changed somewhere along the line. I've checked it with ethereal and it really seems to be a fault of the smtp server or something behind it. I am am reopening the bug, because I think the =?us-ascii?Q?Surname=2C=20Name?= encoding already is a phrase, thus there cannot be further encoding around it. This makes the behaviour a failure with major severity as this affect interoperability with other email clients that send this correct encoding (and have no other choice with non-ascii names). Here the reference to the interpretation: rfc2882 (proposed standard) 3.2.6. Miscellaneous tokens [..] word = atom / quoted-string phrase = 1*word / obs-phrase [..] 3.4. Address Specification [..] mailbox = name-addr / addr-spec name-addr = [display-name] angle-addr [..] display-name = phrase rfc2047 (draft standard) 5. Use of encoded-words in message headers [..] (3) As a replacement for a 'word' entity within a 'phrase', for example, one that precedes an address in a From, To, or Cc header. The ABNF definition for 'phrase' from RFC 822 thus becomes: phrase = 1*( encoded-word / word ) As you can see from the syntax definition above: phrase can have several words and encoded words. But only words can be quoted-strings with DQUOTE (") characters around them. encoded-words MUST not be enclosed by DQUOTE characters. but that section 3 in 2047 goes on: In this case the set of characters that may be used in a "Q"-encoded 'encoded-word' is restricted to: <upper and lower case ASCII letters, decimal digits, "!", "*", "+", "-", "/", "=", and "_" (underscore, ASCII 95.)>. An 'encoded-word' that appears within a 'phrase' MUST be separated from any adjacent 'word', 'text' or 'special' by 'linear-white-space'. That excludes ",", does it not? Till, | the set of characters that may be used in a "Q"-encoded | 'encoded-word' is restricted to [..] | That excludes ",", does it not? It does exclude "," in the encoded-word, but not in the word to be encoded (= the word you get, when you decode). Exclusion of "," means that you must encode this character if it is in the string (to be encoded). This is definitely a bug in KMail, even though it's also IMO an abuse of RFC2047-encoding which was invented for encoding non-ASCII characters and not for encoding commas. The latter is what quoted strings have been invented for in RFC822. The problem here is that KMail treats RFC2047-encoding more or less as transport encoding. All (or at least most) operations are done with the decoded header values. This is obviously based on a wrong assumption, namely that the RFC2047-decoder transforms a valid email address into a valid email address (except for the fact, that it may contain non-ASCII characters). The problem is that the RFC2047 decoder doesn't know anything about email address syntax. A possible solution would be to extend normalizeAddressesAndDecodeIDNs() (or write a similar function) so that it accepts RFC2047 encoded addresses and, during normalization, makes sure the display name is correctly quoted, whenever necessary. In fact, normalizedAddress() already adds quotes, whenever necessary. So, in normalizeAddressesAndDecodeIDNs() after KPIM::splitAddress( (*it).utf8(), displayName, addrSpec, comment ) the RFC2047 decoder has to be applied to displayName and comment. Of course, this means that we also have to pass the raw, possibly RFC2047-encoded, header value to normalizeAddressesAndDecodeIDNs(). (Yes, the correct solution would be to use KMime and it's email address class-tree for this, but KMime's email address parser only accepts ASCII-text (IIRC) and throws away any comments it encounters, so we can't use it atm.) Thanks for the comment Ingo, it matches what I had come up with myself, pretty much. Does the attached patch (against proko2, thus containing a few imports from libemailfunctions we didn't have before) look alright to you? Created attachment 16010 [details]
patch against proko2
Ingo, BTW i am pretty sure that this is no abuse, if there is a comma and a non-ascii char in the realname string, you can only use encoded-word from RFC2047, otherwise it would not be fitting the grammer in rfc822 or rfc2822 anymore. Ping Ingo, did this patch go in? It's not that easy to apply to KDE 3.5 because KMMsgBase::decodeRFC2047String() can't simply be moved to libemailfunctions. *** Bug 142810 has been marked as a duplicate of this bug. *** The note in RFC 2047, 6.2 makes this more clear:
>NOTE: Decoding and display of encoded-words occurs *after* a
> structured field body is parsed into tokens. It is therefore
> possible to hide 'special' characters in encoded-words which, when
> displayed, will be indistinguishable from 'special' characters in the
> surrounding text. For this and other reasons, it is NOT generally
> possible to translate a message header containing 'encoded-word's to
> an unencoded form which can be parsed by an RFC 822 mail reader.
*** Bug 137984 has been marked as a duplicate of this bug. *** *** Bug 145508 has been marked as a duplicate of this bug. *** *** Bug 160357 has been marked as a duplicate of this bug. *** SVN commit 800168 by ervin: Apply the RFC2047 decoding inside of normalizeAddressesAndDecodeIDNs() as advised by Ingo. Use the RFC2047 implementation of kmime for that matter (yes, we have at least three implementations of this rfc in kdepim). That fixes 126025 in the enterprise branch (forward port on trunk to follow). CCBUG: 126025 M +16 -6 kmail/kmmessage.cpp M +1 -1 libemailfunctions/Makefile.am M +5 -1 libemailfunctions/email.cpp M +1 -1 libemailfunctions/tests/Makefile.am M +11 -0 libemailfunctions/tests/testemail.cpp M +1 -0 libkcal/Makefile.am M +5 -0 libkmime/kmime_util.cpp M +7 -0 libkmime/kmime_util.h WebSVN link: http://websvn.kde.org/?view=rev&revision=800168 SVN commit 800178 by ervin: Apply the RFC2047 decoding inside of normalizeAddressesAndDecodeIdn() as advised by Ingo. Use the RFC2047 implementation of kmime for that matter (yes, we have at least three implementations of this rfc in kdepim). Forwardport of r800168 for kdepimlibs. CCBUG: 126025 M +6 -0 kmime/kmime_util.cpp M +8 -0 kmime/kmime_util.h M +2 -1 kpimutils/CMakeLists.txt M +5 -1 kpimutils/email.cpp M +11 -0 kpimutils/tests/testemail.cpp WebSVN link: http://websvn.kde.org/?view=rev&revision=800178 SVN commit 800182 by ervin: Since the RFC2047 decoding is now done in normalizeAddressesAndDecodeIdn(), use the raw headers for the relevant addresses related fields in KMMessage. Forwardport of 800168 for kdepim. That fixes 126025 in trunk. BUG: 126025 M +16 -6 kmmessage.cpp WebSVN link: http://websvn.kde.org/?view=rev&revision=800182 *** Bug 166550 has been marked as a duplicate of this bug. *** |