Bug 119641 - use message encoding for decoding 8bit data in message headers
Summary: use message encoding for decoding 8bit data in message headers
Status: RESOLVED UNMAINTAINED
Alias: None
Product: kmail
Classification: Applications
Component: mime (show other bugs)
Version: 1.8.3
Platform: Gentoo Packages Linux
: NOR normal
Target Milestone: ---
Assignee: kdepim bugs
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2006-01-06 18:40 UTC by Spiro A.
Modified: 2015-04-12 09:56 UTC (History)
0 users

See Also:
Latest Commit:
Version Fixed In:


Attachments
Picture describing the Encoding issue (140.77 KB, image/jpeg)
2006-01-06 18:41 UTC, Spiro A.
Details
Saved email (1.03 KB, application/octet-stream)
2006-01-06 19:08 UTC, Spiro A.
Details
Complete Email (1.16 KB, application/octet-stream)
2006-01-06 19:47 UTC, Spiro A.
Details
Another email issue (173.81 KB, image/jpeg)
2006-01-06 19:59 UTC, Spiro A.
Details
Saved email for second email (2.92 KB, application/octet-stream)
2006-01-06 20:00 UTC, Spiro A.
Details
Email (68.35 KB, image/jpeg)
2006-07-12 06:40 UTC, Spiro A.
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Spiro A. 2006-01-06 18:40:47 UTC
Version:           1.8.3 (using KDE KDE 3.4.3)
Installed from:    Gentoo Packages
OS:                Linux

I run Gentoo amd64 and have my locale set this way:
~ # locale
LANG=en_US.UTF-8
LC_CTYPE=en_US.UTF-8
LC_NUMERIC=it_IT.UTF-8
LC_TIME=en_US.UTF-8
LC_COLLATE=en_US.UTF-8
LC_MONETARY=it_IT.UTF-8
LC_MESSAGES=en_US.UTF-8
LC_PAPER=it_IT.UTF-8
LC_NAME=en_US.UTF-8
LC_ADDRESS=en_US.UTF-8
LC_TELEPHONE=en_US.UTF-8
LC_MEASUREMENT=it_IT.UTF-8
LC_IDENTIFICATION=en_US.UTF-8
LC_ALL=

/etc/locales.build
en_US/ISO-8859-1
en_US.UTF-8/UTF-8
it_IT.UTF-8/UTF-8
it_IT/ISO-8859-1
it_IT@euro/ISO-8859-15

I wish to change my /etc/locales.build in my next gentoo install to this:
en_US.UTF-8/UTF-8
it_IT.UTF-8/UTF-8

But so far my problem is that I receive mail with some characters with accent like à è ì ò ù   but they are not always properly displayed.

I attach a picture with exact issue and also have written kmail settings.

I wish to learn what is wrong as this has been quite annowing so far and do not know if this is a wrong setting or a but.

Thank you,
Spiro
Comment 1 Spiro A. 2006-01-06 18:41:53 UTC
Created attachment 14153 [details]
Picture describing the Encoding issue
Comment 2 Thiago Macieira 2006-01-06 18:55:33 UTC
Please save, zip and attach such an email here.
Comment 3 Spiro A. 2006-01-06 19:08:21 UTC
Created attachment 14154 [details]
Saved email

I removed sensitive data and replaced it with -----. I also removed the from
and all the header email part. If it is needed please then let me know

Moreover, is it possible is someone of you could send me a test email with some
à è ì ò ù characters so I can test this mail too?

thank you,
Spiro
Comment 4 Thiago Macieira 2006-01-06 19:33:15 UTC
I need specially the headers.
Comment 5 Spiro A. 2006-01-06 19:47:11 UTC
Created attachment 14155 [details]
Complete Email

hope this helps.
Spiro
Comment 6 Spiro A. 2006-01-06 19:57:53 UTC
Hi,

I just received a Test message and this is its body;
This email contains non-ASCII characters.

ǽ€
-- 
  Thiago Macieira  -  thiago (AT) macieira.info - thiago (AT) kde.org
    PGP/GPG: 0x6EF45358; fingerprint:
    E067 918B B660 DBD1 105C  966C 33F5 F005 6EF4 5358

1. On frumscafte, hwonne time_t wæs náht, se scieppend þone circolwyrde 
wundorcræftlíge cennede and seo eorðe wæs idel and hit wæs gód.

It seems I can properly read all characters.
Hummm I am attacing another email.

Spiro
Comment 7 Spiro A. 2006-01-06 19:59:04 UTC
Created attachment 14156 [details]
Another email issue

I will also attach the email
Comment 8 Spiro A. 2006-01-06 20:00:41 UTC
Created attachment 14157 [details]
Saved email for second email
Comment 9 Thiago Macieira 2006-01-06 23:03:33 UTC
Both emails you attached are invalid. They contain non-ASCII characters in sections where those characters are not allowed.

The first one (attachment 14155 [details]) contains no Content-Type header. As such, the email is mandated to stay within US-ASCII only. Anything else is forbidden.

The second one (attachment 14157 [details]) properly encodes its body, but not the Subject header. If only there were standards to do this... oh, wait! there are. They are only 9 years old...

Your emails are invalid.

*** This bug has been marked as a duplicate of 95157 ***
Comment 10 Spiro A. 2006-01-07 14:31:33 UTC
I see then this is not a bug issue related to kmail but how email is sent, in this case received.
But, I wonder in attachment from "Additional Comment #1" how come if I only switch encoding (when I switch from utf-8 to 8859-1 I can then see characters properly?

Please advice then what encoding I should set for kontact, for kmail both for Fallback character encoding and Override Character encoding. Consider that I wish to implement (as such been adviced) UTF-8.

Thank you,
Spiro
Comment 11 Spiro A. 2006-01-17 18:07:59 UTC
Hi,

I feel this case is not clearly resolved to me due to the fact that:
from "Additional Comment #1" how come if I only switch encoding (when I switch from utf-8 to 8859-1 I can then see characters properly?

If an email is invalid is invalid for all encodings (am I correct perhaps?), right?

Moreover, when I am able to properly view an email from mail KMail program, but when I double click on that email and open it in a new windows I do not see it properly, than something must be wrong.

So, please let me know, if you have a clue.

Thank you,
Spiro
Comment 12 Thiago Macieira 2006-01-19 18:34:47 UTC
The emails are invalid. Do not reopen.

There MUST NOT be 8-bit characters in the headers and your emails do that. There's no way of guessing which encoding those use, that's why you see blocks in the default encoding. If you change the encoding to the proper charset, you'll see the characters.

Also note that this problem does not happen if you're using a Cyrus IMAP server. All those characters would simply be "X" in all email clients.

*** This bug has been marked as a duplicate of 95157 ***
Comment 13 Spiro A. 2006-07-12 06:38:59 UTC
Hi there,

I am back with the same issue, which does not convice me at all about what it was said to make this case [resolved].
I have again another email with the same problem but what bothers me is that the problem is inconsistent, menaing that KMail is able to display what it cannot.

Here is the issue:
1) In KMail settings, I have set Message Window:
- Fallback character encoding: Western European ( iso-8859-15)
- Override character encoding: Auto
NOTE: IF I set Overrride to UTF-8 or anything else it does not solve the problem
2) As you can see from my picture "photo.jpg" you can see this discrepancy:
- In the main kmail application panel, the frame one on the right "Subject, Sender, Date) it lists all email and among them it displays the wrong Subject field with the missing ' before the L char
- If I open this email, I get this new windows with this discrepancy:
--> the Subject is displayed both correctly and wrong.
HOW CAN THIS BE POSSIBLE?

To me it is an issue with KMail. Why? because if it can correctly display the Subject line "at least once" to me it means that the subject line and perhaps the content/body part has some coding kind of issue.

I will also attach the header of this mail.

Please look into this because "even if the email was sent with wrong encoding," waht is crucial to the end user is that this email can be read. Otherwhise, the strict rule of KMail of being perfect because it ignores encoding errors becomes useless, along with kmail usage.

Thank you for your assistance,
Spiro
Comment 14 Spiro A. 2006-07-12 06:39:45 UTC
.
Comment 15 Spiro A. 2006-07-12 06:40:15 UTC
Created attachment 16962 [details]
Email
Comment 16 Spiro A. 2006-07-12 06:49:28 UTC
Hi,

I have some additional info:
I tried to change the Fallback character encoding to Western European ( iso-8859-1252) and it now displays it correctly.
So, if with this encoding works fine, why the override encoding set to Auto does not take care of this issue?

To my understanding, I believe that this Section of KMail with encoding should first look at the header of the email and try to adapt each email to the native encoding overriding kmail general setting where needed. 
Is KMail already doing so? Perhaps this could be a solution because having Override character encoding set to Auto as of now does not resolve the issue.
ùì
Thank you for your support,
Spiro 

Here is the header part:
Received-SPF: none (mx02.csee.siteprotect.com: 201.240.114.243 is neither permitted nor denied by domain of ifg.com) client-ip=201.240.114.243; envelope-from=699portie@ifg.com; helo=HUASCARAN.n8ouu6e.net;
 Received: from HUASCARAN.n8ouu6e.net (unknown [201.240.114.243])
        by mx02.csee.siteprotect.com (Postfix) with ESMTP id 5BDC1D8115
        for <mail@siriush.com>; Tue, 11 Jul 2006 17:23:46 -0500 (CDT)
 From: "Milo Bird" <222theodor@gmx.net>
 To: <mail@siriush.com>
 Subject: L’adolescente fa sesso qui
 Date: Tue, 11 Jul 2006 17:23:51 -0500
 MIME-Version: 1.0
 X-Mailer: Microsoft Office Outlook, Build 11.0.5510
 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1106
 Thread-Index: RJtmLZ2g9TyeBMGMgtNct4CNylGxu5PE3ARF
 Content-Type: text/plain;
  charset="Windows-1251"
 Content-Transfer-Encoding: 8bit
 Message-Id: <20060711222346.5BDC1D8115@mx02.csee.siteprotect.com>
 X-Virus-Scanned: CleanMail 2.5 at mf00
 X-Amavis-Alert: BAD HEADER Non-encoded 8-bit data (char 92 hex) in message header 'Subject': Subject: L\222adolescente fa ...
 X-Spam-Status: Yes, hits=6.953 required=6 tests=[DNS_FROM_RFC_POST=1.376,
 FORGED_RCVD_HELO=0.05, LONGWORDS=2.259, MSGID_FROM_MTA_ID=1.704,
 PORN_URL_SEX=1.427, RCVD_IN_SORBS_DUL=0.137]
Comment 17 Maksim Orlovich 2006-07-12 07:01:20 UTC
FYI, the message you pasted specifies Windows-1251 which is a Cyrillic encoding, not a latin one, so it's basically screwed up, though the way that should read is that instead of accented characters you'd get cyrillic ones. 
I'll leave it to kmail folks to comment on all the encoding settings and stuff.

Comment 18 Spiro A. 2006-07-12 08:22:41 UTC
Yes Maksim, I see and agree with you.
But, being aware of all this encoding issues, would it be worth letting kmail automatically detect, when possible, from the email header the proper encoding and automatically display that specific mail with the adapted encoding?
All that I aim to see in kmail is being more dynamic toward encoding issues and not being so strict about errors because the bottom line is not perfection but being able to read what I receive without pulling my hairs any time something is not "perfect".

Hope this will provide some ideas to kmail folks,

Thank,
Spiro
Comment 19 Juha Tuomala 2006-07-12 09:30:26 UTC
> All that I aim to see in kmail is being more dynamic toward encoding 
> issues and not being so strict about errors because the bottom line 
> is not perfection but being able to read what I receive without 
> pulling my hairs any time something is not "perfect". 

In other hand, that's what the whole web is full of. And the end result is
whole mess of content where non-IE users fail to use some pages.

IMO following standards is good, being strict:
 - Makes the non-conformant content producer look like fool when 
   it is found out where the actual problem is. And hopefully act to fix it.
 - Makes programs smaller and faster (and lot of related benefits to this)

I personally do hate when something does not open to myself, because
someone has cut corners, but after all - I rather take a strict program and 
suffer instead endless bloat what guesses what 'it should be'. If 
others have a problem, I don't want to pay for it (hw for example).

Tuju :-)
Comment 20 Ingo Klöcker 2006-07-12 11:21:17 UTC
We are apparently using different encodings for the subject shown in the message list and the window caption than for the subject shown in the message viewer. Obviously, this is simply a bug and has to be fixed because it seems to be quite common for spam-tools (?) to create invalid Subject headers. *sigh*

Can you please forward the message (as attachment) to me so that I can check whether this problem does still exist in KMail 1.9.3?

A note about the override encoding: The override encoding should usually never be used (or only for specific messages via View->Set Encoding). As the name suggests it overrides all encoding information specified in the message. Maybe "None" would be more appropriate instead of "Auto".
Comment 21 Spiro A. 2006-07-12 14:35:09 UTC
Hi

"A note about the override encoding: The override encoding should usually never be used (or only for specific messages via View->Set Encoding). As the name suggests it overrides all encoding information specified in the message. Maybe "None" would be more appropriate instead of "Auto". "
Yes, I completely agree. NONE should also be an option. How to request such option to be implemented?

As for the message I will forward it as soon as possible.

Thanks Ingo for your support,
Spiro
Comment 22 Laurent Montel 2015-04-12 09:56:11 UTC
Thank you for taking the time to file a bug report.

KMail2 was released in 2011, and the entire code base went through significant changes. We are currently in the process of porting to Qt5 and KF5. It is unlikely that these bugs are still valid in KMail2.

We welcome you to try out KMail 2 with the KDE 4.14 release and give your feedback.