Bug 95733 - KMail should always encode as base64 any non text/* and non message/* MIME part
Summary: KMail should always encode as base64 any non text/* and non message/* MIME part
Status: RESOLVED FIXED
Alias: None
Product: kmail2
Classification: Applications
Component: general (show other bugs)
Version: 5.3.0
Platform: Debian testing Linux
: NOR normal
Target Milestone: ---
Assignee: kdepim bugs
URL:
Keywords:
: 100552 114355 115267 (view as bug list)
Depends on:
Blocks:
 
Reported: 2004-12-23 19:44 UTC by Georg Baum
Modified: 2018-04-09 20:10 UTC (History)
11 users (show)

See Also:
Latest Commit:
Version Fixed In: 5.5.1


Attachments
pdf file that is attached correctly (12.64 KB, application/pdf)
2004-12-23 19:45 UTC, Georg Baum
Details
message with working pdf file (18.20 KB, message/rfc822)
2004-12-23 19:46 UTC, Georg Baum
Details
Not working pdf file (65.80 KB, application/pdf)
2004-12-23 19:48 UTC, Georg Baum
Details
message with not working pdf file (62.69 KB, text/plain)
2004-12-23 19:50 UTC, Georg Baum
Details
This is the PDF she's been struggling with all day :) (52.97 KB, application/pdf)
2005-04-10 23:24 UTC, Frederik Dannemare
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Georg Baum 2004-12-23 19:44:10 UTC
Version:            (using KDE KDE 3.3.2)
Installed from:    Debian testing/unstable Packages
OS:                Linux

When I attach some pdf files to a message, they are not encoded correctly with Content-Transfer-Encoding: base64, but re stored as plain text: Content-Transfer-Encoding: 8bit
The line endings are converted from DOS style to UNIX, thereby making the pdf file unreadable. With other pdf files everything is ok. I used the package kmail_3.3.2-0pre1_i386.deb from http://pkg-kde.alioth.debian.org/kde-3.3.2/ on debian unstable, but kmail 3.3.1 from debian unstable has the same behaviour.
I am attaching two pdf files (one working and one nonworking) and two messages that kmail produced with each file attached.
When I run the 'file' command on both files it says in both cases "PDF document, version 1.4". Or maybe this is a library issue?
In any case, it would be nice if this could be fixed.

Thanks,


Georg
Comment 1 Georg Baum 2004-12-23 19:45:26 UTC
Created attachment 8786 [details]
pdf file that is attached correctly

This file can be attached and is correctly encoded with base64.
Comment 2 Georg Baum 2004-12-23 19:46:55 UTC
Created attachment 8787 [details]
message with working pdf file

This is the message created by kamil with the working pdf file attached.
Comment 3 Georg Baum 2004-12-23 19:48:27 UTC
Created attachment 8788 [details]
Not working pdf file

This is the pdf file that gets incorrectly encoded in 8bit
Comment 4 Georg Baum 2004-12-23 19:50:18 UTC
Created attachment 8790 [details]
message with not working pdf file

This is the message as produced by kmail with the incorrectly encoded pdf file
attached.
Comment 5 Thiago Macieira 2004-12-23 19:54:04 UTC
What file type does Konqueror/KMail show for the file that doesn't get correctly attached?

If you change the file type to application/pdf, does it get sent correctly?
Comment 6 Georg Baum 2004-12-23 20:17:00 UTC
In both cases kmail shows "PDF-Dokument" (german translation) and the icon is the one for pdf files. If I right click on the attachment, I can see that the file type is indeed application/pdf, but the encoding is 8 bit. If I change this manually to base64 the file gets sent correctly. But of course the automatic setting should work for known mime types such as application/pdf.
Comment 7 Thiago Macieira 2004-12-23 22:34:28 UTC
Allow me to change the Summary.

Encoding as 8bit would require that the file's own binary data be inserted into the MIME part. It could make sense for MIME-Related files such as MHTML, but it doesn't for emails (RFC 2822).

Therefore, the only solution is to encode as Base64 any file attachment that is not of type text/*.
Comment 8 Thiago Macieira 2004-12-23 22:50:48 UTC
I can also confirm this bug.

Try attaching a text file with CRLF line-endings, but called something like test.jpg and sending it to yourself.

test.jpg is the original, test2.jpg is the one that KMail saved after receiving:
-rw-rw-r--  1 thiago thiago 28 2004-12-23 19:47 /tmp/test.jpg
-rw-------  1 thiago thiago 26 2004-12-23 19:48 /tmp/test2.jpg

$ od -tx1 /tmp/test.jpg
0000000 48 65 6c 6c 6f 2c 20 57 6f 72 6c 64 0d 0a 48 65
0000020 6c 6c 6f 2c 20 57 6f 72 6c 64 0d 0a

$ od -tx1 /tmp/test2.jpg
0000000 48 65 6c 6c 6f 2c 20 57 6f 72 6c 64 0a 48 65 6c
0000020 6c 6f 2c 20 57 6f 72 6c 64 0a

Comment 9 Thiago Macieira 2005-03-31 03:26:36 UTC
*** Bug 100552 has been marked as a duplicate of this bug. ***
Comment 10 Frederik Dannemare 2005-04-10 23:19:40 UTC
I'm also bitten by this one (well, actually my girlfriend is, but I get yelled at when things are broken :) ).

People, we should vote this one up a bit. It's bound to bite many users and it is rather difficult for people to figure out why (well, at least it confused the hell out of me :) ).
Comment 11 Frederik Dannemare 2005-04-10 23:24:34 UTC
Created attachment 10572 [details]
This is the PDF she's been struggling with all day :)

Attached via kmail (kontact) this one gets incorrectly marked as
quoted-printable by default (Kubuntu 5.04, KDE 3.4.0, KMail 1.8).
Comment 12 Frederik Dannemare 2005-04-11 01:15:12 UTC
It seems that it doesn't even work when manually setting the attached PDF as base64. It will produce a corrupted PDF as well. I'm puzzled. 

Could somebody please try the PDF I have attached this report and try is out with the most current KMail. 

===================================================================

Headers from KMail:

User-Agent: KMail/1.8
Mime-Version: 1.0
Content-Type: multipart/mixed;
  boundary="=_host.kl-teknik.com-32153-1113174595-0001-2"
Message-Id: <200504110110.47305.mail@margith.dk>
X-UID: 1786
X-Length: 70906

This is a MIME-formatted message.  If you see this text it means that your
E-mail software does not support MIME-formatted messages.

--=_host.kl-teknik.com-32153-1113174595-0001-2
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Content-Disposition: inline


--=_host.kl-teknik.com-32153-1113174595-0001-2
Content-Type: application/pdf; name="regnearter.pdf"
Content-Transfer-Encoding: base64
Content-Disposition: attachment;
 filename="regnearter.pdf"

===================================================================


Sending via e.g. Squirrelmail works and produces this (for comparison):

User-Agent: SquirrelMail/1.4.4
MIME-Version: 1.0
Content-Type: multipart/mixed;
  boundary="----=_20050410224453_85097"
X-Priority: 3 (Normal)
Importance: Normal
X-UID: 36
X-Length: 75386

------=_20050410224453_85097
Content-Type: text/plain; charset="iso-8859-1"
Content-Transfer-Encoding: 8bit


------=_20050410224453_85097
Content-Type: application/pdf; name="regnearter.pdf"
Content-Transfer-Encoding: base64
Content-Disposition: attachment; filename="regnearter.pdf"

===================================================================


(btw, just as a side notes: kontact crashes very often when I press V to see the email source - I'll have to investigate...).
Comment 13 Kevin Stefanik 2005-07-01 18:41:37 UTC
I think this also affects incoming mail from other packages where the type is 'quoted printable' but the file is a pdf or something similarly misclassified as text.

See: http://support.microsoft.com/default.aspx?scid=kb;en-us;178241
Might be related: http://two.pairlist.net/pipermail/reportlab-users/2004-April/002909.html
Comment 14 Thiago Macieira 2005-07-02 03:40:50 UTC
No, we treat incoming text/* MIME parts correctly. File a bug against the sender if the file was wrongly transmitted.
Comment 15 Allen Chan 2005-08-12 19:54:32 UTC
kmail needs to detect the properly encoded pdf files as binary files, not text files.  I can confirm that the "not working pdf file" attachment from #3 is encoded improperly in kmail even though it DOES have the proper binary data in a comment in line 2 of the file.

At the very least, kmail should be fixed to allow manually overriding the encoding type to "Base 64" to work.  I can confirm the result from #12 that it doesn't seem to make a difference if kmail initially detects the file as text.

See http://two.pairlist.net/pipermail/reportlab-users/2004-April/002907.html
for a more detailed discussion of the pdf encoding.
Comment 16 Thiago Macieira 2005-08-13 05:11:38 UTC
Let me summarise what I think KMail should do. It should encode as base64 anything that:

1) contains NULs
2) contains stray CR and LF
3) contains very long lines (possibly 1)
4) MIME type is not text/* or message/* or multipart/*

Point #2 may be relaxed if it detects that a file is a text file and those stray characters are in fact the line terminators (such as Unix text files). In that case, the CRs and LFs must all be converted to CRLF before transmission.

Text attachments (text/*) can be base64-encoded if KMail decides quoted-printable would make the output actually larger.
Comment 17 Thiago Macieira 2005-10-13 19:33:15 UTC
*** Bug 114355 has been marked as a duplicate of this bug. ***
Comment 18 Lars Ivar Igesund 2005-11-04 14:14:39 UTC
I'm using KDE 3.5 Beta 2 for Kubuntu.

Ok, I've been bitten by this bug today, as I tried to send some job applications as PDF. I tried some different things, but mostly it seems that some information goes missing when sent from KMail.

Case 1: I sent to PDF-docs (application and CV) to a company. They answered, saying that they got the CV correctly, but the other was missing. I sent the application again, and they said they just got a .dat file. After looking closer, they also found a .dat file in the first mail. I then tried sending the application as .doc and .odt, with the same result (they became .dat files). After looking some more, they was able to open the .pdf anyway, when they just knew that it was pdf.

Case 2: I sent it to myself. From KMail, to Kmail. It worked wonderfully.

Case 3: I sent it to my job adress, Exchange server with Outlook client. The first PDF got a generic name + .att. Opening it with acroread worked great. The second file was named correctly.

Case 4: I sent it to my gmail account. Gmail agreed that both files were PDF's, although it wasn't able to get the filename (having norwegian characters) right.

Case 5: I sent it to myself, from KMail, to Squirrel Webmail. Squirrel correctly detected both files as PDF, but wasn't able to get the file name right.

If companies can't get my jobapplications correctly, I feel this is somewhat critical for me, at least since I otherwise enjoy KMail.
Comment 19 Lars Ivar Igesund 2005-11-04 14:17:06 UTC
Oh, and KMail sets the encoding to base64 in all the cases above.
Comment 20 Thiago Macieira 2005-11-04 14:40:50 UTC
Your problem is completely unrelated. To work around it, just don't send files with non-ASCII characters in its name.
Comment 21 Thiago Macieira 2006-03-12 20:04:46 UTC
*** Bug 115267 has been marked as a duplicate of this bug. ***
Comment 22 Joachim Wagner 2009-02-12 19:22:36 UTC
Bug still exists on:

KMail 1.9.6 (OpenSuSE 10.3 installation)
KMail 1.9.9 (OpenSuSE 11.0 installation)
KMail 1.10.3 (OpenSuSE 11.1 Live CD)

Test: Compose new mail, attach kmailtest-full.pdf ("Not working pdf file" from Georg Baum), observe unsuitable encoding "quoted-printable" in composer window.

I checked my sent-mail folder with
$ ~/Mail/sent-mail/cur> find -type f -print0 | xargs --null fgrep -h "Content-Type:" -A 3 | fgrep application/pdf -A 3 | fgrep "Content-Transfer-Encoding:" | sort | uniq -c
    212 Content-Transfer-Encoding: base64
      1 Content-Transfer-Encoding: quoted-printable
but closer inspection showed that the PDF was inside a forwarded mail that was generated by MS Office Outlook. Note that this PDF is NOT damaged. Thinking about it, I cannot really see what should be wrong with encoding a binary file with quoted-printable (except for efficiency). Anything that is not printable is quoted in hexadecimal (for example =3D for equal signs in HTML). Reading comment #8, the problem seems to be that some line break conversion is going on. Is this maybe the real bug? Why is the MS Outlook PDF I found in my sent-mail folder not damaged?

Hope this helps you guys to pinpoint the bug. Best regards, JJ

Comment 23 Joachim Wagner 2013-02-14 11:59:53 UTC
Bug still present in Kontact 4.8.5 (OpenSUSE 12.2).

Georg's PDF is attached as quoted-printable and arrives 1 byte shorter than the original.

Good news: manually selecting a base64 encoding (right click on attachment line - Properties - Encoding) now fixes the problem ("cmp" show no transmission errors).

JJ
Comment 24 Denis Kurz 2016-09-24 18:20:07 UTC
This bug has only been reported for versions before 4.14, which have been unsupported for at least two years now. Can anyone tell if this bug still present?

If noone confirms this bug for a Framework-based version of kmail2 (version 5.0 or later, as part of KDE Applications 15.12 or later), it gets closed in about three months.
Comment 25 Thiago Macieira 2016-09-24 18:51:24 UTC
Changing back to Confirmed, as this can still be reproduced with KMail 5.3, Frameworks 5.26. I've just sent myself an email with the PDF from attachment 8788 [details] and the sent email had:

--nextPart1803533.O5pIifQRu1
Content-Disposition: attachment; filename="kmailtest-full.pdf"
Content-Transfer-Encoding: quoted-printable
Content-Type: application/pdf; name="kmailtest-full.pdf"

So it did detect as application/pdf (not text/*) and still sent as quoted printable.
Comment 26 Daniel Vrátil 2017-05-08 07:22:14 UTC
Git commit a03f5d84e98801a75ea6ed3e74f6e6a6f8597c46 by Daniel Vrátil.
Committed on 08/05/2017 at 07:20.
Pushed by dvratil into branch 'Applications/17.04'.

Use base64 encoding for all non-text attachments

KMime::encodingForData() uses character occurence frequency to determine
which encoding to use. This does not work for binary attachments because
any characters can appear there. If a wrong encoding is chosen, CRLF/LF
transformations can break the attachment.

Differential Revision: https://phabricator.kde.org/D5737
FIXED-IN: 5.5.1

M  +21   -4    messagecore/src/attachment/attachmentpart.cpp

https://commits.kde.org/messagelib/a03f5d84e98801a75ea6ed3e74f6e6a6f8597c46