Bug 84654 - Treat text files with mixed line endings as application/octet-stream
Summary: Treat text files with mixed line endings as application/octet-stream
Status: RESOLVED WAITINGFORINFO
Alias: None
Product: kmail
Classification: Applications
Component: mime (show other bugs)
Version: 1.6.2
Platform: unspecified Linux
: NOR wishlist
Target Milestone: ---
Assignee: kdepim bugs
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2004-07-07 15:13 UTC by Rudolf Kollien
Modified: 2012-08-19 00:51 UTC (History)
1 user (show)

See Also:
Latest Commit:
Version Fixed In:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Rudolf Kollien 2004-07-07 15:13:43 UTC
Version:           1.6.2 (using KDE KDE 3.2.3)
OS:                Linux

When an attachment with the mimetype "text/plain" is saved to disk then kmail changes all apperances of a CR/LF to LF. This (may corrupt the content of the file. AFAIK all linux text editors/wordprocessors are able to correctly handle "DOS"-Textfiles. So this "feature" makes no sense to me. 

Background: We exchange files with our customers which include pure ASCII characters and CR/LF combinations. The file is (correctly?) recogniced by the sending MUA as an text file with the mimetype "text/plain" (NB: the customers mostly use Windooz boxes to send mail). Now when the message is received via kmail and saved to to disk, kmail stripes off all occurences of CR/LF from within the file. As the application for which these files are directed to relies on CR/LF, the file cannot be used anymore. When saving the attachment with "save coded" (German: "Speichern kodiert als...") and using mmencode to decode it, the all is ok. But this is very complicated for the user an not the way the attachment should be handled.
Comment 1 Andreas Gungl 2004-07-07 15:25:55 UTC
On Wednesday 07 July 2004 15:13, Rudolf Kollien wrote:
> When an attachment with the mimetype "text/plain" is saved to disk then
> kmail changes all apperances of a CR/LF to LF. This (may corrupt the
> content of the file. AFAIK all linux text editors/wordprocessors are able
> to correctly handle "DOS"-Textfiles. So this "feature" makes no sense to
> me.

Messages are sent in canonical format with CR/LF line endings. This complies 
to the DOS platform. If you save the message on a Unix system, you can 
expect the message to be saved with LF line endings only. text/plain means 
exactly that it's only text, so the line endings are not really relevant. 

> Background: We exchange files with our customers which include pure ASCII
> characters and CR/LF combinations. The file is (correctly?) recogniced by
> the sending MUA as an text file with the mimetype "text/plain" (NB: the
> customers mostly use Windooz boxes to send mail). Now when the message is
> received via kmail and saved to to disk, kmail stripes off all occurences
> of CR/LF from within the file. As the application for which these files
> are directed to relies on CR/LF, the file cannot be used anymore. When
> saving the attachment with "save coded" (German: "Speichern kodiert
> als...") and using mmencode to decode it, the all is ok. But this is very
> complicated for the user an not the way the attachment should be handled.

Why don't you rewrite the application to handle LF _and_ CR/LF correctly? 
Another solution could be to use a filter to pipe the files through which 
changes the line endings before you pass the files to your application.

Comment 2 Rudolf Kollien 2004-07-07 15:52:36 UTC
>Messages are sent in canonical format with CR/LF line endings. This complies 
>to the DOS platform. If you save the message on a Unix system, you can 
>expect the message to be saved with LF line endings only. text/plain means 
>exactly that it's only text, so the line endings are not really relevant. 

In this case CR/LF are no line terminators. They have special control meanings inside the file.

>Why don't you rewrite the application to handle LF _and_ CR/LF correctly?

Unfortunately this is not our own software nor is it open source. The software handles files on DOS, VAX, Un*x and Linux the same way. So files can be ported without any thoughts about converting the content. Clearly spoken: the files are byte code to interpret.   


>Another solution could be to use a filter to pipe the files through which 
>changes the line endings before you pass the files to your application. 

The CR/LF are not line endings. There is a different meaning between a single LF, a single CR or a combination of them. But there wouldn't be any chance to decide if CR or LF is correct. No way to go. An context option like "Save file as binary" could be a solution (i never understood why to save a file "coded"). Additonal i "Open file as binary..." would also be neccessary.
 
Comment 3 Andreas Gungl 2004-07-07 16:04:32 UTC
On Wednesday 07 July 2004 15:52, Rudolf Kollien wrote:
> >Another solution could be to use a filter to pipe the files through
> > which changes the line endings before you pass the files to your
> > application.
>
> The CR/LF are not line endings. There is a different meaning between a
> single LF, a single CR or a combination of them. But there wouldn't be
> any chance to decide if CR or LF is correct. No way to go. An context
> option like "Save file as binary" could be a solution (i never understood
> why to save a file "coded"). Additonal i "Open file as binary..." would
> also be neccessary.

Having CR/LF somewhere in the file (not at the end of lines) doesn't qualify 
the file for text/plain encoding. I'm not quite sure but I feel like your 
sending MUA should better use another encoding when sending such files. For 
example base64 encoding would keep the CR/LF combinations.

However I'm better going to leave the decision about the report to be 
confirmed or marked as invalid to the guys who know the RFCs by heart.

Comment 4 Rudolf Kollien 2004-07-07 16:20:53 UTC
>Having CR/LF somewhere in the file (not at the end of lines) doesn't qualify 
>the file for text/plain encoding. I'm not quite sure but I feel like your 
>ending MUA should better use another encoding when sending such files. For 
>example base64 encoding would keep the CR/LF combinations. 

Mozilla, Thunderbird, T-Online Email (a special german MUA) and all webmailer (web.de, Yahoo, Google etc.) handle this files as text. Even KMail!!!!! Only Outlook from M$ does the right job. Sending the "defekt" mail to a windooze box and than forwarding the mail again corrects the problem (outlook changes the mime header to "application/octet-stream".
Comment 5 Rudolf Kollien 2004-07-07 16:22:03 UTC
I meant Outlook AND Outlook Express >= 5.x 
Versions above not tested.
Comment 6 Ingo Klöcker 2004-07-08 17:03:49 UTC
Sorry, but we can't change this because some interpreters (like bash) choke on wrong line endings. (Don't ask me why.) So we have to convert the line endings of _text_ files to the native line endings.

But we could probably detect text files with non-native line endings and attach them as application/octet-stream. And we could add an option Save as binary. But those are wishes.

BTW, you can manually change the content type to application/octet-stream by changing the properties of the attachment.
Comment 7 Rudolf Kollien 2004-07-08 17:26:44 UTC
>BTW, you can manually change the content type to application/octet-stream by changing the properties of the attachment. 

This is the current workaround. But therefore i have to forward the email to myself to correctly save the attachments. I will post a wish for "save as binary".

>But we could probably detect text files with non-native line endings and attach them as application/octet-stream.

The problem is, that these files are NOT textfiles but detected as such. Only the Outlook family of M$ detect the files correct as binary. 
Comment 8 Ingo Klöcker 2005-05-16 16:30:43 UTC
I'm reviewing this wish again and have a few questions:

Those files should obviously be treated not as text but as binary. So if the sending MUA declares the attachment to be of type text/plain then the receiving MUA can't know that this is wrong. The only reasonable solution to counter this bug in the sending MUA is to offer a "Save as binary" option in the receiving MUA. You said you'd file another wish for this, so I consider the "receiving MUA" part of this wish to be covered by this other wish.

So what remains is the "sending MUA" part. If I understood correctly then KMail doesn't handle this correctly, i.e. it thinks the files are of type text/plain although they are not. Now it's very difficult to fix this problem since files which only use one type of line ending (i.e. either crlf or lf) should be treated as text files (because they usually are text files). Only files with mixed "line endings" could be treated as binary (i.e. application/octet-stream) by default. For files which look like text files although they are not you'll always have to change the content-type manually which is possible with KMail.

So summing up what remains to do is treating to-be-attached files with mixed line endings as application/octet-stream.
Comment 9 Myriam Schweingruber 2012-08-18 08:49:23 UTC
Thank you for your feature request. Kmail1 is currently unmaintained so we are closing all wishes. Please feel free to reopen a feature request for Kmail2 if it has not already been implemented.
Thank you for your understanding.
Comment 10 Luigi Toscano 2012-08-19 00:51:45 UTC
Instead of creating a new feature request, please confirm here if the wishlist is still valid for kmail2.