Created attachment 110397 [details]
Screen shot 1 - Initial message preview display
When a message is received that has a .doc file (or multiple such files) attached, they are initially shown in the message preview as encrypted (see screen shot 1). Note that they are shown as such in the message body, but are shown in their unencrypted form with correct file names in the message header.
Clicking on the "Decrypt Message" action link does not prompt for any further information but simply displays the attachment as expected, see screen shot 2.
Clicking on either the attachment shown after "decryption", or the original attachment from the header, opens it in LibreOffice as expected.
Double clicking on the message (to open it in a separate viewer window) does not show this problem, the attachments are shown as expected.
The .doc file does not have to be from Microsoft Word, a file created in LibreOffice does the same. I have not observed this happening with anything other than .doc files.
A test case is attached; neither the message nor the attached document contain any personal information. Opening the .eml file in the KMail View will show the same behaviour.
This is using KMail with a single POP3 fetching account and no encryption or signing in use. Headers are set to "Fancy Headers", attachment display is set to "As Icons".
Created attachment 110398 [details]
Screen shot 2 - After clicking on "Decrypt message"
Created attachment 110399 [details]
Test message - open in KMail View or import
Have done some investigating and found that the root cause is the MS Word binary format being detected as PGP encrypted data, in EncryptedBodyPartFormatter::process(). The result of GpgME::data().type() is returned as GpgME::Data::PGPOther for the attachment part.
One solution that I have found so far is to remove the preset formatter
from BodyPartFormatterFactoryPrivate::messageviewer_create_builtin_bodypart_formatters(). The detection from the binary data is too sensitive for the catch-all MIME type. This still allows PGP encrypted parts to be detected as long as they have the correct MIME type (which any decent sending mailer should use, it's 2019 now).
See also bug 435504 for possibly the same effect with ODT files.
Bug is still present in master branches of today, for the same reason as mentioned in comment 3.
Pulling Ingo as a representative(? if not, please forward) of GPGMe into the discussion, to ask whether this should be also considered a bug in GPGMe that a msword or odt data blob is detected to be of type GpgME::Data::PGPOther? Works properly to report GpgME::Data::Unknown e.g. for PDF data blobs for me.
On our side should registering EncryptedBodyPartFormatter as handler for "application/octet-stream" indeed be reconsidered and probably replaced with some special handling for known cases where the content type is not properly set to "text/pgp"?
*** Bug 406808 has been marked as a duplicate of this bug. ***
I had a quick look at what GpgME::data().type() is doing. It looks as if any data that starts with something that looks even remotely like a valid OpenPGP packet is identified as some OpenPGP data. The code trying to identifying the data basically only looks at the first byte. If bit 7 is set, then there is
a chance of 31:127 that the data is identified as some kind of OpenPGP data and a chance of 15:127 that the data is identified as GpgME::Data::PGPOther.
There is special code to prevent PNG files starting with "\x89PNG" from being identified as some kind of OpenPGP data, but that's the only blacklisting that's done.
Given this, I conclude that GpgME::data().type() is not suitable for checking if arbitrary data is some kind of OpenPGP data. In my opinion, it should only be used on data if there are good reasons to assume that this data is indeed OpenPGP data, but if it's not clear what kind of OpenPGP data.
For identifying data of type application/octet-stream, we should use file type detection provided by other libraries. Isn't there something in KF5 maybe in KIO that determines the mimetype? Or are we doing the GpgME::data().type() only after KIO returns application/octet-stream?
Thanks for the quick reply and research, Ingo. Okay, so we need to fix/improve things on our side. Only on my schedule for next week#'s WE to continue here, so anyone is invited to have a look & go themselves until ;)
A possibly relevant merge request was started @ https://invent.kde.org/pim/messagelib/-/merge_requests/83
Git commit 3f53b7394ffc75264442a2dd09e845a7a80f3dae by Ingo Klöcker, on behalf of Jonathan Marten.
Committed on 22/02/2022 at 07:51.
Pushed by kloecker into branch 'master'.
Fix MS Word attachments being detected as encrypted
Caused by EncryptedBodyPartFormatter (via GpgME upstream) being too
enthusiastic and detecting the binary data as PGP encrypted. Only
enable this formatter for binary body parts explicitly declared as
encrypted by their MIME type. See the referenced bug for more
M +17 -0 mimetreeparser/autotests/basicobjecttreeparsertest.cpp
M +1 -0 mimetreeparser/autotests/basicobjecttreeparsertest.h
A +61 -0 mimetreeparser/autotests/data/binary-attachment-not-pgp.mbox
M +2 -3 mimetreeparser/autotests/data/openpgp-inline-encrypted-with-attachment.mbox.tree
M +27 -10 mimetreeparser/src/bodyformatter/encrypted.cpp
M +12 -4 mimetreeparser/src/bodyformatter/encrypted.h
M +2 -2 mimetreeparser/src/bodypartformatter.cpp
*** Bug 435504 has been marked as a duplicate of this bug. ***