Bug 388423 - [p7zip] ark says zip "archive corrupt" but both "zip -F" and "unzip -t" find no problem
Summary: [p7zip] ark says zip "archive corrupt" but both "zip -F" and "unzip -t" find ...
Status: RESOLVED UPSTREAM
Alias: None
Product: ark
Classification: Applications
Component: plugins (show other bugs)
Version: 17.08.1
Platform: Fedora RPMs Linux
: NOR normal
Target Milestone: ---
Assignee: Ragnar Thomsen
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2018-01-01 13:35 UTC by David Tonhofer
Modified: 2021-11-18 21:17 UTC (History)
4 users (show)

See Also:
Latest Commit:
Version Fixed In:


Attachments
Plugins selected (110.66 KB, image/png)
2018-01-15 21:10 UTC, David Tonhofer
Details
Beginning of "zipinfo -v" output (4.92 KB, text/plain)
2018-02-10 08:07 UTC, David Tonhofer
Details
Atlassian Confluence Export Zip file, ok to 7z (3.17 KB, application/zip)
2018-02-10 21:44 UTC, David Tonhofer
Details
Atlassian Confluence Export Zip file, not ok to 7z (650.40 KB, application/zip)
2018-02-10 21:45 UTC, David Tonhofer
Details
Output of zipinfo -v for the file that works with 7z (3.46 KB, text/plain)
2018-02-10 21:50 UTC, David Tonhofer
Details
Output of zipinfo -v for the file that does not works with 7z (6.33 KB, text/plain)
2018-02-10 21:50 UTC, David Tonhofer
Details
Small Confluence export (ok with p7zip) (5.98 KB, application/zip)
2018-03-06 08:52 UTC, David Tonhofer
Details
Small Confluence export (has subpage) (NOT ok with p7zip) (8.28 KB, application/zip)
2018-03-06 08:53 UTC, David Tonhofer
Details

Note You need to log in before you can comment on or make changes to this bug.
Description David Tonhofer 2018-01-01 13:35:11 UTC
A large archive file (909M, a backup of an Atlassian Confluence installation) opened via "ark" (via the KDE interface or "ark --batch") says the archive is corrupt.

On the command line:

ark.kerfuffle: Archive corrupt

A popuu says "The archive you're trying to open is corrupt. Some files may be missing or damaged [Open as Read-Only] [Don't Open]". Unfortunately no further information.

Continuing with "Open as Read-Only" unpacks the archive.

It can be opened w/o problems using unzip and both "zip -F" and "unzip -t" find no problem.

Indeed, if I compare the MD5SUM of all the files unpacked by ark and unzip, everything matches up.

Unfortunately I have no additional information as ark does not seem to have a verbose mode :-( ... the archive contains confidential info, so I can't provide it.
Comment 1 Elvis Angelaccio 2018-01-02 12:39:47 UTC
Can you try to disable the libzip plugin? (Settings -> Configure Ark -> Plugin Settings)

This could be bug #379964 or bug #383542
Comment 2 David Tonhofer 2018-01-15 21:10:25 UTC
Created attachment 109893 [details]
Plugins selected
Comment 3 David Tonhofer 2018-01-15 21:11:01 UTC
It works if I deselect the "P7zip" plugin.
Comment 4 Ragnar Thomsen 2018-02-09 18:00:36 UTC
It seems 7z detects some issue with your archive. Please provide output of "7z t <archive>"

You can also run Ark from terminal with:
QT_LOGGING_RULES=ark.*.debug=true ark
which provides more console output.
Comment 5 David Tonhofer 2018-02-10 08:07:23 UTC
Output of "7z t"

Sadly, informative messages are often treated as if they were made of precious gems. They are exceedingly sparse. So here:

----------------------
7-Zip [64] 16.02 : Copyright (c) 1999-2016 Igor Pavlov : 2016-05-21
p7zip Version 16.02 (locale=en_GB.UTF-8,Utf16=on,HugeFiles=on,64 bits,8 CPUs Intel(R) Xeon(R) CPU           W3520  @ 2.67GHz (106A5),ASM)

Scanning the drive for archives:
1 file, 952325789 bytes (909 MiB)

Testing archive: Confluence_Aikido_2018_01_01.zip

WARNINGS:
Headers Error

--
Path = Confluence_Aikido_2018_01_01.zip
Type = zip
WARNINGS:
Headers Error
Physical Size = 952325789

Everything is Ok                             

Archives with Warnings: 1

Warnings: 1
Files: 2569
Size:       1114914438
Compressed: 952325789
----------------------

WTF!

Runinng ark using

QT_LOGGING_RULES=ark.*.debug=true ark

we get

---------------
ark.main: Entering application loop
ark.part: Attempting to open archive "/home/calvin/Confluence_Aikido_2018_01_01.zip"
ark.kerfuffle: Going to create archive "/home/calvin/Confluence_Aikido_2018_01_01.zip"
ark.kerfuffle: Checking plugin "kerfuffle_cli7z"
ark.kerfuffle: Created read-only interface for "/home/calvin/Confluence_Aikido_2018_01_01.zip"
ark.kerfuffle: Created read-write interface for "/home/calvin/Confluence_Aikido_2018_01_01.zip"
ark.cli7z: Loaded cli_7z plugin
ark.cli7z: Setting up parameters...
ark.kerfuffle: Successfully loaded plugin "kerfuffle_cli7z"
ark.kerfuffle: Created archive instance
ark.kerfuffle: LoadJob created
ark.kerfuffle: Executing "/usr/bin/7z" ("l", "-slt", "/home/calvin/Confluence_Aikido_2018_01_01.zip") within directory "/home/calvin"
ark.cli7z: p7zip version "16.02" detected
ark.cli7z: Archive name:  "/home/calvin/Confluence_Aikido_2018_01_01.zip"
ark.cli7z: Archive type:  "zip"
ark.kerfuffle: Archive corrupt
ark.part: Showing columns:  (0, 1, 2, 7, 8)
ark.kerfuffle: Process finished, exitcode: 0 exitstatus: QProcess::ExitStatus(NormalExit)
ark.kerfuffle: Executing LoadCorrupt prompt
ark.kerfuffle: Job finished, result: true , time: 5548 ms
---------------

Attaching the output of "zipinfo -v" too
Comment 6 David Tonhofer 2018-02-10 08:07:57 UTC
Created attachment 110501 [details]
Beginning of "zipinfo -v" output
Comment 7 Ragnar Thomsen 2018-02-10 08:29:57 UTC
Ok, so 7z detects a non-critical header error in your archive (standard zip with deflate compression method), while unzip/zipinfo doesnt. Which of the two is correct is hard to say without more details.

I dont think there is anything different we can do in Ark, since we do want to detect errors from the backends we use and its still possible to open/extract the archive.

How was the archive created? It looks like it was created on Windows?
Comment 8 David Tonhofer 2018-02-10 10:10:38 UTC
It was created by Atlassian Confluence (the "cloud" solution), which is probably done via a Java library.

I can create a empty archive to see what happens, but in the end this seems to be the 7z library being fickle. I can open a bug at Fedora about this. What do you think?
Comment 9 David Tonhofer 2018-02-10 21:44:17 UTC
Created attachment 110517 [details]
Atlassian Confluence Export Zip file, ok to 7z

Create an "export" of a totally fresh & empty Atlassian Confluence Space.

The resulting zip file is ok to 7z.
Comment 10 David Tonhofer 2018-02-10 21:45:42 UTC
Created attachment 110518 [details]
Atlassian Confluence Export Zip file, not ok to 7z

Create an "export" of a fresh Atlassian Confluence Space with some test content

The resulting zip file is NOT ok to 7z.
Comment 11 David Tonhofer 2018-02-10 21:50:03 UTC
Created attachment 110519 [details]
Output of zipinfo -v for the file that works with 7z
Comment 12 David Tonhofer 2018-02-10 21:50:22 UTC
Created attachment 110520 [details]
Output of zipinfo -v for the file that does not works with 7z
Comment 13 David Tonhofer 2018-02-10 22:30:36 UTC
Hmmm....

$ rpm --query --file $(which 7z)
p7zip-plugins-16.02-9.fc27.x86_64

$ rpm --query --file $(which 7za)
p7zip-16.02-9.fc27.x86_64

Actually, if the version number of p7zip (the latest p7zip at sourceforge is indeed 16.02) corresponds to the version number of 7z, it's a bit out of date. The 7z latest is 18.01/2018-01-28, 16.02 was 2016-05-21. 

"Bugs have been fixed" in the meantime: 

http://www.7-zip.org/history.txt

Maybe it's one of those bugs.

(Incidentally, 15.05 was the one with the exploitable vulnerabilities: http://blog.talosintelligence.com/2016/05/multiple-7-zip-vulnerabilities.html - Completely unfunnily, the 7z history log says nothing about these; this does not inspire confidence in 7z. At all. Langsec: Serious business!)

I will test this on a Windows machine nearby.
Comment 14 David Tonhofer 2018-02-10 22:54:36 UTC
7zip 18.01 on Windows also says there are "header errors" but doesn't say what exactly the problem is.
Comment 15 Edmund Kasprzak 2018-02-27 20:49:32 UTC
It's pretty clear that a main issue is clearly with 7z library or a wrong usage of it by Confluence.
Nothing can be (or should be) done from Ark side.

I think it should be closed as an UPSTREAM issue.
Comment 16 David Tonhofer 2018-03-01 21:24:15 UTC
I agree
Comment 17 Elvis Angelaccio 2018-03-03 15:04:31 UTC
Right. Ideally this should be reported upstream, unfortunately p7zip is weirdly maintained.
Comment 18 David Tonhofer 2018-03-04 19:49:25 UTC
Report opened at p7zip:

https://sourceforge.net/p/p7zip/bugs/206/
Comment 19 David Tonhofer 2018-03-06 08:52:30 UTC
Created attachment 111216 [details]
Small Confluence export (ok with p7zip)
Comment 20 David Tonhofer 2018-03-06 08:53:07 UTC
Created attachment 111217 [details]
Small Confluence export (has subpage) (NOT ok with p7zip)
Comment 21 David Tonhofer 2018-03-06 09:29:10 UTC
Igor Pavlov says:

"There are incorrect Date/Time values in headers.
ZIP uses MS-DOS time format.
Maybe your software (that was used to create these archives) writes Unix timestamps or some garbage data instead of MS-DOS timestamps.
You can report about that problem to "Atlassian Confluence Cloud" developers. Maybe it was already fixed. So check for new version of their software."
Comment 22 David Tonhofer 2018-03-06 09:49:41 UTC
Support request opened at Atlassian

https://getsupport.atlassian.com/servicedesk/customer/portal/23/JST-373392
Comment 23 David Tonhofer 2018-03-21 07:31:22 UTC
Atlassian says:

https://jira.atlassian.com/browse/CONFCLOUD-59271

"Thanks for bringing this issue to our attention. We've reviewed it with the team and decided we won't fix at this time. This is an issue that's unique to Zip7 software which uses the MS-DOS time format. We'd recommend using another zip software tool like WinZip, WinRAR, or PeaZip, none of which should have issues."

okay.jpg