A large archive file (909M, a backup of an Atlassian Confluence installation) opened via "ark" (via the KDE interface or "ark --batch") says the archive is corrupt. On the command line: ark.kerfuffle: Archive corrupt A popuu says "The archive you're trying to open is corrupt. Some files may be missing or damaged [Open as Read-Only] [Don't Open]". Unfortunately no further information. Continuing with "Open as Read-Only" unpacks the archive. It can be opened w/o problems using unzip and both "zip -F" and "unzip -t" find no problem. Indeed, if I compare the MD5SUM of all the files unpacked by ark and unzip, everything matches up. Unfortunately I have no additional information as ark does not seem to have a verbose mode :-( ... the archive contains confidential info, so I can't provide it.
Can you try to disable the libzip plugin? (Settings -> Configure Ark -> Plugin Settings) This could be bug #379964 or bug #383542
Created attachment 109893 [details] Plugins selected
It works if I deselect the "P7zip" plugin.
It seems 7z detects some issue with your archive. Please provide output of "7z t <archive>" You can also run Ark from terminal with: QT_LOGGING_RULES=ark.*.debug=true ark which provides more console output.
Output of "7z t" Sadly, informative messages are often treated as if they were made of precious gems. They are exceedingly sparse. So here: ---------------------- 7-Zip [64] 16.02 : Copyright (c) 1999-2016 Igor Pavlov : 2016-05-21 p7zip Version 16.02 (locale=en_GB.UTF-8,Utf16=on,HugeFiles=on,64 bits,8 CPUs Intel(R) Xeon(R) CPU W3520 @ 2.67GHz (106A5),ASM) Scanning the drive for archives: 1 file, 952325789 bytes (909 MiB) Testing archive: Confluence_Aikido_2018_01_01.zip WARNINGS: Headers Error -- Path = Confluence_Aikido_2018_01_01.zip Type = zip WARNINGS: Headers Error Physical Size = 952325789 Everything is Ok Archives with Warnings: 1 Warnings: 1 Files: 2569 Size: 1114914438 Compressed: 952325789 ---------------------- WTF! Runinng ark using QT_LOGGING_RULES=ark.*.debug=true ark we get --------------- ark.main: Entering application loop ark.part: Attempting to open archive "/home/calvin/Confluence_Aikido_2018_01_01.zip" ark.kerfuffle: Going to create archive "/home/calvin/Confluence_Aikido_2018_01_01.zip" ark.kerfuffle: Checking plugin "kerfuffle_cli7z" ark.kerfuffle: Created read-only interface for "/home/calvin/Confluence_Aikido_2018_01_01.zip" ark.kerfuffle: Created read-write interface for "/home/calvin/Confluence_Aikido_2018_01_01.zip" ark.cli7z: Loaded cli_7z plugin ark.cli7z: Setting up parameters... ark.kerfuffle: Successfully loaded plugin "kerfuffle_cli7z" ark.kerfuffle: Created archive instance ark.kerfuffle: LoadJob created ark.kerfuffle: Executing "/usr/bin/7z" ("l", "-slt", "/home/calvin/Confluence_Aikido_2018_01_01.zip") within directory "/home/calvin" ark.cli7z: p7zip version "16.02" detected ark.cli7z: Archive name: "/home/calvin/Confluence_Aikido_2018_01_01.zip" ark.cli7z: Archive type: "zip" ark.kerfuffle: Archive corrupt ark.part: Showing columns: (0, 1, 2, 7, 8) ark.kerfuffle: Process finished, exitcode: 0 exitstatus: QProcess::ExitStatus(NormalExit) ark.kerfuffle: Executing LoadCorrupt prompt ark.kerfuffle: Job finished, result: true , time: 5548 ms --------------- Attaching the output of "zipinfo -v" too
Created attachment 110501 [details] Beginning of "zipinfo -v" output
Ok, so 7z detects a non-critical header error in your archive (standard zip with deflate compression method), while unzip/zipinfo doesnt. Which of the two is correct is hard to say without more details. I dont think there is anything different we can do in Ark, since we do want to detect errors from the backends we use and its still possible to open/extract the archive. How was the archive created? It looks like it was created on Windows?
It was created by Atlassian Confluence (the "cloud" solution), which is probably done via a Java library. I can create a empty archive to see what happens, but in the end this seems to be the 7z library being fickle. I can open a bug at Fedora about this. What do you think?
Created attachment 110517 [details] Atlassian Confluence Export Zip file, ok to 7z Create an "export" of a totally fresh & empty Atlassian Confluence Space. The resulting zip file is ok to 7z.
Created attachment 110518 [details] Atlassian Confluence Export Zip file, not ok to 7z Create an "export" of a fresh Atlassian Confluence Space with some test content The resulting zip file is NOT ok to 7z.
Created attachment 110519 [details] Output of zipinfo -v for the file that works with 7z
Created attachment 110520 [details] Output of zipinfo -v for the file that does not works with 7z
Hmmm.... $ rpm --query --file $(which 7z) p7zip-plugins-16.02-9.fc27.x86_64 $ rpm --query --file $(which 7za) p7zip-16.02-9.fc27.x86_64 Actually, if the version number of p7zip (the latest p7zip at sourceforge is indeed 16.02) corresponds to the version number of 7z, it's a bit out of date. The 7z latest is 18.01/2018-01-28, 16.02 was 2016-05-21. "Bugs have been fixed" in the meantime: http://www.7-zip.org/history.txt Maybe it's one of those bugs. (Incidentally, 15.05 was the one with the exploitable vulnerabilities: http://blog.talosintelligence.com/2016/05/multiple-7-zip-vulnerabilities.html - Completely unfunnily, the 7z history log says nothing about these; this does not inspire confidence in 7z. At all. Langsec: Serious business!) I will test this on a Windows machine nearby.
7zip 18.01 on Windows also says there are "header errors" but doesn't say what exactly the problem is.
It's pretty clear that a main issue is clearly with 7z library or a wrong usage of it by Confluence. Nothing can be (or should be) done from Ark side. I think it should be closed as an UPSTREAM issue.
I agree
Right. Ideally this should be reported upstream, unfortunately p7zip is weirdly maintained.
Report opened at p7zip: https://sourceforge.net/p/p7zip/bugs/206/
Created attachment 111216 [details] Small Confluence export (ok with p7zip)
Created attachment 111217 [details] Small Confluence export (has subpage) (NOT ok with p7zip)
Igor Pavlov says: "There are incorrect Date/Time values in headers. ZIP uses MS-DOS time format. Maybe your software (that was used to create these archives) writes Unix timestamps or some garbage data instead of MS-DOS timestamps. You can report about that problem to "Atlassian Confluence Cloud" developers. Maybe it was already fixed. So check for new version of their software."
Support request opened at Atlassian https://getsupport.atlassian.com/servicedesk/customer/portal/23/JST-373392
Atlassian says: https://jira.atlassian.com/browse/CONFCLOUD-59271 "Thanks for bringing this issue to our attention. We've reviewed it with the team and decided we won't fix at this time. This is an issue that's unique to Zip7 software which uses the MS-DOS time format. We'd recommend using another zip software tool like WinZip, WinRAR, or PeaZip, none of which should have issues." okay.jpg