Bug 232843 - KGzipFilter cannot inflate multi-member gzip archives
Summary: KGzipFilter cannot inflate multi-member gzip archives
Status: RESOLVED FIXED
Alias: None
Product: kdelibs
Classification: Frameworks and Libraries
Component: kdecore (show other bugs)
Version: unspecified
Platform: Gentoo Packages Linux
: NOR normal
Target Milestone: ---
Assignee: kdelibs bugs
URL:
Keywords:
: 241153 252036 269182 286949 333718 (view as bug list)
Depends on:
Blocks:
 
Reported: 2010-03-31 14:46 UTC by Christian (Fuchs)
Modified: 2016-07-05 11:25 UTC (History)
7 users (show)

See Also:
Latest Commit:
Version Fixed In:


Attachments
bug report .gz file (54.36 KB, application/x-gzip)
2010-05-09 11:23 UTC, Christian (Fuchs)
Details
result of unzipping the provided gz with the ark context entry in dolphin (477 bytes, text/plain)
2010-05-09 11:47 UTC, Christian (Fuchs)
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Christian (Fuchs) 2010-03-31 14:46:33 UTC
Version:           ark 2.14  (from KDE SC 4.4.2) (using KDE 4.4.1)
Compiler:          gcc 
OS:                Linux
Installed from:    Gentoo Packages

When extracting .gz archives (.tar.gz not affected) with the  "Autodetect Subfolders" entry in dolphin context menu, the file is only partially extracted. 

A good way to reproduce this is by using nvidia-bug-report.sh: It generates a .gz text file, which is only extracted partially (about 90% of content missing) 

Easy way to reproduce: 

1) sudo nvidia-bug-report.sh
2) right click on the generated file in dolphin
3) choose "Extract Archive Here, Autodetect Subfolders" 
4) extract the same file with either gunzip or only "Extract here" 
5) diff the files: note that the first one is missing about 90% of content 

If you need further information, just ask 

Kind regards, 

Christian
Comment 1 Raphael Kubo da Costa 2010-05-09 05:56:59 UTC
Can you please attach a .gz file that causes this problem for you (even a gzipped nvidia report)?
Comment 2 Christian (Fuchs) 2010-05-09 11:23:25 UTC
Created attachment 43391 [details]
bug report .gz file 

samle .gz file
Comment 3 Christian (Fuchs) 2010-05-09 11:47:41 UTC
Created attachment 43392 [details]
result of unzipping the provided gz with the ark context entry in dolphin

result of unzipping the provided gz with the ark context entry in dolphin. Just gunzip the .gz file and make a diff, you'll see that about 95% of the content is missing.  

Thanks :)
Comment 4 Raphael Kubo da Costa 2010-05-09 18:44:47 UTC
Have you checked if this happens only to this file nvidia-bug-report generated?
Comment 5 Christian (Fuchs) 2010-05-09 18:52:10 UTC
It happens to all nvidia bug reports generated. 

As I am a supporter in a local linux forum (especially for nvidia related problems) I get a lot of these, so in fact probably most (or all) tests I made happened with nvidia bug reports.

So I just tried with dd /dev/urandom to a text file, gzipped it and unzipped it with dolphin/ark, and indeed it did not happen there. 

So it is related to something specific which is in every nvidia bug report. 

I think it's weird that this does not happen when I use gunzip to unzip the bug report, but only with ark. 

Sorry for not checking earlier. 

Kind regards

Christian
Comment 6 Raphael Kubo da Costa 2010-05-09 19:15:10 UTC
Indeed, it seems to be specific to those reports. Actually, it seems to be related to KFilterDev: acessing gzip:/path/to/gzipped/report.gz in Konqueror displays the same incomplete file.

What's more interesting is that nvidia-bug-report.sh simply uses gzip -c. gunzip -l shows:

         compressed        uncompressed  ratio uncompressed_name
              55664                  81 -68588.9% nvidia-bug-report.log

If I uncompress it and run gzip manually again, I get:

         compressed        uncompressed  ratio uncompressed_name
              41084              232633  82.4% bla

Reassigning to kdelibs and CC'ing dfaure.
Comment 7 Raphael Kubo da Costa 2010-06-20 23:45:21 UTC
*** Bug 241153 has been marked as a duplicate of this bug. ***
Comment 8 Raphael Kubo da Costa 2010-06-20 23:52:11 UTC
Well, after spending some hours on this one it looks like the problem is upstream: the ISIZE field in the attached gzip files (the one from bug 241153 too) is set to an erroneus value as stated in comment 6.

I suspect gzip's code (which does not depend on zlib) simply ignores that field, whereas zlib respects that value when reading.

Writing a simple test case based on zlib/examples/zpipe.c, or simply using zlib/examples/zran.c shows that neither of them can read the gzipped files fully.

dfaure's commit 951845 may have something to do with it, as it now passes MAX_WBITS+32 to inflateInit2() to make zlib read and parse the gzip header itself.

Not sure what to do here, though...
Comment 9 Raphael Kubo da Costa 2010-06-21 01:33:23 UTC
Actually, I was not completely right.

Considering the attachment from bug 241153:

gunzip does read only the last 4 bytes in the archive to calculate ISIZE when run with -l, however there seem to be various members in the archive, each with different ISIZE values. The first member has 1948 bytes, which is what zlib and our code reads; gzip, on its turn, reads all the other members as expected (the last one indeed has 2831 stored in its ISIZE field, which is what gunzip -l reports).
Comment 10 Raphael Kubo da Costa 2010-09-22 20:04:52 UTC
*** Bug 252036 has been marked as a duplicate of this bug. ***
Comment 11 Christopher Yeleighton 2010-09-22 22:02:54 UTC
(In reply to comment #8)

> Writing a simple test case based on zlib/examples/zpipe.c, or simply using
> zlib/examples/zran.c shows that neither of them can read the gzipped files
> fully.
> 
> Not sure what to do here, though...

* <URL:http://www.zlib.net/zlib_faq.html#faq08>
Comment 12 Raphael Kubo da Costa 2011-03-22 22:14:22 UTC
*** Bug 269182 has been marked as a duplicate of this bug. ***
Comment 13 Technologov 2011-03-22 22:34:25 UTC
Bug is reproducible on openSUSE 11.4 (KDE 4.6.0)

-Technologov
Comment 14 Raphael Kubo da Costa 2011-11-18 18:39:06 UTC
*** Bug 286949 has been marked as a duplicate of this bug. ***
Comment 15 Raphael Kubo da Costa 2014-04-22 09:01:23 UTC
*** Bug 333718 has been marked as a duplicate of this bug. ***
Comment 16 Sune Vuorela 2016-07-03 12:20:49 UTC
Git commit c75a82d1645062c33fc62c6a7b719ba0202d80d5 by Sune Vuorela.
Committed on 03/07/2016 at 12:14.
Pushed by sune into branch 'master'.

Unit test for bug 232843. Extracting two gz files concatenated.

Just a unit test exposing the bug. No actual fix yet. The fix is
likely about advancing the zlib stream a little, and then parsing
a gz header again.
REVIEW: 128237

A  +-    --    autotests/data/twofiles.gz
M  +15   -0    autotests/kfiltertest.cpp
M  +1    -0    autotests/kfiltertest.h

http://commits.kde.org/karchive/c75a82d1645062c33fc62c6a7b719ba0202d80d5
Comment 17 Martin Sandsmark 2016-07-05 11:25:48 UTC
Git commit 853496897f5216909337ed8711e1b8391ae1b6a3 by Martin T. H. Sandsmark.
Committed on 04/07/2016 at 22:32.
Pushed by sandsmark into branch 'master'.

Handle multiple gzip streams

If kgzipfilter notices that zlib didn't read all the data, it tries to
re-init the stream and read the rest of the buffer. This is tested by
the unit test from Sune. The case where there's more than two streams
available in the current buffer is tested in a unit test added
separately.

If the split between the streams falls right between two buffers, we
need KCompressionDevice to notice that there's data left and try to
continue decompressing. This is easy to test by setting BUFFER_SIZE to
the size of the first stream in the unit test data (28 bytes).

REVIEW: 128369

M  +0    -1    autotests/kfiltertest.cpp
M  +7    -2    src/kcompressiondevice.cpp
M  +38   -10   src/kgzipfilter.cpp

http://commits.kde.org/karchive/853496897f5216909337ed8711e1b8391ae1b6a3