Bug 451816

Summary: improper usage of ZSTD_compressStream2 API
Product: [Frameworks and Libraries] frameworks-karchive Reporter: toolinfo
Component: generalAssignee: David Faure <faure>
Status: RESOLVED FIXED    
Severity: normal CC: a.samirh78, aacid, kdelibs-bugs-null
Priority: NOR    
Version First Reported In: unspecified   
Target Milestone: ---   
Platform: Other   
OS: Other   
Latest Commit: Version Fixed/Implemented In:
Sentry Crash Report:

Description toolinfo 2022-03-23 09:31:11 UTC
When used through KCompressionDevice, compression using Zstd format is not efficient.

Root cause is the following line in API KZstdFilter::Result KZstdFilter::compress(bool finish)  in file kzstdfilter.cpp

Line 126:
    const size_t result = ZSTD_compressStream2(d->cStream, &d->outBuffer, &d->inBuffer, finish ? ZSTD_e_end : ZSTD_e_flush);

It shall be changed into :

    const size_t result = ZSTD_compressStream2(d->cStream, &d->outBuffer, &d->inBuffer, finish ? ZSTD_e_end : ZSTD_e_continue);

To get compression working correctly.
Comment 1 Albert Astals Cid 2022-05-30 12:30:56 UTC
Can you explain why?

Give some test? some reasoning?
Comment 2 toolinfo 2022-05-30 12:46:44 UTC
Hi,
From faceboot zstd Q&A:

https://github.com/facebook/zstd/issues/1360

Comment is: 

using ZSTD_e_flush will force the compressor to emit a block immediately. In general, this requirement is for communication scenarios, when some data must be sent "now".

Because it increases the nb of blocks produced, this technique can be associated with some lower compression ratio. Therefore, whenever there is no need to send data immediately, it's better (for compression ratio) to let the data accumulate in the internal buffers, and let the compressor decide when it's best to output a new block.

As stated above:

Using flush in a device is not a good idea. It result in a flush for each pieces of data that is send to the device using << operators, which breaks compression. That is why i suggest to use ZSTD_e_continue instead.

I am using it regularily for my project and without the changes, the API is just not working at all (Not compression anything).

Regards
Comment 3 toolinfo 2022-05-30 12:50:25 UTC
Usage model is:
@code
    KCompressionDevice compressor(&compressedModule, false, KCompressionDevice::Zstd);
    if (compressor.open(QIODevice::WriteOnly)) {
      QDataStream stream(&compressor);
      stream << [My class: makes many calls to << operator for each component containers part,...] ;
      compressor.close();
      compressedModule.close();
    }
@endcode
Comment 4 Bug Janitor Service 2022-05-30 21:54:29 UTC
A possibly relevant merge request was started @ https://invent.kde.org/frameworks/karchive/-/merge_requests/38
Comment 5 Albert Astals Cid 2022-05-31 08:53:00 UTC
Git commit 028802998b11163dd761b579000f5cb89fab18d3 by Albert Astals Cid.
Committed on 31/05/2022 at 08:47.
Pushed by aacid into branch 'master'.

Fix zstd KCompressionDevice not compressing as much as it could

M  +37   -4    autotests/kfiltertest.cpp
M  +1    -1    autotests/kfiltertest.h
M  +1    -1    src/kzstdfilter.cpp

https://invent.kde.org/frameworks/karchive/commit/028802998b11163dd761b579000f5cb89fab18d3