Summary: | Valgrind does not support zstd-compressed debug sections | ||
---|---|---|---|
Product: | [Developer tools] valgrind | Reporter: | Sam James <sam> |
Component: | general | Assignee: | Paul Floyd <pjfloyd> |
Status: | RESOLVED FIXED | ||
Severity: | normal | CC: | kocelfc, mark, mihmih1989, pjfloyd, seppo.yliolli+kde |
Priority: | NOR | ||
Version: | unspecified | ||
Target Milestone: | --- | ||
Platform: | Other | ||
OS: | Other | ||
See Also: | https://bugs.kde.org/show_bug.cgi?id=303877 | ||
Latest Commit: | Version Fixed In: | ||
Sentry Crash Report: | |||
Attachments: | first draft of zstd debug sections support |
Description
Sam James
2023-05-15 04:03:00 UTC
Created attachment 172283 [details] first draft of zstd debug sections support Made first draft of Zstandard-compressed debug sections support. Patch looks huge but most of the data is code of "zstddeclib.c", single-file version of zstd decompressor made with their tool as described here: https://github.com/facebook/zstd/blob/dev/build/single_file_libs/README.md . I used source from 1.5.6 version, latest stable at this time. After generation, it was manually fixed to use VG_(malloc), VG_(calloc) and other such functions, basically until it compiled. Aside from that, I used final attachment from https://bugs.kde.org/show_bug.cgi?id=303877 as a template. Except for NEWS, don't really know where should I add it and if I should be the one to add. Changes are quite trivial: * `UChar typeC; // type of compressed data` was added to struct CSlc * In `ML_(img_mark_compressed_part)` new argument was added, `UChar typeC` * In `check_compression`, type of compression now provided to `img_mark_compressed_part` * in `get_slowcase` now either `ZSTD_decompress` or `tinfl_decompress_mem_to_mem` depending on cslc->typeC * Tests were added to check if Zstd decompression is working. That's it basically it. I should say thanks to Aleksandar Rikalo for making code that can be easily adapted for new compression types and to Sam James for providing useful links. I do have a problem with compiling any of the tests, be it for ZLIB or ZSTD, but manual test building and checking works fine. How to build said tests, BTW? I'm not sure about the dual licensing of the zstd code. Mark what do you think of that? I haven't looked at the implementation yet. But the patch seems to include 3 slightly different license blocks: The main one, most used: +/* + * Copyright (c) Meta Platforms, Inc. and affiliates. + * All rights reserved. + * + * This source code is licensed under both the BSD-style license (found in the + * LICENSE file in the root directory of this source tree) and the GPLv2 (found + * in the COPYING file in the root directory of this source tree). + * You may select, at your option, one of the above-listed licenses. + */ and, used just a few times for code lifted from the FSE library: + * Part of FSE library + * Copyright (c) Meta Platforms, Inc. and affiliates. + * + * You can contact the author at : + * - Source repository : https://github.com/Cyan4973/FiniteStateEntropy + * + * This source code is licensed under both the BSD-style license (found in the + * LICENSE file in the root directory of this source tree) and the GPLv2 (found + * in the COPYING file in the root directory of this source tree). + * You may select, at your option, one of the above-listed licenses. And finally once for code lifted from xxHash: + * xxHash - Extremely Fast Hash algorithm + * Header File + * Copyright (c) Yann Collet - Meta Platforms, Inc + * + * This source code is licensed under both the BSD-style license (found in the + * LICENSE file in the root directory of this source tree) and the GPLv2 (found + * in the COPYING file in the root directory of this source tree). + * You may select, at your option, one of the above-listed licenses. I think with GPLv2 they mean GPLv2-only, which would be incompatible with GPLV2+ used by Valgrind. So we would pick the BSD-style license. The comment is misleading though, obviously when included in valgrind that license is NOT "found in the LICENSE file in the root directory of this source tree". They seem to mean: BSD License For Zstandard software Copyright (c) Meta Platforms, Inc. and affiliates. All rights reserved. Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: * Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. * Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. * Neither the name Facebook, nor Meta, nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission. THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. Which is simple 3-clause BSD which is GPLV2+ compatible, so could be incorporated. For the FSE (FinateStateEntropy) library the actual license seems to be this: Copyright (c) 2013, Yann Collet All rights reserved. Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: * Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. * Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. Simple 2-clause BSD, also GPLv2+ compatible. Finally the xxhash code is under a similar license: xxHash Library Copyright (c) 2012-2021 Yann Collet All rights reserved. BSD 2-Clause License (https://www.opensource.org/licenses/bsd-license.php) Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: * Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. * Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. It would be good imho to include the actual license texts in the code instead of the meta/zstd statements. I don't think there is any problem including the code under these (original) licenses, but maybe we could check with upstream zstd why they add these (imho slightly misleading) license statements. So, I need to * Clarify with zstd upstream why they use three slightly different licenses * Clarify if they use GPLv2 or GPLv2+ * Add LICENSE.ZSTD, LICENSE.FSE and LICENSE.XXHASH to coregrind/m_debuginfo with licenses for zstd, FSE library and xxHash No problem, will do. > > I think with GPLv2 they mean GPLv2-only, which would be incompatible with > GPLV2+ used by Valgrind. > So we would pick the BSD-style license. In the COPYING in zstd repo there are lines: > 9. The Free Software Foundation may publish revised and/or new versions > of the General Public License from time to time. Such new versions will > be similar in spirit to the present version, but may differ in detail to > address new problems or concerns. > > Each version is given a distinguishing version number. If the Program > specifies a version number of this License which applies to it and "any > later version", you have the option of following the terms and conditions > either of that version or of any later version published by the Free > Software Foundation. If the Program does not specify a version number of > this License, you may choose any version ever published by the Free Software > Foundation. And also > This program is free software; you can redistribute it and/or modify > it under the terms of the GNU General Public License as published by > the Free Software Foundation; either version 2 of the License, or > (at your option) any later version. So zstd license is actually GPLV2+ (In reply to Mikhail Gorodetsky from comment #4) > So, I need to > * Clarify with zstd upstream why they use three slightly different licenses Yes, we do know why, the code come from three different "upstreams". Each using a slightly different BSD 2-clause vs 3-clause license. They "problem" is that they use identical header text for all three, all referencing the same LICENSE file, but that license file only contains one variant. So the question is why don't they list all three variants in that LICENSE file or have slightly different, but more accurate, copyright headers explicitly referencing the specific LICENSE.variant. > * Clarify if they use GPLv2 or GPLv2+ Yes, I'll comment on your other comment why I think they should be explicit in the header instead of relying just on the COPYING file text. > * Add LICENSE.ZSTD, LICENSE.FSE and LICENSE.XXHASH to coregrind/m_debuginfo > with licenses for zstd, FSE library and xxHash Yes, or simply add the text of those to the zstd.h file (they are short enough and imho it is slightly more convenient to just have them inline where possible). BTW. Is this zstd.h file "generated" from the upstream sources? Then we might have to adjust the (upstream) generator program (and document clearly how to regenerate to update the zstd version when needed. > No problem, will do. And my apologies. All this is kind of being pendantic. We can actually use the code since the BSD 2 and 3 clause licenses are compatible with GPLv2+. It would just be good to have proper documentation of the origin. I think including it with the license headers as they currently are is slightly misleading and might cause trouble for someone in the future who is reviewing the origins of this code and not knowing the exact background. (In reply to Mikhail Gorodetsky from comment #5) > > > > I think with GPLv2 they mean GPLv2-only, which would be incompatible with > > GPLV2+ used by Valgrind. > > So we would pick the BSD-style license. > > In the COPYING in zstd repo there are lines: > > > 9. The Free Software Foundation may publish revised and/or new versions > > of the General Public License from time to time. Such new versions will > > be similar in spirit to the present version, but may differ in detail to > > address new problems or concerns. > > > > Each version is given a distinguishing version number. If the Program > > specifies a version number of this License which applies to it and "any > > later version", you have the option of following the terms and conditions > > either of that version or of any later version published by the Free > > Software Foundation. If the Program does not specify a version number of > > this License, you may choose any version ever published by the Free Software > > Foundation. > > And also > > > This program is free software; you can redistribute it and/or modify > > it under the terms of the GNU General Public License as published by > > the Free Software Foundation; either version 2 of the License, or > > (at your option) any later version. > > > So zstd license is actually GPLV2+ I don't think you can conclude that from the COPYING file itself. The COPYING file is the standard GPLv2 license text. Note that the first part says "If the Program specifies a version number of this License which applies to it and "any later version", ..." But the header only says "GPLv2 found in the COPYING file" so it doesn't add the words "any later version". And the second quote is from the example of how to apply the licenses. But they clearly didn't follow that because the header text they use is slightly different (again doesn't include "any later version"). It might still be that they actually mean GPLv2+, but it would be good if they explicitly state that. > BTW. Is this zstd.h file "generated" from the upstream sources? Then we might have to adjust the (upstream) generator program (and document clearly how to regenerate to update the zstd version when needed. Initially, zstd.h indeed generated here: https://github.com/facebook/zstd/tree/dev/build/single_file_libs . But then all malloc, free and such are replaced with valgrind counterparts, manually. > Yes, or simply add the text of those to the zstd.h file (they are short enough and imho it is slightly more convenient to just have them inline where possible). So, I should add GPLv2 and BSD licenses text at the beginning of zstd.h and probably zstd.c, and above them a short instruction how to regenerate the files in the futute, and replace "found in the <FILENAME> file in the root directory of this source tree" with "found at the beginning of this file". Did I miss anything? Is there any hope of this progressing in near months? We just managed to get freedesktop-sdk (common flatpak base runtime) debuginfo several hundred megabytes smaller by switching to zstd but noticed valgrind doesn't yet support zstd which means we most likely have to revert and switch back to gzip. This affects many parties including KDE flatpak runtime. Obviously this affects all KDE flatpak apps as well. I'm taking another look. (In reply to Seppo Yli-Olli from comment #9) As a nit, that would be zlib, not gzip. Thank you Paul and Mikhail! I've made a couple of small fixes for the regtest. For the license I copied the BSD license into coregrind/m_debuginfo/zstddeclib.c I've also added a description of how to clone the zstd repo and generate zstddeclib.c in README_DEVELOPERS and a brief reference to it in docs/internals/release-HOWTO.txt I've pushed it to a sourceware git try branch - users/paulf/try-bug469782 Mark - is this OK now for the license? Sam, Yeppo - if you want to test it, clone https://sourceware.org/git/valgrind.git and then checkout the above mentioned branch. I see that there are a couple of new warnings and the new regtest files missing in EXTRA_DIST. I'll clean those up before proceeding. The warnings (both C and Makefile.am) should now be fixed. I've repushed to the same try branch, and buildbot seems happy. With users/paulf/try-bug469782 83be84dfafd9372c21d2653c8a48f53fe5383414 I tried to detect leaks with /bin/true and /usr/bin/python, there are no longer complaints unable to load libc debuginfo due to unsupported debuginfo compression and reports say no leaks detected. Thanks for the patch! commit 309d96d3e4f17930da28d76f6163957849553140 (HEAD -> master, origin/master, origin/HEAD) Author: Mikhail Gorodetsky <mihmih1989@gmail.com> Date: Sun Nov 17 21:14:17 2024 +0100 Bug 469782 - Valgrind does not support zstd-compressed debug sections |