Bug 464712 - KBackup archives cannot be restored. - tar files are corrupt
Summary: KBackup archives cannot be restored. - tar files are corrupt
Status: RESOLVED FIXED
Alias: None
Product: kbackup
Classification: Applications
Component: general (show other bugs)
Version: 22.12.1
Platform: Neon Linux
: NOR grave
Target Milestone: ---
Assignee: Martin Koller
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2023-01-23 19:08 UTC by Paul Hands
Modified: 2023-02-04 23:04 UTC (History)
0 users

See Also:
Latest Commit:
Version Fixed In:
Sentry Crash Report:


Attachments
CLI atr tvf warning (302.66 KB, image/png)
2023-01-23 19:08 UTC, Paul Hands
Details
kbacup corruption message (66.22 KB, image/png)
2023-01-23 19:09 UTC, Paul Hands
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Paul Hands 2023-01-23 19:08:35 UTC
Created attachment 155538 [details]
CLI atr tvf warning

SUMMARY
***
kbackup tar archives cannot be restored - kbackup says the tar files are corrupt, and running tar tvf  in a shell on the files confirms this.
**


STEPS TO REPRODUCE
1.   Set up kbackup with a profile and run a full backup, and set up incremental backups as a cron job every 2 days
2.  Attempt to restore a part of a backup of .config info in home directory
3. 

OBSERVED RESULT
Choosing an archive to restore almost always fails with a message :-
"The archive you're trying to open is corrupt.
Some files may be missing or damaged.:

kbackup then asks if I want to open the archive Read-Only, and if I do, I see that the recorded size of the archives is MUCH smaller than the tar file on the drive - 3.6GB as opposed to 710GB. See the attached screenshots

EXPECTED RESULT

Restore should happen as desired.


SOFTWARE/OS VERSIONS
Operating System: KDE neon 5.26
KDE Plasma Version: 5.26.5
KDE Frameworks Version: 5.102.0
Qt Version: 5.15.8
Kernel Version: 5.15.0-58-generic (64-bit)
Graphics Platform: X11
Processors: 24 × AMD Ryzen 9 3900X 12-Core Processor
Memory: 47.0 GiB of RAM
Graphics Processor: AMD Radeon RX 6600
Manufacturer: ASUS

ADDITIONAL INFORMATION

This is a catastrophic problem...it makes kbackup worse than useless.  It also let me down when I really need to restore a backup.

More Info about the details.....
1.  The backups are on a home network NAS unit which is mounted via NFS - This may be an issue!
2. The backups contain some very large binary files :- ISOs of various OSs and many VirtualBox .vdi file for guest machines.  These .vdi files are hundreds of GB in size.  Running tar tvf on these almost always fails on a large binary file.
3. Running the backups (incremental or full) spends a lot of time on these .vdi files, and it appears as if the kbackup GUI is frozen.  However, the backup completes with no errors after many hours.

This all makes kbackup almost useless, and because it doesn't report any problems, or do any checking on the tar files, a user is left with a false sense of security.  I think kbackup needs to check (checksum or other signature?) on the tar files, and perhaps to warn of an NFS mount, where tar may fail.
Comment 1 Paul Hands 2023-01-23 19:09:35 UTC
Created attachment 155539 [details]
kbacup corruption message
Comment 2 Paul Hands 2023-01-23 19:11:38 UTC
I'm available for testing and can create screen capture videos as needed.
Comment 3 Martin Koller 2023-01-26 00:02:00 UTC
Found the problem.
It's a bug in the KTar implementation.
The size of a file is stored inside the tar file header as an octal encoded string.
strange, but obviously a very old format from the good old Unix days ;-) 

The string is limited to 11 chars, which means: the maximum file size is
8589934591 Bytes.
Sadly KTar has no check for an overflow in our case and simply cuts the 12th
character. In my testcase I have a file being ~12GB, which is 12288000000 bytes,
which should be encoded as the string "133433000000", but KTar writes it as
"13343300000" leading to a size showing as 1536000000 bytes.

In the short term this means I need to implement a check which limits the backup
to files with the above max length and kbackup will cancel a backup when a too large
file is found.
Comment 4 Paul Hands 2023-01-30 16:37:13 UTC
A thought.....

It's possible to use the "split" command to break a large file into pieces.  That might be an option inside KBackup, instead of making changes to KTar....especially if you're not the maintainer.  It could be that this a new option for KBackup - split files over a given size into chunks anyway?

There would be a bit of extra work at restore time to reassemble the pieces (xaa, xab, xac...and so on) from the split operation...but KBackup doesn't do restore, so perhaps some scripting for Ark?
Comment 5 Martin Koller 2023-01-30 16:52:13 UTC
I don't want to go that route. It would also need double the space on your disk and is probably slow.
I looked into the ark sources since I can open a tar file with 12GB when I created it on the command line.
What I found is that they are not using KTar but the libArchive library, which can handle a lot of other archive formats as well
(which I don't need in kbackup).
It seems the simplest way is to also use this lib instead reimplemeting all this enhanced tar formats libarchive
has already implemented.
Comment 6 Paul Hands 2023-01-30 17:03:25 UTC
OK...I'll stop bothering you :-)  I'm just trying to get a working backup solution that works well in KDE.  LibArchive sounds like the right call.
Comment 7 Martin Koller 2023-02-04 23:04:47 UTC
Git commit 92b35cf7f4fed327d68bfc1608878f8723a679a9 by Martin Koller.
Committed on 04/02/2023 at 23:02.
Pushed by mkoller into branch 'master'.

Switch from KTar to libarchive to allow archival of files > ca. 8GB

M  +118  -98   src/Archiver.cxx
M  +21   -23   src/Archiver.hxx
M  +3    -0    src/CMakeLists.txt

https://invent.kde.org/utilities/kbackup/commit/92b35cf7f4fed327d68bfc1608878f8723a679a9