Bug 280446 - [patch] Determining MIME type of corrupted files hangs on repeated reads
Summary: [patch] Determining MIME type of corrupted files hangs on repeated reads
Status: RESOLVED FIXED
Alias: None
Product: kdelibs
Classification: Unmaintained
Component: kdecore (show other bugs)
Version: 4.7
Platform: Gentoo Packages Linux
: NOR normal
Target Milestone: ---
Assignee: kdelibs bugs
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2011-08-19 21:13 UTC by Miroslav Ľos
Modified: 2011-10-19 07:37 UTC (History)
2 users (show)

See Also:
Latest Commit:
Version Fixed In: 4.7.3
Sentry Crash Report:


Attachments
proposed patch (1.47 KB, patch)
2011-08-19 21:28 UTC, Miroslav Ľos
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Miroslav Ľos 2011-08-19 21:13:26 UTC
Version:           4.7 (using KDE 4.7.0) 
OS:                Linux

 If KMimeTypeRepository::findFromContent tries to determine MIME from a file
that cannot be read, such as on a corrupted optical disc, a read attempt is
made in KMimeMagicMatch::match for every available rule, resulting in UI
hangs (e.g. file dialogs, dolphin) for tens of minutes.


Reproducible: Always

Steps to Reproduce:
List a directory containing corrupted files in e.g. Dolphin.

Actual Results:  
UI becomes unresponsive for a very long time, usually needing to kill the app.

Expected Results:  
Applications should stay responsive.

A typical strace then looks like:

open("/media/Fedora-15-x86_64-Live-KDE.iso/isolinux/initrd0.img", O_RDONLY|O_CLOEXEC) = 23
fcntl(23, F_SETFD, FD_CLOEXEC)          = 0
fstat(23, {st_mode=S_IFREG|0444, st_size=9577989, ...}) = 0
fstat(23, {st_mode=S_IFREG|0444, st_size=9577989, ...}) = 0
lseek(23, 2089, SEEK_SET)               = 2089
fstat(23, {st_mode=S_IFREG|0444, st_size=9577989, ...}) = 0
read(23, 0xcaaa30, 16384)               = -1 EIO (Input/output error)
read(23, 0xca9c08, 10)                  = -1 EIO (Input/output error)
lseek(23, 0, SEEK_SET)                  = 0
read(23, 0xcaaa30, 16384)               = -1 EIO (Input/output error)
read(23, 0xc9ffe8, 2)                   = -1 EIO (Input/output error)
lseek(23, 0, SEEK_SET)                  = 0
read(23, 0xcaaa30, 16384)               = -1 EIO (Input/output error)
read(23, 0xc9ffe8, 3)                   = -1 EIO (Input/output error)

(repeated hundreds of times for each file)

As KMimeMagicMatch::match returns a bool and cannot easily signal such a
broken file, I alleviated this problem by pre-reading enough data before
matching any rules. 

The maximum amount of data ever needed were the first 1029 bytes of each file, 
so I read 2K, leaving a buffer for future rules. 
I also had the file opened unbuffered in both KMimeType::findByUrlHelper and 
KMimeType::findByFileContent. 
It was trying to needlesly read whole 16K for the buffer, increasing the chance 
for the EIO (as some files fail after the 2K needed).

If the pre-read fails, matching by content fails also. This cuts the EIO's to
at most one per corrupted file, shrinking the wait to mere seconds.

For the proposed fix see attached patch (changing 2 files in kdecore/services).
Comment 1 Miroslav Ľos 2011-08-19 21:28:15 UTC
Created attachment 62988 [details]
proposed patch

Gentoo won't tether. Android browser won't upload files.

Waiting times after my fix are comparable to those in Nautilus (Ubuntu).
Comment 2 Peter Penz 2011-08-19 22:59:56 UTC
Thanks a lot for the patch! Could you please put the patch to https://git.reviewboard.kde.org and put faure@kde.org to the recipients-list? I'm quite sure that this patch will be "forgotten" here in bugs.kde.org within hundreds of other reports - on git.reviewboard.kde.org it usually takes only a few days until the patch can get committed.
Comment 3 Miroslav Ľos 2011-08-20 16:28:32 UTC
Well, I registered and created a review request (102382), but uploading a patch to the system seems to require the correct commit hashes, which in turn requires a git clone of the lot of kdelibs, which is not so great over my current connection.

I believe Mr. Faure could apply the patch as-is in his environment. I have confirmed it will apply cleanly to the current master. Sorry for the inconvenience (mine would be greater).
Comment 4 Peter Penz 2011-08-20 17:22:52 UTC
No problem, I've created a review request on https://git.reviewboard.kde.org/r/102391/ and will take care to push it if David found the time to review it.
Comment 5 David Faure 2011-08-21 10:09:29 UTC
Thanks. I have commented in the review request.

Would the problem be fixed for you without the use of Unbuffered?

(Alternatively we could make sure to always update the cache, in the code, when reading data after the current cache size, and then we could use Unbuffered)
Comment 6 Peter Penz 2011-09-01 17:53:33 UTC
Git commit 880a4e8512343462353f69ddecb63ae1748b9902 by Peter Penz.
Committed on 20/08/2011 at 19:14.
Pushed by ppenz into branch 'KDE/4.7'.

Don't hang when determining MIME type of corrupted files

The patch has been provided by Miroslav Ľos.

BUG: 280446

M  +8    -0    kdecore/services/kmimetyperepository.cpp
M  +2    -2    kdecore/services/kmimetype.cpp

http://commits.kde.org/kdelibs/880a4e8512343462353f69ddecb63ae1748b9902
Comment 7 Peter Penz 2011-09-01 19:17:22 UTC
Git commit 7705b25883c76affcc608e157e0d003de98ad531 by Peter Penz.
Committed on 20/08/2011 at 19:14.
Pushed by ppenz into branch 'KDE/4.7'.

Don't hang when determining MIME type of corrupted files

The patch has been provided by Miroslav Ľos.

BUG: 280446

M  +8    -0    kdecore/services/kmimetyperepository.cpp
M  +2    -2    kdecore/services/kmimetype.cpp

http://commits.kde.org/kdelibs/7705b25883c76affcc608e157e0d003de98ad531
Comment 8 Peter Penz 2011-09-01 19:28:37 UTC
Sorry, I accidentally have pushed the patch. I had some troubles with the history then in git (-> 2 commits listed here) but the patch is reverted now.
Comment 9 David Faure 2011-10-19 07:37:47 UTC
Git commit 6606337b9bdf98111276d0fd3d803a644462cb28 by David Faure.
Committed on 19/10/2011 at 09:35.
Pushed by dfaure into branch 'KDE/4.7'.

Read the first 16K of the file upfront, to minimize seeks and reads.

Patch by Miroslav Ľos - thanks!

BUG: 280446
FIXED-IN: 4.7.3
REVIEW: 102391

M  +8    -0    kdecore/services/kmimetyperepository.cpp

http://commits.kde.org/kdelibs/6606337b9bdf98111276d0fd3d803a644462cb28