Bug 346804

Summary: kmimetypefinder incorrectly detects some ISO files as 'text/plain' type, while other ISO files are detected correctly
Product: [Unmaintained] kdelibs Reporter: i.Dark_Templar <idarktemplar>
Component: generalAssignee: kdelibs bugs <kdelibs-bugs>
Status: RESOLVED DUPLICATE    
Severity: normal CC: rdieter
Priority: NOR    
Version: 4.14.0   
Target Milestone: ---   
Platform: Gentoo Packages   
OS: Linux   
Latest Commit: Version Fixed In:
Sentry Crash Report:

Description i.Dark_Templar 2015-04-27 17:21:41 UTC
Some of .iso images I have are detected in KDE as 'text/plain' instead of 'application/x-cd-image' or similar. KDE version is 4.14.3. Problem doesn't appear in pcmanfm-qt.

$ LC_ALL=C ls -lah *.iso
-rw-r--r-- 1 user group 588M Mar 27 12:52 FreeBSD-10.1-RELEASE-i386-disc1.iso
-rw-r--r-- 1 user group 158M Mar 19 11:43 FreeBSD-9.3-RELEASE-i386-bootonly.iso
-rw-r--r-- 1 user group 182M Oct  4  2014 install-x86-minimal-20140930.iso
-rw-r--r-- 1 user group 2.8G Oct  4  2014 livedvd-x86-amd64-32ul-20140826.iso
-rw-r--r-- 1 user group 916M Jan 24 20:47 xubuntu-14.04.1-desktop-i386.iso

$ file --mime-type *.iso
FreeBSD-10.1-RELEASE-i386-disc1.iso:   application/x-iso9660-image
FreeBSD-9.3-RELEASE-i386-bootonly.iso: application/x-iso9660-image
install-x86-minimal-20140930.iso:      application/x-iso9660-image
livedvd-x86-amd64-32ul-20140826.iso:   application/x-iso9660-image
xubuntu-14.04.1-desktop-i386.iso:      application/x-iso9660-image

$ mimetype *.iso
FreeBSD-10.1-RELEASE-i386-disc1.iso:   application/x-gamecube-rom
FreeBSD-9.3-RELEASE-i386-bootonly.iso: application/x-gamecube-rom
install-x86-minimal-20140930.iso:      application/x-gamecube-rom
livedvd-x86-amd64-32ul-20140826.iso:   application/x-gamecube-rom
xubuntu-14.04.1-desktop-i386.iso:      application/x-gamecube-rom

$ for i in *.iso ; do echo -n -e "$i:\t" ; kmimetypefinder $i ; done
FreeBSD-10.1-RELEASE-i386-disc1.iso:    application/x-cd-image
(accuracy 20)
FreeBSD-9.3-RELEASE-i386-bootonly.iso:  application/x-cd-image
(accuracy 20)
install-x86-minimal-20140930.iso:       text/plain
(accuracy 5)
livedvd-x86-amd64-32ul-20140826.iso:    application/x-cd-image
(accuracy 20)
xubuntu-14.04.1-desktop-i386.iso:       text/plain
(accuracy 5)

In KDE session:
$ for i in *.iso ; do echo -n -e "$i:\t" ; xdg-mime query filetype $i ; done
FreeBSD-10.1-RELEASE-i386-disc1.iso:    application/x-cd-image
FreeBSD-9.3-RELEASE-i386-bootonly.iso:  application/x-cd-image
install-x86-minimal-20140930.iso:       text/plain
livedvd-x86-amd64-32ul-20140826.iso:    application/x-cd-image
xubuntu-14.04.1-desktop-i386.iso:       text/plain

Outside of KDE session, when dev-perl/File-MimeInfo is installed (which provides mimetype executable):
$ for i in *.iso ; do echo -n -e "$i:\t" ; xdg-mime query filetype $i ; done
FreeBSD-10.1-RELEASE-i386-disc1.iso:   application/x-gamecube-rom
FreeBSD-9.3-RELEASE-i386-bootonly.iso: application/x-gamecube-rom
install-x86-minimal-20140930.iso:      application/x-gamecube-rom
livedvd-x86-amd64-32ul-20140826.iso:   application/x-gamecube-rom
xubuntu-14.04.1-desktop-i386.iso:      application/x-gamecube-rom

Outside of KDE session, when dev-perl/File-MimeInfo is not installed:
$ for i in *.iso ; do echo -n -e "$i:\t" ; xdg-mime query filetype $i ; done
FreeBSD-10.1-RELEASE-i386-disc1.iso:   application/x-iso9660-image
FreeBSD-9.3-RELEASE-i386-bootonly.iso: application/x-iso9660-image
install-x86-minimal-20140930.iso:      application/x-iso9660-image
livedvd-x86-amd64-32ul-20140826.iso:   application/x-iso9660-image
xubuntu-14.04.1-desktop-i386.iso:      application/x-iso9660-image

The problem appears in KDE sessions. xdg-open calls xdg-mime and it uses kmimetypefinder, which incorrectly recognizes some of ISO images as text files.

Images have some differences, and that may be the reason of such strange behaviour of kmimetypefinder. FreeBSD images look like plain ISO files, while other 3 are hybrid ones, and xubuntu*.iso and install*.iso images' filesystem tables look similar.

$ for i in *.iso ; do echo -e "\nFILE $i:\t" ; /sbin/fdisk -l $i ; done

FILE FreeBSD-10.1-RELEASE-i386-disc1.iso:

Disk FreeBSD-10.1-RELEASE-i386-disc1.iso: 587,4 MiB, 615966720 bytes, 1203060 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes

FILE FreeBSD-9.3-RELEASE-i386-bootonly.iso:

Disk FreeBSD-9.3-RELEASE-i386-bootonly.iso: 157,3 MiB, 164935680 bytes, 322140 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes

FILE install-x86-minimal-20140930.iso:

Disk install-x86-minimal-20140930.iso: 182 MiB, 190840832 bytes, 372736 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: dos
Disk identifier: 0x653af967

Device                            Boot Start    End Sectors  Size Id Type
install-x86-minimal-20140930.iso1 *        0 372735  372736  182M 17 Hidden HPFS/NTFS


FILE livedvd-x86-amd64-32ul-20140826.iso:

Disk livedvd-x86-amd64-32ul-20140826.iso: 2,8 GiB, 2944650240 bytes, 5751270 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: dos
Disk identifier: 0x133381df

Device                               Boot Start     End Sectors  Size Id Type
livedvd-x86-amd64-32ul-20140826.iso1 *        0 5751269 5751270  2,8G  0 Empty
livedvd-x86-amd64-32ul-20140826.iso2        256   16639   16384    8M ef EFI (FAT-12/16/32)
livedvd-x86-amd64-32ul-20140826.iso3      16640   82174   65535   32M  0 Empty


FILE xubuntu-14.04.1-desktop-i386.iso:

Disk xubuntu-14.04.1-desktop-i386.iso: 916 MiB, 960495616 bytes, 1875968 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: dos
Disk identifier: 0x7d2105db

Device                            Boot Start     End Sectors  Size Id Type
xubuntu-14.04.1-desktop-i386.iso1 *       64 1875967 1875904  916M 17 Hidden HPFS/NTFS 

Reproducible: Didn't try




Not sure if I chose correct KDE components to report against.
Comment 1 i.Dark_Templar 2015-04-27 18:32:10 UTC
Dump of first 128 bytes of each image:
$ for i in *.iso ; do echo FILE $i ; hexdump -n 128 -C $i ; echo ; done   
FILE FreeBSD-10.1-RELEASE-i386-disc1.iso
00000000  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00000080

FILE FreeBSD-9.3-RELEASE-i386-bootonly.iso
00000000  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00000080

FILE install-x86-minimal-20140930.iso
00000000  33 ed 90 90 90 90 90 90  90 90 90 90 90 90 90 90  |3...............|
00000010  90 90 90 90 90 90 90 90  90 90 90 90 90 90 90 90  |................|
00000020  33 ed fa 8e d5 bc 00 7c  fb fc 66 31 db 66 31 c9  |3......|..f1.f1.|
00000030  66 53 66 51 06 57 8e dd  8e c5 52 be 00 7c bf 00  |fSfQ.W....R..|..|
00000040  06 b9 00 01 f3 a5 ea 4b  06 00 00 52 b4 41 bb aa  |.......K...R.A..|
00000050  55 31 c9 30 f6 f9 cd 13  72 16 81 fb 55 aa 75 10  |U1.0....r...U.u.|
00000060  83 e1 01 74 0b 66 c7 06  f1 06 b4 42 eb 15 eb 00  |...t.f.....B....|
00000070  5a 51 b4 08 cd 13 83 e1  3f 5b 51 0f b6 c6 40 50  |ZQ......?[Q...@P|
00000080

FILE livedvd-x86-amd64-32ul-20140826.iso
00000000  45 52 08 00 00 00 90 90  00 00 00 00 00 00 00 00  |ER..............|
00000010  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000020  33 ed fa 8e d5 bc 00 7c  fb fc 66 31 db 66 31 c9  |3......|..f1.f1.|
00000030  66 53 66 51 06 57 8e dd  8e c5 52 be 00 7c bf 00  |fSfQ.W....R..|..|
00000040  06 b9 00 01 f3 a5 ea 4b  06 00 00 52 b4 41 bb aa  |.......K...R.A..|
00000050  55 31 c9 30 f6 f9 cd 13  72 16 81 fb 55 aa 75 10  |U1.0....r...U.u.|
00000060  83 e1 01 74 0b 66 c7 06  f1 06 b4 42 eb 15 eb 00  |...t.f.....B....|
00000070  5a 51 b4 08 cd 13 83 e1  3f 5b 51 0f b6 c6 40 50  |ZQ......?[Q...@P|
00000080

FILE xubuntu-14.04.1-desktop-i386.iso
00000000  33 ed 90 90 90 90 90 90  90 90 90 90 90 90 90 90  |3...............|
00000010  90 90 90 90 90 90 90 90  90 90 90 90 90 90 90 90  |................|
00000020  33 ed fa 8e d5 bc 00 7c  fb fc 66 31 db 66 31 c9  |3......|..f1.f1.|
00000030  66 53 66 51 06 57 8e dd  8e c5 52 be 00 7c bf 00  |fSfQ.W....R..|..|
00000040  06 b9 00 01 f3 a5 ea 4b  06 00 00 52 b4 41 bb aa  |.......K...R.A..|
00000050  55 31 c9 30 f6 f9 cd 13  72 16 81 fb 55 aa 75 10  |U1.0....r...U.u.|
00000060  83 e1 01 74 0b 66 c7 06  f1 06 b4 42 eb 15 eb 00  |...t.f.....B....|
00000070  5a 51 b4 08 cd 13 83 e1  3f 5b 51 0f b6 c6 40 50  |ZQ......?[Q...@P|
00000080

I've checked sources.
It looks like function KMimeTypeRepository::findFromContent is called, it doesn't recognize some of those images and falls back to function KMimeType::isBufferBinaryData which reports it being valid text files.

Here's one possible way to fix it:
1) make sure that KMimeType::isBufferBinaryData uses 'isgraph || isspace' functions (or their locale-aware versions) to check that each byte is a valid printable or space or newline character.
2) if contents mime detection fails (accuracy == 0), fallback on mime detection by file name. It may be made either inside KMimeType class or inside kmimetypefinder application.

Here's another one:
1) By default behaviour make kmimetypefinder obtain mime type both based on filename and contents, and if contents mime accuracy is greater than filename mime accuracy then report contents mime, otherwise report filename mime.

Both fixes may be implemented/combined too.
Comment 2 i.Dark_Templar 2015-04-29 09:49:09 UTC
More information:
I've just downloaded xubuntu-15.04-desktop-i386.iso from http://xubuntu.org/getxubuntu/. I guess using other images from that page would provide same results.
I've also downloaded FreeBSD-10.1-RELEASE-i386-bootonly.iso from ftp://ftp.freebsd.org/pub/FreeBSD/releases/i386/i386/ISO-IMAGES/10.1/

$ file --mime-type *.iso
FreeBSD-10.1-RELEASE-i386-bootonly.iso: application/x-iso9660-image
xubuntu-15.04-desktop-i386.iso:         application/x-iso9660-image

$ for i in *.iso ; do echo -n -e "\n$i:\t" ; kmimetypefinder $i ; done

FreeBSD-10.1-RELEASE-i386-bootonly.iso: application/x-cd-image
(accuracy 20)

xubuntu-15.04-desktop-i386.iso: text/plain
(accuracy 5)

$ LC_ALL=C kmimetypefinder --version
Qt: 4.8.6
KDE Development Platform: 4.14.3
MimeType Finder: 4.14.3
Comment 3 Rex Dieter 2015-05-02 03:47:28 UTC
see bug #337035 and freedesktop.org upstream bug,
https://bugs.freedesktop.org/show_bug.cgi?id=80877

*** This bug has been marked as a duplicate of bug 337035 ***