Bug 188355

Summary: (probably) undesirable file type check through file extension
Product: [Frameworks and Libraries] kdelibs Reporter: David <spamaccountmeister>
Component: generalAssignee: kdelibs bugs <kdelibs-bugs>
Status: RESOLVED FIXED    
Severity: normal CC: andresbajotierra, faure, niels.misc, Regnaron, rrh, tanuva
Priority: NOR    
Version: SVN   
Target Milestone: ---   
Platform: unspecified   
OS: Linux   
Latest Commit: Version Fixed In:
Attachments: File attachment

Description David 2009-03-28 20:21:14 UTC
Version:           1.2.1 (using 4.2.1 (KDE 4.2.1), Gentoo)
Compiler:          i686-pc-linux-gnu-gcc
OS:                Linux (i686) release 2.6.29-gentoo

I'm sorry to report this as a bug when probably it isn't, I'll try to be as clear as possible.

Typically in kde 3.5, correct me if I'm wrong, a text file would be associated to a text editor (kate, kwrite, whatever) even if no file extension existed. It has come to my attention that this no longer occurs in kde 4.2.1.

I assume this is a bug since if change any file extension to, for instance, ".pdf", okular will be the program trying to open the file.

I'm not sure if this the expected behavior but it reminds me an old exploit for (cough cough) windows...

Once again, am sorry if this is not a bug and/or the right place to report.

Cheers
Comment 1 Peter Penz 2009-04-05 21:12:21 UTC
Thanks for the report. The file extension has a higher priority as the content of a file. So if you you rename a text-file to a .pdf file, Okular will get opened and this is the expected behavior. Only if there is no file extension, KDE will try to guess the file type by looking into the content (this is quite expensive, as the file needs to get opened twice: once for determining the content, a second time by the application that should show the file). So if you have a text file without text extension, still the text editor (kwrite, ...) will get opened. The current behavior is equal to the behavior of KDE 3. I'll close this bug as WORKSFORME (please let me know if I misunderstood your request).
Comment 2 David 2009-04-06 16:23:45 UTC
Hi, you understood perfectly. The only thinh is that the behaviour you explain (the expected one) sometimes doesn't really happen. I experienced this with a simple text file without any extension.

Thanks for your reply
Comment 3 Peter Penz 2009-04-06 16:28:05 UTC
David, could you maybe attach this text file to the bug report? I did some tests with text files having no extension and KWrite has been opened as expected in my environment...
Comment 4 Peter Penz 2009-04-12 15:22:22 UTC
(Update: The submitter of the report sent me the text file where the autodetection does not work - I could reproduce the issue in KDE 4.2 and trunk. When using the same file in KDE 3 with Konqueror, it is noticed as text file)

I've set David Faure to CC. This might be an issue with the used encoding of the text file and this report should go into kdelibs.

@David (bug-report submitter): I know the text file you've sent me contains some names of your friends (-> that's why it is not attached here). Would it be possible to modify the names of this file and attach the file here? Otherwise it will get tricky to fix this issue. Thanks!
Comment 5 David 2009-04-14 18:12:58 UTC
Created attachment 32826 [details]
File attachment

Hi again

I'm really convinced it is the enconding. If I use characters such as "é" the file type is not correctly identified.

Cheers
Comment 6 David Faure 2009-05-21 03:27:02 UTC
SVN commit 970853 by dfaure:

shared-mime-info says "note that files with high-bit-set characters should still be treated as text since these can appear in UTF-8 text, unlike control characters" and that was the intent of the code; but the 127-255 range was treated as signed, so negative, so < 32 -- so text files with accents in them weren't recognized as text.
BUG: 188355


 M  +1 -1      kmimetype.cpp  


WebSVN link: http://websvn.kde.org/?view=rev&revision=970853
Comment 7 Christoph Feck 2009-05-21 17:59:27 UTC
*** Bug 173370 has been marked as a duplicate of this bug. ***
Comment 8 Christoph Feck 2009-05-21 18:19:35 UTC
*** Bug 189660 has been marked as a duplicate of this bug. ***
Comment 9 Dario Andres 2009-05-25 00:23:18 UTC
*** Bug 193958 has been marked as a duplicate of this bug. ***
Comment 10 Dario Andres 2009-06-07 16:24:03 UTC
*** Bug 124757 has been marked as a duplicate of this bug. ***
Comment 11 Christoph Feck 2009-07-02 15:25:28 UTC
SVN commit 990397 by cfeck:

Let Qt render text thumbnails

We used a tiny bitmapped font which was unfortunately lacking any
non ASCII characters. We now detect non ASCII files as text files,
so we should render them better.

This code uses the smallest readable font, but currently forces it
to a very small pixel size (similar to what the old code used).

The text encoding detection might need some improvements, e.g. use
the default locale encoding, when the prober is not very sure.

Reviewed by Peter Penz
Reviewed by Darío Andrés

See http://reviewboard.kde.org/r/760/

BUG: 169381
CCBUG: 188355


 M  +0 -1      CMakeLists.txt  
 M  +39 -112   textcreator.cpp  
 M  +0 -3      textcreator.h  
 D             thumbnailfont_7x4.png  


WebSVN link: http://websvn.kde.org/?view=rev&revision=990397