| Summary: | Baloo attempts to index MPEG TS file as text | ||
|---|---|---|---|
| Product: | [Unmaintained] Baloo | Reporter: | Pontus Johannesson <hydrardraconis> |
| Component: | General | Assignee: | Vishesh Handa <me> |
| Status: | RESOLVED FIXED | ||
| Severity: | normal | ||
| Priority: | NOR | ||
| Version First Reported In: | 4.13 | ||
| Target Milestone: | --- | ||
| Platform: | Gentoo Packages | ||
| OS: | Linux | ||
| Latest Commit: | http://commits.kde.org/baloo/c19b7a9ded994009c49007d8336afe92acf513cd | Version Fixed/Implemented In: | 5.3.1 |
| Sentry Crash Report: | |||
|
Description
Pontus Johannesson
2015-03-21 21:42:58 UTC
Confirmed. This is even a problem with Qt5 Fast Mimetype: text/vnd.trolltech.linguist Slow Mimetype: text/vnd.trolltech.linguist The fix will probably need to go into Qt. Git commit c19b7a9ded994009c49007d8336afe92acf513cd by Vishesh Handa. Committed on 13/05/2015 at 14:07. Pushed by vhanda into branch 'Plasma/5.3'. Only use the file's content during mimetype detection During the first indexing phase, we only use the filename as we do not want the overhead of reading the contents of the file. During the second indexing phase, we are actually going to be indexing the contents of the file. At this time, it's perfectly fine to read the file's contents to determine the mimetype. We were using QMimeDatabase::mimeTypeForFile with its default settings which takes both the filename and file contents into consideration. This results in interesting cases where if a file ends with '.ts' it is detected as a 'linguist' file, even though the magic byte mapping failed. We want the mimetype to be as exact as possible. We now only use the files contents, and not the filename. Related: bug 342312 FIXED-IN: 5.3.1 M +1 -1 src/file/extractor/app.cpp M +2 -2 src/file/tests/indexerconfigtest.cpp http://commits.kde.org/baloo/c19b7a9ded994009c49007d8336afe92acf513cd |