Bug 242777 - MIME type detection fails
Summary: MIME type detection fails
Status: RESOLVED WORKSFORME
Alias: None
Product: kdelibs
Classification: Frameworks and Libraries
Component: kdecore (show other bugs)
Version: unspecified
Platform: openSUSE Linux
: NOR normal
Target Milestone: ---
Assignee: kdelibs bugs
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2010-06-25 13:07 UTC by Christian Morales Vega
Modified: 2023-01-17 05:17 UTC (History)
1 user (show)

See Also:
Latest Commit:
Version Fixed In:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Christian Morales Vega 2010-06-25 13:07:24 UTC
Version:           unspecified (using KDE 4.4.4) 
OS:                Linux

See https://bugzilla.novell.com/616252

If you add <glob weight="60" pattern="*.xml"/> to the application/xhtml+xml entry from the freedesktop's shared MIME database 0.71. I would expect this file:
<?xml version="1.0" encoding="UTF-8" ?>
<painting>
  <img src="madonna.jpg" alt='Foligno Madonna, by Raphael'/>
  <caption>This is Raphael's "Foligno" Madonna, painted in
    <date>1511</date>–<date>1512</date>.
  </caption>
</painting>

to be detected as application/XML. Instead it's detected as application/xhtml+xml.

What is even worse. If you don't set a weight to that glob (just add <glob pattern="*.xml"/>) now XHTML files are detected as text/html. That's ok if you only look at the content, but the algorithm documentd at http://standards.freedesktop.org/shared-mime-info-spec/shared-mime-info-spec-latest.html#id2554479 would do better (see the linked openSUSE bug for a detailed explanation of what I would expect).

Is the current guessing logic documented somewhere?


Reproducible: Always
Comment 1 Christian Morales Vega 2010-06-26 13:04:52 UTC
OK. The relevant code (and comment explaining the guessing logic) is at KMimeType::findByUrlHelper.

Then... My first example is plain wrong, sorry. With weight="60" the behavior is correct, it's my fault for putting it. Still the behavior without the weight parameter is wrong:
- XML files are detected as XHTML
- XHTML files are detected as HTML

It seems to me that the problem is that the root-XML info is ignored. Why? Isn't it also "magic"?
Because of that,
- For XML files: Using glob there are two matches: application/xml and application/xhtml+xml. Using magic there is a single match: application/xml. Since application/xhtml+xml is a subclass of application/xml wins (or is just luck? it seems the code uses the first glob match that is also a magic match, not preferring subclasses)
- For XHTML files: Using glob there are two matches: application/xml and application/xhtml+xml. Using magic there are two matches: application/xml and text/html. Since text/html has a higher priority wins. Since none of the globs is a subclass of text/html, text/html itself is detected.

If root-XML were used (with a high enough priority) XHTML files would have been correctly detected.
About XML files. It seems it would require an extra condition in the logic. But the fact that the root-XML of an XML file doesn't match that of an XHTML file should avoid that XML being detected as XHTML (at least if it has already made the hard drive work).
Comment 2 Andrew Crouthamel 2018-11-05 03:16:48 UTC
Dear Bug Submitter,

This bug has been stagnant for a long time. Could you help us out and re-test if the bug is valid in the latest version? I am setting the status to NEEDSINFO pending your response, please change the Status back to REPORTED when you respond.

Thank you for helping us make KDE software even better for everyone!
Comment 3 Andrew Crouthamel 2018-11-17 05:01:10 UTC
Dear Bug Submitter,

This is a reminder that this bug has been stagnant for a long time. Could you help us out and re-test if the bug is valid in the latest version? This bug will be moved back to REPORTED Status for manual review later, which may take a while. If you are able to, please lend us a hand.

Thank you for helping us make KDE software even better for everyone!
Comment 4 Justin Zobel 2022-12-18 08:18:29 UTC
Thank you for reporting this issue in KDE software. As it has been a while since this issue was reported, can we please ask you to see if you can reproduce the issue with a recent software version?

If you can reproduce the issue, please change the status to "REPORTED" when replying. Thank you!
Comment 5 Bug Janitor Service 2023-01-02 05:26:19 UTC
Dear Bug Submitter,

This bug has been in NEEDSINFO status with no change for at least
15 days. Please provide the requested information as soon as
possible and set the bug status as REPORTED. Due to regular bug
tracker maintenance, if the bug is still in NEEDSINFO status with
no change in 30 days the bug will be closed as RESOLVED > WORKSFORME
due to lack of needed information.

For more information about our bug triaging procedures please read the
wiki located here:
https://community.kde.org/Guidelines_and_HOWTOs/Bug_triaging

If you have already provided the requested information, please
mark the bug as REPORTED so that the KDE team knows that the bug is
ready to be confirmed.

Thank you for helping us make KDE software even better for everyone!
Comment 6 Bug Janitor Service 2023-01-17 05:17:43 UTC
This bug has been in NEEDSINFO status with no change for at least
30 days. The bug is now closed as RESOLVED > WORKSFORME
due to lack of needed information.

For more information about our bug triaging procedures please read the
wiki located here:
https://community.kde.org/Guidelines_and_HOWTOs/Bug_triaging

Thank you for helping us make KDE software even better for everyone!