Version: unspecified (using KDE 4.4.4) OS: Linux See https://bugzilla.novell.com/616252 If you add <glob weight="60" pattern="*.xml"/> to the application/xhtml+xml entry from the freedesktop's shared MIME database 0.71. I would expect this file: <?xml version="1.0" encoding="UTF-8" ?> <painting> <img src="madonna.jpg" alt='Foligno Madonna, by Raphael'/> <caption>This is Raphael's "Foligno" Madonna, painted in <date>1511</date>–<date>1512</date>. </caption> </painting> to be detected as application/XML. Instead it's detected as application/xhtml+xml. What is even worse. If you don't set a weight to that glob (just add <glob pattern="*.xml"/>) now XHTML files are detected as text/html. That's ok if you only look at the content, but the algorithm documentd at http://standards.freedesktop.org/shared-mime-info-spec/shared-mime-info-spec-latest.html#id2554479 would do better (see the linked openSUSE bug for a detailed explanation of what I would expect). Is the current guessing logic documented somewhere? Reproducible: Always
OK. The relevant code (and comment explaining the guessing logic) is at KMimeType::findByUrlHelper. Then... My first example is plain wrong, sorry. With weight="60" the behavior is correct, it's my fault for putting it. Still the behavior without the weight parameter is wrong: - XML files are detected as XHTML - XHTML files are detected as HTML It seems to me that the problem is that the root-XML info is ignored. Why? Isn't it also "magic"? Because of that, - For XML files: Using glob there are two matches: application/xml and application/xhtml+xml. Using magic there is a single match: application/xml. Since application/xhtml+xml is a subclass of application/xml wins (or is just luck? it seems the code uses the first glob match that is also a magic match, not preferring subclasses) - For XHTML files: Using glob there are two matches: application/xml and application/xhtml+xml. Using magic there are two matches: application/xml and text/html. Since text/html has a higher priority wins. Since none of the globs is a subclass of text/html, text/html itself is detected. If root-XML were used (with a high enough priority) XHTML files would have been correctly detected. About XML files. It seems it would require an extra condition in the logic. But the fact that the root-XML of an XML file doesn't match that of an XHTML file should avoid that XML being detected as XHTML (at least if it has already made the hard drive work).
Dear Bug Submitter, This bug has been stagnant for a long time. Could you help us out and re-test if the bug is valid in the latest version? I am setting the status to NEEDSINFO pending your response, please change the Status back to REPORTED when you respond. Thank you for helping us make KDE software even better for everyone!
Dear Bug Submitter, This is a reminder that this bug has been stagnant for a long time. Could you help us out and re-test if the bug is valid in the latest version? This bug will be moved back to REPORTED Status for manual review later, which may take a while. If you are able to, please lend us a hand. Thank you for helping us make KDE software even better for everyone!
Thank you for reporting this issue in KDE software. As it has been a while since this issue was reported, can we please ask you to see if you can reproduce the issue with a recent software version? If you can reproduce the issue, please change the status to "REPORTED" when replying. Thank you!
Dear Bug Submitter, This bug has been in NEEDSINFO status with no change for at least 15 days. Please provide the requested information as soon as possible and set the bug status as REPORTED. Due to regular bug tracker maintenance, if the bug is still in NEEDSINFO status with no change in 30 days the bug will be closed as RESOLVED > WORKSFORME due to lack of needed information. For more information about our bug triaging procedures please read the wiki located here: https://community.kde.org/Guidelines_and_HOWTOs/Bug_triaging If you have already provided the requested information, please mark the bug as REPORTED so that the KDE team knows that the bug is ready to be confirmed. Thank you for helping us make KDE software even better for everyone!
This bug has been in NEEDSINFO status with no change for at least 30 days. The bug is now closed as RESOLVED > WORKSFORME due to lack of needed information. For more information about our bug triaging procedures please read the wiki located here: https://community.kde.org/Guidelines_and_HOWTOs/Bug_triaging Thank you for helping us make KDE software even better for everyone!