Bug 417761

Summary: kmimetypefinder5 misidentifies mimetype of python files containing certain strings
Product: [Plasma] kde-cli-tools Reporter: Nathaniel Beaver <nathanielmbeaver>
Component: generalAssignee: Aleix Pol <aleixpol>
Status: RESOLVED FIXED    
Severity: normal CC: mitya57, nate
Priority: NOR    
Version: 5.12.8   
Target Milestone: ---   
Platform: Other   
OS: Linux   
See Also: https://bugs.kde.org/show_bug.cgi?id=417248
Latest Commit: Version Fixed In: 5.81.80
Sentry Crash Report:
Attachments: example python script

Description Nathaniel Beaver 2020-02-16 23:15:53 UTC
SUMMARY
Python scripts with a string containing HTML can be misidentified as HTML files by kmimetypefinder5.


STEPS TO REPRODUCE
1. Create python file "example.py" that looks like this:

#! /usr/bin/env python3
example_string = \
"""\
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
  <head>
    <title>Example title</title>
  </head>
  <body>
    <p>Example body</p>
  </body>
</html>
"""
print('Hello, world!')

2. Run kmimetypefinder5 on that file.

$ kmimetypefinder5 example.py 

OBSERVED RESULT

$ kmimetypefinder5 example.py 
application/xhtml+xml

EXPECTED RESULT

$ kmimetypefinder5 example.py
text/x-python

or

$ kmimetypefinder5 example.py
text/plain


SOFTWARE/OS VERSIONS
Linux/KDE Plasma: 
(available in About System)
KDE Plasma Version: 5.12.9
KDE Frameworks Version: 5.44.0
Qt Version: 5.9.5

ADDITIONAL INFORMATION

$ lsb_release -rd
Description:	Ubuntu 18.04.3 LTS
Release:	18.04
$ apt-cache policy kde-cli-tools
kde-cli-tools:
  Installed: 4:5.12.8-0ubuntu0.1
  Candidate: 4:5.12.8-0ubuntu0.1
  Version table:
 *** 4:5.12.8-0ubuntu0.1 500
        500 http://us.archive.ubuntu.com/ubuntu bionic-updates/universe amd64 Packages
        100 /var/lib/dpkg/status
     4:5.12.4-0ubuntu1 500
        500 http://us.archive.ubuntu.com/ubuntu bionic/universe amd64 Packages

Downstream bug report:

https://bugs.launchpad.net/ubuntu/+source/kde-cli-tools/+bug/1857824
Comment 1 Nathaniel Beaver 2020-02-16 23:17:37 UTC
Created attachment 126088 [details]
example python script
Comment 2 Dmitry Shachnev 2021-05-18 18:06:22 UTC
Probably duplicate of https://bugs.kde.org/show_bug.cgi?id=411718?

Upstream Qt fix is https://codereview.qt-project.org/c/qt/qtbase/+/328240, but it was not cherry-picked to kde/5.15 branch AFAICS.
Comment 3 Christian David 2023-11-21 19:42:50 UTC
I just tried the example file with kde-cli-tools commit 1e68eb303b42a05e613020519d97189c622d78aa and there the result was `text/x-python`. I assume the upstream changes were added in the mean time.

The source build kmimetypefinder identifies as version `5.81.80` – I'll use that as fixed in.