Bug 445147

Summary: Index TMDX content
Product: [Frameworks and Libraries] frameworks-baloo Reporter: noticon
Component: generalAssignee: baloo-bugs-null
Status: RESOLVED DOWNSTREAM    
Severity: wishlist CC: nate
Priority: NOR    
Version First Reported In: unspecified   
Target Milestone: ---   
Platform: Ubuntu   
OS: Linux   
Latest Commit: Version Fixed/Implemented In:
Sentry Crash Report:

Description noticon 2021-11-08 04:02:05 UTC
SUMMARY
Won't index content of TMDX files from Softmaker's Office Suite, specifically Textmaker, the word processor. I assume the same of the rest of the office suite, such as Presenter and PlanMaker. No doubt, this is because TMDX is a non-standard format, but it is a text format. I reached out to Softmaker, and they said to reach out to you. I know, it's probably a run-around, but could you please look into this? :D

STEPS TO REPRODUCE
1. 
2. 
3. 

OBSERVED RESULT


EXPECTED RESULT


SOFTWARE/OS VERSIONS
Windows: 
macOS: 
Linux/KDE Plasma: 
(available in About System)
KDE Plasma Version: 
KDE Frameworks Version: 
Qt Version: 

ADDITIONAL INFORMATION
Comment 1 Nate Graham 2021-11-09 14:29:42 UTC
Can you run `file --mime-type [path to file]` on one of the files that isn't getting indexed, and paste the output here?
Comment 2 noticon 2021-11-09 14:33:57 UTC
terminal returns:
This is a test.tmdx: application/octet-stream
Comment 3 Nate Graham 2021-11-09 14:37:37 UTC
Well, that's the problem. It's actually not a text file, or at least it doesn't tell the system it's a text file. It tells the system it's binary content. And Baloo avoids indexing these.

If these files really are text, then this is a bug in the file format itself. The correct solution would be one of the following:
1. Correct the error in the file structure that causes the system to mis-identify it as binary content rather than text
2. Submit a patch to shared-mime-info (https://gitlab.freedesktop.org/xdg/shared-mime-info/) to make it aware of this file format, so that it identifies the files as some type of text file rather than a binary file.
Comment 4 noticon 2021-11-09 14:39:05 UTC
Thanks! I'll do that.