Bug 406571 - False "invalid charachter" while opening utf-8 files (NULL character?)
Summary: False "invalid charachter" while opening utf-8 files (NULL character?)
Status: RESOLVED FIXED
Alias: None
Product: kate
Classification: Applications
Component: part (other bugs)
Version First Reported In: unspecified
Platform: Ubuntu Linux
: NOR normal
Target Milestone: ---
Assignee: KWrite Developers
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2019-04-15 16:45 UTC by Daniel Klein
Modified: 2019-05-18 18:24 UTC (History)
2 users (show)

See Also:
Latest Commit:
Version Fixed/Implemented In:
Sentry Crash Report:


Attachments
Text file containing invalid charachters (119 bytes, text/x-log)
2019-04-15 16:45 UTC, Daniel Klein
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Daniel Klein 2019-04-15 16:45:09 UTC
Created attachment 119422 [details]
Text file containing invalid charachters

This happens all the time. Enabling write-mode and saving the file does not solve it. Other editors (eg Leafpad) have no issues. Hex editor showed no special charachters. Neither did trying to grep for NULL etc.

These are important log files I work on all day long - so I have to disable the "read only mode" every 5 minutes for 8 hours (!). 

It happens also with other kate-related editors/frontends, like KrViewer in Krusader, so I will have to move the entire DE to gnome.. (need simple code highlight and minimal DE- no big editors).

This is a log file created by other servers so I can't 'avoid using NULL'. 

Now adding "edit anyway" button won't help -- because after saving the message will return: Kate won't 'clean' the invalid charachters, so it is a FALSE WARNING. I wish it would corrupt the file, thus end the warnings ;) But it does nothing. Just warning and disable editing. Repeatidly.

This behaviour is simply wrong. Control charachters are not 'invalid'.

STEPS TO REPRODUCE
1. Open the file.

OBSERVED RESULT
"The file _sched.log was opened with UTF-8 encoding but contained invalid characters.
It is set to read-only mode, as saving might destroy its content.
Either reopen the file with the correct encoding chosen or enable the read-write mode again in the tools menu to be able to edit it."


EXPECTED RESULT
Allow to edit normaly? 
Provide a button to remove or at least show the "invalid" charachters? 

KDE Frameworks 5.36.0
Qt 5.6.1 (built against 5.6.1)
Kate Part Version 5.36.0
Comment 1 Daniel Klein 2019-04-15 16:48:52 UTC
If it's relevant: 

Linux 4.4.0-142-generic #168-Ubuntu SMP Wed Jan 16 21:00:45 UTC 2019 x86_64 
x86_64 x86_64 GNU/Linux

Distributor ID: Ubuntu
Description:    Ubuntu 16.04.5 LTS
Release:        16.04
Codename:       xenial
Comment 2 Christoph Cullmann 2019-05-18 18:23:09 UTC
Git commit e1bb897f3e53a5778ffbab41b8a17c238d2f3ebf by Christoph Cullmann.
Committed on 18/05/2019 at 18:22.
Pushed by cullmann into branch 'master'.

improve invalid character check on loading
don't use the ConvertInvalidToNull variant + check for null chars but
check the invalidChars field of the decoder state
allows to load files with 0 bytes

M  +4    -10   src/buffer/katetextloader.h

https://commits.kde.org/ktexteditor/e1bb897f3e53a5778ffbab41b8a17c238d2f3ebf
Comment 3 Christoph Cullmann 2019-05-18 18:24:01 UTC
Thanks for the report, the invalid character check was not ok for the case of null characters, sorry.
The new code opens your file without issues.