Summary: | make katepart add BOM mark into the beginning of UTF-8 files | ||
---|---|---|---|
Product: | [Applications] kate | Reporter: | Nick Shaforostoff <shafff> |
Component: | general | Assignee: | KWrite Developers <kwrite-bugs-null> |
Status: | RESOLVED INTENTIONAL | ||
Severity: | wishlist | CC: | christoph |
Priority: | NOR | ||
Version First Reported In: | unspecified | ||
Target Milestone: | --- | ||
Platform: | Debian testing | ||
OS: | Unspecified | ||
Latest Commit: | Version Fixed In: | ||
Sentry Crash Report: | |||
Attachments: |
file produced by kate right now
file with exactly same text, but reencoded by notepad |
Description
Nick Shaforostoff
2010-05-04 10:56:06 UTC
Created attachment 43216 [details]
file produced by kate right now
Created attachment 43217 [details]
file with exactly same text, but reencoded by notepad
note that it is readable by both sides.
AND also BOM mark will allow us to always autodetect UTF-8 encoding 100%.
hey Nick! Please try the same in KDE 4.4 or better yet 4.5 - esp. the latter saw quite some improvements in encoding support. Furthermore BOM should never be inserted by default, many scripting languages have problems with them (e.g. PHP) and afaik even some XML parsers. there is one positive change from 4.3 to 4.4: In 4.4 if I open any of *.txt files and save them under different name, it is equal to binary copy (i.e. everything is preserved). In 4.3 kwrite removes BOM mark. But If I open win-utf8.txt and copy paste it to a new kwrite window, then save it, it creates file equal to kate-utf8.txt. But I would like to have an option to explicitly specify encoding way for files, i.e. windows-friendly or simple(cmd-line friendly). Also it would be cool to have it automatically select windows-friendly for certain types of files: for example .srt files. Tools -> Add Byte Order Mark (BOM) ?? I don't know if there is a corresponding modeline. i thought about extending file save dialog. it already allows us to specify encoding, so why not extend it with another option? what if we add BOM mark to file only if it has .txt extension? this way no xml files will be harmed, and we'll get nice interoperability with osx and win. Extending the file dialog is not so easy, as the QFileDialog API in Qt5 must support that first somehow. You can add a bom already now by specifiying it in the filetype through a variable: http://docs.kde.org/stable/en/applications/kate/config-variables.html#variable-byte-order-marker Besides that, are you proposing to add a BOM by default to unicode encoded files? "are you proposing to add a BOM by default to unicode encoded files?" yes, but only for those that get saved with .txt extension (so we get around the mentioned use-case of broken xml processors). that is its purpose, after all. http://en.wikipedia.org/wiki/Byte_order_mark Wikipedia says: "The Unicode Standard neither requires nor recommends the use of the BOM for UTF-8.[ http://www.unicode.org/versions/Unicode6.0.0/ch02.pdf ] The presence of the UTF-8 BOM may cause interoperability problems with existing software that could otherwise handle UTF-8[...]" Sorry, per default we won't add BOMs, that only leads to problems. |