Version: (using KDE 4.3.1) OS: Linux Installed from: Ubuntu Packages Nowadays there are a lot of multilingual text files going around and hence Kate/KWrite should support them fully. For this, it is important that they write the proper BOM when saving text files. Else interoperability of the files created by Kate/KWrite is affected. I checked how Kate/KWrite saved Devanagari text in UTF-8 and UTF-16 encodings and there was no BOM. (I verified this using Okteta.) For any text file that contains non-ASCII characters, Kate/KWrite should write a proper BOM. I am marking this a bug instead of a wishlist because the current behaviour causes interoperability problems.
Please don't. Or if that really is required for some people, make it optional and off-by-default. At least PHP and I bet other languages or file-types can easily choke on BOMs. If kate would always insert BOMs, it would get useless for me. Also I think to remember that at least UTF-8 BOMs are optional and not required. Hence this is not a bug.
milian: really‽ even notepad prepends a BOM on unicode files.
Automatically and always writing BOM markers suck, especially with template languages. eg with django, ZOPE, .... I got bitten often enough by stupid windows editors just inserting a BOM. If we add the feature is has to be easily enabled/disabled. As far as I know, BOMs are not required for being unicode compliant
Addition, it appears for UTF-8 the BOM is optional, for UTF-16 and UTF-32 it is required
Then it means that Kate does not have Unicode compliance for UTF-16 since I tried with UTF-16 and Kate did not place the BOM. Same true for KWrite. So the request is now for an option to prepend a BOM. Some more points: 1. This option should be added not in the Configure Kate or Configure Editor menu dialog, but in the Save As window. 2. This option should be set to off in the default distribution. 3. It must be decided whether Kate/KWrite should remember my last choice or revert always to the default state of off and make the user manually select it every time they want it. Maybe a choice is to remember the on state only for a file which was recently (in the current session) saved with BOM turned on. 4. When a file with a BOM is loaded, the option should automatically be turned on. Care must be taken that BOMs are not duplicated at the head of the file, however. 5. If merely "Save" is done and not "Save As", then a BOM should be prepended if it existed when the file was loaded.
Adding to the file dialog is not that good, especially for applications embedding the kate part. I think the easiest/best way would be: *) Detecting the existence of a BOM when opening an UTF-8/16 file and keeping that setting. *) An option for the default for new files (BOM on/off) in the config dialog (Open/Save section) (defaulting to off as system default) *) An option in the tools menu, just like the "End of line" option in the tools menu.
Another idea is to add a document variable (modeline) for that, e.g. prevent-bom=true/false. This could be in the moderc for some files, e.g. php and others. And in theory, it could even be in the highlighting information.
I'm working on it
Okay, it's not committed yet, but the result will be: *) If the user enabled/disabled the byte order marker explicitly in the tools menu, this setting is honoured, otherwise *) If bom or byte-order-marker is set in the file mode config line, the boolean value will be used for saving. (This variable is ignored if it is within the document, since before saving the mode overwrites local settings as it appears) *) If the variable is not specified, if there was a byte order marker at load time, the byte order marker is kept. *) For new files, if the filetype used for saving has the variable not set and the user didn't explicitly set the option in the menu, the default set in the main open/save configuration is used. Does that sound reasonable ?
Hello. I'm pleasantly surprised to see the speed at which this bug is being fixed. Is there a record for the fastest bugfix? This is certainly the fastest *I* have seen. Anyway, can anyone give me a brief idea about this modeline thing? I have read: http://kate-editor.org/article/katepart_modelines but perhaps it's just me but I didn't get what exactly I would have to type into my text file to turn BOM on or off. So can anyone please give me (via direct email, if more appropriate) examples for C/C++ and Python? Thanks.
I have implemented it, and it appears to work. I'm going to commit it as soon as svn is up again. BOMs can now be turned on/off for utf-8 and utf-16. I don't see utf-32 in the encoding selection combo. To force BOMs of for instance for python, although they are generally turned on. You can change in Settings->Open/Save -> Modes & Filetypes the Variable line for python from: kate: presave-postdialog python-encoding to kate: presave-postdialog python-encoding; bom off If you want to force Byte order markers to on for a specific file type, although they are generally turned off, this can be done with "bom on". A synonym is "byte-order-marker on/off"
SVN commit 1022826 by jowenn: *)New RPMSPEC file from Tim Fechtner *)Configuration option for writing BOM (byte order markers) for UTF-8/16 BUG: 206142 BUG: 207174 M +2 -1 data/katepartsimpleui.rc M +2 -1 data/katepartui.rc M +4 -1 dialogs/katedialogs.cpp M +15 -0 dialogs/opensaveconfigwidget.ui M +74 -11 document/katebuffer.cpp M +18 -1 document/katedocument.cpp M +7 -1 document/katedocument.h M +46 -7 syntax/data/rpmspec.xml M +25 -1 utils/kateconfig.cpp M +5 -0 utils/kateconfig.h M +19 -1 view/kateview.cpp M +2 -0 view/kateview.h WebSVN link: http://websvn.kde.org/?view=rev&revision=1022826
You can also put it into a comment to enable it in a single file. Just add the part "kate: bom on;" somewhere in a comment in the first or last 10 lines in your file. But it will work only from KDE >= 4.4.
<applauds> That's the fastest bugfix I've ever seen guys! Keep it up!