Version: unspecified (using KDE 4.5.1) OS: Linux [This relates to my work on an Okteta Structure Definition for Compound File Binary File Format, which is described in MS-CFB specification] It would be nice to have ability to handle strings (including wide character strings). MS-CFB describes a 64 byte block, which is an array of up to 31 UTF-16 characters followed by a null terminator. I'd like to be able to show the text (as a unicode string), but the best I can do is an array of 32 elements of uint16, which isn't so easy to read. <array name="Directory Entry Name" length="32"> <primitive type="UInt16" /> </array> Some options that could be considered: - treat it as an array (as I do now), but add an extra attribute that shows how it should be rendered (like display="number", display="string") - add a new primitive type (like type="string") and some attributes for fixed length, null terminated, ucs2/utf8/ascii. - add several new primitive types (like type="FixedLengthString", type="FixedLengthWideString", type="NullTerminatedString", type="NullTerminatedWideString") The display attribute could be used to avoid conflicts between char and int8/uint8, but I recognise that this would still require backwards-compatibility hacks. Reproducible: Always Steps to Reproduce: Its probably easiest to see in a file with UTF-16 strings. I can provide the .osd and .desktop that I'm working on, if required. Actual Results: Display is a vertical display of the array in hex. Expected Results: I'd prefer to show the string
Better support for strings is already on my TODO list. I wanted to add support for null terminated ASCII C-strings, null terminated utf8, null terminated and fixed length UTF16 and UTF32. Probably Latin1 with configurable charset would also be a good idea. I'm afraid I'm fairly busy the next weeks, but should have enough time to implement this in February.
SVN commit 1230377 by arichardson: Add basic support for strings in structures. Currently supported encodings are ASCII and UTF16-LE/BE. Strings can be added to.osd by using the <string> element. Strings can have a fixed length (byte count or character count), be terminated by a certain unicode code point, or both (whichever occurs first) CCBUG: 263489 M +6 -0 CMakeLists.txt M +0 -6 view/structures/datatypes/abstractarraydatainformation.cpp M +2 -3 view/structures/datatypes/abstractarraydatainformation.h M +22 -11 view/structures/datatypes/datainformation.cpp M +21 -10 view/structures/datatypes/datainformation.h M +6 -1 view/structures/datatypes/datainformationbase.cpp M +3 -2 view/structures/datatypes/datainformationbase.h A view/structures/datatypes/dummydatainformation.cpp [License: LGPL] A view/structures/datatypes/dummydatainformation.h [License: LGPL] M +0 -5 view/structures/datatypes/dynamiclengtharraydatainformation.h A view/structures/datatypes/strings (directory) A view/structures/datatypes/strings/asciistringdata.cpp [License: LGPL] A view/structures/datatypes/strings/asciistringdata.h [License: LGPL] A view/structures/datatypes/strings/stringdata.cpp [License: LGPL] A view/structures/datatypes/strings/stringdata.h [License: LGPL] A view/structures/datatypes/strings/stringdatainformation.cpp [License: LGPL] A view/structures/datatypes/strings/stringdatainformation.h [License: LGPL] A view/structures/datatypes/strings/utf16stringdata.cpp [License: LGPL] A view/structures/datatypes/strings/utf16stringdata.h [License: LGPL] M +2 -2 view/structures/datatypes/topleveldatainformation.h M +86 -8 view/structures/parsers/osdparser.cpp M +3 -2 view/structures/parsers/osdparser.h M +11 -10 view/structures/structtool.cpp M +3 -0 view/structures/structtreemodel.cpp WebSVN link: http://websvn.kde.org/?view=rev&revision=1230377
Closing this bug, since Latin1, UTF8 and UTF32 have been added by now
Thanks for this functionality, and all your other work too. Much appreciated.