Bug 514151 - Okteta reads structure fields at incorrect offset following "oversized" array
Summary: Okteta reads structure fields at incorrect offset following "oversized" array
Status: RESOLVED FIXED
Alias: None
Product: okteta
Classification: Applications
Component: Structures Tool (other bugs)
Version First Reported In: 0.26.24
Platform: Debian testing Linux
: NOR normal
Target Milestone: ---
Assignee: Alex Richardson
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2026-01-04 17:20 UTC by Brendon Higgins
Modified: 2026-01-13 20:33 UTC (History)
1 user (show)

See Also:
Latest Commit:
Version Fixed/Implemented In: 0.26.25
Sentry Crash Report:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Brendon Higgins 2026-01-04 17:20:56 UTC
SUMMARY
I'm finding the structures tool is very useful! When dealing with structures with large arrays, though, Okteta only reads the first 65535 values if the array is specified (dynamically from a length field in the file, for example) as having a greater length. This is understandable to ensure memory isn't exhausted, however from my testing it appears that if other structure fields exist after an "oversized" array, then Okteta reads the remaining file data from the wrong offset in the file. It would be more useful if Okteta simply skipped the "extra" array elements, and read the remaining structure fields from the correct offsets in the file.

STEPS TO REPRODUCE
1. Configure a file format structure which contains an array with length greater than 65535, e.g. 100k. This can be as simple as an array of uint8. Add another field after the array, e.g. a uint32.
2. Create a file to match, e.g. 100k zeros followed by 4 bytes of 0xFF.
3. Open the file in Okteta and read it using the structure.

OBSERVED RESULT
The array is truncated, as expected. The following (uint32) field is read at the offset immediately following the 65535th element of the array, and not where the actual field is located within the file.

EXPECTED RESULT
The field following the array is read from the correct offset in the file, after the entire oversized array.

SOFTWARE/OS VERSIONS
Operating System: Debian GNU/Linux 
KDE Plasma Version: 6.5.4
KDE Frameworks Version: 6.20.0
Qt Version: 6.9.2
Kernel Version: 6.17.13+deb14-amd64 (64-bit)
Graphics Platform: X11
Comment 1 Friedrich W. H. Kossebau 2026-01-07 16:35:16 UTC
Hi Brendon Higgins, thanks for the report and interest. Myself, while not being Alexander, the original developer of the Structures tool, I am the one currently now trying to keep the Structures tool alive and to bring it into a Qt6 future. Not a full expert on the code yet, but getting there, and now looking into using the bug report/feature request to enhance some more.

After a check on the code, I can imagine that at least for the non-complex arrays keeping the full original array length as separate data clue and using that when calculating the bits size() of the array might get the desired effect to have a proper offset for the things behind the array.

Sketched your scenario by the following structure definition:
--- 8< --- bug65535.osd
<?xml version="1.0" encoding="UTF-8"?>
<data>
  <struct name="Bug 65535">
    <array name="zeros" length="100000">
      <primitive name="data" type="UInt8" />
    </array>
    <primitive name="enddata" type="UInt32"/>
  </struct>
</data>
--- 8< ---

--- 8< --- bug65535.desktop
[Desktop Entry]
Icon=text/plain
Type=Service
ServiceTypes=KPluginInfo
Name=Bug 65535
Comment=
X-KDE-PluginInfo-Name=bug65535
X-KDE-PluginInfo-Category=structure
--- 8< ---

Just, for some yet to explore reason I hit some "Out of memory" exception right now with that OSD.

While I wait for the next time slot to continue my investigation, could you please share your OSD file you used, if possible, so I can compare?
Any crashes you experienced while experimenting around your large data structure description?
Comment 2 Brendon Higgins 2026-01-07 17:19:26 UTC
Hi Friedrich,

Thanks for looking! We've stumbled on something interesting, here. I usually use the JavaScript definitions because that seems more flexible. My array lengths are typically specified using functions which refer to some length field found earlier in the file. So my equivalent is more like this:

--- 8< --- bug65535/main.js
function init()
{
    var bug65535 = struct({
        zeros: array(uint8(), function() { return 100000; }),
        enddata: uint32(),
    });

    return bug65535;
}
--- 8< ---

The .desktop file is essentially the same, except X-KDE-PluginInfo-Category becomes "structure/js". I don't get any out-of-memory problems with this; Okteta takes only the first 65535 elements as I described above. But if I replace the function with a literal "100000", now Okteta crashes with signal 11 and core dump on start-up!
Comment 3 Friedrich W. H. Kossebau 2026-01-07 18:20:43 UTC
Thanks for the quick reply :) Okay, will then switch for now testing against the JS variant, to be most close to your experience (with the XML variant tested only afterwards).

> But if I replace the function with a literal "100000", now Okteta crashes with signal 11 and core dump on start-up!

What about using "65536", so the limit + 1? Would you have an example where you are above the limit (and thus hit the bad following offsets) without a crash?

Being just abiove the limit is where things had started crashing for me, in the branch dealing with being above the limit somewhere during "logWarn()" execution for yet to understand reasons throws some out of memory for me, by what the log showed:
https://invent.kde.org/utilities/okteta/-/blob/0.26/kasten/controllers/view/structures/datatypes/array/arraydatainformation.cpp#L30
Comment 4 Friedrich W. H. Kossebau 2026-01-07 20:01:31 UTC
Interesting: when using your JS variant, things do not crash for me, also with 100000 :) IIRC there is a different code path when the length of an array is defined by a method, like always the case with the JS-based definition. Going to persue that some more.

On your actual request, seems the initial code I drafted works for a first start (with no more crash in the way), my sample file gets the enddata: uint32 properly matching the very FF FF FF FF bytes I placed there (as by your example) :) Let's see if I can harden the logic around that, so things will not break otherwise.

For the user experience, I could imagine that there would be a last entry in the array listing which points out explicitly that there are further array items, just not listed due to resource restrictions. Would you have any ideas/wishes how such a final entry should look like, any info you would expect to see?

Also considering to make the "warning" symbol also show the related warning in a tooltip, was not obvious to me directly that one has to search the actual warning in the script console window. Any related ideas/wishes here also?

((For the future also wonder if array entries for overlarge arrays could be just estimated on-the-fly with a cache... but well, porting to a Qt6-compatible JavaScript engine is #1 task for now))
Comment 5 Brendon Higgins 2026-01-07 20:44:37 UTC
(In reply to Friedrich W. H. Kossebau from comment #3)
> What about using "65536", so the limit + 1?

With literal "65536", I get the crash. "65535" does not crash. If I return the value through a function, though, it does not crash - it seems to simply limit the value to a maximum of 65535 in that case.

(In reply to Friedrich W. H. Kossebau from comment #4)
> On your actual request, seems the initial code I drafted works for a first start (with no more crash in the way), my sample file gets the enddata: uint32 properly matching the very FF FF FF FF bytes I placed there (as by your example) :) Let's see if I can harden the logic around that, so things will not break otherwise.

That sounds great!

> For the user experience, I could imagine that there would be a last entry in the array listing which points out explicitly that there are further array items, just not listed due to resource restrictions. Would you have any ideas/wishes how such a final entry should look like, any info you would expect to see?

Even something as simple as an ellipsis "..." would be great. More elaborate could be "%d more entries (%d bytes) not shown...", or something along those lines. I can't imagine what else you might add that would be useful in the general case.

> Also considering to make the "warning" symbol also show the related warning in a tooltip, was not obvious to me directly that one has to search the actual warning in the script console window. Any related ideas/wishes here also?

The array's tooltip already displays some information, and I don't know if adding a tooltip to the warning symbol itself might compete with that? I generally agree, though: if I haven't been using Okteta like this for a while, it takes me some time before I eventually realize the Script console has more information for when some structure is upset. A tooltip pointing the user to "check the script console for more details", or something along those lines, would be a good addition.

> ((For the future also wonder if array entries for overlarge arrays could be just estimated on-the-fly with a cache... but well, porting to a Qt6-compatible JavaScript engine is #1 task for now))

I personally think the next step here would be to make the "maximum array entries to read" value user adjustable. Another option could be that the user could press the "..." entry at the end of what's displayed in order to reveal further hidden entries.

But for my purposes, simply correcting the offset of the next field (as it sounds like you've done) would solve the main nuisance.
Comment 6 Friedrich W. H. Kossebau 2026-01-07 22:20:53 UTC
> With literal "65536", I get the crash. "65535" does not crash. If I return the value through a function, though, it does not crash - it seems to simply limit the value to a maximum of 65535 in that case.

Okay, so consistent with what I see, starting to have a first working theory for the cause there, let's see...

> Even something as simple as an ellipsis "..." would be great. More elaborate could be "%d more entries (%d bytes) not shown...", or something along those lines. I can't imagine what else you might add that would be useful in the general case.

Guess ellipsis would be what I can use when doing a fix for this bug report and release still for 0.26, where translators would not like a new UI string to appear. Mentioning the complete missing byte size was not on my plate, seems sensible though and will consider for the development version.

Good, thanks again Brendon for your quick replies & input, will see to turn the started work into a proper fix over the next days/weeks. And while I had not planned further releases until the Qt6 port is done ,another intermediate release, showing life, for now should help the spirit, and with this fix might at least make one person more happy :) Would target the date of 2nd of February, by some tradition.

Will report back here once I have something worth to test, for anyone who is interested/able to test self-built Okteta versions.
Comment 7 Friedrich W. H. Kossebau 2026-01-13 20:18:42 UTC
Git commit 4483b9ce0c422fcf05bd8b684cce199c91943df7 by Friedrich W. H. Kossebau.
Committed on 13/01/2026 at 20:04.
Pushed by kossebau into branch '0.26'.

ArrayDataInformation constructor: fix crash on warning for too large length

logWarn() sees to get the logger from the toplevel datainformation, which
at this point is not yet around.

Passing a LoggerWithContext to the constructor fow now, to get access to
the logger.

M  +3    -2    kasten/controllers/view/structures/datatypes/array/arraydatainformation.cpp
M  +2    -0    kasten/controllers/view/structures/datatypes/array/arraydatainformation.hpp
M  +6    -3    kasten/controllers/view/structures/parsers/datainformationfactory.cpp
M  +5    -0    kasten/controllers/view/structures/script/scriptlogger.hpp
M  +2    -2    kasten/controllers/view/structures/tests/arraydatainformationtest.cpp
M  +4    -4    kasten/controllers/view/structures/tests/basicdatainformationtest.cpp
M  +5    -3    kasten/controllers/view/structures/tests/primitivearraytest.cpp
M  +2    -1    kasten/controllers/view/structures/tests/scriptclassestest.cpp

https://invent.kde.org/utilities/okteta/-/commit/4483b9ce0c422fcf05bd8b684cce199c91943df7
Comment 8 Friedrich W. H. Kossebau 2026-01-13 20:18:44 UTC
Git commit f2ac3b64fd3c48db6b91f861f0e0698cdc470960 by Friedrich W. H. Kossebau.
Committed on 13/01/2026 at 20:04.
Pushed by kossebau into branch '0.26'.

ArrayDataInformation: use correct offsets after too large primitive arrays

While not being able to show all items of arrays whose size is larger than
what is supported, for arrays of primitive types the offsets of elements
after the array can be still calculated correctly.
FIXED-IN: 0.26.25

M  +19   -18   kasten/controllers/view/structures/datatypes/array/abstractarraydata.cpp
M  +3    -2    kasten/controllers/view/structures/datatypes/array/abstractarraydata.hpp
M  +29   -20   kasten/controllers/view/structures/datatypes/array/arraydatainformation.cpp
M  +27   -18   kasten/controllers/view/structures/datatypes/array/complexarraydata.cpp
M  +4    -2    kasten/controllers/view/structures/datatypes/array/complexarraydata.hpp
M  +19   -14   kasten/controllers/view/structures/datatypes/array/primitivearraydata.cpp
M  +18   -8    kasten/controllers/view/structures/datatypes/array/primitivearraydata.hpp

https://invent.kde.org/utilities/okteta/-/commit/f2ac3b64fd3c48db6b91f861f0e0698cdc470960
Comment 9 Friedrich W. H. Kossebau 2026-01-13 20:33:02 UTC
As by the automatic messages, on the WE came to polish the first fix to the badly calculated offsets for structure elements after arrays with length larger than supported, at least for arrays of primitive types.
Also fixed the crash happening for the static OSD array definition in case of unsupported array length.

For now though the idea to have a trailing item in the array item list as replacement for the non-supported part of the array is not yet implemented. Needs to be done in a separate task. The consistent warning sign displayed with the array though should for now serve as visual hint that something is up with the array. Not sure if I get to this soonish, as I consider this nice-to-have feature currently, so with lower priority.

As mentioned before, a release with the fix scheduled for Feb 2nd. If you are up to build from the sources, the 0.26 branch and the master branch both have the fix now.
Real world testing welcome. Not being an active user of the Structures tool myself I might have missed something #)