Trying to use Valgrind with a binary compiled by Clang14 with DWARF5 enabled results in Valgrind complaing about possibly corrupted debuginfo and aborting. Attached are 3 patches that improve the situation. I'm not very familiar with Valgrind internals or DWARF, but with these patches Valgrind is now able to run Clang14-compiled programs without complaining and generally seems to work. The one thing I've noticed that still does not work is backtraces with split-dwarf, they report the binary name instead of source file location, I don't know if the 3rd patch is incorrect in the assumption that rnglistx/loclistx may be ignored, or if it's something else missing.
Created attachment 148233 [details] valgrind patch
Created attachment 148234 [details] valgrind patch
Created attachment 148235 [details] valgrind patch
Created attachment 148239 [details] read dwarf5 DW_FORM_rnglistx and DW_FORM_loclistx
Created attachment 148240 [details] avoid warning about missing DW_AT_*_base in skip_DIE()
Created attachment 148242 [details] read dwarf5 DW_FORM_rnglistx and DW_FORM_loclistx Removed left-over debug statement.
Created attachment 148243 [details] implement support for missing DW_LLE_* and DW_RLE_* values
Here are my results with clang-devel (15.0.0) on FreeBSD 13.0. == 757 tests, 15 stderr failures, 4 stdout failures, 0 stderrB failures, 0 stdoutB failures, 0 post failures == memcheck/tests/amd64/insn-pmovmskb (stderr) memcheck/tests/clientperm (stderr) memcheck/tests/freebsd/static_allocs (stderr) memcheck/tests/gone_abrt_xml (stderr) memcheck/tests/origin5-bz2 (stderr) memcheck/tests/varinfo2 (stderr) memcheck/tests/varinfo5 (stderr) memcheck/tests/varinfo6 (stderr) memcheck/tests/wrap6 (stdout) helgrind/tests/tls_threads (stderr) drd/tests/atomic_var (stderr) drd/tests/omp_matinv (stdout) drd/tests/omp_matinv (stderr) drd/tests/omp_matinv_racy (stdout) drd/tests/omp_matinv_racy (stderr) drd/tests/omp_prime_racy (stderr) drd/tests/omp_printf (stderr) drd/tests/pth_mutex_signal (stderr) none/tests/amd64/amd64locked (stdout) Most of those fail with clang 11.0.1. There are a few extra OMP failes, but I doubt that they are related. The extra failures are memcheck/tests/freebsd/static_allocs ^^^ this does look like it could be dwarf related, seems to be failing to read the client stack for the errror memcheck/tests/wrap6 ^^^ missing wrap output none/tests/amd64/amd64locked ^^^ difference in crc calculation in guest However, none of the above uses dwarf5. A small leaking C++ exe gives me paulf> ../vg_llvm15/vg-in-place ./leak ==60775== Memcheck, a memory error detector ==60775== Copyright (C) 2002-2022, and GNU GPL'd, by Julian Seward et al. ==60775== Using Valgrind-3.20.0.GIT and LibVEX; rerun with -h for copyright info ==60775== Command: ./leak ==60775== ### unhandled dwarf2 abbrev form code 0x25 ### unhandled dwarf2 abbrev form code 0x25 ### unhandled dwarf2 abbrev form code 0x25 ### unhandled dwarf2 abbrev form code 0x1b ==60775== Valgrind: debuginfo reader: ensure_valid failed: ==60775== Valgrind: during call to ML_(img_get) ==60775== Valgrind: request for range [86779272, +4) exceeds ==60775== Valgrind: valid image size of 15768 for image: ==60775== Valgrind: "/usr/home/paulf/scratch/vg_examples/leak" ==60775== ==60775== Valgrind: debuginfo reader: Possibly corrupted debuginfo file. ==60775== Valgrind: I can't recover. Giving up. Sorry.
I'll try the patches tomorrow.
The only remaining issue I can see is that DWARF5 + -gsplit-dwarf does not output source locations in backtraces. So far I have not been able to find out which part of Valgrind code is responsible, the furthest I know so far is that search_one_loctab() does not find the location, possibly because di->loctab_used is slightly smaller than in non-split case. I also don't see any difference in DWARF info when compared to split DWARF4 or non-split DWARF5 that would seem relevant (llvm-dwarfdump claims .debug_line are the same in both cases when generated by GCC 11). Testcase: $ cat a.cpp static int func(int a, int* b) { if(*b) return a + 2; else return a - 2; } int main() { int v; return func(4, &v); } $ g++-11 -g -gdwarf-5 a.cpp -gsplit-dwarf $ valgrind ./a.out ... ==22491== Conditional jump or move depends on uninitialised value(s) ==22491== at 0x400559: func(int, int*) (in /home/seli/tmp/a.out) ==22491== by 0x400583: main (in /home/seli/tmp/a.out) $ g++-11 -g -gdwarf-5 a.cpp $ valgrind ./a.out ... ==22512== Conditional jump or move depends on uninitialised value(s) ==22512== at 0x400559: func(int, int*) (a.cpp:3) ==22512== by 0x400583: main (a.cpp:12)
On FreeBSD 13 and64 with clang and clang devel (11 and 15 respectively) no problems building, no new regression test failures. This will need some new regression tests. I know little about DWARF so I'll leave that to Mark.
Created attachment 148366 [details] read dwarf5 DW_FORM_addrx* and DW_FORM_strx* as generated by Clang14 Removed forgotten todo note in 2nd patch.
Created attachment 148367 [details] read dwarf5 DW_FORM_rnglistx and DW_FORM_loclistx Added one forgotten switch case in 4th patch.
Created attachment 148368 [details] implement support for missing DW_LLE_* and DW_RLE_* values Removed forgotten todo comment in 5th patch.
Created attachment 148374 [details] read properly unit headers depending on dwarf5 unit_type
Created attachment 148375 [details] avoid warning about missing DW_AT_*_base in skip_DIE()
Created attachment 148376 [details] treat DW_TAG_skeleton_unit like DW_TAG_compile_unit Ok, I've figured out the split-dwarf debug line problem too, this fixes it for both GCC and Clang. So these patches should be all.
Created attachment 148380 [details] support DW_FORM_addrx3 and DW_FORM_strx3
Hi Lubos Do all these patches only affect llvm 14+? Or will there be any improvement for older llvm versions?
They affect compilations does using Clang when it generates DWARF5. So presumably they also affect older versions if explicitly asked to generate DWARF5, but otherwise no. BTW, since you're not going to actually handle this, maybe you shouldn't be the assignee for the bugreport?
I'm keen to get these changes in and if Mark approves I'll push the changes.
I am really happy to see these patches. But have not read them yet. Also I am on vacation till next week. Sorry for being slow,
FYI: several people are experiencing this when running Valgrind on code compiled with the current nightly Rust compiler.
Patch 1 looks good, pushed as: commit 61ddbc1fc395c787192e569d8f2238f713bdfd8e Author: Luboš Luňák <l.lunak@centrum.cz> Date: Tue Apr 19 10:58:44 2022 +0200 read properly unit headers depending on dwarf5 unit_type There may be additional fields that need to be skipped over, otherwise further reading will interpret these incorrectly. Thanks
(In reply to Nick Nethercote from comment #23) > FYI: several people are experiencing this when running Valgrind on code > compiled with the current nightly Rust compiler. Unfortunately, this is still happening. E.g if you have an up-to-date rustc (e.g. updated via `rustup` with `rustup update nightly`). > [gulf:~/dev/rustc-perf] vg1 --trace-children=yes rustc +nightly --version > ==787430== Memcheck, a memory error detector > ==787430== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al. > ==787430== Using Valgrind-3.18.1 and LibVEX; rerun with -h for copyright info > ==787430== Command: /home/njn/.rustup/toolchains/532be942ddf8f40d086e54d157453434b16e9647/bin/rustc > ==787430== > ### unhandled dwarf2 abbrev form code 0x25 > ### unhandled dwarf2 abbrev form code 0x25 > ### unhandled dwarf2 abbrev form code 0x25 > ### unhandled dwarf2 abbrev form code 0x23 > ### unhandled dwarf2 abbrev form code 0x25 > ### unhandled dwarf2 abbrev form code 0x25 > ### unhandled dwarf2 abbrev form code 0x25 > ### unhandled dwarf2 abbrev form code 0x23 > ==787430== Valgrind: debuginfo reader: ensure_valid failed: > ==787430== Valgrind: during call to ML_(img_get) > ==787430== Valgrind: request for range [1903633023, +4) exceeds > ==787430== Valgrind: valid image size of 1932688 for image: > ==787430== Valgrind: "/home/njn/.rustup/toolchains/532be942ddf8f40d086e54d157453434b16e9647/bin/rustc" > ==787430== > ==787430== Valgrind: debuginfo reader: Possibly corrupted debuginfo file. > ==787430== Valgrind: I can't recover. Giving up. Sorry. > ==787430==
Patch 2 doesn't handle DW_AT_GNU_addr_base, DW_FORM_GNU_addr_index and DW_FORM_GNU_str_index. But those are only used as extensions to DWARF4 with split-dwarf. Support could be added later if people really use pre-DWARF5 split-dwarf. setup_cu_bases could check for cc->version >= 5 to prevent some work for older DWARF, but it is probably not noticeable. Looks good. Pushed as: commit 61dfba4232d795c00603c74e6d0573f2d16641e2 Author: Luboš Luňák <l.lunak@centrum.cz> Date: Tue Apr 19 11:14:52 2022 +0200 read dwarf5 DW_FORM_addrx* and DW_FORM_strx* as generated by Clang14 DW_FORM_addrx* are offsets into .debug_addr containing addresses. DW_FORM_strx* are offsets into .debug_str_offsets, which contain offsets into .debug_str. Support for these also requires reading DW_AT_addr_base and DW_AT_str_offsets_base before any other field in the abbrev table entry, as those may use this form.
Patch3 looks correct on its own. So I'll push it as is: commit 383f36462fdc25a0e849391cf845e22703a95c08 Author: Luboš Luňák <l.lunak@centrum.cz> Date: Mon Apr 25 22:11:27 2022 +0200 avoid warning about missing DW_AT_*_base in skip_DIE() Similarly to setup_cu_bases(), DW_FORM_addrx etc. may depend on DW_AT_addr_base etc. that have not been read yet. But this also handles DW_FORM_rnglistx and DW_FORM_loclistx. Shouldn't setup_cu_bases also handle these? And if so, should we have an skip_Form_contents function that can be used in both places?
> But this also handles DW_FORM_rnglistx and DW_FORM_loclistx. > Shouldn't setup_cu_bases also handle these? The 2nd patch does that. The 7 patches should be applied in the order they are attached. You appear to be applying them based on their subject line, but those got meaningless as I was adding and updating the patches and I didn't realize somebody might look at those (I don't know if I'd ever submitted several patches as actual patches before). > And if so, should we have an skip_Form_contents function that can be used in both places? I don't know what you mean, I see no need for such a function.
(In reply to Mark Wielaard from comment #27) > But this also handles DW_FORM_rnglistx and DW_FORM_loclistx. > Shouldn't setup_cu_bases also handle these? And that is what patch4 does. > And if so, should we have an skip_Form_contents function that can be used in > both places? Still a good idea, but not super urgent. Found once small issue with patch4: + case DW_FORM_rnglistx: + case DW_FORM_loclistx: + return VARSZ_FORM; + return VARSZ_FORM; Removed the duplicate return and pushed as: commit d19bbdf1200685ddbb6976cf915fba25f2097162 Author: Luboš Luňák <l.lunak@centrum.cz> Date: Tue Apr 19 12:20:16 2022 +0200 read dwarf5 DW_FORM_rnglistx and DW_FORM_loclistx The .debug_rnglists/.debug_loclists sections first have a list of offsets and then a list of the actual data that those offsets point to.
(In reply to Lubos Lunak from comment #28) > The 7 patches should be applied in the order they > are attached. You appear to be applying them based on their subject line, > but those got meaningless as I was adding and updating the patches and I > didn't realize somebody might look at those (I don't know if I'd ever > submitted several patches as actual patches before). Ah, sorry, I assumed because they were numbered one to seven that was the order to review them. So I reviewed, tested and applied the first four numbered ones. What would be correct order for the next three patches/attachements? > > And if so, should we have an skip_Form_contents function that can be used in both places? > > I don't know what you mean, I see no need for such a function. Just to make sure that whenever we skip an attribute we always do it the same way. So skip_Form_contents would basically be: if(form == DW_FORM_addrx || form == DW_FORM_strx || form == DW_FORM_rnglistx || form == DW_FORM_loclistx) { /* Skip without interpreting them, they may depend on e.g. DW_AT_addr_base that has not been read yet. */ (void) get_ULEB128(c_die); } else get_Form_contents( &cts, cc, c_die, False /*td3*/, &abbv->nf[nf_i] ); And it could then also have a simplified get_Form_contents inlined that really just skips the values.
> So I reviewed, tested and applied the first four numbered ones. What would > be correct order for the next three patches/attachements? In the order they are listed in the attachment list here. If you managed to apply those 4 ones without conflicts, then apparently they can all be applied independently. > Just to make sure that whenever we skip an attribute we always do it the > same way. > So skip_Form_contents would basically be: I see. I did not find any need to do that while writing the patches.
For "implement support for missing DW_LLE_* and DW_RLE_* values" the code looks good. But the error handling in get_debug_addr_entry looks off. The second argument is DW_FORM form. But it is given either a DW_FORM, DW_LLE or DW_LRE. I think it would be better if the second argument would be a const HChar*. And we introduce new functions: const HChar* ML_(pp_DW_LLE) ( DW_LLE loc ) const HChar* ML_(pp_DW_RLE) ( DW_RLE range ) So get_debug_addr_entry can print the actual operator in the error message.
treat DW_TAG_skeleton_unit like DW_TAG_compile_unit looks good, pushed as: commit e95904b99c87606eae1372ee4661e9db03833f08 Author: Luboš Luňák <l.lunak@centrum.cz> Date: Tue Apr 26 08:35:16 2022 +0200 treat DW_TAG_skeleton_unit like DW_TAG_compile_unit It's basically the same, except for being split-dwarf. Handling it is required e.g. for reading line info.
For support DW_FORM_addrx3 and DW_FORM_strx3 it would be easier/clearer imho to use #if defined(VG_BIGENDIAN) instead of doing a runtime trick. Is the big endian variant correct? return c1 << 16 | c2 << 8 | c3; Should that be: return c1 << 32 | c2 << 16 | c3 << 8;
With the additional commits things are working again for Rust code, thanks!
Created attachment 149672 [details] implement support for missing DW_LLE_* and DW_RLE_* values
Created attachment 149673 [details] support DW_FORM_addrx3 and DW_FORM_strx3
> But the error handling in get_debug_addr_entry looks off. > The second argument is DW_FORM form. But it is given either a DW_FORM, > DW_LLE or DW_LRE. > I think it would be better if the second argument would be a const HChar*. You're right, too much copy&paste. Updated (but not using const HChar*, as I find it inefficient to go through a large switch for every call even though it normally wouldn't be used). > imho to use #if defined(VG_BIGENDIAN) instead of doing a runtime trick. Updated. > Is the big endian variant correct? I don't have actual big endian HW, but I think it is. The input is 3-byte bigendian that needs byte-by-byte handling, not the output.
(In reply to Lubos Lunak from comment #38) > > But the error handling in get_debug_addr_entry looks off. > > The second argument is DW_FORM form. But it is given either a DW_FORM, > > DW_LLE or DW_LRE. > > I think it would be better if the second argument would be a const HChar*. > > You're right, too much copy&paste. Updated (but not using const HChar*, as I > find it inefficient to go through a large switch for every call even though > it normally wouldn't be used). Yes, your introduction of an intermediary access function for the 3 cases is nicer than what I proposed. Thanks. It looks almost correct. I would just remove the following debug (TD3) part from get_debug_addr_entry_common: + if (TD3) { + HChar* tmp = ML_(cur_read_strdup)(get_DiCursor_from_Cursor(&cur), "di.getFC.1"); + TRACE_D3("(indirect address, offset: 0x%lx): %s", addr_pos, tmp); + ML_(dinfo_free)(tmp); + } That only makes sense for strings, not addresses. We could instead print the address, but I think that is too noisy even for a debug run. I missed this in the "read dwarf5 DW_FORM_addrx* and DW_FORM_strx* as generated by Clang14" where it was introduced for the get_Form_contents_addr function. sorry.
(In reply to Lubos Lunak from comment #38) > > imho to use #if defined(VG_BIGENDIAN) instead of doing a runtime trick. > > Updated. Thanks, looks good. > > Is the big endian variant correct? > > I don't have actual big endian HW, but I think it is. The input is 3-byte > bigendian that needs byte-by-byte handling, not the output. It is a little tricky to know how to interpret endianness for these "3-byte" values. But what I mean is that I believe others interpret this as a 4-byte value where the 4th byte is zero. Which doesn't matter for little endian, but does for big endian. With the assumption of the 4th byte is zero the code would look like: c1 = ML_(img_get_UChar)(c->sli.img, c->sli_next); c2 = ML_(img_get_UChar)(c->sli.img, c->sli_next+1); c3 = ML_(img_get_UChar)(c->sli.img, c->sli_next+2); c4 = 0; c->sli_next += 3; #if defined(VG_BIGENDIAN) return c1 << 24 | c2 << 16 | c3 << 8 | c4; #else return c1 | c2 << 8 | c3 << 16 | c4 << 24; #endif
Created attachment 149692 [details] implement support for missing DW_LLE_* and DW_RLE_* values
> It looks almost correct. I would just remove the following debug (TD3) part > from get_debug_addr_entry_common: Updated. > It is a little tricky to know how to interpret endianness for these "3-byte" > values. > But what I mean is that I believe others interpret this as a 4-byte value > where the 4th byte is zero. > Which doesn't matter for little endian, but does for big endian. > With the assumption of the 4th byte is zero the code would look like: > > c1 = ML_(img_get_UChar)(c->sli.img, c->sli_next); > c2 = ML_(img_get_UChar)(c->sli.img, c->sli_next+1); > c3 = ML_(img_get_UChar)(c->sli.img, c->sli_next+2); > c4 = 0; > c->sli_next += 3; > #if defined(VG_BIGENDIAN) > return c1 << 24 | c2 << 16 | c3 << 8 | c4; > #else > return c1 | c2 << 8 | c3 << 16 | c4 << 24; > #endif I don't see the dwarf spec being specific on how to interpret this, but then I think my way is the only logical one. This code would be weird/broken, for the following reasons: - It's inconsistent between little and big endian. On LE this would encode offsets 0-16777215, on BE it would encode offsets in multiples of 256 in the range 0-4278190080. - It's inconsistent with the 1,2,4 variants. Those encode 1,2,4 lowest bytes (e.g. str2 doesn't mean 3rd and 4th byte being zero). - It doesn't make much sense to encode multiples of 256. Although it's not spelled out, I think it's clear that it's meant to be a binary representation of a value using the minimal number of bytes needed.
(In reply to Lubos Lunak from comment #42) > > It looks almost correct. I would just remove the following debug (TD3) part > > from get_debug_addr_entry_common: > > Updated. Thanks, pushed as: commit 4bb0164e6b2067376f085452a40137f13436384c Author: Luboš Luňák <l.lunak@centrum.cz> Date: Mon Apr 25 22:11:27 2022 +0200 implement support for missing DW_LLE_* and DW_RLE_* values
(In reply to Lubos Lunak from comment #42) > I don't see the dwarf spec being specific on how to interpret this, but then > I think my way is the only logical one. Yes, I keep getting confused. We only want the read the value in. So pushed as is: commit 026cda6c8111ef6ff79bd5c6e083ab50ccf3ded6 Author: Luboš Luňák <l.lunak@centrum.cz> Date: Tue Apr 26 13:53:14 2022 +0200 support DW_FORM_addrx3 and DW_FORM_strx3 Apparently these may get used after all with large enough binaries, despite being somewhat tricky with regard to endianess.
(In reply to Nick Nethercote from comment #35) > With the additional commits things are working again for Rust code, thanks! And that was without some of the patches been pushed yet :) Could you test again with top of tree and see whether --read-var-info=yes also works OK?
> Could you test again with top of tree and see whether --read-var-info=yes > also works OK? My normal workflow works well. It uses Cachegrind and DHAT, which don't need `--read-var-info=yes`. I also tested Memcheck on a trivial Rust program and `--read-var-info=yes` had problems: > ==19085== Memcheck, a memory error detector > ==19085== Copyright (C) 2002-2022, and GNU GPL'd, by Julian Seward et al. > ==19085== Using Valgrind-3.20.0.GIT and LibVEX; rerun with -h for copyright info > ==19085== Command: ./a > ==19085== > > parse DIE(readdwarf3.c:3994): confused by: > <0><b>: Abbrev Number: 1 (DW_TAG_compile_unit) > DW_AT_producer : (indirect string, offset: 0x0): clang LLVM (rustc version 1.63.0-nightly (420c970cb 2022-06-09)) > DW_AT_language : 28 > DW_AT_name : (indirect string, offset: 0x41): library/std/src/lib.rs/@/std.85a61a41-cgu.0 > DW_AT_stmt_list : 0 > DW_AT_comp_dir : (indirect string, offset: 0x6d): /rustc/420c970cb1edccbf60ff2aeb51ca01e2300b09ef > DW_AT_GNU_pubnames: 1 > DW_AT_low_pc : 0x0 > DW_AT_ranges : 322832 > parse_type_DIE: > --19085-- WARNING: Serious error when reading debug info > --19085-- When reading debug info from /home/njn/dev/rustc-perf/a: > --19085-- confused by the above DIE > hello world > ==19085== > ==19085== HEAP SUMMARY: > ==19085== in use at exit: 0 bytes in 0 blocks > ==19085== total heap usage: 11 allocs, 11 frees, 3,181 bytes allocated > ==19085== > ==19085== All heap blocks were freed -- no leaks are possible > ==19085== > ==19085== For lists of detected and suppressed errors, rerun with: -s > ==19085== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
My test Rust program was this, in `a.rs`: > fn main() { > println!("hello world"); > } And I compiled with `rustc +nightly a.rs` to produce the executable `a`.
*** Bug 457412 has been marked as a duplicate of this bug. ***
Looks like I never properly closed this. Apologies. All Luboš have been committed and made it into valgrind 3.20.0 commit 026cda6c8111ef6ff79bd5c6e083ab50ccf3ded6 Author: Luboš Luňák <l.lunak@centrum.cz> Date: Tue Apr 26 13:53:14 2022 +0200 support DW_FORM_addrx3 and DW_FORM_strx3 Apparently these may get used after all with large enough binaries, despite being somewhat tricky with regard to endianess. commit 4bb0164e6b2067376f085452a40137f13436384c Author: Luboš Luňák <l.lunak@centrum.cz> Date: Mon Apr 25 22:11:27 2022 +0200 implement support for missing DW_LLE_* and DW_RLE_* values commit e95904b99c87606eae1372ee4661e9db03833f08 Author: Luboš Luňák <l.lunak@centrum.cz> Date: Tue Apr 26 08:35:16 2022 +0200 treat DW_TAG_skeleton_unit like DW_TAG_compile_unit It's basically the same, except for being split-dwarf. Handling it is required e.g. for reading line info. commit d19bbdf1200685ddbb6976cf915fba25f2097162 Author: Luboš Luňák <l.lunak@centrum.cz> Date: Tue Apr 19 12:20:16 2022 +0200 read dwarf5 DW_FORM_rnglistx and DW_FORM_loclistx The .debug_rnglists/.debug_loclists sections first have a list of offsets and then a list of the actual data that those offsets point to. commit 383f36462fdc25a0e849391cf845e22703a95c08 Author: Luboš Luňák <l.lunak@centrum.cz> Date: Mon Apr 25 22:11:27 2022 +0200 avoid warning about missing DW_AT_*_base in skip_DIE() Similarly to setup_cu_bases(), DW_FORM_addrx etc. may depend on DW_AT_addr_base etc. that have not been read yet. commit 61dfba4232d795c00603c74e6d0573f2d16641e2 Author: Luboš Luňák <l.lunak@centrum.cz> Date: Tue Apr 19 11:14:52 2022 +0200 read dwarf5 DW_FORM_addrx* and DW_FORM_strx* as generated by Clang14 DW_FORM_addrx* are offsets into .debug_addr containing addresses. DW_FORM_strx* are offsets into .debug_str_offsets, which contain offsets into .debug_str. Support for these also requires reading DW_AT_addr_base and DW_AT_str_offsets_base before any other field in the abbrev table entry, as those may use this form. commit 61ddbc1fc395c787192e569d8f2238f713bdfd8e Author: Luboš Luňák <l.lunak@centrum.cz> Date: Tue Apr 19 10:58:44 2022 +0200 read properly unit headers depending on dwarf5 unit_type There may be additional fields that need to be skipped over, otherwise further reading will interpret these incorrectly.