Bug 495808

Summary: valgrind: m_translate.c:1833 (vgPlain_translate): Assertion 'tres.status == VexTransOK' failed.
Product: [Developer tools] valgrind Reporter: Tyson <tyson.w.smith>
Component: vexAssignee: Julian Seward <jseward>
Status: REPORTED ---    
Severity: normal CC: pjfloyd
Priority: NOR    
Version First Reported In: unspecified   
Target Milestone: ---   
Platform: Other   
OS: Other   
Latest Commit: Version Fixed In:
Sentry Crash Report:
Attachments: assertion.txt
mozconfig
suppressions

Description Tyson 2024-11-05 02:23:32 UTC
Created attachment 175526 [details]
assertion.txt

SUMMARY
Running Firefox via Valgrind leads to:

VEX temporary storage exhausted.
Pool = TEMP,  start 0x597d76c8 curr 0x59c68778 end 0x59c9c207 (size 5000000)

vex: the `impossible' happened:
   VEX temporary storage exhausted.
Increase N_{TEMPORARY,PERMANENT}_BYTES and recompile.
vex storage: T total 74523934088 bytes allocated
vex storage: P total 512 bytes allocated

valgrind: the 'impossible' happened:
   LibVEX called failure_exit().

host stacktrace:
==1186454==    at 0x580445CA: show_sched_status_wrk (m_libcassert.c:407)
==1186454==    by 0x580446F7: report_and_quit (m_libcassert.c:478)
==1186454==    by 0x58044960: panic (m_libcassert.c:554)
==1186454==    by 0x58044960: vgPlain_core_panic_at (m_libcassert.c:559)
==1186454==    by 0x58044990: vgPlain_core_panic (m_libcassert.c:564)
==1186454==    by 0x5805964A: failure_exit (m_translate.c:761)
==1186454==    by 0x58136800: vpanic (main_util.c:253)
==1186454==    by 0x58136891: private_LibVEX_alloc_OOM (main_util.c:181)
==1186454==    by 0x581C7827: LibVEX_Alloc_inline (main_util.h:176)
==1186454==    by 0x581C7827: doRegisterAllocation_v3 (host_generic_reg_alloc3.c:494)
==1186454==    by 0x581348C8: libvex_BackEnd (main_main.c:1133)
==1186454==    by 0x581348C8: LibVEX_Translate (main_main.c:1236)
==1186454==    by 0x5805BE02: vgPlain_translate (m_translate.c:1831)
==1186454==    by 0x5809A73A: handle_chain_me (scheduler.c:1166)
==1186454==    by 0x5809CE9D: vgPlain_scheduler (scheduler.c:1562)
==1186454==    by 0x580E9A7D: thread_wrapper (syswrap-linux.c:102)
==1186454==    by 0x580E9A7D: run_a_thread_NORETURN (syswrap-linux.c:155)
==1186454==    by 0x580E9D60: vgModuleLocal_start_thread_NORETURN (syswrap-linux.c:329)
==1186454==    by 0x580AEB8D: ??? (in /usr/local/libexec/valgrind/memcheck-amd64-linux)
==1186454==    by 0xDEADBEEFDEADBEEE: ???
==1186454==    by 0xDEADBEEFDEADBEEE: ???

I increased N_TEMPORARY_BYTES 5000000 -> 7000000 in VEX/priv/main_util.c

I now hit the following assertion "valgrind: m_translate.c:1833 (vgPlain_translate): Assertion 'tres.status == VexTransOK' failed."

STEPS TO REPRODUCE
1. build Firefox (use attached mozconfig)

2. launch file Firefox
valgrind --exit-on-first-error=yes --expensive-definedness-checks=yes --fair-sched=yes --gen-suppressions=all --leak-check=no --num-transtab-sectors=48 --read-inline-info=yes --show-mismatched-frees=no --show-possibly-lost=no --smc-check=all-non-file --trace-children=yes --trace-children-skip=python*,*/lsb_release,*/dbus-launch --track-origins=yes --vex-iropt-register-updates=allregs-at-mem-access --vgdb=no --suppressions=x86_64-pc-linux-gnu.sup /home/user/code/mozilla-central/objdir-ff-valgrind/dist/bin/firefox -new-instance -headless

I am running Ubuntu 22.04 lts with the 5.15 kernel.
Comment 1 Tyson 2024-11-05 02:24:45 UTC
Created attachment 175527 [details]
mozconfig
Comment 2 Tyson 2024-11-05 02:27:08 UTC
Created attachment 175528 [details]
suppressions
Comment 3 Tyson 2024-11-05 02:27:57 UTC
I am using Valgrind from git: commit 9a439e5c
Comment 4 Tyson 2024-11-05 02:30:48 UTC
Setting --track-origins=no seems to make the issue go away.
Comment 5 Paul Floyd 2024-11-05 08:53:16 UTC
What happens if you do as the message says
"Increase N_{TEMPORARY,PERMANENT}_BYTES and recompile." (in this case I presume N_TEMPORARY_BYTES, I suggest that you try 10 million).

That's in VEX/priv/main_util.c

At present the size is 5 million and the comment says that it should be less than the L2 cache size. I think that the comment is out of date and should probably be LLC (typically L3 and 10Mbytes and more on most machines since about 2020).
Comment 6 Tyson 2024-11-05 15:14:48 UTC
(In reply to Paul Floyd from comment #5)
> What happens if you do as the message says
> "Increase N_{TEMPORARY,PERMANENT}_BYTES and recompile." (in this case I
> presume N_TEMPORARY_BYTES, I suggest that you try 10 million).

I did, that's where the assertion message came from. Please see the attached assertion.txt
Comment 7 Paul Floyd 2024-11-05 18:54:29 UTC
Sorry didn’t read carefully enough. The assert will be trickier.