In function doRegisterAllocation(), array "reg_usage_arr" is currently allocated with: LibVEX_Alloc_inline(sizeof(HRegUsage) * instrs_in->arr_used - 1). However I think this is wrong. It should be allocated with: LibVEX_Alloc_inline(sizeof(HRegUsage) * instrs_in->arr_used); That's because rreg_usage_arr is indexed with [0 .. arr_used-1] so its length needs to be at least arr_used (and not arr_used-1).
Created attachment 106081 [details] proposed patch
(In reply to Ivo Raisr from comment #0) > In function doRegisterAllocation(), array "reg_usage_arr" is currently > allocated with: > LibVEX_Alloc_inline(sizeof(HRegUsage) * instrs_in->arr_used - 1). Even more bizarrely, if you consider the priority of * vs -, it is: ( sizeof(HRegUsage) * instrs_in->arr_used ) - 1 which I'm sure isn't what the original author intended :-) But that might explain why it works. And why Philippe's self-hosting Memcheck runs never detected it: because LibVEX_Alloc_inline will surely have rounded that value up to the next word size (or allocation unit, I think 8 or 16 bytes) and in so doing will have removed the effect of the "-1". Bizarre, huh!
(In reply to Ivo Raisr from comment #1) > Created attachment 106081 [details] > proposed patch r+
Fixed in Valgrind SVN commit r16445 (NEWS) and VEX SVN commit r3392.
(In reply to Julian Seward from comment #2) > But that might explain why it works. And why Philippe's self-hosting > Memcheck runs never detected it: because LibVEX_Alloc_inline will surely > have rounded that value up to the next word size (or allocation unit, > I think 8 or 16 bytes) and in so doing will have removed the effect > of the "-1". Bizarre, huh! In the particular case of the lib VEX 'fast allocator', there is no instrumentation of this allocator : outer memcheck is simply not informed that the 'big memory piece' is cut in small blocks. To detect bugs in this area, it will be needed to change LibVEX_Alloc_inline (in the inner) to: * add some redzone before/after each block * increase the permanent and temporary size to cope with the redzone * use some mempool client requests to inform the outer memcheck of the allocated blocks * release all the blocks in one single operation, when the temporary zone is cleared via vexSetAllocModeTEMP_and_clear
(In reply to Philippe Waroquiers from comment #5) > In the particular case of the lib VEX 'fast allocator', there is > no instrumentation of this allocator [..] Ah. Good point. Hmm. Given that VEX is a big(ish) and complex bit of code, and that it will be getting more complex with Ivo's IR work, it would be nice if we could Memcheck it properly. In particular I'd like to be able to check for uninitialised value uses in it. Is it possible you could make a patch to do what you said, at some point?
I concur. I hope this will shed some light for a mysterious bug I see on x86/Solaris (not amd64/Solaris) which manifests in the following debug output printed for almost all gdbserver_tests: vex iropt: 4 x unrolling (25 sts -> 100 sts) vex iropt: 2 x unrolling (50 sts -> 100 sts) vex iropt: 2 x unrolling (34 sts -> 68 sts) vex iropt: fold_Expr: no ident rule for: 32HLto64(t47,t47) vex iropt: 8 x unrolling (13 sts -> 104 sts) vex iropt: 2 x unrolling (56 sts -> 112 sts) vex iropt: 8 x unrolling (14 sts -> 112 sts) vex iropt: 2 x unrolling (57 sts -> 114 sts) vex iropt: 2 x unrolling (47 sts -> 94 sts) vex iropt: 2 x unrolling (33 sts -> 66 sts) vex iropt: fold_Expr: no ident rule for: 32HLto64(t68,t68) vex iropt: fold_Expr: no ident rule for: 32HLto64(t62,t62) vex iropt: fold_Expr: no ident rule for: 32HLto64(t66,t66) vex iropt: fold_Expr: no ident rule for: 32HLto64(t57,t57) vex iropt: 2 x unrolling (39 sts -> 78 sts) vex iropt: 2 x unrolling (33 sts -> 66 sts) vex iropt: 2 x unrolling (57 sts -> 114 sts) vex iropt: 2 x unrolling (39 sts -> 78 sts) vex iropt: IRStmt_Exit became unconditional vex iropt: IRStmt_Exit became unconditional vex iropt: 2 x unrolling (32 sts -> 64 sts) vex iropt: 2 x unrolling (44 sts -> 88 sts) vex iropt: IRStmt_Exit became unconditional vex iropt: fold_Expr: no ident rule for: 32HLto64(t118,t118) vex iropt: 2 x unrolling (33 sts -> 66 sts) This is printed only for a brief duration and I can reproduce it outside the 'regtest' harness. For example with nlvgdbsigqueue, following GDB commands in nlvgdbsigqueue.stdinB.gdb, it is printed only after I hit "continue": ../vg-in-place --tool=none --vgdb=yes --vgdb-error=0 --vgdb-prefix=./vgdb-prefix-nlvgdbsigqueue ./sleepers 1000000000 1000000000 1 BSBSBSBS 1 ==9979== Nulgrind, the minimal Valgrind tool ==9979== Copyright (C) 2002-2017, and GNU GPL'd, by Nicholas Nethercote. ==9979== Using Valgrind-3.14.0.SVN and LibVEX; rerun with -h for copyright info ==9979== Command: ./sleepers 1000000000 1000000000 1 BSBSBSBS 1 ==9979== ==9979== (action at startup) vgdb me ... ==9979== ==9979== TO DEBUG THIS PROCESS USING GDB: start GDB like this ==9979== /path/to/gdb ./sleepers ==9979== and then give GDB the following command ==9979== target remote | /export/home/tester1/valgrind-32/gdbserver_tests/../.in_place/../../bin/vgdb --vgdb-prefix=./vgdb-prefix-nlvgdbsigqueue --pid=9979 ==9979== --pid is optional if only one valgrind process is running ==9979== [here all the signals are queued as described in nlvgdbsigqueue.stdinB.gdb] [then do "continue" in GDB] vex iropt: 4 x unrolling (20 sts -> 80 sts) vex iropt: 2 x unrolling (41 sts -> 82 sts) vex iropt: 4 x unrolling (29 sts -> 116 sts) vex iropt: 2 x unrolling (38 sts -> 76 sts) vex iropt: 8 x unrolling (12 sts -> 96 sts) vex iropt: 2 x unrolling (32 sts -> 64 sts) vex iropt: 8 x unrolling (13 sts -> 104 sts) vex iropt: 4 x unrolling (30 sts -> 120 sts) vex iropt: 2 x unrolling (39 sts -> 78 sts) vex iropt: 4 x unrolling (16 sts -> 64 sts) vex iropt: 8 x unrolling (15 sts -> 120 sts) vex iropt: 2 x unrolling (34 sts -> 68 sts) vex iropt: not unrolling (75 sts) vex iropt: 4 x unrolling (29 sts -> 116 sts) vex iropt: 4 x unrolling (16 sts -> 64 sts) vex iropt: 8 x unrolling (15 sts -> 120 sts) vex iropt: 4 x unrolling (23 sts -> 92 sts) vex iropt: IRStmt_Exit became unconditional vex iropt: 8 x unrolling (13 sts -> 104 sts) vex iropt: 8 x unrolling (12 sts -> 96 sts) vex iropt: IRStmt_Exit became unconditional vex iropt: 2 x unrolling (35 sts -> 70 sts) loops/sleep_ms/burn/threads_spec/affinity: 1000000000 1000000000 1 BSBSBSBS 1 vex iropt: IRStmt_Exit became unconditional vex iropt: 2 x unrolling (50 sts -> 100 sts) vex iropt: IRStmt_Exit became unconditional vex iropt: 2 x unrolling (31 sts -> 62 sts) vex iropt: 2 x unrolling (34 sts -> 68 sts) vex iropt: 8 x unrolling (13 sts -> 104 sts) vex iropt: 2 x unrolling (50 sts -> 100 sts) Brussels ready to sleep and/or burn London ready to sleep and/or burn Petaouchnok ready to sleep and/or burn main ready to sleep and/or burn Unfortunately I cannot attach to the main Valgrind process with gdb to watch who has supposedly mistakenly overwritten vex_control.iropt_verbosity. Because by that time procfs agent thread is created in the process and gdb is confused about it and crashes.
(In reply to Ivo Raisr from comment #7) > I hope this will shed some light for a mysterious bug I see on x86/Solaris > (not amd64/Solaris) which manifests in the following debug output printed > for almost all gdbserver_tests: Wow. That's pretty scary. It is as if vex_control.iropt_verbosity has been changed unexpectedly. Can you use nm on none-x86-solaris to find out what variables/structures are the immediate neighbours of vex_control? Maybe that would give a clue.
Hmm, does not strike me as something obvious... ... 580e1a50 t varstack_show 58003b80 R version 58108f90 T vex_assert_fail 58108f20 T vex_bzero 5904a72c B vex_control <== 5904a74c B vex_debuglevel 5904a754 B vex_failure_exit 58179740 b vex_init_done.25661 5904a758 B vex_initdone 58113d00 T vex_inject_ir ...
(In reply to Ivo Raisr from comment #9) > Hmm, does not strike me as something obvious... > 5904a754 B vex_failure_exit > 58179740 b vex_init_done.25661 > 5904a758 B vex_initdone > 58113d00 T vex_inject_ir I think those symbols are sorted by name, not address. -n flag for nm maybe?
Indeed: ... 5904a71c b p_CmpNEZ32_Or32.5344 5904a720 b p_CmpNEZ32_And32.5340 5904a724 b p_zwiden_load64.5765 5904a728 b p_CmpwNEZ64_Or64.5612 5904a72c B vex_control <=== 5904a748 B vex_traceflags 5904a74c B vex_debuglevel 5904a750 B vex_log_bytes 5904a754 B vex_failure_exit 5904a758 B vex_initdone ...
(In reply to Ivo Raisr from comment #11) > 5904a71c b p_CmpNEZ32_Or32.5344 > 5904a720 b p_CmpNEZ32_And32.5340 > 5904a724 b p_zwiden_load64.5765 > 5904a728 b p_CmpwNEZ64_Or64.5612 > 5904a72c B vex_control <=== > 5904a748 B vex_traceflags > 5904a74c B vex_debuglevel > 5904a750 B vex_log_bytes > 5904a754 B vex_failure_exit > 5904a758 B vex_initdone I don't know. That all looks pretty harmless. The p_* values are single pointers to IR trees used in insn selection (they are templates used for matching) and I can't imagine how they could be involved in an array overrun. I do notice that iropt_verbosity is the first word of VexControl, but whether that is significant, I also don't know.
(In reply to Julian Seward from comment #6) > (In reply to Philippe Waroquiers from comment #5) > > In the particular case of the lib VEX 'fast allocator', there is > > no instrumentation of this allocator [..] > > Ah. Good point. Hmm. Given that VEX is a big(ish) and complex bit of > code, and that it will be getting more complex with Ivo's IR work, it would > be nice if we could Memcheck it properly. In particular I'd like to be able > to check for uninitialised value uses in it. Is it possible you could make > a patch to do what you said, at some point? Yes, I will take a look. Note that this might slow down heavily the inner operations, as any VEX allocate will imply significantly heavier work : take a stack trace, record it, etc ... I will watch the impact (of course, no impact expected unless configured as an inner valgrind)