394361 – [Enhancement] : Client request to control thread-yielding in valgrind

Bug 394361 - [Enhancement] : Client request to control thread-yielding in valgrind

Summary: [Enhancement] : Client request to control thread-yielding in valgrind

Status:	REPORTED

Alias:	None

Product:	valgrind
Classification:	Developer tools
Component:	general (other bugs)
Version First Reported In:	3.14 SVN
Platform:	RedHat Enterprise Linux Linux

Importance:	NOR normal
Target Milestone:	---
Assignee:	Julian Seward

URL:
Keywords:

Depends on:
Blocks:

Reported:	2018-05-17 09:12 UTC by Manish Goel
Modified:	2018-09-04 03:14 UTC (History)
CC List:	0 users

See Also:
Latest Commit:
Version Fixed/Implemented In:
Sentry Crash Report:

Attachments
patch-file (2.46 KB, application/mbox) 2018-05-17 09:12 UTC, Manish Goel	Details
View All Add an attachment

Note You need to log in before you can comment on or make changes to this bug.

Description Manish Goel 2018-05-17 09:12:13 UTC

Created attachment 112702 [details]
patch-file

Hi, 

I have created a valgrind client-request "VALGRIND_YIELD", which makes current running thread in valgrind to yield. 
This helps in scenario where, app has multiple consumer threads and those threads are processing executing grabbed-objects. And there is a possibility of data-race between execution of 2 grabbed objects. But since helgrind by default runs a thread for 100000 basic-blocks, a single consumer-thread tends to grab all object and hence no race happens with helgrind.
But with this client-request, after client-specific number of grabbed objects a consumer-thread can yield to other consumer-thread and we can re-produce race-causing scenario with helgrind as well.

I have patch attached with bugz. Kindly review and patch it into valgrind.

Thanks & Regards
Manish Goel

Comment 1 Julian Seward 2018-09-03 06:40:36 UTC

Did you try without your patch, but the the flag --fair-sched=yes ?

Comment 2 Manish Goel 2018-09-04 03:14:41 UTC

(In reply to Julian Seward from comment #1)
> Did you try without your patch, but the the flag --fair-sched=yes ?

Yes, I did try with --fair-sched=yes but it didn't worked and there were still data-races.

Further elaboration -- 
Imaging we have 3 jobs i.e. J1, J2, J3 to be done in parallel and 2 thread i.e. T1, T2. And there is a syncpoint after all these jobs are done between threads. Also these jobs are of unequal size and possibly of size about few 100 basic-blocks.
Now these threads are competing against each other to capture-&-process these jobs. 
So in normal run -- T1 would acquire say J1. And T2 would acquire say J2. And whoever finishes first acquires J3 (say T1). 

But with helgrind -- only one of thread say T1(picked fairly) would be scheduled to run. And would end up capturing-&-processing all jobs because of 100K basic-block heuristic. And since all jobs happened in a single thread. No data-race between them would be reported.

With this patch -- we start with a thread say T1(picked fairly), which would then process J1. Afterwards, it would yield (because of newly added client-request). Then thread T2 would process J2, report data-races between T1.J1-&-T1.J2 and yield. And then T1 would acquire J3, report data-races between T1.J2-&-T1.J3 and will go to syncpoint.
With help of this patch we were able to see data-races between T1.J1-&-T1.J2 and T1.J2-&-T1.J3. And still missed data-races because of T1.J1-&-T1.J3

Kindly let me know, if you have some more thoughts or suggestions on this.

Thanks & Regards
Manish Goel