Created attachment 123081 [details] Add a per-thread simulate flag to Callgrind This patch makes it possible to enable and disable Callgrind's cache simulation on a per-thread basis, just like the collect flag controls event collection per thread. This is useful for example when using Callgrind to profile the memory footprint of embedded code running inside of a larger simulator process. By enabling cache simulation for only the threads doing the interesting work, we can avoid irrelevant threads interfering with the simulated cache.