Bug 350405

Summary: Support for Intel DPDK custom allocator rte_malloc
Product: [Developer tools] valgrind Reporter: Luca Boccassi <luca.boccassi>
Component: generalAssignee: Julian Seward <jseward>
Status: REPORTED ---    
Severity: wishlist CC: tom
Priority: NOR    
Version: 3.10 SVN   
Target Milestone: ---   
Platform: Debian testing   
OS: Linux   
Latest Commit: Version Fixed In:
Sentry Crash Report:
Attachments: Patch to add support for Intel DPDK allocator rte_malloc family
1 - VALGRIND_NON_SIMD_CALL4 patch
2 - VALGRIND_NON_SIMD_CALL5 patch
3 - drd_realloc alignment
4 - dhat realloc alignment
5 - h_replace realloc alignment
6 - hg_cli realloc alignment
7 - massif realloc alignment
8 - memcheck realloc alignment
9 - DPDK (rte_*alloc) family) support

Description Luca Boccassi 2015-07-20 11:07:53 UTC
Created attachment 93665 [details]
Patch to add support for Intel DPDK allocator rte_malloc family

Dear Maintainers,

Intel's DPDK framework (http://www.dpdk.org/) ships with a custom family of heap allocators with the following signatures:

void *rte_malloc(const char *type, size_t size, unsigned align);
void *rte_zmalloc(const char *type, size_t size, unsigned align);
void *rte_calloc(const char *type, size_t num, size_t size, unsigned align);
void *rte_realloc(void *ptr, size_t size, unsigned align);
void *rte_malloc_socket(const char *type, size_t size, unsigned align, int socket);
void *rte_zmalloc_socket(const char *type, size_t size, unsigned align, int socket);
void *rte_calloc_socket(const char *type, size_t num, size_t size, unsigned align, int socket);
void rte_free(void *ptr);

We are building an application using this framework, and given how amazingly useful Valgrind is, I've created an internal build with a patch to support these allocators, so that we can fully debug our product.

What I have done is simply wrap those functions in libc's malloc family. The result seems to work quite well, so I'd like to ask for feedback (especially on whether there are there other, simpler ways to solve this problem!) and, if you like, to contribute back the patch, which is attached (taken from the SVN head).

Thank you!

Kind regards,
Luca Boccassi
Brocade Communications Systems
Comment 1 Tom Hughes 2015-07-20 11:59:31 UTC
I'm not sure that's safe - you're ignoring the requested alignment (assuming that is what the "align" parameter is) which could easily cause the program to abort if it relies on the alignment being respected.

The arena_malloc call and friends only align to 8 bytes (on 32 bit x86, arm and mips platforms) or 16 bytes (on everything else) so if an alignment larger than that was requested it would be chance if you got it or not.

Also what does the "socket" argument do and what is the effect of ignoring it?
Comment 2 Luca Boccassi 2015-07-20 12:21:28 UTC
Hello Tom,

(In reply to Tom Hughes from comment #1)
> I'm not sure that's safe - you're ignoring the requested alignment (assuming
> that is what the "align" parameter is) which could easily cause the program
> to abort if it relies on the alignment being respected.
> 
> The arena_malloc call and friends only align to 8 bytes (on 32 bit x86, arm
> and mips platforms) or 16 bytes (on everything else) so if an alignment
> larger than that was requested it would be chance if you got it or not.

Yes, that is my major concern. The application we build uses quite heavily a 64 bytes alignment, but I haven't seen it explode in my face (yet) :-)

Do you have any suggestion on how this could be implemented? Any particular area of the Valgrind code I can look at?

> Also what does the "socket" argument do and what is the effect of ignoring
> it?

It's the NUMA socket. DPDK can allocate memory on a specific CPU socket, to improve memory latency (each thread is usually pinned to a specific core). The framework allows the number of NUMA sockets to be a runtime parameter, so I think it's safe to ignore it when running in Valgrind for testing purposes. As far as I know it should not affect correctness, only latency.

Thanks for your feedback!

Kind regards,
Luca Boccassi
Brocade Communications Systems
Comment 3 Luca Boccassi 2015-11-05 17:05:20 UTC
Created attachment 95335 [details]
1 - VALGRIND_NON_SIMD_CALL4 patch
Comment 4 Luca Boccassi 2015-11-05 17:05:52 UTC
Created attachment 95336 [details]
2 - VALGRIND_NON_SIMD_CALL5 patch
Comment 5 Luca Boccassi 2015-11-05 17:06:50 UTC
Created attachment 95337 [details]
3 - drd_realloc alignment
Comment 6 Luca Boccassi 2015-11-05 17:07:32 UTC
Created attachment 95338 [details]
4 - dhat realloc alignment
Comment 7 Luca Boccassi 2015-11-05 17:08:04 UTC
Created attachment 95339 [details]
5 - h_replace realloc alignment
Comment 8 Luca Boccassi 2015-11-05 17:08:37 UTC
Created attachment 95340 [details]
6 - hg_cli realloc alignment
Comment 9 Luca Boccassi 2015-11-05 17:09:03 UTC
Created attachment 95341 [details]
7 - massif realloc alignment
Comment 10 Luca Boccassi 2015-11-05 17:09:25 UTC
Created attachment 95342 [details]
8 - memcheck realloc alignment
Comment 11 Luca Boccassi 2015-11-05 17:10:01 UTC
Created attachment 95343 [details]
9 - DPDK (rte_*alloc) family) support
Comment 12 Luca Boccassi 2015-11-05 17:15:26 UTC
(In reply to Tom Hughes from comment #1)
> I'm not sure that's safe - you're ignoring the requested alignment (assuming
> that is what the "align" parameter is) which could easily cause the program
> to abort if it relies on the alignment being respected.
> 
> The arena_malloc call and friends only align to 8 bytes (on 32 bit x86, arm
> and mips platforms) or 16 bytes (on everything else) so if an alignment
> larger than that was requested it would be chance if you got it or not.

Hi Tom,

Found some time to work again on this, apologies for the delay.

I have removed the previous patch and split my changes in 9 patches, since in order to support alignment I had to refactor the implementation of the helper functions that the various tools use to implement realloc. Also I added VALGRIND_NON_SIMD_CALL4 and VALGRIND_NON_SIMD_CALL5, since some of the rte_* functions have 4 and 5 parameters.

I thought it would be cleaner and nicer to see each refactor in its own patch. They apply cleanly on the current HEAD of SVN.

I tested them with some very large alignment values with our DPDK application in Debian stable 64bit and I had no issues.
Comment 13 Luca Boccassi 2015-11-05 17:30:38 UTC
Forgot to mention, since DPDK is often statically compiled, I added support for that too. Tested by running valgrind with --soname-synonyms=somalloc=NONE