Bug 356174

Summary: Enhance the embedded gdbserver to allow LLDB to use it
Product: [Developer tools] valgrind Reporter: Daniel Trebbien <dtrebbien>
Component: generalAssignee: Julian Seward <jseward>
Status: REPORTED ---    
Severity: wishlist CC: dtrebbien, ivosh, philippe.waroquiers, pjfloyd, sam
Priority: NOR    
Version: 3.10.0   
Target Milestone: ---   
Platform: unspecified   
OS: All   
Latest Commit: Version Fixed In:
Sentry Crash Report:
Attachments: Add support for 'qC' packets
Fix handling of an 'Hc-1' packet
Swap declarations of r15 and rip
Add generic register info
Add generic register info (2)

Description Daniel Trebbien 2015-12-01 22:36:05 UTC
Because Mac OS X no longer ships with gdb, it would be useful to allow LLDB's support for gdb-remote debugging to be used with the embedded gdbserver.

Reproducible: Always

Steps to Reproduce:
On a Mac system:
1. Install Homebrew's valgrind package.
2. In one terminal, start valgrind with --vgdb-error=0:
valgrind --vgdb-error=0 ./myprogram
3. In another terminal, start LLDB:
lldb ./myprogram
4. In another terminal, start vgdb listening on an unused port (e.g. port 12851):
vgdb --port=12851
5. In the LLDB terminal, run the gdb-remote command to connect:
(lldb) gdb-remote 12851
6. Try to continue by running 'c' in LLDB:
(lldb) c

Actual Results:  
LLDB will display "error: Process must be launched." because it thinks that the ./myprogram process has not been started.

Expected Results:  
Running 'c' in LLDB after connecting should continue execution of the program and LLDB should be usable to debug.

I am using:
- valgrind-3.11.0
- Mac OS X 10.11.1 'El Capitan'
- lldb-340.4.110.1
Comment 1 Daniel Trebbien 2015-12-01 23:00:20 UTC
Created attachment 95852 [details]
Add support for 'qC' packets

When LLDB connects, it sends the following packets:
QStartNoAckMode
qSupported:xmlRegisters=i386,arm,mips
QThreadSuffixSupported
QListThreadsInStopReply
qHostInfo
vCont?
qVAttachOrWaitSupported
qProcessInfo
qC

(You can view the log of packets sent and received by enabling gdb-remote 'packets' logging in LLDB:
log enable gdb-remote packets )

Adding support for the 'qC' packet causes LLDB to send a '?' packet.  From the response LLDB then figures out that the program is currently halted, and you can then run the 'c' command in LLDB.

While this patch makes some progress in allowing LLDB to use the embedded gdbserver, LLDB currently segfaults after continuing.  I am looking into this segfault issue to see how it might be fixed.
Comment 2 Daniel Trebbien 2015-12-01 23:23:55 UTC
Created attachment 95853 [details]
Fix handling of an 'Hc-1' packet

Looking through the packets logged just before the segfault, I noticed that the response to an 'Hc-1' packet was not correct.  For example:
<   8> send packet: $Hc-1#09
<   7> read packet: $E01#a6

This is due to two issues:
1. Within server_main(), `strtoul (&own_buf[2], NULL, 16);' actually expands to a call to VG_(strtoull16).  However, unlike the standard C library strtoul() routine, VG_(strtoull16) did not support a minus sign.
2. gdb_id_to_thread_id() does not handle the case where the gdb thread ID is -1 (which means "all threads" in this case:  https://sourceware.org/gdb/current/onlinedocs/gdb/Packets.html#Packets ).

The attached patch adds minus sign and overflow handling to VG_(strtoull10) and VG_(strtoull16) so that their behavior is closer to the standard strtoull() function:  http://pubs.opengroup.org/onlinepubs/000095399/functions/strtoull.html
The patch also modifies server_main() to handle the case where `gdb_id' is `(unsigned long)-1'.
Comment 3 Daniel Trebbien 2015-12-02 13:30:14 UTC
The two patches allow LLDB 3.4, 3.5, and 3.6 to debug using the embedded gdbserver on an Ubuntu 14.04.3 'Trusty Tahr' system.

The crash on Mac has the following backtrace:
```
Thread 5 Crashed:
0   com.apple.LLDB.framework      	0x0000000108b637ad ABISysV_x86_64::GetArgumentValues(lldb_private::Thread&, lldb_private::ValueList&) const + 147
1   com.apple.LLDB.framework      	0x00000001089aba51 DynamicLoaderMacOSXDYLD::NotifyBreakpointHit(void*, lldb_private::StoppointCallbackContext*, unsigned long long, unsigned long long) + 399
2   com.apple.LLDB.framework      	0x000000010882ce04 lldb_private::BreakpointOptions::InvokeCallback(lldb_private::StoppointCallbackContext*, unsigned long long, unsigned long long) + 48
3   com.apple.LLDB.framework      	0x0000000108829f7e lldb_private::BreakpointLocation::InvokeCallback(lldb_private::StoppointCallbackContext*) + 82
4   com.apple.LLDB.framework      	0x000000010882a6de lldb_private::BreakpointLocation::ShouldStop(lldb_private::StoppointCallbackContext*) + 116
5   com.apple.LLDB.framework      	0x000000010882b418 lldb_private::BreakpointLocationCollection::ShouldStop(lldb_private::StoppointCallbackContext*) + 76
6   com.apple.LLDB.framework      	0x0000000108833884 lldb_private::BreakpointSite::ShouldStop(lldb_private::StoppointCallbackContext*) + 48
7   com.apple.LLDB.framework      	0x0000000108ac0db5 lldb_private::StopInfoBreakpoint::ShouldStopSynchronous(lldb_private::Event*) + 241
8   com.apple.LLDB.framework      	0x0000000108adabdf lldb_private::Thread::ShouldStop(lldb_private::Event*) + 715
9   com.apple.LLDB.framework      	0x0000000108ae0562 lldb_private::ThreadList::ShouldStop(lldb_private::Event*) + 502
10  com.apple.LLDB.framework      	0x0000000108aae550 lldb_private::Process::ShouldBroadcastEvent(lldb_private::Event*) + 396
11  com.apple.LLDB.framework      	0x0000000108aab685 lldb_private::Process::HandlePrivateEvent(std::__1::shared_ptr<lldb_private::Event>&) + 471
12  com.apple.LLDB.framework      	0x0000000108aaed3a lldb_private::Process::RunPrivateStateThread() + 568
13  com.apple.LLDB.framework      	0x0000000108aae771 lldb_private::Process::PrivateStateThread(void*) + 9
14  libsystem_pthread.dylib       	0x00007fff8e3b59b1 _pthread_body + 131
15  libsystem_pthread.dylib       	0x00007fff8e3b592e _pthread_start + 168
16  libsystem_pthread.dylib       	0x00007fff8e3b3385 thread_start + 13
```

This appears to be Apple-specific.

If these two patches are committed, then I think that this bug can be closed as "fixed".
Comment 4 Philippe Waroquiers 2015-12-03 22:15:54 UTC
Thanks for the analysis and patches. I can reproduce on debian8/x86 the problem that
implies to have support for qC.
I have quickly looked at the patches, which look ok.
I should be able to handle and commit the patches in a few days.
Comment 5 Daniel Trebbien 2015-12-04 12:18:11 UTC
I contacted Apple Developer Relations about the crash and they informed me that unlike gdb, lldb does not have an initial target definition set, and relies on the gdbserver to tell it which registers the gdbserver supports.  This can be done either by responding to 'qRegisterInfo XX' packets or to 'qXfer:features:read:target.xml'.

They also informed me about the plugin.process.gdb-remote.target-definition-file LLDB setting and the example target definitions at http://llvm.org/svn/llvm-project/lldb/trunk/examples/python/
I can confirm that using either x86_64_linux_target_definition.py or x86_64_target_definition.py fixes the segfault issue.

I have posted a message to lldb-dev to ask if anyone can see why the XML target definition sent back by the embedded gdbserver might be causing the crash:
http://lists.llvm.org/pipermail/lldb-dev/2015-December/009011.html
Comment 6 Daniel Trebbien 2015-12-04 18:37:41 UTC
Created attachment 95898 [details]
Swap declarations of r15 and rip

Through trial and error I found that the declaration of the 'r15' register is the problematic part of the XML target definition.  Simply commenting out the declaration of 'r15' prevents the crash; however, this also hides the r15 register.

Another solution is to swap the declarations of the 'r15' and 'rip' registers.  The patch implements this approach of swapping the register declarations.

The patch also adds group="general" to the declarations of 'rax' through 'r15'.  This makes the output of the `register read` command nicer in LLDB.  Without group="general", `register read` prints out a number of blank lines and also categorizes these registers as "unknown".
Comment 7 Daniel Trebbien 2015-12-05 13:08:49 UTC
Created attachment 95906 [details]
Add generic register info

Swapping the 'rip' and 'r15' register declarations is not the right fix.

In this case, as explained by Greg Clayton (http://lists.llvm.org/pipermail/lldb-dev/2015-December/009021.html ), LLDB is looking for the 'arg1' through 'arg6' generic registers.  This information can be supplied by adding appropriate `generic' attributes to the <reg> elements.

This new patch adds `generic' attributes to the declarations of 'rcx', 'rdx', 'rsi', 'rdi', 'rbp', 'rsp', 'r8', 'r9', 'rip', and 'eflags'.
Comment 8 Philippe Waroquiers 2015-12-05 23:41:25 UTC
I already committed (svn 15743) the support for qC (slightly modified patch).

With this, lldb somewhat works on debian8/x86 or Ubuntu14/amd64
(e.g. continue till a breakpoint works).

However, in both environments, I encounter several things not working:
* unwind and/or frame discovery seems to not work properly.
    e.g.  bt   shows an empty frame:
* thread #1: tid = 29071, , stop reason = signal SIGTRAP
  * frame #0: 

            register read  gives an error
(lldb) register read
error: invalid frame

Despite the fact that valgrind gdbserver tells qXfer:features:read+; is supported,
lldb does not send requests to read the target description.

Finally, it is not very clear how to send monitor commands from lldb.
'process plugin packet monitor v.info scheduler'
sends the command v.info scheduler, but the behaviour is strange:
* the first output line is shown (encoded in protocol layout).
* then if continue is given, the rest of the output is shown properly.

Note that the Hc-1 handling seems not very critical, but applying the patch does
not improve the above in any case.
Comment 9 Philippe Waroquiers 2015-12-06 00:09:38 UTC
An additional note: on x86 linux, valgrind gdbserver only reports  the Xfer:features:read+ 
supported if --vgdb-shadow-registers=yes is given.
On amd64 linux, Xfer:features:read+ is reported as supported if either shadow registers are
requested or if the host has avx register.
Otherwise, valgrind gdbserver expects that the debugger knows the register layout
of x86 or amd64.
Comment 10 Daniel Trebbien 2015-12-07 18:03:25 UTC
(In reply to Philippe Waroquiers from comment #9)
> Otherwise, valgrind gdbserver expects that the debugger knows the register
> layout of x86 or amd64.

I think that this is what is causing the problem; i.e. that unlike gdb, lldb does not have built-in knowledge of the x86 and amd64 architectures.

On my system (OS X 10.11.1 and Core i7 with AVX support), the embedded gdbserver responds with qXfer:features:read+ and lldb retrieves the target.xml.  In my case, the embedded gdbserver sends back amd64-avx-coresse.xml.  When I comment out the <xi:include>s leaving just the <architecture>i386:x86-64</architecture> element, then I also see "frame #0: 0xffffffffffffffff" and `register read' says "error: invalid frame".

I think that the solution is to always respond with qXfer:features:read+ and for the XML target descriptions to have the generic register information.

I am looking at the LLDB sources, within source/Plugins/ABI, to see what the appropriate generic registers are for the different architectures.
Comment 11 Daniel Trebbien 2015-12-07 21:55:22 UTC
Created attachment 95931 [details]
Add generic register info (2)

This patch adds LLDB generic register info to 32bit-core.xml, 64bit-core.xml, arm-core.xml, mips-cpu.xml, mips64-cpu.xml, power-core.xml, and power64-core.xml based on LLDB sources:
http://llvm.org/svn/llvm-project/lldb/trunk/source/Plugins/ABI/

s390x-core64.xml is not touched because it appears that LLDB does not support s390.

This patch also includes a small change in server.c so that qXfer:features:read+ is always included in the response to a 'qSupported' packet.
Comment 12 Daniel Trebbien 2015-12-07 21:58:19 UTC
Comment on attachment 95852 [details]
Add support for 'qC' packets

The "Add support for 'qC' packets" patch is obsolete because a slightly modified version was committed as Subversion r15743.
Comment 13 Philippe Waroquiers 2015-12-07 21:59:38 UTC
(In reply to Daniel Trebbien from comment #10)
> (In reply to Philippe Waroquiers from comment #9)
> > Otherwise, valgrind gdbserver expects that the debugger knows the register
> > layout of x86 or amd64.
> 
> I think that this is what is causing the problem; i.e. that unlike gdb, lldb
> does not have built-in knowledge of the x86 and amd64 architectures.
> 
> On my system (OS X 10.11.1 and Core i7 with AVX support), the embedded
> gdbserver responds with qXfer:features:read+ and lldb retrieves the
> target.xml.  In my case, the embedded gdbserver sends back
> amd64-avx-coresse.xml.  When I comment out the <xi:include>s leaving just
> the <architecture>i386:x86-64</architecture> element, then I also see "frame
> #0: 0xffffffffffffffff" and `register read' says "error: invalid frame".
> 
> I think that the solution is to always respond with qXfer:features:read+ and
> for the XML target descriptions to have the generic register information.
> 
> I am looking at the LLDB sources, within source/Plugins/ABI, to see what the
> appropriate generic registers are for the different architectures.

On debina8/x86, when using --vgdb-shadow-registers=yes, valgrind gdbserver reports
qXfer:features:read+;
but lldb 3.5.0 does not read target.xml

When I use gdb with the same setup, it requests to read target.xml
Comment 14 Daniel Trebbien 2015-12-07 22:51:43 UTC
(In reply to Philippe Waroquiers from comment #13)
> On debina8/x86, when using --vgdb-shadow-registers=yes, valgrind gdbserver
> reports
> qXfer:features:read+;
> but lldb 3.5.0 does not read target.xml
> 
> When I use gdb with the same setup, it requests to read target.xml

Hmmm, yes, I can confirm that I see this issue (using lldb 3.5 on debian8/x86).  I am taking a look at this.
Comment 15 Daniel Trebbien 2015-12-07 23:32:44 UTC
Looking through the sources of the release_35 and release_36 branches, I see that LLDB 3.5 and 3.6 do not support the target.xml or target definition file ways of specifying register information.  See ProcessGDBRemote::BuildDynamicRegisterInfo():
http://llvm.org/svn/llvm-project/lldb/branches/release_35/source/Plugins/Process/gdb-remote/ProcessGDBRemote.cpp
http://llvm.org/svn/llvm-project/lldb/branches/release_36/source/Plugins/Process/gdb-remote/ProcessGDBRemote.cpp

LLDB 3.7 is the first to add this support:
http://llvm.org/svn/llvm-project/lldb/branches/release_37/source/Plugins/Process/gdb-remote/ProcessGDBRemote.cpp

I think that to get this working with LLDB 3.5 and 3.6, the embedded gdbserver would need to respond to 'qRegisterInfo XX' packets.  'qRegisterInfo' is specific to LLDB.  It is documented here:
http://llvm.org/svn/llvm-project/lldb/trunk/docs/lldb-gdb-remote.txt

Adding support for 'qRegisterInfo' would entail using an XML parser in the embedded server which can build up the XML target description (resolving the <xi:include>s if any), and then process the <reg> elements.

Would adding a dependency on an XML parser be out of the question?  Preferably libxml2.
Comment 16 Philippe Waroquiers 2015-12-08 21:56:04 UTC
(In reply to Daniel Trebbien from comment #15)
> Looking through the sources of the release_35 and release_36 branches, I see
> that LLDB 3.5 and 3.6 do not support the target.xml or target definition
> file ways of specifying register information.  See
> ProcessGDBRemote::BuildDynamicRegisterInfo():
> http://llvm.org/svn/llvm-project/lldb/branches/release_35/source/Plugins/
> Process/gdb-remote/ProcessGDBRemote.cpp
> http://llvm.org/svn/llvm-project/lldb/branches/release_36/source/Plugins/
> Process/gdb-remote/ProcessGDBRemote.cpp
> 
> LLDB 3.7 is the first to add this support:
> http://llvm.org/svn/llvm-project/lldb/branches/release_37/source/Plugins/
> Process/gdb-remote/ProcessGDBRemote.cpp
> 
> I think that to get this working with LLDB 3.5 and 3.6, the embedded
> gdbserver would need to respond to 'qRegisterInfo XX' packets. 
> 'qRegisterInfo' is specific to LLDB.  It is documented here:
> http://llvm.org/svn/llvm-project/lldb/trunk/docs/lldb-gdb-remote.txt
> 
> Adding support for 'qRegisterInfo' would entail using an XML parser in the
> embedded server which can build up the XML target description (resolving the
> <xi:include>s if any), and then process the <reg> elements.
> 
> Would adding a dependency on an XML parser be out of the question? 
> Preferably libxml2.
Valgrind core cannot be linked with a library (to avoid problems of interactions
with guest processes that are using the same library).
So, linking with libxml2 is out of the question.

Also, if lldb 3.7 properly supports target.xml, then it looks to me good enough
to make the changes needed to have lldb 3.7 working properly:
As I understood, it is needed to do some modifications to the xml files, to add
some lldb specific xml elements, such as altname or generic.
What we have to ensure is that these elements unknown to gdb are not causing a problem
(the idea is that valgrind gdbserver supports various gdb versions).
 Otherwise, either lldb has to be modified to work with the current xml files,
or alternatively, we maintain 2 sets of xml files : the 'gdb' version and the 'lldb' version
(we can maybe build the first one by automatically editing the 2nd one during build phase).
Comment 17 Philippe Waroquiers 2015-12-08 22:00:16 UTC
Also, without a reasonable working equivalent of gdb 'monitor' command, a lot
of Valgrind gdbserver features are not usable.