Native Client Support for Debugging, Crash Reporting and Hardware Exception Handling
High Level Design
Contributors: Brad Chen, Noel Allen, Mark Seaborn, Evgeny Eltsin
Link to earlier draft: https://docs.google.com/a/google.com/document/d/1tu2FEA4EKhBH669iUgRZBDBcEd6jzNQ-0OVn9JI4_qk/edit?disco=AAAAAEJ6JP0
This document provides a high-level design for a set of related functional areas in Native Client: debugging, crash reporting, and hardware exception handling. It covers key differences between 32-bit and 64-bit x86 systems, and how they are addressed. It also attempts to present a design sketch consistent with the combined requirements from all these functional areas, although it may not fully document all of these requirements.
Debugging, crash reporting and hardware exception handling are areas that have been slow to arrive to Native Client. They share the common property of involving asynchronous interactions between the untrusted Native Client code and the operating system. Note that crash reporting is a special case of hardware exception handling, in which a hardware exception is handled by a crash reporting infrastructure rather than a general purpose exception handling infrastructure. Also note that when a debugger is attached to a Native Client process, hardware exceptions should be passed to a debugger before any crash reporting or exception handling mechanism.
The use of segment registers by the 32-bit sandbox, and the operating system interaction this requires, makes the 32-bit case somewhat more complicated. So we will consider the 64-bit case first.
For debugging, our ambitions for the first half of 2012 are limited to providing command-line GDB support. This document will not discuss longer-term team goals. With this as a base, IDE support should be possible as per the approach taken by the Eclipse IDE, and also Visual Studio support as with WinGDB (www.wingdb.com). While such efforts may be undertaken by third parties, they are out of scope with respect to our own efforts.
For 64-bit systems, there’s nothing special about a Native Client process from the perspective of the operating system. While sandboxed code sequences and address space layout of a Native Client computations are somewhat peculiar, these properties don’t impact normal interaction with the operating system via system calls, exception handling, CPU scheduler, memory management etc. Caveat, in certain linux distros the large virtual address space reservation (~100GB) conflicts with system configurations. Such configuration issues could conceivably occur on other systems as well. Not that MacOS always uses 32-bit Native Client.
For specific functionality:
GDB has good DWARF support, suggesting we might use it as the basis of debug support on Windows, Mac, and Linux. As it is open source, modifications to support Native Client address interpretation are relatively straightforward. There is an additional question of whether to use an attached external debugger, using OS interfaces to update debuggee memory remotely, or a debug-stub that implements the GNU Remote Serial Protocol (RSP). In the short term, Evgeny Eltsin has a Linux implementation of the external debugger approach and should be able to port it to Windows with minimal effort to support our current 64-bit platforms.
As the interaction between Native Client and 64-bit operating systems is relatively simple, we should be able to use these standard approaches on these platforms for untrusted Native Client exception handling ABI.
On Linux the POSIX-style signal() API makes it possible to implement Native Client exception handling and crash reporting in a relatively straightforward way.
On MacOS, Breakpad crash reporting uses Mach interfaces for exception handling, which take priority over MacOS signal() support. For this reason, Native Client must also use the Mach interfaces to catch hardware exceptions in coordination with Breakpad.
For 32-bit Windows, a Native Client process will confuse the OS due to use of x86 segmented memory. As a result, the Windows kernel will terminate a Native Client process that raises a hardware exception, rather than attempting to deliver the exception. It will however deliver an exception to a debugger if there is a debugger associated with the process. So, given the use of segments in the current sandbox, we must associate a debug process with a Native Client process in order catch hardware exceptions. While the same debug process might also implement the GDB Remote Serial Protocol (RSP), to support a 32-bit GDB debugger, this would introduce differences between Windows and the other platforms.
Instead, the proposed design would build RSP support into sel_ldr on all platforms. On Linux and Mac, no other processes are needed, and the implementation is relatively straightforward. On Windows, an exception would initially be delivered to the NaCl process’s debugger, which could shuttle the exception over to sel_ldr via IPC. In this way, all three platforms would use substantially the same RSP implementation, with the only divergence being how exception events are delivered on Windows.
Once we have stabilized the sel_ldr debug stub for 32-bit debugging, we may consider using it for 64-bit debugging as well. This should be investigated after 32-bit debugging is stabilized.
Both 32- and 64-bit NaCl version of GDB should provide the following benefits
Exception handling paths:
Linux-32 and 64:
MacOS: 32-bit only
Windows-32 and 64: