PL

PLCrashReporter 1.2-beta1 (and ARM64 Support!)

September 13, 2013, by Landon Fuller

I’m pleased to announce the first beta release of PLCrashReporter 1.2. Plausible CrashReporter provides an open source in-process crash reporting framework for use on both the iPhone and Mac OS X, and is used by most of the first-tier commercial crash reporting services for Mac OS X and iOS.

This is the first major update to PLCrashReporter’s design since the 1.0 release, and there’s a lot of significant improvements — and we’ve set the stage for some even more significant enhancements for Mac OS X and iOS in the future. The extensive work on this release was funded by Plausible Labs and HockeyApp via the PLCrashReporter Consortium.

New features in this release include:

  • Experimental ARM64 support.
  • Mach-based exception handling on Mac OS X and iOS (configurable).
  • Client-side symbolication using the Mach-O symbol table and Objective-C meta-data (configurable).
  • Enhanced stack unwinding using both DWARF and Apple’s Compact Unwind data when available (i386, and x86-64, ARM64 forthcoming).
  • Support for tracking preserved non-volatile registers across frame walking. Allows for providing non-volatile register state for frames other than the first frame when using compact or DWARF-based unwinding.
  • Back-end support for out-of-process execution.
  • A unique incident identifier is now included in all reports.
  • Reports now include the application’s start time. This can be used to determine (along with the crash report timestamp) if an application is crashing on launch.
  • Build and runtime configuration support for enabling/disabling local symbolication and Mach exception-based reporting.
  • Mac OS X x86-64 is now a fully supported target.

You can download the latest release here, or review the full API Documentation.

More details on a few of the big (and cool) features:

ARM64 Support

We’ve implemented baseline support for ARM64, including all the necessary assembly/architecture code changes. For this initial release — prior to the availability of actual iPhone 5S hardware — we’re providing a separate binary release that includes ARM64 support. This is intended to allow projects that depend on PLCrashReporter to experiment with integrating arm64 into their builds; applications should not be released with PLCrashReporter/ARM64 until the implementation has been validated against actual hardware.

Once we have ARM64 hardware in hand, we’ll validate our implementation via our test suite and fix any issues that likely exist. One of the most exciting changes that we’ll be investigating after the release of the iPhone 5S is support for frame unwinding using the now-available ARM64 compact unwind and DWARF eh_frame data; this will provide the best possible stack traces on iOS, and has not been available for arm32 targets.

Mach Exception Handling

This release supports the use of optional Mach exception handling, rather than a standard POSIX signal handler. The use of Mach exception handling can be configured at runtime, or can easily be excluded/included from the entire build at compile-time.

Mach exceptions differ from POSIX signals in three significant ways:

  • Exception information is delivered as a Mach message via a Mach IPC port, rather than by the kernel calling into a userspace trampoline.
  • Exception handlers may be registered by any process that has the appropriate mach port rights for the target process.
  • Exception handlers may be registered for a specific thread, a specific task (process), or for the entire host. The kernel will search for handlers in that order.

These properties can be useful for a crash reporter; they allow us to operate a reporter entirely out-of-process on platforms where this is supported (eg, Mac OS X), they allow us to create multiple tiers of crash reporting (eg, we can register a per-thread Mach exception handler that detects only crashes that occur in our own crash reporter), and they allow us to catch crashes that leave the currently executing thread in a non-viable state (such as due to a stack overflow, in which case there is no room on the target thread’s stack for the signal handler’s frame).

However, there are also some downsides, which is why Mach exceptions have been such a long time coming to PLCrashReporter, and why they remain optional:

  • On iOS, the APIs required to implement Mach exception handling are not fully public — more details on the implications of this may be found in the API documentation referenced below.
  • A Mach exception handler may conflict with any managed runtime that registers a BSD signal handler that can safely handle otherwise fatal signals, allowing execution to proceed. This includes products such as Xamarin for iOS.
  • Interpretation of particular fault types often requires information that is architecture/kernel specific, and either partially defined or undefined.

In some circles, Mach exception handling has been described as the “holy grail” of crash reporting. I think that’s a bit of a misnomer; I’d be tempted to call them the “holy hand grenade”; they provide some advantages, but can just as easily explode in an implementor’s hand. In the process of implementing this feature, we found (and worked around) two separate kernel bugs that resulted in an in-kernel deadlock caused by in-process use of Mach exceptions. The fact is that Apple treats Mach exceptions as a partially exposed private API, and the only truly supported consumer of Mach exceptions is Apple’s own Crash Reporting implementation.

Our general recommendation is to continue to use POSIX signal handlers on iOS; for further information, refer to PLCrashReporter’s Mach Exceptions on Mac OS X and iOS documentation.
Mach exception handling may be enabled via -[PLCrashReporter initWithConfiguration:].

Client-side Symbolication

While DWARF debugging information is necessary for first-class symbolication, it’s not always available; for example, when running an in-development copy of your code on your phone for which you lost the dSYM by performing a rebuild. Traditionally, the crash report generated by such a case is useless, as you have no reasonable way of matching it up to even symbol names.

To help in these instances, we’ve implemented support for client-side symbolication, which will provide basic symbol information even when the dSYM is long-gone. Our implementation goes quite a bit beyond most other systems, in that in addition to using the Mach-O symbol table (which is often stripped, or in the case of iOS, all symbol names are renamed to <redacted>), Mike Ash implemented async-safe introspection of the runtime Objective-C metadata to fetch class and method names for all symbols implemented in Objective-C. As far as we know, we’re the only crash reporting implementation to do this, and we think it’s pretty neat.

Client-side symbolication may be enabled via -[PLCrashReporter initWithConfiguration:]; since release builds should always have dSYMs, we recommend only enabling client-side symbolication for non-release builds.

Enhanced Stack Unwinding

On x86-64 and i386 (and soon, ARM64!), additional unwinding data is provided in-binary, and may be used to both produce better stack traces, but also to provide the state of non-volatile registers at each stage of the stack frame. To support this, we’ve implemented fully async-safe and portable implementations of DWARF eh_frame stack unwinding, as well as support for Apple’s Compact Frame encoding. This should significantly improve stack traces on Mac OS X, and once we have our hands on the hardware and can add ARM support, ARM64.

In the future, we will also be exposing the enhanced register state for all frames, making it even easier to dig into the state of the process at the time of the crash.