Avoiding and debugging deadlocks

The recommended order for locking very important objects is getting a handler or an update lock first, if required, followed by read locks or write locks. This is not the only ordering that's possible, but this is the recommended order.

The following example demonstrates a common pitfall:

vip_t::handlerlock(vip).install(notifier, *vip_t::readlock(vip));

// elsewhere...

vip_t::updatelock(vip).update(vip2_value);

This innocent code results in undefined behavior, and subtle deadlocks. This is because the compiler can generate code which acquires the read lock before or after the handler lock, sometimes depending on the compilation options; or even a different order for different occurences of the same code sequence, depending on other code in the same function or method. The compiler is allowed to evaluate the argument to install(), which acquires a read lock, before or after the compiler instantiates the temporary handler lock object.

The end result: one thread gets a read lock on the very important object. At the same time, another thread gets an update lock. Then, the first thread tries to get a handler lock, which gets blocked by the update lock held by the second thread. The second thread invokes update(), which tries to get a write lock, which gets blocked by the read lock held by the first thread. Deadlock.

Always instantiate very important object locks individually, in their individual sequence points, so that the order of lock acquisition is consistent, and is not implementation-defined. Avoid temporary lock objects, which are subject to reordering within their sequence point span. Use the x::vipobjdebug class template to isolate lock acquisition in the wrong order:

typedef x::vipobjdebug<vipintvalue> vip_t;

x::vipobjdebug derives from, and implements the same interface as x::vipobj, and adds additional checks that throw an exception if:

x::vipobjdebug adds significant overhead, and should only be used for debugging purposes. This is just a runtime check, and it's not possible to detect situations where an implementation-defined compilation order results in the correct locking order. This time.

When a conflicting deadlock sequence gets detected, a fatal message gets logged, a backtrace gets logged at the trace level, and an exception gets thrown. Setting the x::vipobjdebug_base::abort property to true results in abort(3), and a core dump instead of a thrown exception.

Warning

x::vipobjdebug gives unreliable results if lock instances are allocated on the heap, and juggled between different threads, or if pthread_cancel(3) terminates a thread without unwinding the stack. This debugging class gives accurate results only if lock instances are allocated on the stack. As noted elsewhere, pthread_cancel(3) cannot be used with LIBCXX.