Revision history [back]

click to hide/show revision 1
initial version

Possible deadlock in qcc::Timer::AddAlarm()?

If I host a session on a relatively slow device, I often experience deadlocks when several peers try to join the session all at once. In the debugger, I'm seeing multiple threads blocked on select() inside a call to Event::Wait(Event::neverSet, Event::WAIT_FOREVER), which is called from qcc::Timer::AddAlarm() when there are already maxAlarm alarms installed.

Most of the locked threads are of type TimerThread, but the main thread also gets blocked in Timer::AddAlarm() deep below a call to -[PGMPeerGroupManager getHostPeerIdOfGroup:]

As far as I can tell, the waiting threads in the Timer's addWaitQueue can only be released in TimerThread::Run() or TimerThread::Stop(). Since most of the blocked threads are timer threads, they can't reach the point where they release the next waiter, and the main thread call from the peer group manager doesn't appear to ever release any of those waiters, hence the deadlock.

Is there something I can do to avoid this situation?