Fix potential race-condition in DisplayLink::drop on macOS (#32116)
Fix a segfault in CVDisplayLink We see 1-2 crashes a day on macOS on the `CVDisplayLink` thread. ``` Segmentation fault: 11 on thread 9325960 (CVDisplayLink) CoreVideo CVHWTime::reset() CoreVideo CVXTime::reset() CoreVideo CVDisplayLink::runIOThread() libsystem_pthread.dylib _pthread_start libsystem_pthread.dylib thread_start ``` With the help of the Zed AI, I dove into the crash report, which looks like this: ``` Crashed Thread: 49 CVDisplayLink Exception Type: EXC_BAD_ACCESS (SIGSEGV) Exception Codes: KERN_INVALID_ADDRESS at 0x00000000000001f6 Exception Codes: 0x0000000000000001, 0x00000000000001f6 Thread 49 Crashed:: CVDisplayLink 0 CoreVideo 0x18c1ed994 CVHWTime::reset() + 64 1 CoreVideo 0x18c1ee474 CVXTime::reset() + 52 2 CoreVideo 0x18c1ee198 CVDisplayLink::runIOThread() + 176 3 libsystem_pthread.dylib 0x18285ac0c _pthread_start + 136 4 libsystem_pthread.dylib 0x182855b80 thread_start + 8 Thread 49 crashed with ARM Thread State (64-bit): x0: 0x0000000000000000 x1: 0x000000018c206e08 x2: 0x0000002c00001513 x3: 0x0001d4630002a433 x4: 0x00000e2100000000 x5: 0x0001d46300000000 x6: 0x000000000000002c x7: 0x0000000000000000 x8: 0x000000000000002e x9: 0x000000004d555458 x10: 0x0000000000000000 x11: 0x0000000000000000 x12: 0x0000000000000000 x13: 0x0000000000000000 x14: 0x0000000000000000 x15: 0x0000000000000000 x16: 0x0000000182856a9c x17: 0x00000001f19bc540 x18: 0x0000000000000000 x19: 0x0000600003c56ed8 x20: 0x000000000002a433 x21: 0x0000000000000000 x22: 0x0000000000000000 x23: 0x0000000000000000 x24: 0x0000000000000000 x25: 0x0000000000000000 x26: 0x0000000000000000 x27: 0x0000000000000000 x28: 0x0000000000000000 fp: 0x000000016b02ade0 lr: 0x000000018c1ed984 sp: 0x000000016b02adc0 pc: 0x000000018c1ed994 cpsr: 0x80001000 far: 0x00000000000001f6 esr: 0x92000006 (Data Abort) byte read Translation fault Binary Images: 0x1828c9000 - 0x182e07fff com.apple.CoreFoundation (6.9) <df489a59-b4f6-32b8-9bb4-9b832960aa52> /System/Library/Frameworks/CoreFoundation.framework/Versions/A/CoreFoundation ``` Using lldb to disassemble `CVHWTime::reset()` (and the AI to interpret it), the crash is caused by dereferencing the pointer at the start of the CVHWTime struct + 0x1c8. In this case the pointer has (the clearly nonsense) value 0x2e (and 0x2e + 0x1c8 = 0x1f6, the failing address). As to how this could happen... Looking at the implementation of `CVDisplayLinkRelease`, it calls straight into `CFRelease` on the main thread; and so it is not safe to call `CVDisplayLinkRelease` concurrently with other threads that access the CVDisplayLink. While we already stopped the display link, it turns out that `CVDisplayLinkStop` just sets a flag on the struct to instruct the io-thread to exit "soon", and returns immediately. That means we don't know when the other thread will actually exit, and so we can't safely call `CVDisplayLinkRelease`. So, for now, we just leak these objects. They should be created relatively infrequently (when the app is foregrounded/backgrounded), so I don't think this is a huge problem. Release Notes: - Fix a rare crash on macOS when putting the app in the background.
This commit is contained in:
parent
7c64737e00
commit
a2e98e9f0e
1 changed files with 12 additions and 4 deletions
|
@ -12,7 +12,7 @@ use std::ffi::c_void;
|
|||
use util::ResultExt;
|
||||
|
||||
pub struct DisplayLink {
|
||||
display_link: sys::DisplayLink,
|
||||
display_link: Option<sys::DisplayLink>,
|
||||
frame_requests: dispatch_source_t,
|
||||
}
|
||||
|
||||
|
@ -59,7 +59,7 @@ impl DisplayLink {
|
|||
)?;
|
||||
|
||||
Ok(Self {
|
||||
display_link,
|
||||
display_link: Some(display_link),
|
||||
frame_requests,
|
||||
})
|
||||
}
|
||||
|
@ -70,7 +70,7 @@ impl DisplayLink {
|
|||
dispatch_resume(crate::dispatch_sys::dispatch_object_t {
|
||||
_ds: self.frame_requests,
|
||||
});
|
||||
self.display_link.start()?;
|
||||
self.display_link.as_mut().unwrap().start()?;
|
||||
}
|
||||
Ok(())
|
||||
}
|
||||
|
@ -80,7 +80,7 @@ impl DisplayLink {
|
|||
dispatch_suspend(crate::dispatch_sys::dispatch_object_t {
|
||||
_ds: self.frame_requests,
|
||||
});
|
||||
self.display_link.stop()?;
|
||||
self.display_link.as_mut().unwrap().stop()?;
|
||||
}
|
||||
Ok(())
|
||||
}
|
||||
|
@ -89,6 +89,14 @@ impl DisplayLink {
|
|||
impl Drop for DisplayLink {
|
||||
fn drop(&mut self) {
|
||||
self.stop().log_err();
|
||||
// We see occasional segfaults on the CVDisplayLink thread.
|
||||
//
|
||||
// It seems possible that this happens because CVDisplayLinkRelease releases the CVDisplayLink
|
||||
// on the main thread immediately, but the background thread that CVDisplayLink uses for timers
|
||||
// is still accessing it.
|
||||
//
|
||||
// We might also want to upgrade to CADisplayLink, but that requires dropping old macOS support.
|
||||
std::mem::forget(self.display_link.take());
|
||||
unsafe {
|
||||
dispatch_source_cancel(self.frame_requests);
|
||||
}
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue