linux: run runnables only when event loop is idle (#12839)

This change ensures that the event loop prioritizes enqueueing another render or handling user input over executing runnables. It's a subtle change as a result of a week of digging into performance on X11. It's also not perfect: ideally we'd get rid of the intermediate channel here and had more control over when and how we run runnables vs. X11 events, but I think short of rewriting how we use an event loop, this is good cost/benefit change. To illustrate: Before this change, it was possible to block the app from rendering for a long time by just creating a ton of futures that were executed on the "main" thread (we don't have a "main" thread on Linux, but we have a single thread in which we run the event loop). That was relatively easy to reproduce by opening the `zed` repository and starting `rust-analyzer`: at some point `rust-analyzer` sends us so many notifications, that are all handled in futures, that the event loop is busy just working off the runnables, never getting to the events that X11 sends us or our own timer to re-enqueue another render. When you put print statements into the code to show when which event was handled, you'd see something like this **before this change**: ``` [ ... hundreds of runnable.run() ... ] runnable.run() runnable.run() runnable.run() runnable.run() runnable.run() runnable.run() runnable.run() runnable.run() runnable.run() runnable.run() runnable.run() runnable.run() runnable.run() runnable.run() runnable.run() runnable.run() runnable.run() new render tick timer. lag: 56.942049ms X11 event new render tick timer. lag: 9.668µs X11 event new render tick timer. lag: 9.955µs X11 event runnable.run() runnable.run() runnable.run() runnable.run() new render tick timer. lag: 12.462µs X11 event new render tick timer. lag: 14.868µs X11 event new render tick timer. lag: 11.234µs X11 event new render tick timer. lag: 11.681µs X11 event new render tick timer. lag: 13.926µs X11 event ``` Note the `lag: 56ms`: that's the difference between when we wanted to execute the callback that enqueues another render and when it ran. Longer lags are possible, this is just the first one I grabbed from the logs. Now, compare this with the logs **after this change**: ``` runnable.run() runnable.run() runnable.run() runnable.run() runnable.run() runnable.run() runnable.run() runnable.run() runnable.run() runnable.run() runnable.run() runnable.run() runnable.run() runnable.run() runnable.run() runnable.run() runnable.run() runnable.run() runnable.run() runnable.run() runnable.run() runnable.run() new render tick timer. lag: 36.051µs runnable.run() runnable.run() runnable.run() runnable.run() runnable.run() runnable.run() runnable.run() runnable.run() runnable.run() X11 event runnable.run() runnable.run() runnable.run() runnable.run() runnable.run() runnable.run() runnable.run() runnable.run() runnable.run() runnable.run() runnable.run() runnable.run() runnable.run() runnable.run() ``` In-between many `runnable.run()` we'll always handle events. So, in essence, what this change does is to introduce 2 priorities into the X11 event queue: - high: X11 events (user events, render events, ...), render tick, XIM events, ... - low: all async rust code I've tested this with a debug build and release build and I think the app now feels more responsive. It doesn't feel perfect still, especially in the slow debug builds, but I couldn't observe 10s lockups anymore. Since it's a pretty small change, I think we should go for it and see how it behaves. Thanks to @maan2003 this now also includes the same change to Wayland. Release Notes: - N/A --------- Co-authored-by: maan2003 <manmeetmann2003@gmail.com>
2024-06-10 14:04:41 +02:00 · 2024-06-10 14:04:41 +02:00 · 43d1a8040d
commit 43d1a8040d
parent e829a8c3b0
2 changed files with 19 additions and 6 deletions
--- a/crates/gpui/src/platform/linux/wayland/client.rs
+++ b/crates/gpui/src/platform/linux/wayland/client.rs
@ -403,9 +403,14 @@ impl WaylandClient {

        let handle = event_loop.handle();
        handle
-            .insert_source(main_receiver, |event, _, _: &mut WaylandClientStatePtr| {
-                if let calloop::channel::Event::Msg(runnable) = event {
-                    runnable.run();
+            .insert_source(main_receiver, {
+                let handle = handle.clone();
+                move |event, _, _: &mut WaylandClientStatePtr| {
+                    if let calloop::channel::Event::Msg(runnable) = event {
+                        handle.insert_idle(|_| {
+                            runnable.run();
+                        });
+                    }
                }
            })
            .unwrap();
--- a/crates/gpui/src/platform/linux/x11/client.rs
+++ b/crates/gpui/src/platform/linux/x11/client.rs
@ -165,9 +165,17 @@ impl X11Client {
        let handle = event_loop.handle();

        handle
-            .insert_source(main_receiver, |event, _, _: &mut X11Client| {
-                if let calloop::channel::Event::Msg(runnable) = event {
-                    runnable.run();
+            .insert_source(main_receiver, {
+                let handle = handle.clone();
+                move |event, _, _: &mut X11Client| {
+                    if let calloop::channel::Event::Msg(runnable) = event {
+                        // Insert the runnables as idle callbacks, so we make sure that user-input and X11
+                        // events have higher priority and runnables are only worked off after the event
+                        // callbacks.
+                        handle.insert_idle(|_| {
+                            runnable.run();
+                        });
+                    }
                }
            })
            .unwrap();