***Update**: after rebasing on top of
https://github.com/zed-industries/zed/pull/16915/ (that also changed how
search worked), the results are still good, but not 10x. Instead of
going from 10s to 500ms, it goes from 10s to 3s.*
This improves the performance of project-wide search by an order of
magnitude.
After digging in, @as-cii and I found that opening buffers was the
bottleneck for project-wide search (since Zed opens, parses, ... buffers
when finding them, which is something VS Code doesn't do, for example).
So this PR improves the performance of opening multiple buffers at once.
It does this by doing two things:
- It batches scan-requests in the worktree. When we search, we search
files in chunks of 64. Previously we'd handle all 64 scan requests
separately. The new code checks if the scan requests can be batched and
if so it, it does that.
- It batches `git status` calls when reloading the project entries for
the opened buffers. Instead of calling `git status` for each file, it
calls `git status` for a batch of files, and then extracts the status
for each file.
(It has to be said that I think the slow performance on `main` has been
a regression introduced over the last few months with the changes made
to project/worktree/git. I don't think it was this slow ~5 months ago.
But I also don't think it was this fast ~5 months ago.)
## Benchmarks
| Search | Before | After (without
https://github.com/zed-industries/zed/pull/16915) | After (with
https://github.com/zed-industries/zed/pull/16915)
|--------|--------|-------|------|
| `zed.dev` at `2b2a501192e78e`, searching for `<` (`4484` results) |
3.0s<br>2.9s<br>2.89s | 489ms<br>517ms<br>476ms | n/a |
| `zed.dev` at `2b2a501192e78e`, searching for `:` (`25886+` results) |
3.9s<br>3.9s<br>3.8s | 70ms<br>66ms<br>72ms | n/a |
| `zed` at `55dda0e6af`, searching for `<` (`10937+` results) |
10s<br>11s<br>12s | 500m<br>499ms<br>543ms | 3.4s<br>3.1s<br> |
(All results recorded after doing a warm-up run that would start
language servers etc.)
Release Notes:
- Performance of project-wide search has been improved by up to 10x.
---------
Co-authored-by: Antonio <antonio@zed.dev>
Previously, each git `Repository` object was held inside of a mutex.
This was needed because libgit2's Repository object is (as one would
expect) not thread safe. But now, the two longest-running git operations
that Zed performs, (`status` and `blame`) do not use libgit2 - they
invoke the `git` executable. For these operations, it's not necessary to
hold a lock on the repository.
In this PR, I've moved our mutex usage so that it only wraps the libgit2
calls, not our `git` subprocess spawns. The main user-facing impact of
this is that the UI is much more responsive when initially opening a
project with a very large git repository (e.g. `chromium`, `webkit`,
`linux`).
Release Notes:
- Improved Zed's responsiveness when initially opening a project
containing a very large git repository.
I realized that somehow, the `git` executable is able to compute `git
status` much more quickly than libgit2, so I've switched our git status
logic to use `git`. Follow-up to
https://github.com/zed-industries/zed/pull/12266.
Release Notes:
- Improved the performance of git status updated when working in large
git repositories.
This PR adds a registry for `GitHostingProvider`s.
The intent here is to help decouple these provider-specific concerns
from the lower-level `git` crate.
Similar to languages, the Git hosting providers live in the new
`git_hosting_providers` crate.
This work also lays the foundation for if we wanted to allow defining a
`GitHostingProvider` from within an extension. This could be useful if
we wanted to extend the support to work with self-hosted Git providers
(like GitHub Enterprise).
I also took the opportunity to move some of the provider-specific code
out of the `util` crate, since it had leaked into there.
Release Notes:
- N/A