This introduces semantic indexing in Zed based on chunking text from
files in the developer's workspace and creating vector embeddings using
an embedding model. As part of this, we've created an embeddings
provider trait that allows us to work with OpenAI, a local Ollama model,
or a Zed hosted embedding.
The semantic index is built by breaking down text for known
(programming) languages into manageable chunks that are smaller than the
max token size. Each chunk is then fed to a language model to create a
high dimensional vector which is then normalized to a unit vector to
allow fast comparison with other vectors with a simple dot product.
Alongside the vector, we store the path of the file and the range within
the document where the vector was sourced from.
Zed will soon grok contextual similarity across different text snippets,
allowing for natural language search beyond keyword matching. This is
being put together both for human-based search as well as providing
results to Large Language Models to allow them to refine how they help
developers.
Remaining todo:
* [x] Change `provider` to `model` within the zed hosted embeddings
database (as its currently a combo of the provider and the model in one
name)
Release Notes:
- N/A
---------
Co-authored-by: Nathan Sobo <nathan@zed.dev>
Co-authored-by: Antonio Scandurra <me@as-cii.com>
Co-authored-by: Conrad Irwin <conrad@zed.dev>
Co-authored-by: Marshall Bowers <elliott.codes@gmail.com>
Co-authored-by: Antonio <antonio@zed.dev>
This PR adds an `zed: Install Local Extension` action, which lets you
select a path to a folder containing a Zed extension, and install that .
When you select a directory, the extension will be compiled (both the
Tree-sitter grammars and the Rust code for the extension itself) and
installed as a Zed extension, using a symlink.
### Details
A few dependencies are needed to build an extension:
* The Rust `wasm32-wasi` target. This is automatically installed if
needed via `rustup`.
* A wasi-preview1 adapter WASM module, for building WASM components with
Rust. This is automatically downloaded if needed from a `wasmtime`
GitHub release
* For building Tree-sitter parsers, a distribution of `wasi-sdk`. This
is automatically downloaded if needed from a `wasi-sdk` GitHub release.
The downloaded artifacts are cached in a support directory called
`Zed/extensions/build`.
### Tasks
UX
* [x] Show local extensions in the Extensions view
* [x] Provide a button for recompiling a linked extension
* [x] Make this action discoverable by adding a button for it on the
Extensions view
* [ ] Surface errors (don't just write them to the Zed log)
Packaging
* [ ] Create a separate executable that performs the extension
compilation. We'll switch the packaging system in our
[extensions](https://github.com/zed-industries/extensions) repo to use
this binary, so that there is one canonical definition of how to
build/package an extensions.
### Release Notes:
- N/A
---------
Co-authored-by: Marshall <marshall@zed.dev>
Co-authored-by: Marshall Bowers <elliott.codes@gmail.com>
This PR updates the tenses used by the summary line of doc comments to
match the [Rust API documentation
conventions](https://rust-lang.github.io/rfcs/1574-more-api-documentation-conventions.html#summary-sentence).
Specifically:
> The summary line should be written in third person singular present
indicative form. Basically, this means write ‘Returns’ instead of
‘Return’.
I'm sure there are plenty occurrences that I missed.
Release Notes:
- N/A
Fix mislocation of caller query in detach_and_log_error
Fix incorrect wording on livekit integration
Add share_mic action for manually enabling the microphone
Make mic sharing wait until the room has been fully established
Drop dependency on tokio introduced by async-openai and do it ourselves.
The approach I'm taking of replacing instead of appending is causing issues. Need to just append.
When a host sends a buffer to a guest for the first time, they record that
they have done so in a set tied to that guest's peer id. When the guest
reconnects and syncs buffers, they do so under a different peer id, so we
need to be sure we track which buffers we have sent them to avoid sending
them the same buffer twice, which violates the guest's assumptions.
We were seeing non-deterministic behavior in randomized tests when
generating backtraces took enough time to cause transactions to group
in some cases, but not group in others.
Tests will need to explicitly opt into grouping if they want it by
setting the interval explicitly. We have tests in the text module that
currently test the history grouping explicitly, but I'm not sure
it's needed elsewhere.
We now run tests that interact with the real database under a Tokio reactor. We make the tests run multi-threaded so we can block on the main thread on database teardown and still make progress actually tearing down the DB.
Co-Authored-By: Max Brunsfeld <maxbrunsfeld@gmail.com>
* Make advance_clock more realistic by waking timers in order,
instead of all at once.
* Don't advance the clock when simulating random delays.
Co-Authored-By: Keith Simmons <keith@zed.dev>
Co-Authored-By: Nathan Sobo <nathan@zed.dev>