ZIm/crates/assistant_tools/src/schema.rs
Michael Sloan 5fafab6e52
Migrate to schemars version 1.0 (#33635)
The major change in schemars 1.0 is that now schemas are represented as
plain json values instead of specialized datatypes. This allows for more
concise construction and manipulation.

This change also improves how settings schemas are generated. Each top
level settings type was being generated as a full root schema including
the definitions it references, and then these were merged. This meant
generating all shared definitions multiple times, and might have bugs in
cases where there are two types with the same names.

Now instead the schemar generator's `definitions` are built up as they
normally are and the `Settings` trait no longer has a special
`json_schema` method. To handle types that have schema that vary at
runtime (`FontFamilyName`, `ThemeName`, etc), values of
`ParameterizedJsonSchema` are collected by `inventory`, and the schema
definitions for these types are replaced.

To help check that this doesn't break anything, I tried to minimize the
overall [schema
diff](https://gist.github.com/mgsloan/1de549def20399d6f37943a3c1583ee7)
with some patches to make the order more consistent + schemas also
sorted with `jq -S .`. A skim of the diff shows that the diffs come
from:

* `enum: ["value"]` turning into `const: "value"`
* Differences in handling of newlines for "description"
* Schemas for generic types no longer including the parameter name, now
all disambiguation is with numeric suffixes
* Enums now using `oneOf` instead of `anyOf`.

Release Notes:

- N/A
2025-06-30 21:07:28 +00:00

63 lines
2.1 KiB
Rust

use anyhow::Result;
use language_model::LanguageModelToolSchemaFormat;
use schemars::{
JsonSchema, Schema,
generate::SchemaSettings,
transform::{Transform, transform_subschemas},
};
pub fn json_schema_for<T: JsonSchema>(
format: LanguageModelToolSchemaFormat,
) -> Result<serde_json::Value> {
let schema = root_schema_for::<T>(format);
schema_to_json(&schema, format)
}
fn schema_to_json(
schema: &Schema,
format: LanguageModelToolSchemaFormat,
) -> Result<serde_json::Value> {
let mut value = serde_json::to_value(schema)?;
assistant_tool::adapt_schema_to_format(&mut value, format)?;
Ok(value)
}
fn root_schema_for<T: JsonSchema>(format: LanguageModelToolSchemaFormat) -> Schema {
let mut generator = match format {
LanguageModelToolSchemaFormat::JsonSchema => SchemaSettings::draft07().into_generator(),
// TODO: Gemini docs mention using a subset of OpenAPI 3, so this may benefit from using
// `SchemaSettings::openapi3()`.
LanguageModelToolSchemaFormat::JsonSchemaSubset => SchemaSettings::draft07()
.with(|settings| {
settings.meta_schema = None;
settings.inline_subschemas = true;
})
.with_transform(ToJsonSchemaSubsetTransform)
.into_generator(),
};
generator.root_schema_for::<T>()
}
#[derive(Debug, Clone)]
struct ToJsonSchemaSubsetTransform;
impl Transform for ToJsonSchemaSubsetTransform {
fn transform(&mut self, schema: &mut Schema) {
// Ensure that the type field is not an array, this happens when we use
// Option<T>, the type will be [T, "null"].
if let Some(type_field) = schema.get_mut("type") {
if let Some(types) = type_field.as_array() {
if let Some(first_type) = types.first() {
*type_field = first_type.clone();
}
}
}
// oneOf is not supported, use anyOf instead
if let Some(one_of) = schema.remove("oneOf") {
schema.insert("anyOf".to_string(), one_of);
}
transform_subschemas(self, schema);
}
}