ZIm/crates/fuzzy
Umesh Yadav 8db0333b04
Fix out-of-bounds panic in fuzzy matcher with Unicode/multibyte characters (#30546)
This PR fixes a crash in the fuzzy matcher that occurred when handling
Unicode or multibyte characters (such as Turkish `İ` or `ş`). The issue
was caused by the matcher attempting to index beyond the end of internal
arrays when lowercased Unicode characters expanded into multiple
codepoints, resulting in an out-of-bounds panic.

#### Root Cause

The loop in `recursive_score_match` used an upper bound (`limit`)
derived from `self.last_positions[query_idx]`, which could exceed the
actual length of the arrays being indexed, especially with multibyte
Unicode input.

#### Solution

The fix clamps the loop’s upper bound to the maximum valid index for the
arrays being accessed:
```rust
let max_valid_index = (prefix.len() + path_lowercased.len()).saturating_sub(1);
let safe_limit = limit.min(max_valid_index);
for j in path_idx..=safe_limit { ... }
```
This ensures all indexing is safe and prevents panics.

Closes #30269 

Release Notes:

- N/A

---------

Signed-off-by: Umesh Yadav <git@umesh.dev>
2025-05-12 14:43:14 +00:00
..
src Fix out-of-bounds panic in fuzzy matcher with Unicode/multibyte characters (#30546) 2025-05-12 14:43:14 +00:00
Cargo.toml Add workspace-hack (#27277) 2025-04-02 13:26:34 -07:00
LICENSE-GPL chore: Change AGPL-licensed crates to GPL (except for collab) (#4231) 2024-01-24 00:26:58 +01:00