Mar 2026 • 9 min read
Rewriting the Silver Searcher in Rust
A Rust rewrite of ag that replaces PCRE with the regex crate, reproduces the multiline print state machine, and cuts median runtime roughly 2x on the measured workload -- but rg and ugrep are still faster.
Rewrite study
First benchmark passHistorical first pass on the literal-simple workload. rust-ag roughly halved ag's median runtime, but ripgrep and ugrep still led.
ag to rust-ag
1.96x-2.03x faster
Parity
214/214 Rust tests
Relative rank
Still behind rg and ugrep
Coverage
1 of 38 scenarios
Claim gate
Fails on reproducibility
- On the only measured performance scenario, rust-ag cut the local median from 19.64 ms to 9.69 ms.
- rg and ugrep were still faster on the same workload, at 7.74 ms and 4.02 ms median.
- The Rust rewrite is 51% fewer lines than the C original (3,128 vs 6,098) for equivalent search functionality.
- This was a narrow result: one workload, three measured samples per tool, one Apple M4 machine, and an overall claim gate that still failed on reproducibility bundle validation.
I rewrote the core search path of The Silver Searcher in Rust and benchmarked that rewrite against the original tool. On the only measured performance workload, rust-ag was roughly 2x faster than ag.
That was the honest headline, but not the whole story. rg and ugrep were still faster, the performance data covered only 1 of 38 registered scenarios, and the overall claim gate still returned fail because the reproducibility bundle validation step was not finished. So this was a promising first result, not a victory lap.
The result in one view
The measured workload was the literal-simple scenario: search for "foo" across the repository working tree.
| Tool | Local median | Relative to ag | What to keep in mind |
|---|---|---|---|
ag | 19.64 ms | 1.00x | Baseline |
rust-ag | 9.69 ms | 2.03x faster | Apples-to-apples comparison target |
rg | 7.74 ms | 2.54x faster | Faster here, but it searches a different file set by default |
ugrep | 4.02 ms | 4.89x faster | Fastest here, also with different default traversal semantics |
The key comparison is ag versus rust-ag. That pair was checked for output parity. The cross-tool numbers provide context but are not the same kind of comparison because rg and ugrep make different default choices about traversal and ignore handling.
What was rewritten
The Rust rewrite replaces ag's 23 C source files (~6,098 lines) with 6 Rust modules (~3,128 lines) -- 51% fewer lines for equivalent search functionality. The architectural decisions were deliberate:
| Aspect | C ag | Rust ag |
|---|---|---|
| Regex engine | PCRE (linked C library) | regex crate (pure Rust, no unsafe) |
| Ignore engine | Hash-table with fnmatch | Stack-based IgnoreEngine with push/pop per directory |
| CLI parsing | Custom C parser (options.c, 32K) | Hand-rolled parser (683 lines, no clap/structopt) |
| Threading | pthreads worker pool | Single-threaded (accepts --workers for compat, ignores it) |
| Binary detection | is_binary() in util.c | Byte-for-byte port of ag's heuristics |
| Decompression | zlib/lzma (decompress.c, zfile.c) | Not ported (not needed for search parity) |
| Dependencies | pcre, xz, zlib, pthreads | regex + libc (2 runtime crates) |
The single-threaded design was a deliberate scope cut. ag's pthreads pool adds complexity but the benchmark runs both tools with --workers=1 for fairness, so the comparison measures search + I/O, not thread scheduling.
The search core
The heart of the rewrite is search_text_multiline(), which faithfully reproduces ag's print state machine. ag walks the input byte by byte, tracking whether the current position is inside a match region, to determine which lines to emit as matches versus context:
fn search_text_multiline(
text: &str, re: &Regex, opts: &Opts,
) -> Vec<Match> {
// First pass: collect match regions, limited by max_count
let mut regions = Vec::new();
let mut search_start = 0;
while let Some(m) = re.find_at(text, search_start) {
regions.push(MatchRegion {
start: m.start(), end: m.end(),
});
if regions.len() >= opts.max_count { break; }
if m.start() == m.end() {
search_start = next_char_boundary(text, m.end());
} else {
search_start = m.end();
}
}
// Second pass: simulate ag's byte-by-byte print state machine
// to determine which lines are matches vs context
// ...
}The two-pass approach separates regex matching from line attribution. The first pass collects all match regions using the regex crate's find_at for anchored searching at each position. The second pass walks the byte stream to map those regions onto line boundaries -- the same state machine ag uses in its print_file_matches function.
The ignore engine
ag's ignore handling is one of its trickiest subsystems. The Rust rewrite uses a stack-based IgnoreEngine that pushes a new rule scope when entering a directory and pops it when leaving:
This prevents scope leakage -- a .gitignore rule in src/ should not affect files in tests/. ag's C implementation uses the same conceptual approach but with a different data structure. The stack-based design makes the Rust version easier to verify: each directory's ignore rules are an isolated frame, and the frame lifetime matches the directory traversal lifetime.
The ignore engine handles .gitignore, .hgignore, .ignore, --ignore patterns, hidden file rules, and VCS directory skipping. 24 parity tests verify that rust-ag and ag agree on which files to include or exclude.
Correctness
Before looking at speed, I checked whether the rewrite still behaves like ag. The test suite includes 214 Rust tests covering the full search surface:
| Category | Tests | What they cover |
|---|---|---|
| Core matching | 23 | Literal, regex, case-sensitive, smart-case, word-boundary, multiline |
| Recursion and ignore | 24 | Gitignore, .ignore, hidden files, VCS skip, symlinks, depth limits, scope leak |
| Edge cases | 45 | Binary detection, max-count, zero-length regex, large files, one-device |
| Count/filename/stream | 43 | --count, --nofilename, stdin, --numbers, per-line counts |
| Context/color/filters | 37 | -A/-B/-C context, --color, -g/-G filename filters |
| Exit codes and errors | 28 | Exit codes 0/1/2, bad regex, nonexistent paths, -v invert |
| Smoke and skeleton | 14 | Build verification, --version, --help, template parity |
The parity tests run rust-ag and ag on identical inputs and assert matching stdout, stderr, and exit codes. The 38 declared scenarios in manifests/scenarios.json define the full test surface across literal search, regex search, CLI behavior, edge cases, and exit code verification.
Where the 2x comes from
The regex engine swap is the primary contributor. ag uses PCRE (Perl-Compatible Regular Expressions), a backtracking engine written in C. rust-ag uses the regex crate, which is built on a Thompson NFA with lazy DFA optimization.
For the literal-simple scenario (search for "foo"), the regex crate detects the literal pattern and uses Aho-Corasick or memchr for the search. This is a fast-path that PCRE does not have for this pattern class -- PCRE compiles the pattern into bytecode and interprets it, even when the pattern contains no metacharacters. The difference is significant at scale: scanning a working tree involves matching the pattern against thousands of files, and the per-match overhead compounds.
The I/O path also differs. ag uses mmap for file access, which avoids explicit read syscalls but involves page fault handling and TLB pressure. rust-ag uses std::fs::read_to_string, which does a buffered read syscall. For the file sizes in this working tree (mostly small source files), the buffered read is competitive with mmap because the kernel's readahead heuristic effectively prefetches the data.
That said, both rg and ugrep are still faster. rg uses the same regex crate but adds parallelism, a custom directory walker, and a memory-mapped searcher. ugrep uses a SIMD-accelerated DFA with its own parallel traversal. rust-ag's single-threaded, non-SIMD search is a baseline that validates the regex engine advantage, not the ceiling of what Rust can do.
Speedup across run types
The 2x result was not a one-off. Across local, nightly, and manual runs, rust-ag stayed between 1.96x and 2.03x faster than ag.
| Pair | Local | Nightly | Manual |
|---|---|---|---|
rust-ag vs ag | 2.03x | 1.96x | 1.96x |
rg vs ag | 2.54x | 2.45x | 2.48x |
ugrep vs ag | 4.89x | 4.76x | 4.95x |
rg vs rust-ag | 1.25x | 1.25x | 1.27x |
ugrep vs rust-ag | 2.41x | 2.43x | 2.52x |
The ranking stayed stable: ugrep > rg > rust-ag > ag.
Limitations
-
One of 38 scenarios. Only
literal-simplehas measured performance data. A regex-heavy, output-heavy, or large-file workload could change the ranking shape. -
Three samples per tool. Each tool has 3 measured samples after 1 warmup iteration. That is enough for a first read but not enough to compute tight confidence intervals.
-
Single machine. Everything ran on one Apple M4, 16 GiB RAM, macOS Darwin 25.3.0. Linux, x86_64, or a different filesystem could move the numbers.
-
Claim gate still failed. The parity checks, threshold checks, and cross-run agreement passed individually, but the full gate stayed red because the reproducibility bundle validation step is incomplete.
-
Single-threaded rewrite. rust-ag accepts but ignores
--workers/--parallelflags. The comparison is fair (both use--workers=1) but does not reflect ag's multi-threaded potential. -
No committed evidence artifacts. The benchmark infrastructure generates evidence at runtime but does not commit timing results. Numbers are reproducible on-demand via
run_matrix.py, not from stored artifacts.
How this was built
The implementation was built using Factory mission mode. The mission system planned the rewrite across milestones (search core, ignore engine, CLI surface, parity tests, benchmark harness), ran worker sessions for each feature, and executed scrutiny and parity validators after every step.
The test-to-source ratio is 1.6:1 by line count (4,923 test lines vs 3,128 source lines across 14 Rust files). An additional 7 Python test scripts in scripts/parity/ validate manifest integrity, fixture hashes, and stderr parity, providing a second layer of verification outside the Rust test harness.
Reproducibility
git clone --branch mission/rust-rewrite-benchmark-publication \
https://github.com/sagaragas/the_silver_searcher.git
cd the_silver_searcher
# Build both
./build.sh # C ag
cargo build --workspace # Rust ag
# Run parity tests
python3 tests/edge-cases/setup_fixtures.py
cargo test --workspace -- --test-threads=5
python3 -m pytest scripts/parity -v
# Run benchmark scenarios
python3 scripts/parity/run_matrix.py --target rust --group smoke
python3 scripts/parity/validate_outputs.py --run latestThe 38 scenarios are defined in manifests/scenarios.json with cross-linked corpus hashes (manifests/corpus.json, 126 files with SHA-256) and query patterns (manifests/queries.json, 13 patterns). The manifest integrity chain ensures that changes to the corpus or queries invalidate the scenario hashes.