Mar 2026 • 9 min read

Rewriting the Silver Searcher in Rust

A Rust rewrite of ag that replaces PCRE with the regex crate, reproduces the multiline print state machine, and cuts median runtime roughly 2x on the measured workload -- but rg and ugrep are still faster.

benchmarksrustrewrite studysearch

Rewrite study

First benchmark pass

Project

The Silver Searcher

Baseline

ag 2.2.0

Rewrite

rust-ag 0.1.0

Language

Rust

Historical first pass on the literal-simple workload. rust-ag roughly halved ag's median runtime, but ripgrep and ugrep still led.

ag to rust-ag

1.96x-2.03x faster

Parity

214/214 Rust tests

Relative rank

Still behind rg and ugrep

Coverage

1 of 38 scenarios

Claim gate

Fails on reproducibility

On the only measured performance scenario, rust-ag cut the local median from 19.64 ms to 9.69 ms.
rg and ugrep were still faster on the same workload, at 7.74 ms and 4.02 ms median.
The Rust rewrite is 51% fewer lines than the C original (3,128 vs 6,098) for equivalent search functionality.
This was a narrow result: one workload, three measured samples per tool, one Apple M4 machine, and an overall claim gate that still failed on reproducibility bundle validation.

I rewrote the core search path of The Silver Searcher in Rust and benchmarked that rewrite against the original tool. On the only measured performance workload, rust-ag was roughly 2x faster than ag.

That was the honest headline, but not the whole story. rg and ugrep were still faster, the performance data covered only 1 of 38 registered scenarios, and the overall claim gate still returned fail because the reproducibility bundle validation step was not finished. So this was a promising first result, not a victory lap.

Horizontal bar chart comparing the local median runtimes for ag, rust-ag, rg, and ugrep on the measured literal-simple workload. Lower is better. — Local median runtime comparison for the only measured workload. rust-ag roughly halves ag's runtime, but rg and ugrep remain faster.

The result in one view

The measured workload was the literal-simple scenario: search for "foo" across the repository working tree.

Tool	Local median	Relative to ag	What to keep in mind
`ag`	19.64 ms	1.00x	Baseline
`rust-ag`	9.69 ms	2.03x faster	Apples-to-apples comparison target
`rg`	7.74 ms	2.54x faster	Faster here, but it searches a different file set by default
`ugrep`	4.02 ms	4.89x faster	Fastest here, also with different default traversal semantics

The key comparison is ag versus rust-ag. That pair was checked for output parity. The cross-tool numbers provide context but are not the same kind of comparison because rg and ugrep make different default choices about traversal and ignore handling.

What was rewritten

The Rust rewrite replaces ag's 23 C source files (~6,098 lines) with 6 Rust modules (~3,128 lines) -- 51% fewer lines for equivalent search functionality. The architectural decisions were deliberate:

Aspect	C ag	Rust ag
Regex engine	PCRE (linked C library)	`regex` crate (pure Rust, no unsafe)
Ignore engine	Hash-table with `fnmatch`	Stack-based `IgnoreEngine` with push/pop per directory
CLI parsing	Custom C parser (`options.c`, 32K)	Hand-rolled parser (683 lines, no clap/structopt)
Threading	pthreads worker pool	Single-threaded (accepts `--workers` for compat, ignores it)
Binary detection	`is_binary()` in `util.c`	Byte-for-byte port of ag's heuristics
Decompression	zlib/lzma (`decompress.c`, `zfile.c`)	Not ported (not needed for search parity)
Dependencies	pcre, xz, zlib, pthreads	`regex` + `libc` (2 runtime crates)

The single-threaded design was a deliberate scope cut. ag's pthreads pool adds complexity but the benchmark runs both tools with --workers=1 for fairness, so the comparison measures search + I/O, not thread scheduling.

The search core

The heart of the rewrite is search_text_multiline(), which faithfully reproduces ag's print state machine. ag walks the input byte by byte, tracking whether the current position is inside a match region, to determine which lines to emit as matches versus context:

fn search_text_multiline(
    text: &str, re: &Regex, opts: &Opts,
) -> Vec<Match> {
    // First pass: collect match regions, limited by max_count
    let mut regions = Vec::new();
    let mut search_start = 0;
    while let Some(m) = re.find_at(text, search_start) {
        regions.push(MatchRegion {
            start: m.start(), end: m.end(),
        });
        if regions.len() >= opts.max_count { break; }
        if m.start() == m.end() {
            search_start = next_char_boundary(text, m.end());
        } else {
            search_start = m.end();
        }
    }
 
    // Second pass: simulate ag's byte-by-byte print state machine
    // to determine which lines are matches vs context
    // ...
}

The two-pass approach separates regex matching from line attribution. The first pass collects all match regions using the regex crate's find_at for anchored searching at each position. The second pass walks the byte stream to map those regions onto line boundaries -- the same state machine ag uses in its print_file_matches function.

The ignore engine

ag's ignore handling is one of its trickiest subsystems. The Rust rewrite uses a stack-based IgnoreEngine that pushes a new rule scope when entering a directory and pops it when leaving:

This prevents scope leakage -- a .gitignore rule in src/ should not affect files in tests/. ag's C implementation uses the same conceptual approach but with a different data structure. The stack-based design makes the Rust version easier to verify: each directory's ignore rules are an isolated frame, and the frame lifetime matches the directory traversal lifetime.

The ignore engine handles .gitignore, .hgignore, .ignore, --ignore patterns, hidden file rules, and VCS directory skipping. 24 parity tests verify that rust-ag and ag agree on which files to include or exclude.

Correctness

Before looking at speed, I checked whether the rewrite still behaves like ag. The test suite includes 214 Rust tests covering the full search surface:

Grid showing eight smoke scenarios where ag and rust-ag matched on stdout, stderr, and exit code. — ag and rust-ag matched across all 8 tested parity scenarios in the correctness gate.

Category	Tests	What they cover
Core matching	23	Literal, regex, case-sensitive, smart-case, word-boundary, multiline
Recursion and ignore	24	Gitignore, .ignore, hidden files, VCS skip, symlinks, depth limits, scope leak
Edge cases	45	Binary detection, max-count, zero-length regex, large files, one-device
Count/filename/stream	43	`--count`, `--nofilename`, stdin, `--numbers`, per-line counts
Context/color/filters	37	`-A`/`-B`/`-C` context, `--color`, `-g`/`-G` filename filters
Exit codes and errors	28	Exit codes 0/1/2, bad regex, nonexistent paths, `-v` invert
Smoke and skeleton	14	Build verification, `--version`, `--help`, template parity

The parity tests run rust-ag and ag on identical inputs and assert matching stdout, stderr, and exit codes. The 38 declared scenarios in manifests/scenarios.json define the full test surface across literal search, regex search, CLI behavior, edge cases, and exit code verification.

Where the 2x comes from

The regex engine swap is the primary contributor. ag uses PCRE (Perl-Compatible Regular Expressions), a backtracking engine written in C. rust-ag uses the regex crate, which is built on a Thompson NFA with lazy DFA optimization.

For the literal-simple scenario (search for "foo"), the regex crate detects the literal pattern and uses Aho-Corasick or memchr for the search. This is a fast-path that PCRE does not have for this pattern class -- PCRE compiles the pattern into bytecode and interprets it, even when the pattern contains no metacharacters. The difference is significant at scale: scanning a working tree involves matching the pattern against thousands of files, and the per-match overhead compounds.

The I/O path also differs. ag uses mmap for file access, which avoids explicit read syscalls but involves page fault handling and TLB pressure. rust-ag uses std::fs::read_to_string, which does a buffered read syscall. For the file sizes in this working tree (mostly small source files), the buffered read is competitive with mmap because the kernel's readahead heuristic effectively prefetches the data.

That said, both rg and ugrep are still faster. rg uses the same regex crate but adds parallelism, a custom directory walker, and a memory-mapped searcher. ugrep uses a SIMD-accelerated DFA with its own parallel traversal. rust-ag's single-threaded, non-SIMD search is a baseline that validates the regex engine advantage, not the ceiling of what Rust can do.

Speedup across run types

The 2x result was not a one-off. Across local, nightly, and manual runs, rust-ag stayed between 1.96x and 2.03x faster than ag.

Range chart showing speedup ranges across local, nightly, and manual benchmark runs. — Speedup ranges across local, nightly, and manual runs. rust-ag consistently beat ag, but rg and ugrep still led the field.

Pair	Local	Nightly	Manual
`rust-ag` vs `ag`	2.03x	1.96x	1.96x
`rg` vs `ag`	2.54x	2.45x	2.48x
`ugrep` vs `ag`	4.89x	4.76x	4.95x
`rg` vs `rust-ag`	1.25x	1.25x	1.27x
`ugrep` vs `rust-ag`	2.41x	2.43x	2.52x

The ranking stayed stable: ugrep > rg > rust-ag > ag.

Limitations

One of 38 scenarios. Only literal-simple has measured performance data. A regex-heavy, output-heavy, or large-file workload could change the ranking shape.
Three samples per tool. Each tool has 3 measured samples after 1 warmup iteration. That is enough for a first read but not enough to compute tight confidence intervals.
Single machine. Everything ran on one Apple M4, 16 GiB RAM, macOS Darwin 25.3.0. Linux, x86_64, or a different filesystem could move the numbers.
Claim gate still failed. The parity checks, threshold checks, and cross-run agreement passed individually, but the full gate stayed red because the reproducibility bundle validation step is incomplete.
Single-threaded rewrite. rust-ag accepts but ignores --workers / --parallel flags. The comparison is fair (both use --workers=1) but does not reflect ag's multi-threaded potential.
No committed evidence artifacts. The benchmark infrastructure generates evidence at runtime but does not commit timing results. Numbers are reproducible on-demand via run_matrix.py, not from stored artifacts.

How this was built

The implementation was built using Factory mission mode. The mission system planned the rewrite across milestones (search core, ignore engine, CLI surface, parity tests, benchmark harness), ran worker sessions for each feature, and executed scrutiny and parity validators after every step.

The test-to-source ratio is 1.6:1 by line count (4,923 test lines vs 3,128 source lines across 14 Rust files). An additional 7 Python test scripts in scripts/parity/ validate manifest integrity, fixture hashes, and stderr parity, providing a second layer of verification outside the Rust test harness.

Reproducibility

git clone --branch mission/rust-rewrite-benchmark-publication \
  https://github.com/sagaragas/the_silver_searcher.git
cd the_silver_searcher
 
# Build both
./build.sh              # C ag
cargo build --workspace # Rust ag
 
# Run parity tests
python3 tests/edge-cases/setup_fixtures.py
cargo test --workspace -- --test-threads=5
python3 -m pytest scripts/parity -v
 
# Run benchmark scenarios
python3 scripts/parity/run_matrix.py --target rust --group smoke
python3 scripts/parity/validate_outputs.py --run latest

The 38 scenarios are defined in manifests/scenarios.json with cross-linked corpus hashes (manifests/corpus.json, 126 files with SHA-256) and query patterns (manifests/queries.json, 13 patterns). The manifest integrity chain ensures that changes to the corpus or queries invalidate the scenario hashes.