Dualities for Safe and Valid C to Rust

As the name suggests, Syzygy exhibits two kinds of dualities that form the core principles of our approach.

LLMs meet Dynamic Analysis. First, we synergistically combine the superior generative capabilities of LLMs with semantic execution information collected by dynamic analyses on the source C codebase.
Dual Code-Test Translation. Secondly, as opposed to using the LLM+Dynamic Analysis recipe to perform Rust code-generation alone, we also build a test translation approach for reliable equivalence testing.

Overview. The Slicer decomposes a codebase into translation units, CodeGenerator performs translation for that unit, and EqTester checks for equivalence of the generated Rust code with the original C code. ArgTranslator maps the C-Rust function arguments allowing appropriate equivalence checks in EqTester. SpecMiner, our dynamic analysis module, mines (property and I/O) specifications for internal C functions using top-level tests. These specifications later assist our LLM-driven dual code-test translation.

Scaling LLM Inference with Dynamic Analysis

Translation Pipeline. As shown above, given a C function and test suite, we run four modules:

SpecMiner: Analyzes function execution, collecting I/O and key properties like types, aliasing, and bounds.
CodeGenerator: An LLM uses the spec to craft a candidate Rust translation.
ArgTranslator: An LLM creates a function that translates C I/O to Rust inputs and expected outputs, aligned with the generated Rust translation.
EqTester: An LLM generates a test that executes the candidate translation, along with the previously generated Rust code (Invariant), and checks whether the outputs match expected values.

Rejection Sampling. As shown below, we use rejection sampling to scale LLM inference throughout the pipeline. To do so, we alternate between generation (→) and verification (-->) steps, beginning with an LLM generating N candidate Rust functions which are filtered for compilation. The pipeline then samples K translate_args functions for each valid candidate, with subsequent execution filtering and equivalence testing phases that create and verify triples, ultimately yielding successfully translated and equivalent Rust functions.

Translating Zopfli from C to Rust

We completely translate Zopfli, a compression algorithm known to achieve superior compression ratios by extensively optimizing the compression process. The codebase consists of over 3000 lines of C code (excluding comments), comprising 98 functions and 10 structs, and spans over 21 files. It provides a challenging testbed for evaluating our approach due to the diversity of the source code constructs, including heap-based data structures (such as linked lists and array iterators), function pointers, and void* arguments. Our translated Zopfli program ranges over 7000 lines (including verbose comments and docstrings).

Pass Rate

We study the quality of the translation candidates across functions with two metrics: compilation pass rate and execution pass rate. Compilation pass rate is computed as the number of translations that compile against the total number of translations generated. The execution pass rate is computed as the number of translations that pass the equivalence test compared to the total number of compiling translations.

The mean pass rates are high, ranging over 75% for both compilable and executable translations. However, the distribution shows a long tail of challenging functions with lower pass rates (below 20%).

Efficiency

We further evaluated the final translated Rust code on varied test inputs and compared the output and execution time with the original C code. We construct two test suites -- first with six strings formed by repeating 'a' different number of times (between 10 and 1000000 times) and second with twelve random strings of varying lengths (between 1000 and 1000000 characters). Additionally, we compare the execution times under default compilation configuration (gcc -O0 -g and cargo debug) and optimized compilation configuration (gcc -O3 and cargo build --release) for both the C and Rust implementations.

Performance Comparison of C and Rust Implementations
Input Type	Default			Optimized
Input Type	C	Rust	Slowdown	C	Rust	Slowdown
Repeated Input	0.17	1.56	8.9x	0.040	0.059	1.47x
Random Input	0.242	3.336	13.8x	0.094	0.330	3.67x

We find that the translated Rust code is equivalent but slower, with larger slowdowns, in the default compilation compared to the optimized compilation configuration. Even with the compiler optimization, the Rust code is up to 3.67X slower than the C code.

Syzygy

Dual Code-Test C to (safe) Rust Translation using LLMs and Dynamic Analysis

Dualities for Safe and Valid C to Rust

Scaling LLM Inference with Dynamic Analysis

Translating Zopfli from C to Rust

Pass Rate

Efficiency