I've used my homebrew c2rust converter[1] to translate lodepng and pngquant[2] libraries to Rust. My two key takeaways are:
Good test coverage is essential for this. Count how many bugs you've written when you were writing this code for the first time. Even if your bug-rate is 99% better during the rewrite, that may still be a significant number. Fine-grained tests aren't necessary, but end-to-end tests that touch every feature are crucial to catch regressions.
Once the rough conversion is done, it is necessary to refactor the code to take advantage of Rust's idioms to get safety benefits. Just 1:1 conversion is underwhelming, and it feels like replacing gcc with rustc. I did not realize just how recklessly pointer-heavy C tends to be until I saw it through Rusty lens.
The lodepng conversion was "meh". It's a good C code, but its structure was very different from what you'd do in Rust (e.g. Rust prefers generics over pointer casts, iterators over indexing or pointer arithmetic, has interfaces for steaming processing that C lacks). I don't know how far I can refactor the code to Rusty idioms and still call it lodepng :)
OTOH the pngquant codebase was mine, and I'm happy with the results. When converting I took advantage of Rust idioms, and the Rust version is nicer to maintain and even a bit faster.
Thanks for citrus, I've started experimenting on it after I've read your pngquant blog post[1] the other day, and it's exactly what I was looking for: c2rust does semantically exact conversion, which isn't what I need: I needed a tool to automate the boring syntax conversion, and when doing the idiomatic rustification by hand I can take care about the different semantics of the two languages.
Off course there's a different trade-off when bugs are involved: c2rust shouldn't add any bugs during the conversion, while citrus will.
I just started using c2rust on openjpeg [0] (jpeg 2000 encoder/decoder) today and already have it working as a drop in replacement for the C libopenjp2.so on Linux. Still has a lot of unsafe code, but it does work. Which will be a big help with testing during refactoring to idiomatic safe Rust.
c2rust also has a refactor command that helps with refactoring the generated Rust code.
I found refactoring the resulting Rust code somewhat error prone and didn't have great success with the automated tools. I'd recommend having a good test suite and suggest adjusting the C before the conversion to avoid using C features that don't translate well like the C preprocessor.
There was a while there where they were trying to test this out on the cvs codebase, IIRC. It's a good candidate: upstream doesn't exactly move quickly, but is very much a real-world codebase, still in use.
The transpiled code will use pointers, it will not transform C-isms that could map to Rust-isms (like borrowing instead of pointers, or iterators instead of pointer arithmetic). This is meant to be only the first step in a Rust refactor.
You can compile the C code to WASM and then compile the WASM to safe Rust (I wrote a prototype for this and it works). Though as with all WASM, it protects the environment from the C code, but the C code can still corrupt its own WASM memory. iirc Firefox is even starting to use this approach to sandbox some of its components (though they compile the WASM back to C).
There are non-standard extensions to C syntax that provide some amount of safety. It might be interesting to implement support for those within c2rust.
Fun fact: in order to be valid, Rust code in an `unsafe` block must uphold all of Rust's invariants. So (correct) Unsafe Rust is the only code written in Rust that's explicitly safe.
Idiomatically the unsafe block should have a comment explaining why this is actually fine, and if it's an unsafe public API it should have a doc-comment explaining how it can be used safely by other unsafe code.
If you're using unsafe functions to flag something other than Rust's safety considerations (e.g. Rust's core concept doesn't care that this flag bit disables the interrupt controller, and thus if you get it wrong now the product doesn't work, but you probably do so let's mark that "unsafe") the same likely applies for that too.
One of the things I like in Jon Gjengset's live coding Youtube videos is that he takes the time to write such comments, which means both the final code and the live session explain why he thinks this is safe, and once in a while there's a realisation while doing this - aha, this is the wrong way to do it, we need to change other things.
> But can we get 80% there and then have ma human help out?
The problem is that for complex enough projects, the architectural redesign is considerably more demanding than a "remaining 20%". I can imagine it also being quite irritanting, due to shortcuts one may take in C taking advantage of implicit application logic (independently of them being warranted or not), that can't be directly translated due to the Rust strictness.
From another non-rust programmer: all bugs you could write in C will still be there in the transpiled code and the parts that happen to not be bugs will still look as indistinguishable from bugs as they did before. Likely even more indistinguishable than before, because chances are you are better at reading idiomatic C than at reading the anti-idiomatic rust created by the transpiler. But there will likely be some low-hanging fruits of code that can subsequently be changed into idiomatic rust that simply cannot encode certain classes of bugs (without becoming as un-idiomatic as the transpiler output).
What they've stated previously is that they plan to go with the approach of "compile C to unsafe Rust" and "compile unsafe Rust to safe Rust" as two separate things that could be chained together. I don't remember if it's meant to be literally another tool or just the general approach, but seems very interesting and sensible to me!