I wrote a toy language along these lines a while back[0]. Basically, types and function signatures, with comments in English, produce a valid program. You write a type and a comment, and the compiler goes through GPT to run the code. Fun novel idea.
This paper suggests that pseudocode is more understandable to LLM's than natural language. So building off of that, one should not write comments in the method bodies, one should translate a function body (in mixed English and pseudocode) as a unit. LLM's can memorize a lot but there is a limit so I guess it would have to be structured as a multi-level translation. You mention in the Readme that it is persnickety, this is probably due to the use of "auto-complete" in the prompt. I would use an intro like "here is a function body in pseudocode", then prompts "split it into chunks by inserting labels", "translate whole body preserving chunk labels", "re-translate each chunk using whole body as context", "fix type errors". I think the biggest issues besides prompt engineering are cost and latency though.
[0] - https://github.com/eeue56/neuro-lingo