Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I build a static analyzer in my spare time and read a bunch of +fravia in my youth, so I have some familiarity with the space but I'm not an expert so I could be completely off here: I wonder if the LLM advances we've seen recently could be used to improve the state of the art in structuring?

As in, could you abstract away the CFG and feed it into an LLM trained on a bunch of CFGs and corresponding ASTs? Especially given that the LLM could be trained on a lot of code that may very well have been reused in the code being decompiled. Even if not the same code, the same algorithms may be similar enough to improve the structuring output.



Yes, I suspect so; there's been some work on this already (from a cursory arXiV search: [0] [1] [2], but there are more). I'm also curious to see if graph neural networks can be used for structuring, as well - given a decompiled CFG, can the original CFG be predicted?

There's lots to be researched here, but as the post alludes to, (public) development is limited. My hope is that there will be more work in this space to enable porting of closed-source applications on older architectures to newer architectures; I have a few ideas on how to go about that, but not enough time to look into it.

[0]: https://arxiv.org/abs/2310.06530

[1]: https://arxiv.org/abs/2306.02546

[2]: https://arxiv.org/abs/2304.03854




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: