Hacker News new | past | comments | ask | show | jobs | submit login

Very interesting. I especially like the second item.

What happens if the function you want to differentiate calls multiple other functions, in multiple other compilation units?

(I haven't read the paper yet but definitely will)




Enzyme needs to be able to access the IR of any potentially active functions (calls that it deduced could impact the gradient) to be able to differentiate them.

If all of the code you care about is in one compilation unit, you're immediately good to go.

Multiple compilation units can be handled in a couple of ways, depending on how much energy you want to set it up (and we're working on making this easier).

The easiest way is to compile with Link-Time Optimization (LTO) and have Enzyme run during LTO, which ensures it has access to bitcode for all potentially differentiated functions.

The slightly more difficult approach is to have Enzyme ahead-of-time rather than lazily emit derivatives for any functions you may call in an active way (and incidentally this is where Enzyme's rather aggressive activity analysis is super useful). Leveraging Enzyme's support for custom derivatives in which an LLVM function declaration can have metadata that marks its derivative function, Enzyme can then be told to use the "custom" derivatives it generated while compiling other compilation units. This obviously requires more setup so I'm usually lazy and use LTO, but this can definitely be made easier as a workflow.


Thanks. LTO definitely looks like the more natural option.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: