Hacker News new | past | comments | ask | show | jobs | submit login

You don't always need the input to compute the gradient. For example the gradient of a sum function doesn't require the original input, it just sets all of the derivative(input)'s to 1.



To be more precise, in backwards mode auto-diff, inputs only need to be saved if they are used in a non-linear way.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: