Hacker News new | past | comments | ask | show | jobs | submit login

Interesting concept, but note this can't be used for training an LSTM network.

it's the training that involves far more computation and memory, because rather than just storing the latest state of the LSTM cells, one needs to store all past states and activations of the cells.

For forward inference, this logic looks good, although I'm unclear what use cases it would apply to.




It makes much more sense to design ASICs for inference than for training. Any useful inference network will be executed many millions of times more than its training and so will require much more computation in aggregate. Inference may also be run in embedded environments, running on battery power with no network connectivity while training can probably run in the cloud.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: