Interesting concept, but note this can't be used for *training* an LSTM network....

amirhirsch · on June 8, 2018

It makes much more sense to design ASICs for inference than for training. Any useful inference network will be executed many millions of times more than its training and so will require much more computation in aggregate. Inference may also be run in embedded environments, running on battery power with no network connectivity while training can probably run in the cloud.