Same here, I guess it's a complex function of hardware and what you're trying to...

thecleaner · on Sept 6, 2019

> Personally I ended up writing my own autodiffing tensor library in C++

But this will not be a general library right ? You must have only included certain subset of functions of TF or PyTorch or whatever. Autodiffing is also included in certain proprietary libraries like ones from NAG. But I doubt its possible to achieve 50x speedup without compromising on functionality.

svantana · on Sept 8, 2019

Of course it's a herculean task to write a library with that many features. But I don't think that's the issue, it's more that the devs of TF can't possibly optimize for every use case. For me, I knew what kind of ops I needed, so I could focus on getting those as fast as possible.

mamcx · on Sept 6, 2019

> lots of local updates in large tensors

It mean you have efficient "crud" operators? This interest me because I'm building a relational language and toy with the idea of tensors as the "table" structure, but get rid of that because updates...