Using Deep Learning models trained on static images is usually not a good idea. ...

londons_explore · on Feb 5, 2020

And have the amount of training data and time scale from n^2 to n^3... That's why nobody has achieved good results with this yet...

bitL · on Feb 5, 2020

Oh yes, but that might be comparable to attention-based models for NLP, complexity-wise. Using C3D is quite common since 2015 actually:

Training a great model is expensive these days, FB routinely pays 6 figures per training run.