Yes the bulk of our business is time series. This includes everything from hardware break downs to fraud detection.
I think Jeremy has some good points but in general, but I wouldn't assume that everything is binary. (By this, I mean
look at these kinds of terse statements with a bit of nuance)
Usually as long as you have a high amount of regularization and use truncated backprop through time in training you can learn some fairly useful classification and forecasting problems.
Beyond that standard neural net tuning applies. Eg: normalize your data, pay attention to your weight initialization, understand what loss function you're using,..
LSTMs don't "forget" more than "remember the things that matter". They don't necessarily need less data. They do have a limit on the "length" of time steps they can handle though.
Eg: You can't do thousands in to the future (maybe a few hundred or so)
The long part of "LSTM" means remember good long ranging dependencies.
Usually as long as you have a high amount of regularization and use truncated backprop through time in training you can learn some fairly useful classification and forecasting problems.
Beyond that standard neural net tuning applies. Eg: normalize your data, pay attention to your weight initialization, understand what loss function you're using,..