Is there any other good resource on time series modeling and forecasting other than exponential smoothing and variants of ARIMA? Pretty much every tutotial on the web is on exponential smoothing and ARIMA or some lazy LSTM tutorials.
Some good free textbooks are Rob Hyndman's online book https://otexts.com/fpp2/ and Brockwell and Davis' old textbook https://link.springer.com/book/10.1007/978-3-319-29854-2. They focus much on ARIMA and exponential smoothers, because most time series data are pretty small sized (a few dozens to at most a few thousand samples), so there's really not that much else that can do.
Most of Hyndman's textbook approaches (mostly ARIMA and various exponential smoothers) are implemented in his 'forecast' R package.
ARIMA and exponential smoothers tend to be a bit hard to get working well on daily data (they come from the era where most data was monthly or quarterly). A modern take on classical frequency domain Fourier regression is Facebook Prophet (https://facebook.github.io/prophet/) which tends to work pretty well if you have a few years of daily data( https://facebook.github.io/prophet/ )
FPP is great, but limited to the simplest possible timeseries: a single number recorded at evenly-spaced intervals.
Anyone know of good resources for multivariate, multimodal, irregular timeseries forecasting? I know some great practical tools and tutorials (prophet, fast.ai), but I'd love to inject some statistical knowledge like FPP offers.
- Multi-variate: text book treatments tend to focus mainly on Vector Auto Regression (VAR) models. Unrestricted VARs scale very badly in vector dimension, so the often end up in some regularized form (dimension reduced by PCA or Bayesian priors). Lütkepohl's textbook is the standard reference.
VAR type models in my view not very practical for most business time series. You should probably not waste too much time on them unless you're really into macro-economic forecasting, in which case you're wasting your time anyway :). VAR forecast accuracy in macro-economics is not great to put it mildly, but we have nothing really better).
An alternative to VARs for multivariate time series are state space models, which are described mostly in Durbin&Koopman and Andrew Harvey's time series textbooks. These model types was recently popularized in tech circles by Google's CausalImpact R package (though that package I think only implements the univariate model).
- Multi-model: if you need to model some generic non-Gaussian time series process some slow generic simulation method (MCMC, particle filtering). I can't recommend any good reference since I haven't kept up with the literature for about 15 years. I only remember a bunch of dense journal papers from that era (e.g. https://en.wikipedia.org/wiki/Particle_filter#Bibliography)
- Irregular: if the irregularity is mild (filling up a relatively small number of gaps/missing data), you can do LOESS, smoothing splines, Kalman filtering, which should all get you pretty similar results. If your time series are extremely irregular, probably no generic method will do well and you probably need to invest some days/weeks/months into a fairly problem/data-specific method (probably some heavily tuned smoothing spline)
If you're only talking about forecasting and not medical/inferences then most of statistic models are that and GARCH variation.
There are multivariate models but I don't know much about those. Most of the good resources are in the econometric domain. Multivariate time series within econometric, from what I've seen, is portfolio balancing.
For a general overview for statistic domain I would recommend:
For GARCH:
Financial Modeling Under Non-Gaussian Distributions
If you want to learn more within statistic and time series in medical data: there is (1) longitudinal and (2) survival analysis. There are non linear time series but those are rare because most of our tools work within linear. There are also circular time series and temporal spatial statistic but I don't have any relevant knowledge in those to give you. I'm sure there are other that I don't know about within statistic.
There are 4 papers now and most of them are on statistical models which traditional dominating this domain. Datascience/ML models are slowing getting in there. M4 the best model was a highly tailor hybrid between ML/Stat technique the person who created it was employed by Uber and wrote an article about it.
The 5th competition m5 is currently underway and split into 2 contest. I'm eagerly waiting to read the paper on the results.
I can recommend this [0] book. It's focused on financial time series and trading, but the techniques covered in the book are generic enough to apply to all kinds of time series, you can just ignore the finance parts. If you search hard enough you can find the PDF for free online. The way they treat convolution operators and efficiently approximate them with fixed-size EMAs was quite interesting to me. It's definitely a bit dated, but that's some of its charm.
It hasn't really, at least not in production. Academics are now publishing a lot of papers using Deep Learning or RL, but you won't usually see those in live systems.
In live systems, latency is usually more important than a "better" model - A model that takes milliseconds to make slightly better predictions is too slow when you're working on nano- to microsecond scales, often on specialized hardware. Really, the "AI" part is less important in HFT than you may think. It's often more system/infrastructure.
This is for HFT specifically, perhaps it has had more impact on longer time horizons, or something like portfolio management. My impression is (but I may be wrong) that there aren't that many people doing something in between HFT and much longer (minutes to days) time horizons, something like milliseconds to seconds. Maybe there is an opportunity there for some of the newer AI techniques.
Try looking under the name "signal processing" instead. The toolbox under "time series analysis" is usually a variation on the contents of the old book by Box.