Lots of people have already made good library recommendations, so I will make a non-recommendation for all the data science students out there: stop thinking about libraries, and start thinking about models.
"What library do I use?" is the wrong question. "What model do I use?" is the right question. Libraries are just part of the process of answering that question.
That said, high quality implementations of interesting times series models seem hard to come by, so it's still a legitimate question to ask about libraries. but consider the goal of asking about libraries: you want to find high-quality implementations of useful models, not a magic black box that you can crank data through.
It's useful as a study guide and reference even for someone who ostensibly learned all this stuff in school. It's a tremendously good book, and it's even more impressive that it's free to read online in a high-quality HTML document.
Can you reframe the problem to suit a more classical approach - regression using xgboost or lgbm? If so, go for that!
As an example, imagine you want to calculate only a single sample into the future. Say furthermore that you have six input timeseries sampled hourly, and you don't expect meaningful correlation beyond 48h old samples.
You create 6x48 input features, take the single target value that you want to predict as output, and feed this into your run of the mill gradient boosted tree.
The above gives you a less complex approach than reaching for bespoke time-series stuff; I've personally have had success doing something like this.
If your regressor does not support multiple outputs, you can always wrap it in sklearns MultiOutputRegressor (or optionally RegressorChain; check it out). This is useful if, in the above example, you are not looking to predict only the next sample, but maybe the next 12 samples.
fbprophet is mostly just regression though, with features for trend, yearly and weekly periodicity (smoothed a bit using trigonometric regressors), and holiday features. The only non-standard linear regression part is that it includes a flexible piecewise linear trend, with regularization to select where the trend is allowed to change. Once the change points are selected, it's literally linear regression, that you could fit with anything that can handle regression (statsmodels, sklearn, xgboost, keras, tf, scipy or even just plain numpy).
So you could roll your own using lots of different libraries and do the feature engineering, but for plain jane business time series, fbprophet usually works pretty decently with minimum effort. Because most of the predictable variation in business time series is are patterns in human behavior, which are mostly regulated by the weekly and yearly cycle, interrupted a few times per year by holidays. (And the 24h cycle if you forecast intra-day).
That said, fbprophet is somewhat tuned for a few years of regular data (ideally not interrupted by pandemics), sampled at the daily frequency. If your data is different (e.g. weekly or intra-day) it can start to break in unexpected ways, and it becomes worthwhile to dive into the model and customize or roll your own.
Your description of this technique as "6x48" reminds me of 1D convolutional neural networks.
I always had a gut feeling that there was some kind of unifying principle between NN and tree ensemble models. I wonder if the latter is kind of like a highly quantized, compressed version of the former, which explains why tree ensembles seem to work better than NNs on "lower resolution" data.
Peter Cotton has atleast a dozen very credible studies/results on prophet vs other timeseries libraries. Before committing to prophet, please check out a few of these (all over linkedin). His tone is acerbic because he believes prophet is suboptimal & makes poor forecasts compared to the other contenders. That said, you can ignore the tone, just download the packages & test out the scenarios for yourself. I personally will not use prophet. Like most stat tools in the python ecosystem, it is super easy to deploy & code up, but often inaccurate if you actually care about the results. ofcourse, if its some sales prediction forecast where everything’s pretty much made up & data is sparse/unverifiable, then prophet ftw.
I think acerbity is warranted to some extent. We are data scientists. We get paid the big bucks because we have big brains and have the skills and training to use those big brains in order to reason about the work we are doing.
Data science has become so easy and accessible nowadays that basically anyone who can write code can fit and use models. That's a great thing in general, but it means that those of us who do this for a living really should hold ourselves to higher standards.
Even if you are thoroughly mediocre at your job (like me) and aren't smart enough to come up with something like Prophet on your own, you absolutely do need the ability to reason about the models that you use, and to evaluate their performance correctly.
You got it. It's unbelievably difficult to get model devs out of the mindset of training on their own VM, saving model outputs and metrics dumps to arcanely named file shares, etc. Once you can convince them that using stuff like workflow pipelining tools and centralized model repo servers isn't going to impede their creative process and that it prevents the mad scramble to find artifacts when there's turnover on the team, things become much more efficient.
Pretty much this. "ML engineering" has come to refer to the somewhat specialized task of implementing models and algorithms, and "ML ops" has come to refer to all of the other stuff that you just mentioned.
Don't forget "keeping the train data from leaking into the test data" and "having a way to reproduce the exact same model I trained last week".
Those two are too often forgotten.
- Prophet - seems to be the current 'standard' choice
- ARIMA - Classical choice
- Exponential Moving Average - dead simple to implement, works well for stuff that's a time series but not very seasonal
- Kalman/Statespace model - used by Splunk's predict[1] command (pretty sure I always used LLP5)
I did some anomaly detection work, in business transactions, and found the best way was to create a sort of ensemble model, where we applied all the models, and kept any anomalies, then used simple rules to only alert on 'interesting' anomalies, like:
- 2-3 anomalies in a row
- high deviation from expected
- multiple models all detected anomaly
Statistical process control always seemed like some thing that would benefit me in my work, but I don't know anything about it. I have looked up random Wikipedia articles, but that's all I know. Do you know of any more "serious" learning resources in that area?
I think the most succinct intro I have found is Donald Wheeler's Understanding Variation.
I've long wanted to write an open article series for someone like you but never gotten around to it. There's so much information out there, but you sort of have to piece it together on your own, which is suboptimal.
Yup, as per my other comment (https://news.ycombinator.com/context?id=33448802), fbprophet is largely tuned for a few years of somewhat regular business data sampled daily (e.g. sales per day). Outside its comfort zone (e.g. if you have monthly, or hourly/minutely data, or step changes / level shifts) it can fall apart pretty quickly. But its comfort zone happens to be very popular in business settings.
To be fair, it's pretty hard to create generic models that are can robustly handle any random time series.
Get some forecasts quick => FB Prophet. It's not as good as they'd have you believe, but it's fast and analysts can play with it to some extent.
Outlier detection => Hand-rolled C++ ETS framework.
Multilevel predictions and/or more complex tasks => That's where neural models start to have the edge, but at that point it's a costly project. I like simpler stuff to start, moving to the big guns if/when it's needed.
For cases that Prophet doesn't cover I recommend bsts [0], which is much more flexible and powerful. Anything too complicated for bsts, I'll typically implement in Stan.
I once built a forecasting framework for a unicorn startup. Revenue and Pipeline predictability was the key as the company was going through the IPO phase. There were three approaches I took and 'ensemble'd them to predict the revenue and pipeline.
1. Time series based forecast based on revenue (the one OP is referring to). All the statistical time-series models come here. I primarily used H2O.ai for this.
2. Conversion based revenue forecast (input -> pipeline, output -> revenue). This proved to be quite tricky as there was a time lag between pipeline creation and revenue conversion
3. Delphi-method: Got the sales/pre-sales folks on-ground to predict a bottom-up number and used that as a forecast.
Finally, I combined them by applying weightages to the above approaches - based on how accurate they were on the test dataset.
IMHO, Like many of them have pointed out - the model/assumptions are more important than the library. The job of a data scientist is to make the prediction as reliable and explainable as possible.
As a few other people have mentioned, I find R to be the easiest tool for this job, specifically the forecast package [0]. I had to use this package for an applied econometrics course in college a few years ago, and I have been using it ever since. I find the syntax to be more straightforward than comparable libraries in Python. I also assume that this library (and other libraries in R) offer higher quality models and results than their counterparts in Python, but this is just an assumption.
Sktime is the best toolkit for time series out there. It provides a sklesrn like API for many models and modules for validating, metrics for evaluating and all that sklearm jazz.
Besides that, I also like statsmodels as the docs are pretty good.
I've had someone in a team implement feature engineering using tsfresh. It lead to a malignantly under-performing, complicated heap of spaghetti that was a nightmare to get into production. Weird API, slow code, little added value over simple features found in a day of manual exploration.
Person doing the implementation wasn't a rock star coder so we couldn't fix the performance and complexity issues in time; it was left it out of the releas. Maybe with more expertise, tsfresh can add value. The experience was pretty off-putting for me, all in all. Maybe others have different experiences?
I've tried it a number of times, and had a similar experience. The whole stack is orders of magnitudes slower to compute compared to "simple" features (i.e. rolling averages), without showing real predictive improvements.
Actually, I had a similar off-putting experience with tsfresh. I just thought it was due to me not understanding how to use it properly. The API is indeed pretty weird.
I don't think they necessarily perform well, its just that time-series forecasting is notoriously hard and we haven't seemed to progress that far from ARIMA type models (to the best of my knowledge)
Xgboost is a classifier for tabular data, prophet is for time series prediction. They are different use cases, though you can likely massage xgboost to do time series prediction if you really wanted to. So the question of which is better is "it depends"
Time delay embedding (i.e. the observed value at different time lags as a feature) is the usual trick to turn time series data into a tabular form for this sort of regression.
Check out my top level comment in this thread for a (hopefully clear) example. Sometimes you can rephrase a time series problem into boring classical regression.
It can make the implementation and maintainability of a codebase better (IMHO), without sacrificing predictive power.
Create features for day of week, day of year, month of year, lagged values of y, lagged values of y for each period (eg: 1, 2, 3 weeks and years ago etc). You then predict forward 1 time step at a time.
/e: I'm a bit out of this game for 2-3 years now, but Python had nothing comparable. Prophet is suboptimal at best. Some things are implemented here or there, but it's all over the place. So I agree that R really shines here.
I would generally prefer R for this kind of stuff as the experts generally write the code, but Darts seems OK and is well-tested, at the very least (haven't had a chance to use it in anger yet).
This is in part why we built Darts. Now I think we can say the situation is quite different. Darts offers many things offered by the R forecast package, and then some (for instance the ability to train ML models on large datasets made of multiple potentially high-dimensional series).
I like statsmodels. So far it has all methods I need, and itvis very well documented.
But I am just fiddling a little bit with my 'weather station'. No bleeding edge here.
Easiest is to use cvxpy with your own objective function. You can easily add seasonality regularization etc. other things are too much black box. Also pivot tables. They are free now in online version of google sheets and online excel. Set the time as a row field and it will automatically aggregate. Or if you want irregular spacing you can group by 100 samples.
When modeling time series, you will want a model that is sensitive both to short term and longer term movements. In other words, a Long Term Short Term Memory (LSTM).
Sepp Hochreiter invented this concept in his Master's thesis supervised by Jürgen Schmidhuber in Munich in the 1990s; today, it's the most-cited type of neural network.
LSTMs have been going the way of the dinosaurs since 2018. If you really need a complex neural network (over 1D convolution approaches), transformers are the current SOTA. Example implementation in "temporal fusion": https://pytorch-forecasting.readthedocs.io/en/stable/tutoria...
Mind you, in practice I've found these DL approaches overkill for simple problems of the "trends + cyclics + noise" kind.
My impression is that these kinds of models need a lot of data to train properly. I made a comment elsewhere in this thread, musing that tree ensemble models could do the same job, as a kind of low resolution quantized approximation. If you have experience in this area of research, I'd love to hear your thoughts on that.
* Statistical models (ETS, (V)ARIMA(X), etc)
* ML models (sklearn models, LGBM, etc)
* Many recent deep learning models (N-BEATS, TFT, etc)
* Seamlessly works on multi-dimensional series
* Models can be trained on multiple series
* Several models support taking in external data (covariates), known either in the past only, or also in the future
* Many models offer rich support for probabilistic forecasts
* Model evaluation is easy: Darts has many metrics, offers backtest etc
* Deep learning scales to large datasets, using GPUs, TPUs, etc
* You can do reconciliation of forecasts at different hierarchical levels
* There's even now an explainability module for some of the models - showing you what matters for computing the forecasts
* (coming soon): an anomaly detection module :)
* (also, it even include FB Prophet if you really want to use it)
Warning: I'm probably biased because I'm Darts creator.
[1] https://github.com/unit8co/darts