If you have a spreadsheet model, it lets you answer questions like "Which variable in my model is the most important?" and "If I underestimate X by 10%, what's the effect on Y?".
Most models are built on assumptions and estimates — if you run a company, your financial model will probably include an assumption around your customer growth rate in the future. If you use this model to make decisions — "How many people can we hire in Q4?" — then it's important to understand how sensitive these decisions are to your assumptions. It may turn out that overestimating your growth rate by 10% means you can only hire half as many people, in which case you'll probably want to hire a bit more conservatively. This analysis is pretty cumbersome to do in spreadsheets.
More broadly, we think there should be a better way than Excel to crunch numbers and do modelling, and we're trying to build it (https://causal.app). Would love to chat if this sounds interesting: taimur@causal.app :)
I see you're using a variance based approach. How does it handle a dependent structure? To my knowledge, Sobol indices assume that the input variables are independent.
Yep — we assume that the input variables are independent. This is more acceptable in some settings than others, so as always, it's good to understand the limitations of the model. Thanks for pointing this one out :)
I am not aware of any software package that implements these algorithms, although I am working towards creating or contributing to a python library that does. Ideally I'd publish a paper in a few months showing how the results (e.g, ranking of importance) differs from the independent case on particular datasets.
Edit: It turns out SAlib recently added some of this functionality. I'll check it out!
I'm late to the party. Hoping you can answer a question. On the one hand, you're lowering the bar to do useful statistical work, which is awesome. A new set of people will be able to do better analysis. On the other hand, you're baking in assumptions which this new set of people won't understand, so you aren't necessarily lowering the bar to do correct statistical work. That seems challenging, and I'm curious how you think about it.
The process of backpropagation is figuring out which input has the greatest effect on output and adjusting the weights on the input to fit the output curve. In other words, assuming a trained single layer neural network, the weights alone will tell you which inputs have the greatest effect on output.
Sorry for the late reply, just saw your comment. I think what you're describing is training a model and then using feature importance. This is also a valid method for calculating sensitivities and it's used if the function from inputs to outputs is unknown. However, in our case, this function is known so there's no need to train a predictive model.
- Do you really want to be using important models in such an error-prone and difficult to test environment as Excel? There's a whole Spreadsheet Risks Interest Group (http://www.eusprig.org/horror-stories.htm) that collects tales of billion-dollar errors that are attributable to (poor use of) Excel.
- A simple sensitivity analysis such as a tornado plot (while clearly much better than the common practice of reporting nothing on sensitivity of uncertain model inputs) is a local approach: it's telling you which variables are important around one specific point in the input space (generally the median for each input). A global sensitivity analysis method such as implemented here gives you information on the entire input space (which can be quite different if your input model has non-linear features).
- A tornado plot corresponds to a "one-at-a-time" sensitivity analysis (modifying each input variable individually), whereas a global method varies all inputs simultaneously and can reveal potential interactions between inputs, potentially very important if your model is non-linear.
You can run a simple sensitivity analysis using Excel with data tables (vary one input, keep the others constant). Large data tables usually slow down you Excel file a lot.
In addition to these (first-order) sensitivities our tool also looks at global sensitivities. This also takes into account how certain inputs interact (we’re giving an example in the description on the site).
We’re working on a new modelling tool that will provide a lot of this stuff out-of-the-box (sensitivity analysis, dealing with uncertainties, ...).
In fact there is an example
of this kind of sensitivity analysis tool already offered at https://www.getguesstimate.com
(not my venture, but a friend’s). It’s like a spreadsheet with uncertainty and sensitivity analysis weaved in throughout.
We're big fans of Guesstimate! We hung out with Ozzie the other day actually :)
Guesstimate can't run calculations over Excel files though + as far as I know, it's running only first-order sensitivities on Guesstimate models. Our tool is computing total-effect indices as well (https://en.wikipedia.org/wiki/Variance-based_sensitivity_ana...).
I don't want to dismiss this out of hand, but what does it do that the Excel Solver doesn't? Excel, LibreOffice Calc etc. are very comprehensive tools, that include a lot of stuff beyond the basics everyone is familiar with.
Hi, we’re familiar with Excel Solver but it doesn’t let you run a sensitivity analysis. You can do it using data tables which is (a) not that easy to set up and (b) can be quite slow for larger models.
This tool does a slightly different analysis than @RISK's — we use a different method under-the-hood that can account for interaction effects between variables, whereas I think @RISK performs one-way sensitivities. But it's essentially the same use-case, yes :)
@RISK is pretty cool and does a lot of important things that spreadsheets alone don't do, like working with distributions + monte carlo instead of single values, and sensitivity analysis. It has a pretty steep learning curve, though, and inherits all the issues of the spreadsheet paradigm.
We actually assume uniform distributions for the inputs of the sensitivity analysis.
This tool is a standalone thing, and you're right — it's similar to TopRank. Our actual product, Causal (https://causal.app) has aspects of @RISK, but packaged in our own (non-spreadsheet) modelling paradigm.
Nice demo! It's perhaps a little confusing if you're using uniform probability distributions for the sensitivity analysis inputs, but normal distributions for the uncertainty analysis (shown at https://causal.app/buy-to-rent).
Thanks! The sensitivity analysis tool is just a little standalone thing, our actual product is a SaaS app for crunching numbers and building models — trying to replace spreadsheets. More info here: https://causal.app
Do you have a sense for how many people are competent modelers enough to use a tool like this- AND who work primarily in excel? My instinct would be to say small, but it is excel, and I suppose it wouldn’t be the craziest thing I’ve heard to learn otherwise
If you have a spreadsheet model, it lets you answer questions like "Which variable in my model is the most important?" and "If I underestimate X by 10%, what's the effect on Y?".
Most models are built on assumptions and estimates — if you run a company, your financial model will probably include an assumption around your customer growth rate in the future. If you use this model to make decisions — "How many people can we hire in Q4?" — then it's important to understand how sensitive these decisions are to your assumptions. It may turn out that overestimating your growth rate by 10% means you can only hire half as many people, in which case you'll probably want to hire a bit more conservatively. This analysis is pretty cumbersome to do in spreadsheets.
More broadly, we think there should be a better way than Excel to crunch numbers and do modelling, and we're trying to build it (https://causal.app). Would love to chat if this sounds interesting: taimur@causal.app :)