Hacker News new | past | comments | ask | show | jobs | submit login
Show HN: Excel Sensitivity Analysis Tool (causal.app)
148 points by Lukas1994 on Nov 25, 2019 | hide | past | favorite | 30 comments



Hi folks! Lukas (Lukas1994) and I built this.

If you have a spreadsheet model, it lets you answer questions like "Which variable in my model is the most important?" and "If I underestimate X by 10%, what's the effect on Y?".

Most models are built on assumptions and estimates — if you run a company, your financial model will probably include an assumption around your customer growth rate in the future. If you use this model to make decisions — "How many people can we hire in Q4?" — then it's important to understand how sensitive these decisions are to your assumptions. It may turn out that overestimating your growth rate by 10% means you can only hire half as many people, in which case you'll probably want to hire a bit more conservatively. This analysis is pretty cumbersome to do in spreadsheets.

More broadly, we think there should be a better way than Excel to crunch numbers and do modelling, and we're trying to build it (https://causal.app). Would love to chat if this sounds interesting: taimur@causal.app :)


I see you're using a variance based approach. How does it handle a dependent structure? To my knowledge, Sobol indices assume that the input variables are independent.


Yes, right now we assume independence. SALib recently added a few algorithms that can handle correlation: https://github.com/SALib/SALib/issues/193#issuecomment-55195... I'm planning to try them out soon :)


Yep — we assume that the input variables are independent. This is more acceptable in some settings than others, so as always, it's good to understand the limitations of the model. Thanks for pointing this one out :)


There has been some exploration into variance based global sensitivity analysis for models with dependent inputs. [1][2][3]

[1]https://www.sciencedirect.com/science/article/pii/S095183201... [2]https://www.sciencedirect.com/science/article/pii/S001046551... [3]https://www.sciencedirect.com/science/article/pii/S095183201...

I am not aware of any software package that implements these algorithms, although I am working towards creating or contributing to a python library that does. Ideally I'd publish a paper in a few months showing how the results (e.g, ranking of importance) differs from the independent case on particular datasets.

Edit: It turns out SAlib recently added some of this functionality. I'll check it out!


I'm late to the party. Hoping you can answer a question. On the one hand, you're lowering the bar to do useful statistical work, which is awesome. A new set of people will be able to do better analysis. On the other hand, you're baking in assumptions which this new set of people won't understand, so you aren't necessarily lowering the bar to do correct statistical work. That seems challenging, and I'm curious how you think about it.


Are there any benefits to using backpropagation for sensitivity analysis on larger datasets?


Can you elaborate on that?


The process of backpropagation is figuring out which input has the greatest effect on output and adjusting the weights on the input to fit the output curve. In other words, assuming a trained single layer neural network, the weights alone will tell you which inputs have the greatest effect on output.


Sorry for the late reply, just saw your comment. I think what you're describing is training a model and then using feature importance. This is also a valid method for calculating sensitivities and it's used if the function from inputs to outputs is unknown. However, in our case, this function is known so there's no need to train a predictive model.


Hi Lukas.

Thanks for making this.

I will ask the obvious question: how is this different from the sensitivity analysis provided in the built in data table functionality?

I'll say a little more: I've seen people typically take a datatable and run it over a range of variables and then make some tornado plots.

I just googled around and there is a lot of noise but this one seems to have a clear explanation of what I mean:

https://www.f1f9.com/wp-content/uploads/2019/05/F1F9_Tornado...


There are three answers to this:

- Do you really want to be using important models in such an error-prone and difficult to test environment as Excel? There's a whole Spreadsheet Risks Interest Group (http://www.eusprig.org/horror-stories.htm) that collects tales of billion-dollar errors that are attributable to (poor use of) Excel.

- A simple sensitivity analysis such as a tornado plot (while clearly much better than the common practice of reporting nothing on sensitivity of uncertain model inputs) is a local approach: it's telling you which variables are important around one specific point in the input space (generally the median for each input). A global sensitivity analysis method such as implemented here gives you information on the entire input space (which can be quite different if your input model has non-linear features).

- A tornado plot corresponds to a "one-at-a-time" sensitivity analysis (modifying each input variable individually), whereas a global method varies all inputs simultaneously and can reveal potential interactions between inputs, potentially very important if your model is non-linear.

Some background material on these points from a course that I run: https://risk-engineering.org/sensitivity-analysis/


You can run a simple sensitivity analysis using Excel with data tables (vary one input, keep the others constant). Large data tables usually slow down you Excel file a lot. In addition to these (first-order) sensitivities our tool also looks at global sensitivities. This also takes into account how certain inputs interact (we’re giving an example in the description on the site). We’re working on a new modelling tool that will provide a lot of this stuff out-of-the-box (sensitivity analysis, dealing with uncertainties, ...).


In fact there is an example of this kind of sensitivity analysis tool already offered at https://www.getguesstimate.com (not my venture, but a friend’s). It’s like a spreadsheet with uncertainty and sensitivity analysis weaved in throughout.


We're big fans of Guesstimate! We hung out with Ozzie the other day actually :) Guesstimate can't run calculations over Excel files though + as far as I know, it's running only first-order sensitivities on Guesstimate models. Our tool is computing total-effect indices as well (https://en.wikipedia.org/wiki/Variance-based_sensitivity_ana...).


Also: https://sheetless.io/

Relevant to understanding causality in systems: https://en.wikipedia.org/wiki/Twelve_leverage_points


I don't want to dismiss this out of hand, but what does it do that the Excel Solver doesn't? Excel, LibreOffice Calc etc. are very comprehensive tools, that include a lot of stuff beyond the basics everyone is familiar with.


Hi, we’re familiar with Excel Solver but it doesn’t let you run a sensitivity analysis. You can do it using data tables which is (a) not that easy to set up and (b) can be quite slow for larger models.


Does it cover a different use case compared to Palisade @Risk?


This tool does a slightly different analysis than @RISK's — we use a different method under-the-hood that can account for interaction effects between variables, whereas I think @RISK performs one-way sensitivities. But it's essentially the same use-case, yes :)

@RISK is pretty cool and does a lot of important things that spreadsheets alone don't do, like working with distributions + monte carlo instead of single values, and sensitivity analysis. It has a pretty steep learning curve, though, and inherits all the issues of the spreadsheet paradigm.


Thanks for the reply. Does your tool assume that the variables are normally distributed for the Monte Carlo analysis you run?

It looks like your tool is closer to TopRank from Palisade in it's functionality?


We actually assume uniform distributions for the inputs of the sensitivity analysis.

This tool is a standalone thing, and you're right — it's similar to TopRank. Our actual product, Causal (https://causal.app) has aspects of @RISK, but packaged in our own (non-spreadsheet) modelling paradigm.


Nice demo! It's perhaps a little confusing if you're using uniform probability distributions for the sensitivity analysis inputs, but normal distributions for the uncertainty analysis (shown at https://causal.app/buy-to-rent).

A previous comment mentions you are using SALib in Python, which can (even if it's not documented) use normal probability distributions for the inputs: here's a notebook with an example: https://risk-engineering.org/notebook/sensitivity-analysis.h...


Congratulations on launching!

I tried sending an email the the address given in your website, but my message bounced back with an error:

"<hi@casual.app>: Host or domain name not found. Name service error for name=casual.app type=A: Host not found"


It’s @causal.app :)


Oops! I looked at the name 20 times when retyping and still missed that. Thanks!


Haha no worries - people read Casual all the time. We might have to change our name :D


Cool idea! If I may ask: how do you plan on monetizing this?


Thanks! The sensitivity analysis tool is just a little standalone thing, our actual product is a SaaS app for crunching numbers and building models — trying to replace spreadsheets. More info here: https://causal.app


Do you have a sense for how many people are competent modelers enough to use a tool like this- AND who work primarily in excel? My instinct would be to say small, but it is excel, and I suppose it wouldn’t be the craziest thing I’ve heard to learn otherwise




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: