I feel like articles like this are always… behind. SHAP isn’t a perfect tool by any means but it would catch the low hanging fruit like the “R” example
> Data cleansing is something that not many people talk about
Perhaps, because it is only of limited use in practice. For example, Kaggle competitions https://www.kaggle.com/ often have problems with bad data. Instead of cleaning it, their response always is: welcome to the real world.
It's not that it is a black box that can somehow be unblacked. Neural networks are inherently messy things that have contrived, complex and partial or ad-hoc representations
Messy and complex, yes, but not altogether immune to analysis.
And, if the training data is diverse enough, it appears that the individual neurons can reflect things we find meaningful, while being expressed in terms of neurons in previous layers which we also find semantically meaningful, in a way we can find comprehensible.
Of course, the amount of time and effort needed to collectively understand the entirety of such a network (to the point that a similar network could be made by people choosing weights by hand (not copying the specific numbers, only the interpreted meanings/reasons of the numbers), and producing something which is not too much worse than the trained network, would be gargantuan, and I suspect it might require multiple generations, possibly even many.
But, I don't think it is impossible?
(presumably it will never happen, because it would not come anywhere close to being worth it to do the whole thing, but, still.)
> the individual neurons can reflect things we find meaningful, while being expressed in terms of neurons in previous layers
It’s easy to find relationships & explanations of how a network seems to work that seem to hold - until they don’t.
Consider a circle. Assume that the y-coordinate corresponds to the centre of the circle. You can now describe which points lie inside the circle as a function of the x-coordinate. You can vary y to a certain extent, which changes the function, but not fundamentally. But now shift the y-coordinate outside the circle. Now your function of x, which seemed so convincing is entirely useless.
How does the probability of being inside the circle depend on x? What weight does the model ascribe ascribed to x in making the decision? The answer always has to be ‘it depends’.
Sorry, I don't know what you mean by the thing about "the y-coordinate corresponds to the centre of the circle" .
Do you mean that the y coordinate is the y coordinate of the center of the circle? or, is the y coordinate a position and not just a number?
Do you just mean that you have one parameter which determines the center of the circle, and another (which you called x), which determines the radius?
no, that can't be it, because you then talk about moving the y coordinate outside the circle.
Do you mean, if you have a given y value, and you are describing "is the point (x,y) inside the circle" as a function of x,
then if you pick the y value to be further than r away from the the y coordinate of the center of the circle, where r is the radius, that the function stops depending on x and for all x, is "no" ?
I'm really not sure what point you are trying to make.
I'm not convinced that whatever your point is applies to the results I linked.
iirc, they did, for a handful of the neurons, try replacing them with human-designed ones using the same principles, and I believe that that generally worked without all that much degradation of the functioning? I could be accidentally making that up though (as in, possibly I'm wrong in thinking they did that)
> Do you mean, if you have a given y value, and you are describing "is the point (x,y) inside the circle" as a function of x, then if you pick the y value to be further than r away from the the y coordinate of the center of the circle, where r is the radius, that the function stops depending on x and for all x, is "no" ?
Yes
> I'm really not sure what point you are trying to make.
The point is that it is non-linear, and people want linear “explanations”. If you imagine that x is say income, then people want simple explanations for the effect of income on the model.
People don’t like it if you say that, well sometimes income has an effect and sometimes it doesn’t. They don’t like it when you say that it actually depends on the relationship of a number of other variables in a non-linear way. They tend to assume that you’re hand-waving or making excuses, when instead what they’re asking for is a mathematical impossibility.
Science is often probablistic. "Facts" are often just theorems. In the beginning many of those theorems have holes, like newtonian physics before the discovery of relativity. But we didn't just throw our hands up when we found a counterexample. Just because our understanding of something seems incomplete, doesn't mean that thing is undecipherable.
I’m not sure it follows that the entire circuit can be subject to this kind of interrogation resulting in a neat story. We can always try to find narratives, but at the end of the day there’s no reason for all the parameter values learned to adhere to some nice human-parsable reason for existing.
Edit: Maybe in theory you could “understand” anything arbitrarily large if you just had enough memory to handle very complicated narratives. But I’m doubting that models can always be compressed into a simpler narrative representation.
It is true that many of the neurons don’t seem appropriate (e.g. mixing two largely unrelated concepts), but,
well, that particular issue isn’t a problem with the hypothetical I describe, because “this neuron represents ‘either a fire truck or a nebula’” is still something we can describe.
But, yes, I suppose it is possible that there could be important parts of how the network works that aren’t amenable to understanding what a local part is doing in order to achieve a task.
Perhaps details in how parts behave which seem meaningless individually, and only have a useful effect when many such parts are put together across many layers.
But, I still think that it is important to note that, when trained in a diverse way, representative of how the actual distribution, there are at least, many parts of the functioning which can be understood.
Trained NNs aren’t inherently completely inscrutable in how they accomplish things.
But yes, you do have a point that we probably can’t really rule out that there are important aspects of the functioning of these networks that can’t be understood locally in a human-meaningful way, even if many parts of it can be understood.
Yeah, when people want to see e.g. an image classification model explain the different features it saw in the image and the weights it assigned them in making it's decision (this is an example in the article), they are asking for something that isn't what the model does.
ML models have tacit knowledge in a sense, you can't tractably write down a process for it. That's not to say you can't describe the situations in which a model works.
People want brief explanations of reasons for outputs, and a 10gb pile of weights isn't really what they mean.
Human explanations meanwhile are often invented to fit evidence to personal biases and beliefs, and are thus typically deeply flawed. But we're more ok with humans making suspect decisions than ML, in many cases.