Breaking into the black box of artificial intelligence

oneoff786 · on May 15, 2022

I feel like articles like this are always… behind. SHAP isn’t a perfect tool by any means but it would catch the low hanging fruit like the “R” example

burtonator · on May 15, 2022

The R issue is a data cleansing problem too. Data cleansing is something that not many people talk about because it isn't exciting work.

pauljurczak · on May 16, 2022

> Data cleansing is something that not many people talk about

Perhaps, because it is only of limited use in practice. For example, Kaggle competitions https://www.kaggle.com/ often have problems with bad data. Instead of cleaning it, their response always is: welcome to the real world.

oneoff786 · on May 16, 2022

As a real world data scientist, I spend at least 10x as much time cleaning data than creating models.

Modeling is basically nothing

killjoywashere · on May 15, 2022

Yeah, I skimmed it for about a minute and now I want my minute back.

thereisnospork · on May 16, 2022

If it were possibly to succinctly and wholly explain how a neural network works then the problem wouldn't require a neural network to solve.

placebo · on May 16, 2022

Agree in general, though I have read about cases where a neural network was reverse engineered to create a solution that did not require one.

llamaz · on May 16, 2022

I get what you're saying, but this is not immediately obvious.

For example, control systems also adapt to data and are often robust to noise, and they're understood mathematically very well.

I don't necessarily disagree, but you would need some more justification.

mirntyfirty · on May 16, 2022

To be a little pedantic, I’d use the word approximate rather than solve.

pauljurczak · on May 16, 2022

To be even more pedantic, I'd use the phrase approximate the solution ;-)

mirntyfirty · on May 16, 2022

SemanticStrengh · on May 15, 2022

It's not that it is a black box that can somehow be unblacked. Neural networks are inherently messy things that have contrived, complex and partial or ad-hoc representations

drdeca · on May 15, 2022

Are you familiar with the "circuits" thread on distill.pub ? ( https://distill.pub/2020/circuits/ )

Messy and complex, yes, but not altogether immune to analysis.

And, if the training data is diverse enough, it appears that the individual neurons can reflect things we find meaningful, while being expressed in terms of neurons in previous layers which we also find semantically meaningful, in a way we can find comprehensible.

Of course, the amount of time and effort needed to collectively understand the entirety of such a network (to the point that a similar network could be made by people choosing weights by hand (not copying the specific numbers, only the interpreted meanings/reasons of the numbers), and producing something which is not too much worse than the trained network, would be gargantuan, and I suspect it might require multiple generations, possibly even many.

But, I don't think it is impossible?

(presumably it will never happen, because it would not come anywhere close to being worth it to do the whole thing, but, still.)

mr_toad · on May 16, 2022

> the individual neurons can reflect things we find meaningful, while being expressed in terms of neurons in previous layers

It’s easy to find relationships & explanations of how a network seems to work that seem to hold - until they don’t.

Consider a circle. Assume that the y-coordinate corresponds to the centre of the circle. You can now describe which points lie inside the circle as a function of the x-coordinate. You can vary y to a certain extent, which changes the function, but not fundamentally. But now shift the y-coordinate outside the circle. Now your function of x, which seemed so convincing is entirely useless.

How does the probability of being inside the circle depend on x? What weight does the model ascribe ascribed to x in making the decision? The answer always has to be ‘it depends’.

And that’s just a model with two parameters.

drdeca · on May 16, 2022

Sorry, I don't know what you mean by the thing about "the y-coordinate corresponds to the centre of the circle" .

Do you mean that the y coordinate is the y coordinate of the center of the circle? or, is the y coordinate a position and not just a number?

Do you just mean that you have one parameter which determines the center of the circle, and another (which you called x), which determines the radius?

no, that can't be it, because you then talk about moving the y coordinate outside the circle.

Do you mean, if you have a given y value, and you are describing "is the point (x,y) inside the circle" as a function of x, then if you pick the y value to be further than r away from the the y coordinate of the center of the circle, where r is the radius, that the function stops depending on x and for all x, is "no" ?

I'm really not sure what point you are trying to make.

I'm not convinced that whatever your point is applies to the results I linked.

iirc, they did, for a handful of the neurons, try replacing them with human-designed ones using the same principles, and I believe that that generally worked without all that much degradation of the functioning? I could be accidentally making that up though (as in, possibly I'm wrong in thinking they did that)

mr_toad · on May 16, 2022

> Do you mean, if you have a given y value, and you are describing "is the point (x,y) inside the circle" as a function of x, then if you pick the y value to be further than r away from the the y coordinate of the center of the circle, where r is the radius, that the function stops depending on x and for all x, is "no" ?

Yes

> I'm really not sure what point you are trying to make.

The point is that it is non-linear, and people want linear “explanations”. If you imagine that x is say income, then people want simple explanations for the effect of income on the model.

People don’t like it if you say that, well sometimes income has an effect and sometimes it doesn’t. They don’t like it when you say that it actually depends on the relationship of a number of other variables in a non-linear way. They tend to assume that you’re hand-waving or making excuses, when instead what they’re asking for is a mathematical impossibility.

woojoo666 · on May 16, 2022

Science is often probablistic. "Facts" are often just theorems. In the beginning many of those theorems have holes, like newtonian physics before the discovery of relativity. But we didn't just throw our hands up when we found a counterexample. Just because our understanding of something seems incomplete, doesn't mean that thing is undecipherable.

drdeca · on May 16, 2022

By “theorems” do you mean “theories”?

woojoo666 · on May 16, 2022

Yes, sorry got them mixed up

jawarner · on May 16, 2022

I’m not sure it follows that the entire circuit can be subject to this kind of interrogation resulting in a neat story. We can always try to find narratives, but at the end of the day there’s no reason for all the parameter values learned to adhere to some nice human-parsable reason for existing.

Edit: Maybe in theory you could “understand” anything arbitrarily large if you just had enough memory to handle very complicated narratives. But I’m doubting that models can always be compressed into a simpler narrative representation.

drdeca · on May 16, 2022

It is true that many of the neurons don’t seem appropriate (e.g. mixing two largely unrelated concepts), but,

well, that particular issue isn’t a problem with the hypothetical I describe, because “this neuron represents ‘either a fire truck or a nebula’” is still something we can describe.

But, yes, I suppose it is possible that there could be important parts of how the network works that aren’t amenable to understanding what a local part is doing in order to achieve a task. Perhaps details in how parts behave which seem meaningless individually, and only have a useful effect when many such parts are put together across many layers.

But, I still think that it is important to note that, when trained in a diverse way, representative of how the actual distribution, there are at least, many parts of the functioning which can be understood.

Trained NNs aren’t inherently completely inscrutable in how they accomplish things.

But yes, you do have a point that we probably can’t really rule out that there are important aspects of the functioning of these networks that can’t be understood locally in a human-meaningful way, even if many parts of it can be understood.

version_five · on May 15, 2022

Yeah, when people want to see e.g. an image classification model explain the different features it saw in the image and the weights it assigned them in making it's decision (this is an example in the article), they are asking for something that isn't what the model does.

ML models have tacit knowledge in a sense, you can't tractably write down a process for it. That's not to say you can't describe the situations in which a model works.

burtonator · on May 15, 2022

Also, If you look at each layer as just a vector, how the heck do you describe that so that it's easy to understand?

I think this might be perpetually difficult to diagnose.

Maybe there could be a tool that could show WHY a decision was made but not sure that you could identify bias in deep networks before hand.

ewuhic · on May 15, 2022

Isn't the "why" the weights of the model themselves?

sdenton4 · on May 15, 2022

People want brief explanations of reasons for outputs, and a 10gb pile of weights isn't really what they mean.

Human explanations meanwhile are often invented to fit evidence to personal biases and beliefs, and are thus typically deeply flawed. But we're more ok with humans making suspect decisions than ML, in many cases.