> Do you think racial cases of bias outweigh other demographic biases? And what's your opinion of the general level of intent (or lack of, to fix) relative to other issues?
When talking about systemic bias, I think it's less about intent, but that is one of the things that makes it so dangerous in terms of the level and longevity of impact; it's much more difficult to fix when there is no single bad actor at which to point the finger. This makes it easier for people to deny, either outright or at least in extent, and makes it harder for the systems to be fixed.
I don't know that I could place one type of demographic bias over another, but I think it's better for people to understand and acknowledge systemic bias of any form, if we have any hope to rectify any of them. I also think a lot of them are woefully entangled. For example, I've heard the argument that some bias isn't racial, it's economic, as in, a bias against poor people not people of color. But when there are already so many systemic economic biases against people of color, a lot of times it ends up being a distinction without a difference. I have plenty of examples of this, but don't want to dilute the impact of that last thought by going into detail.
> That just shifts the definitional question to the scale of the institution, structure, or process.
I agree. Systemic bias is everywhere, for almost all scales of systems.
> Is the definition, "a system is broad" or (I think you mean) "broad as in flexible", such that a system can be anything? Any micro or macro population?
Yes, you're right, I meant flexible. We can debate the severity of the impact of some form of systemic bias based on the size of the system (and the population or sub-population impacted), but to the individuals who are the victims of systemic bias, the result ends up being the same. At the end of the day, I don't think I'd fault any individual for taking up the cause of identifying and rectifying bias at any scale.
> I guess where I was going with this is that the list seemed to be about some things that are very broadly impactful but not specifically racist, like facial-recognition not recognizing anyone but white men very well, but then a related issue of how this is encoded in the system. Is usage required, or was it random, etc.
Yeah, I didn't really provide much context for why I specifically keep a list. It's not intended to single out racism, and in fact, it is intended to focus on unintentional codification of bias, because I think there are more software developers out there who are at risk for unintentionally codifying bias into their algorithms than there are those who may intentionally encode it.
The reason I keep the list is as a reminder of the level of impact you can unintentionally have in the systems you build without extremely deep thought and broad context. It's a reminder that what we do can have real impact on real people in ways we never imagined.
This is especially recurring as a theme for machine learning and AI systems which are taught based on limited datasets which often systemically under-represent those to whom it will end up being applied.
> Such that maybe top-to-bottom it'd be sorted by breadth. How many people does each impact, and how deeply-mandated or intertwined are the issues. And then different populations and problem to both give contrast, but also to indicate (when complete) the demographics of the problems.
I think this is a good idea.
> So I was wondering if you'd focused on black issues to make the list.
I did not specifically focus on that. I started tracking the list I think 2 or 3 years ago, and I think issues of systemic racism have been especially visible in the mainstream in this time period for reasons I won't get into. So, it's largely a reflection of the systems I've been made aware of through research and publication.
> So, statistically, how much subpopulation misrepresentation would you expect in various ways? And are you saying there's more in general, or more racial, than expected?
I have no idea. I just know that I've not yet been able to find any real systemic bias, at least in the US, against rich, caucasian males. If the question is, which slice of demographics is the most biased against, I think there are some studies which have looked into that, but I couldn't say myself.
> I interpreted the one about kidneys as an error because of changing demographics and how people had caught it, made note that this sort of thing happens, and started checking other research for similar sampling bias. It read as a success story where the one about facial-recognition was more actively black mirror.
There is certainly a bright side in these things being identified as problems which need to be solved. However, even with the kidney example, I don't know that it's a success as much as further evidence of how difficult these things are to rectify at a systemic level, since the vast majority of doctors using the still-biased eGFR formula have no idea that it has this problem.
> More that I'm wondering if it was an attempt to model, like for predictive policing, or a tool sold to simplify the sorting of people? Because models are good, even when they're wrong, but crappy predictive tools are worse than useless and - where I was ultimately going with this - perhaps fraudulent to sell.
I mostly agree with this sentiment. The biggest issue I see is that most of the models weren't built with the intention of being biased or badly predictive, and those selling them often don't know until after they've already been sold and broadly adopted. Fraud generally requires mal-intent, which most of these models and products didn't have, at least if we're being generous and optimistic. Negligence is probably closer to it, but I think difficult to prove in such a new frontier.
I just finished reading a book called, Weapons of Math Destruction, which talks about the damage models can do. The author posits a set of tests to tell whether the model is beneficial or destructive. One of the hallmarks, the author argues, of a good predictive modelling system is one which includes feedback into the system as a result of its predictions. Many examples, especially those which alter the outcome by making the prediction in the first place (e.g. criminal sentencing based on predicted criminal activity, excluding candidates from the hiring process, firing teachers based on opaque models), have no feedback mechanism for directly determining how successful their predictions were.
> But yeah, to recommend sentences, total crap.
One of the other primary tests is dependent on the context of how the model is used, so it's not really reasonable to try to determine the benefits of a model in isolation with no consideration for the appropriateness of its implementation and utility. In this case, it's hard to defend a model with bad predictive data based on the presumption that it could have been more beneficial with another use, when the reality is that it was only sold with one use and that one use was harmful. Though, even this ignores the fact that it had no built-in feedback mechanism to determine how good a job it was even doing with its predictions in the first place.
> When talking about systemic bias, I think it's less about intent, but that is one of the things that makes it so dangerous in terms of the level and longevity of impact; it's much more difficult to fix when there is no single bad actor at which to point the finger.
Thanks for approaching it this way. I think we often (societally, looking for blood) want to find a bad actor and are intentionally blind to bad things that just sort of happen at the edges, but need oversight to find and fix.
> For example, I've heard the argument that some bias isn't racial, it's economic, as in, a bias against poor people not people of color. But when there are already so many systemic economic biases against people of color, a lot of times it ends up being a distinction without a difference.
It doesn't help the sufferer, at the time, to be told that someone else would be suffering equally, but it does help fix the problem I think, to realize that it's circular via poverty or whatever, not simply racial, because in some areas and at some times it has simply been racial and it hugely changes how you fix it. If it's overt you can't just offer change, you have to prevent further damage during the repair process.
> The reason I keep the list is as a reminder of the level of impact you can unintentionally have in the systems you build without extremely deep thought and broad context.
Have you ever read comp.risks? I really like it as a source of Therac-25 type stories (across all fields) that engineering types should think about when building things.
> I just know that I've not yet been able to find any real systemic bias, at least in the US, against rich, caucasian males.
Is Twitter not a system? :D
It gets a bit fuzzy with bias against the majority. Every model that isn't right disadvantages everyone and the majority is part of everyone. So bad drug laws impact white people too. But because actual race is only encoded in one direction (affirmative action, "positive" directions) then anything that impacts white people also impacts everyone, whereas there are often specific laws (such as for constructing "The Projects" in the first place) that do directly exclude whites from the harm they caused. So subgroups definitely experience more exclusive problems, even aside from the amount of problem.
> how difficult these things are to rectify at a systemic level, since the vast majority of doctors using the still-biased eGFR formula have no idea that it has this problem.
I think it's just that the story is best told from that moment. That's the OMG. From there it improves, and I'm sure they sent a copy of the report worldwide asap. But no solution is ever 100% so there's no wrap-up party and it will never look done and solved. (Even one doctor who didn't check their email...)
> Fraud generally requires mal-intent, which most of these models and products didn't have, at least if we're being generous and optimistic.
Probably, but if they're saying "our product does X" maybe there's something to grab onto and investigate. Maybe they did misrepresent it.
> I just finished reading a book called, Weapons of Math Destruction, which talks about the damage models can do. The author posits a set of tests to tell whether the model is beneficial or destructive. One of the hallmarks, the author argues, of a good predictive modelling system is one which includes feedback into the system as a result of its predictions.
Good point, and thanks for the book recommendation.
A lot of things aren't amenable to that though, because the hypothetical city/community meeting can't take years to watch the outcomes and train continue to build a model, they've got to work from historical data up to that point and make policy decisions in the meeting.
> One of the other primary tests is dependent on the context of how the model is used, so it's not really reasonable to try to determine the benefits of a model in isolation with no consideration for the appropriateness of its implementation and utility.
I've been seeing that as the predictive vs directive use. City planning instead of sentencing guidelines.
When talking about systemic bias, I think it's less about intent, but that is one of the things that makes it so dangerous in terms of the level and longevity of impact; it's much more difficult to fix when there is no single bad actor at which to point the finger. This makes it easier for people to deny, either outright or at least in extent, and makes it harder for the systems to be fixed.
I don't know that I could place one type of demographic bias over another, but I think it's better for people to understand and acknowledge systemic bias of any form, if we have any hope to rectify any of them. I also think a lot of them are woefully entangled. For example, I've heard the argument that some bias isn't racial, it's economic, as in, a bias against poor people not people of color. But when there are already so many systemic economic biases against people of color, a lot of times it ends up being a distinction without a difference. I have plenty of examples of this, but don't want to dilute the impact of that last thought by going into detail.
> That just shifts the definitional question to the scale of the institution, structure, or process.
I agree. Systemic bias is everywhere, for almost all scales of systems.
> Is the definition, "a system is broad" or (I think you mean) "broad as in flexible", such that a system can be anything? Any micro or macro population?
Yes, you're right, I meant flexible. We can debate the severity of the impact of some form of systemic bias based on the size of the system (and the population or sub-population impacted), but to the individuals who are the victims of systemic bias, the result ends up being the same. At the end of the day, I don't think I'd fault any individual for taking up the cause of identifying and rectifying bias at any scale.
> I guess where I was going with this is that the list seemed to be about some things that are very broadly impactful but not specifically racist, like facial-recognition not recognizing anyone but white men very well, but then a related issue of how this is encoded in the system. Is usage required, or was it random, etc.
Yeah, I didn't really provide much context for why I specifically keep a list. It's not intended to single out racism, and in fact, it is intended to focus on unintentional codification of bias, because I think there are more software developers out there who are at risk for unintentionally codifying bias into their algorithms than there are those who may intentionally encode it.
The reason I keep the list is as a reminder of the level of impact you can unintentionally have in the systems you build without extremely deep thought and broad context. It's a reminder that what we do can have real impact on real people in ways we never imagined.
This is especially recurring as a theme for machine learning and AI systems which are taught based on limited datasets which often systemically under-represent those to whom it will end up being applied.
> Such that maybe top-to-bottom it'd be sorted by breadth. How many people does each impact, and how deeply-mandated or intertwined are the issues. And then different populations and problem to both give contrast, but also to indicate (when complete) the demographics of the problems.
I think this is a good idea.
> So I was wondering if you'd focused on black issues to make the list.
I did not specifically focus on that. I started tracking the list I think 2 or 3 years ago, and I think issues of systemic racism have been especially visible in the mainstream in this time period for reasons I won't get into. So, it's largely a reflection of the systems I've been made aware of through research and publication.
> So, statistically, how much subpopulation misrepresentation would you expect in various ways? And are you saying there's more in general, or more racial, than expected?
I have no idea. I just know that I've not yet been able to find any real systemic bias, at least in the US, against rich, caucasian males. If the question is, which slice of demographics is the most biased against, I think there are some studies which have looked into that, but I couldn't say myself.
> I interpreted the one about kidneys as an error because of changing demographics and how people had caught it, made note that this sort of thing happens, and started checking other research for similar sampling bias. It read as a success story where the one about facial-recognition was more actively black mirror.
There is certainly a bright side in these things being identified as problems which need to be solved. However, even with the kidney example, I don't know that it's a success as much as further evidence of how difficult these things are to rectify at a systemic level, since the vast majority of doctors using the still-biased eGFR formula have no idea that it has this problem.
> More that I'm wondering if it was an attempt to model, like for predictive policing, or a tool sold to simplify the sorting of people? Because models are good, even when they're wrong, but crappy predictive tools are worse than useless and - where I was ultimately going with this - perhaps fraudulent to sell.
I mostly agree with this sentiment. The biggest issue I see is that most of the models weren't built with the intention of being biased or badly predictive, and those selling them often don't know until after they've already been sold and broadly adopted. Fraud generally requires mal-intent, which most of these models and products didn't have, at least if we're being generous and optimistic. Negligence is probably closer to it, but I think difficult to prove in such a new frontier.
I just finished reading a book called, Weapons of Math Destruction, which talks about the damage models can do. The author posits a set of tests to tell whether the model is beneficial or destructive. One of the hallmarks, the author argues, of a good predictive modelling system is one which includes feedback into the system as a result of its predictions. Many examples, especially those which alter the outcome by making the prediction in the first place (e.g. criminal sentencing based on predicted criminal activity, excluding candidates from the hiring process, firing teachers based on opaque models), have no feedback mechanism for directly determining how successful their predictions were.
> But yeah, to recommend sentences, total crap.
One of the other primary tests is dependent on the context of how the model is used, so it's not really reasonable to try to determine the benefits of a model in isolation with no consideration for the appropriateness of its implementation and utility. In this case, it's hard to defend a model with bad predictive data based on the presumption that it could have been more beneficial with another use, when the reality is that it was only sold with one use and that one use was harmful. Though, even this ignores the fact that it had no built-in feedback mechanism to determine how good a job it was even doing with its predictions in the first place.