This isn't an "algorithm" the way people here are thinking. It is a risk formula. The thing it does is standardize which risk factors have been shown to be important. The inputs are:
Age at current arrest
Current violent offense
Current violent offense & 20 years old or younger
Pending charge at the time of the offense
Prior misdemeanor conviction
Prior felony conviction
Prior conviction (misdemeanor or felony)
Prior violent conviction
Prior failure to appear in the past two years
Prior failure to appear older than two years
Prior sentence to incarceration
Each input is given a simple 0-2 integer importance weighting. Scores are summed, then rescaled by a translation table.
Outputs are a 1-6 "fail to appear" risk score, a 1-6 "new criminal activity" risk score, and a yes/no "new violent criminal activity" risk flag.
What to do with the scores (and any other factors to consider) is still left up to the judge. The advantage is in translating "does this defendent have a record?" into a numerical risk score in an evidence-based way.
Thanks for the clarification. I wish journalists wrote like this :)
Whether or not we want to call this an “algorithm,” it is a step en route to a probability based decision making machine. Not a bad example to frame discussions around.
The obvious plus-side is that if it works (sounds pretty plausible), this scoring system could help optimise jail time and its costs against the societal costs of re-offense. If it takes pressure of the justice system and reduces wait times, it could improve suspects’ civil and republican rights. The right to a timely trial, habeas corpus..
The downsides are a little foggier, and depend more on where this goes and what kinds of patterns emerge. In this simple scenario, there’s not a lot of data here and few obvious opportunities for bias or some other injustice. Doesn’t sound too bad.
..except age. If people your age reoffend more often, it doesn’t seem fair or just that you are effectively implicated for their wrongdoing. Does it?
As this system gets bigger it will want more data, for improved accuracy. Race, gender & political affiliation are obvious nonos. We can’t discriminate on those grounds. How about arrest location, home address, income in the previous year, marital status and level of education? Rent or own? Have a dog? File tax? This starts sounding like an insurance quote.
If we go down a more google/facebook/ML route, we can probably eliminate any data points that raise specific objections. With enough data, we can find proxies for anything. More specifically, our black boxes will find proxies for anything.
From a blind justice perspective, I think it’s all the same problem regardless. We are being implicated by affiliation, a statistical affiliation. That’s not blind.
I think there you have the core issue. It’s present in a pen-and-paper scoring system. It’ll be present in a Google CrimeRank algorithm.
We’ve gotten very used to talking about bias in nonspecific terms, with a lot of emphasis on chauvinist bias. But, we’re bias machines. That’s what judges are, to a large extent. Not all bias is irrational, bigoted or chauvinist. Do we really want to eliminate only rational bias, or is bias a problem for reasons other than that?
While people complain about the racial bias in the courtroom, the sex bias is substantially larger.
Heck, the Supreme Court of Canada recently gave a woman a complete pass for trying to have her husband killed. Because after she was charged with attempted murder, she conveniently remembered that she was the victim of domestic violence.
Despite her claims being provably false.
i.e. claimed incidents occurred when they lived hundreds of miles apart. There's no record of her ever calling the police, despite her claims of multiple police visits, etc.
We found that judges take gender, but not race, into account in determining the amount of bail for certain types of cases; more specifically, Black females faced lower bail than Black males in less serious cases. In contrast, we found that both race and gender affected the likelihood of pretrial release. White defendants were more likely than black defendants to be released pending trial and females were more likely than males to be released prior to trial. In fact, white females, white males, and black females all were more likely than black males to be released.
Okay, I think I see what you mean now. It's a bit hard to see since you're so focused on gender bias.
I think you're right that these factors, while answering fairly objective questions, reinforce bias due to things like prior convictions. Once someone starts down this path, they get treated worse by the system based on history. Even though they did their time, they aren't starting fresh.
Still, I think it's an improvement (to a very flawed system) because it's not adding new bias. It also doesn't seem practical to reexamine previous convictions to see if they were fair when setting bail.
> Okay, I think I see what you mean now. It's a bit hard to see
I must admit to not understanding how it's difficult to see the correlation. If bail is set on factors X, Y, and Z, AND those factors are shown to be biased, then by definition, bail is also biased.
> since you're so focused on gender bias.
That's just a weird statement to make. The research shows bias and I quoted the research... how does that make me "so focused" on gender bias?
> it's not adding new bias.
That is a good point, but continuing existing bias is a serious problem.
The logic behind this is that any bias resulting in unjust convictions will later also cause bias when setting bail. It seems like making sure unjust convictions don't happen is probably the more important of the two? And fixing unjust convictions would also fix the issue with setting bail.
I cannot read the paper there - How is "judges take gender, but not race, into account in determining the amount of bail" determined? Purely from correlation between bail amounts and above factors?
It's better to move towards a more open and transparent system that's data driven. However if the system used here will be transparent isn't clear, only that it can be. The builders of these systems can choose not to share the algorithms behind it.
A podcast I listened to a day or two ago [1]; You Are Not So Smart, discussed something related to this that I feel is important to point out here as well.
It's about how we transfer our biases into the algorithms and ML solutions that we build. Given the move towards using algorithms for making decisions like this, it's something we should definitely consider.
If you have the time, definitely listen to this episode. It's an amazing podcast, but this one episode really hit me about how software can affect peoples lives – and not always for the better. How we as software engineers should be more aware of how the solutions we build will be used down-the-line.
tl;dr of the episode would be:
ML solutions, and algorithms designed by humans, are designed by looking at historic data. Historically, a number of races have not been treated fairly when it comes to the justice system, e.g. black people being treated harshly for the same crime as non-black people.
When our ML solutions are built on historical data, it learns those biases as well. Which means that the "racism" is also built into the algorithm as well. Of course the algorithm doesn't have any concept of racism, it's just another feature it uses to compute it's decision. But it is something we as designers of algorithms should keep in mind.
> When our ML solutions are built on historical data, it learns those biases as well.
It's actually worse than that, because the algorithm is giving you the correct result.
For example, suppose that black men are more likely to fail to appear in part because they have worse, less flexible jobs because of racist hiring practices.
That isn't fair, but it's still true that they're more likely to fail to appear.
Are we supposed to give bail to people we know aren't going to show up because the reason they won't show up isn't fair?
That seems like an obviously bad idea, where the good idea is to do something about racist hiring practices so that doesn't happen anymore. At which point the algorithm will see the new data and un-bias itself.
The issue comes when features correlated with race are used to make inferences. This can occur in complex ways but a simple example, used elsewhere on this thread, is zip code. It might be that people from a mainly black area are more likely to exhibit whatever negative behaviour the model is looking for than a mainly white area, perhaps for the reasons you mention. What then happens is two otherwise identical people in the model are treated differently based on race.
> The issue comes when features correlated with race are used to make inferences.
Practically all features are correlated with race. Income, culture, religion, education, proximity to gangs, parental involvement, etc. etc. And many factors cause other factors, which then also correlate.
The difference between zip code as a mechanism for Bayesian inference and redlining is that redlining is disproportionate. Redlining doesn't adjust probabilities in response to data, it just outright bans anyone in a neighborhood, which overcompensates. It makes inaccurate predictions to the detriment of the people in that neighborhood, because some of the people there have mitigating factors that overcome the negatives of living there, and it doesn't take that into account. An algorithm that considers all available data does.
You still have to address the things that cause trouble for the people in that neighborhood, but those are separate problems. You have to fix them, not pretend they don't exist.
Another problem in the same vein is that when ML algorithms make use of Bayesian inference they can bake in correlations ( e.g. between race and credit score) that we would normally purposefully avoid using as a factor, because while it enhances predictive power, it again codifies our existing biases, prejudices, and injustices. For example if you were deploying an ML model to determine whether someone deserved a loan, features such as ZIP code or race could encode discrimination into the model
Including race as a parameter should reduce the impact of those correlations on the output of the model (by allowing the model to measure and control for the bias that exists in the input data).
Incautiously using race just because it reflects those existing biases would be a problem. This is what lots of humans do, overestimating the information provided by their own inferences of race. Like internet assholes who blather about how it is rational to be afraid of black men because of their higher rates of assault. Never mind that the absolute rate is still so low that there is ~0 predictive power from the race of a given individual.
The technique I'm referring to is including race as a factor, and then instructing your model to disregard that factor. Not being race-blind, but being race-aware while attempting to be neutral to race.
Yeah I think this is an interesting idea but I don't actually understand how the great-grandparent comment believes that this could be done. I don't think it works in a strict Bayesian sense. You would have to go out of your way to instruct your model to operate correctively
By definition, any model is basically bound to be discriminatory. Taking data, extracting common key features, and discarding the rest is essentially generalisation.
But the model is amoral. It's (morally) neither good nor bad for utilising certain features.
If it turned out that race was the most accurate attribute for a particular situation, it would be nonsensical ignore it.
The current trend of trying to paper over biases, while generally born of noble sentiment, probably only perpetuates the problem. Because it's usually done far down stream, and doesn't necessitate change at the source. Functionally, it's like a cover up by a large corporation
> If it turned out that race was the most accurate attribute for a particular situation, it would be nonsensical ignore it.
Only if you are optimizing for prediction accuracy. If you want to optimize for something like "justice" or "citizen wellbeing" then you might want to come to a different conclusion
Some of the most interesting and thorough public work on just how problematic these algorithms are was done by ProPublica[0]. They have a series of articles specifically focused on how machine learning/algorithms can either enable or create racist outcomes[1] that are all worth reading.
The most encouraging thing about this approach is the change to a data-driven systematic approach from the previous “whatever the judge thinks by looking at you” approach. It really doesn’t matter if the algorithm is completely correct, as once you make the conceptual change to using such a system, you can tune the parameters to fix any problems that arise.
Unfortunately, no detail on the actual algorithm in question. Is it executed manually or is it using a statistical model optimized for some tradeoff of metrics?
Some anecdotal outcomes [0], [1], [2] of the system below. In general, people were outraged for a few months and local news seemed to mention it in most drug-bust stories, but that seems to have died off. My law enforcement friends were beside themselves at first, but I think both sides are adjusting to the system.
I think in general I agree with it. Without context of witnessing the [alleged] crime and without the algorithm in front of me, sometimes I wonder if certain situations have exposed some holes in the algorithm. In addition to the three cited below, I remember reading about a guy in Ocean County being released after they found him with $1M of drugs... maybe his supplier took care of that problem for the community...
I got arrested 26 months ago. I'd gone back to the hospital with a court order that said the hospital's behavior was not in compliance with the law, and that they had to let their patient (my friend) go.
I made the mistake of going without a police escort. The ER staff didn't know that their employer actually had no legal authority to hold their patient in their psychiatric ward. I called the police and was patiently waiting for their help to enforce the court's order, but before the police arrived I irritated one of the security guards and they all attacked me. I was charged with 'misdemeanor assault' and was released on my own recognizance after a night at the city jail.
Two months later I got the police report, and learned what I'd allegedly done. 9 months later I learned the "old man" security guard that freaked out on me was a retired police officer. He knew exactly what to say to the responding officer to cover his ass, and was well-practiced at testifying on the stand. It's easy to make shit up if you know what you're doing. His co-conspirator was kind of a doofus - he'd been a hospital security guard for 20+ years.
Two months after my trial, my girlfriend got arrested - her court-ordered mental health professionals didn't think her substance use was of any concern to them, and were heroically treating her presenting systems without any concern as to their cause. While I was arrested for assault (misdemeanor/victim), she was arrested for heroin possession (felony/victimless). Her bail was $5000. The various bail bondsmen I called weren't interested, or wanted more collateral than I had - apparently people arrested for drug charges have a tendency to not show up for their court dates. She spent 2 weeks in jail while her attorney negotiated with the justice system on her behalf.
(The system she was arrested into has since made some efforts to keep people out of the jails, or to reduce bails, but I don't know how these changes would've affected her experience...)
In a separate anecdote, my ex-wife recently picked a fight with her then-boyfriend (California). He responded to her provocation by escalating the fight into a very physical confrontation. Bystanders called the police, he got arrested... Bond was set for $50,000. He bailed himself out by charging $5,000 to his credit card with a bail bondsman. The charges were dropped a few days later. The bondsman apparently got to keep all $5,000 (edit: I think the term 'racket' applies).
I don't know that an algorithmic approach to justice will be much help at this stage, because the system is so corrupt. Any algorithm will have to start with "Is there an actual victim [0]? No: release without bond. Yes: [...]". I don't think the system will take well to considering that they've spent decades making work for themselves...
> He knew exactly what to say to the responding officer to cover his ass
Replacing human judgment with algorithms is a terrible idea for many reasons, but this might be the worst. Algorithms are deterministic, and thus they can be manipulated. "SEO" services exist in spite of Google's efforts to protect their ranking algorithm. Does anybody seriously believe their local government will be more successful than Google at preventing people (i.e. police, prosecutors, etc) from "optimizing" their input to the algorithm?
edit - justification for the claim that a bail algorithm would be easy to manipulate:
You lead an interesting life. Your comment reminded me how curious it is that some people tend to have regular (yearly-ish?) clashes with police and the criminal justice system, and others go their entire life without anything more than a "Good morning" with a cop.
Well. There are factors that increase your risk of dealing with cops: 1) being non white (or, not the same skin color/ethnical background as the majority population) 2) being in any way "non-conforming to mainstream" - e.g. wearing dreads, "gangsta clothing", looking like a punk or basically anything that makes you stand out.
Cops always go for those looking different than the mass when they're bored.
The longer your history the more the cops will treat you like a criminal during even the most trivial interactions. Due to the large amount of discretion they have the more scrutiny you get the more likely you are to wind up with a lengthier criminal history.
Old men who have an (irrelevant) DUI from the '70s on their history get asked if they've been drinking when they roll a 4-way stop on their way to church on Sunday morning.
If a series of stupid policy decisions are made that constricts jail cell supply so much that even dangerous suspects are not detained, then yeah.
But there is a lot of social benefit to not having people sit in jail awaiting trial if they don't need to. People on limited income can't afford to miss work or not be around to take care of their kids or other love ones. People who turn out to be innocent still end up experiencing difficult life consequences as a result of unnecessary jail time.
I think the public interest is served by reducing the number of people in jail in a more fair and transparent manner. It will take sensible policy-making to make sure that it doesn't result in unintended consequences (as some others have pointed out) down the road, but it's worth starting down the road in the first place.
That situation in the UK is simply due to under-funding.
The article points out that the current US situation uses economic status as a reason for keeping people in jail, more so than actual risk to the public. Resulting in huge over-use and (as pointed out) going to jail even briefly increases re-offending risk.
Outputs are a 1-6 "fail to appear" risk score, a 1-6 "new criminal activity" risk score, and a yes/no "new violent criminal activity" risk flag.
What to do with the scores (and any other factors to consider) is still left up to the judge. The advantage is in translating "does this defendent have a record?" into a numerical risk score in an evidence-based way.
[1] http://www.arnoldfoundation.org/wp-content/uploads/PSA-Risk-...