Hacker News new | past | comments | ask | show | jobs | submit login
Police is using ChatGPT-4 powered body cams that turn audio into reports (forbes.com/sites/thomasbrewster)
15 points by giuliomagnifico 11 months ago | hide | past | favorite | 25 comments



The main thing that keeps me from using AI for things like document summaries is that it regularly trips up on the nuances in logic – mistaking positives and negatives when multiple points are being discussed, misattribution of motivation, etc. I can't think of a much worse thing for this to be used for than police reports.

You could say that because the bodycam footage still exists it won't be a problem, but the point of these reports is for decisions to be made based on them without needing to review the full footage. People are going to get dragged through court based on this, and if they're lucky, come out with thousands spent in legal bills to get someone to just review the bodycam footage where they said the opposite to the report.


Humans make the same mistakes when writing up reports too. I'm curious which is more reliable.

For example, I've done interviews for various media outlets and often enough they report I said the opposite of what I said, similar to the reasons you mentioned above.


> interviews for various media outlets

Errors then are not an accident. They start the interview with a conclusion to get to, whatever you say will be cut and pasted to push the narrative.


I hate when they do that. They're always out to get me. They're tricky like that. They have evil agendas. Damn them.


My suggestion is that humans are still noticeably better at this, not least because humans are able to backtrack and re-adjust their interpretations, which LLMs cannot in the same way. An LLM's "thought process" and "output" are essentially the same thing.

Now maybe I'm wrong about this, but even if so I think it's still a risky change. We have understood human error for pretty much all of recorded history. We have a very poor understanding of computer error, to the point that there are some legal systems where computer software is assumed to be correct by default.

Courts and laws around the world are designed to cope with humans making mistakes and lying.


> Humans make the same mistakes when writing up reports too

Sure, but then again humans are all independent and biased in different ways while the automated tools we replace them with will be making the same mistake and using the same bias over and over again.

One is a single entity, the other is a multitude of independent entities.


If the folks at Open AI value the entity's future, then they should get on top of this and stop them. Axon's representatives are claiming that they have "turned off the creativity" for GPT-4 Turbo. Full quote here,

     > Axon senior principal AI product manager Noah Spitzer-Williams told Forbes that to counter racial or other biases, the company has configured its AI, based on OpenAI’s GPT-4 Turbo model, so it sticks to the facts of what’s being recorded. “The simplest way to think about it is that we have turned off the creativity,” he said. “That dramatically reduces the number of hallucinations and mistakes… Everything that it's produced is just based on that transcript and that transcript alone.”
For an entity that was founded to safeguard us against AI risk, it is striking that no one at Open AI thought about the risk of people being imprisoned over the outputs of their next-token prediction models.

Perhaps it is my personal bias rearing its head, but it is striking to me that no one at the entity currently lobbying congress for AI regulation — including regulation that forbids others from training models — over "AI risk" didn't have people capable of making the observation; "if our LLM leads to innocent people being jailed, that will make us look very bad."


It’s quite a stretch to jump straight to innocent people being jailed because they’re using an LLM to summarise audio.


> It’s quite a stretch to jump straight to innocent people being jailed because they’re using an LLM to summarise audio.

98% of cases end in plea bargains. Especially for lower level offences. These cases are decided in an assembly line fashion based on summaries and reports. It will be a blue moon when someone at the DA's office will sit down and listen through the audio.

The DA will use the summary and a summary of the case to pressure someone poor and not that well educated to take a plea deal. Their public defender will do the same.

And the innocent person will often be so frightened out of their mind that they will say yes.

It happens every day.

    > Pleas can allow police and government misconduct to go unchecked, because mistakes and misbehavior often only emerge after defense attorneys gain access to witness interviews and other materials, with which they can test the strength of a government case before trial.
https://www.npr.org/2023/02/22/1158356619/plea-bargains-crim...

    > Eyster believed that Sweatt was innocent of the drug charges against her. “This is a hardworking woman who lived in a heavily policed community for 10 years,” she told me. “If she were a drug dealer, she would have already been evicted. She doesn’t have a history of drug use.” But the idea of taking this case to trial was a nonstarter. The best path forward, Eyster decided, was to humanize Sweatt to the prosecutor—hence those time sheets—and then try to negotiate a plea bargain. In exchange for a guilty plea, the prosecutor might not recommend a prison sentence.

    > The strategy worked. The prosecutor reduced the charge from a felony to a Class A misdemeanor and offered Sweatt a six-month suspended sentence (meaning she wouldn’t have to serve any of it) with no probation. Her paraphernalia charge was dismissed, and her conviction would result in a fine and fees that totaled $1,396.15.
https://www.theatlantic.com/magazine/archive/2017/09/innocen...


Implicit in your argument is that the AI summary report will have more bias than the report written by arresting officers. I am not sure that is the case - this could reduce bias as a more neutral summary and go counter to your argument.


I think the people who keep harping about bias are trying to convey a narrow point. My point is broader. Probabilistic models are probabilistic.

I am simplifying the behavior of these systems, but my argument is twofold. First,, just because an output has a high probability of being correct, doesn't mean that any particular output is correct. Second, "low probability" events that are acceptable in a limited use case are disastrous in broader use.

For example, If I were being generous, I'd say that GPT-4 makes an error 0.1% of the time i.e. 99.9% of the time it doesn't make an error. I would say that is extremely fair. I use this model daily and I've found the rate in my limited sample set to be higher than that.

If you are dealing with 10 cases, a 0.1% error rate is immaterial. If you are dealing with 100,000 cases, that's 100 cases where an error was introduced.

The true error rate is likely to be higher, for example, https://www.ncbi.nlm.nih.gov/corecgi/tileshop/tileshop.fcgi?...

Is it acceptable for a few thousand to a few hundred innocent people to face legal action because an overgrown next token prediction model made a mistake?

I love GPT-4. I love its promise, but I don't think it should be implemented in safety-critical situations.


I am also worried about officers "gaming" the system to bolster whatever stat their superiors want them to optimize at the detriment of those being "policed".

We've all seen how prompting can change GPT output and this is what worries me the most.

Imagine users finding out that starting every interaction that is recorded with a certain sentence acts as a pre prompt that in turn subtly biases the gpt output so that it makes judges or prosecutors less sympathetic to defendants when the police department is optimizing for higher conviction rates, eg. (That is, doing the opposite of the quote above of the public defender "humanizing" the drug case defendant in the plea deal example.)

There will be a lot of reports transcribed, people will find ways to optimize the output to their gain and we've all seen how easy it is to bias llm output by promoting.


I interviewed with these guys. At the time I could not fathom why you would want to pair LLMs with body cams and thought it was a case of hype and sticking LLMs in everything. What’s next, toasters? Guess this is a lesson to think a little more outside the box. But the certainty that hallucinations will happen and the potential to use models/prompts that prefer a certain version of the truth makes me very worried

Also, they make tasers. What could go wrong?


Axon has a history of doing (imho) shady crap like claiming tazers won’t kill people or pumping up “excited delirium” claims.


Let's hope this never gets accepted in the courtS, because we know how much AI never make mistakes


Not to mention alignment. AIs are anything but neutral. An AI might refuse to report that a PoC was acting aggressively out of fear of being racist. Just an example.


Yeah. Refusing to arrest a "PoC" is definitely the thing to be worried about here. Because if there's one thing american police are known for, it's their deference to "PoC".

Seriously, what the hell? Your first reaction to this article is that you're worried it will prevent "PoC" from being arrested? At what point in american history has that ever been an issue?


My first reaction was to raise concern about alignment, and came up with an example that has nothing to do with current or past events, instead describing a hypotetical future scenario that might possibly play out if current AIs were tasked to issue a report, which are (IMO) overly tuned to be exceedingly sensitive to racial issues and would thus be likely fail to be neutral.


Great. Police reports can be leaked through gpt leaks without needing foia hassles now. ;)


My dad's a doctor and the amount of paperwork you need to write these days has gone up a whole bunch, even higher for him then the 40% of time mentioned in the article. There's a big temptation to fudge on the reports too, since it's mostly checkmarking to cya for lawsuits. Thus ,I think this is actually one area where because the people writing the report are someone misaligned with the purpose of the report, ML powered report writing is a better solution.


If the purpose of the report is to CYA, then you 100% do not want an AI filling it out. Your A will not be C'd.


It would be if it became standard practice and the report filling software were an fda approved medical device.


A report which may or may not contain hallucinations and random shit will never be enough to cover your ass.

How would that work?


Reports can already contain errors and random shit. All you need to cover your ass is the ability to pass liability onto someone else. If the errors are because of a software defect rather than because of user error, your ass is more covered, not less. Though it's true that certain sorts of errors in certain sorts of reports could be problematic in and of themselves.


From what I've heard, a lot of doctors and cops make up shit in reports today anyways. If the summarization is citing back the source in the audio transcript and generally doing RAG stuff, I actually think its more aligned with the interests of the public, especially in cases where the doctor or cop wants to fudge or smooth a bad situation.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: