The lawyer didn't only cite "bogus" cases, but when challenged attached entire "bogus" case contents hallucinated by ChatGPT (attachments to #29 on link above)
Thanks for the great context. The lawyer should be disbarred. He doubled down when he was caught, and then blamed chatGPT. What do you bet he was trying to settle really quickly to make this all go away.
There are plenty of people on the internet (including here) who think ChatGPT is a "smart expert" and who don't understand that ChatGPT can easily make up stuff that looks very convincing at first glance.
And if you challenge them, they also double down and say "ChatGPT is the future" etc.
And I'm putting 90% of the blame for that onto the mainstream media, reporting irresponsibly and perhaps maliciously in order to milk a fear cycle out of AI. And some of it on OpenAI/Google execs, now cynically drumming up inexistent existential threats to facilitate regulatory capture.
Is it an interesting and relevant story that many people would be keen on reading? Nah, must be a "msm" conspiracy. Can't wait for the Weinstein take on this.
Stuff like it is certainly the future though. Primary flaw of OpenAI's GPTs are that they're just dumb text transformers, "what's the next best word to follow". It just so happens that it can answer a variety of questions factually simply because it was trained on real (and imaginary) facts.
It needs to be backed up by a repository of hard knowledge and only uses the transformer part to generate a sentence based on this knowledge.
At the same time, as much as it currently hallucinates it's still nothing compared to the misinformation that humans perpetuate on a daily basis.
The original bogus citations may be excusable as a genuine misunderstanding of ChatGPT, i.e. he falsely thought he had a research assistant feeding him accurate quotes.
But there is simply no good faith excuse for filing the transcripts of the cases without as much as skimming them, once doubts had been raised. I’m not a lawyer, but even a cursory look at the Varghese case transcript shows that it’s gibberish: The name of the plaintiff changing inexplicably, the plaintiff filing bankruptcy (of two different kinds) as a tactical move, etc. Another transcript purports to be about a passenger suing an airline over being denied an exit row seat. As soon as you start reading the “transcripts”, you see that something is seriously off about them, compared to the two real (but irrelevant) cases cited.
Of course I do not know, but he should have come clean. "Hey, I can't find this case in WestLaw, but chatGPT found it and produced it". Instead he just submitted it as-is right out of chatGPT. Alarm bells had to be going off in his mind that a federal court decision in a lawsuit was less than 5 pages
Forward thinking for him to try out ChatGPT for his work. Nothing wrong with experimenting with a potential helpful tool.
But just as I review and correct code snippets it produces, he should have verified the results because nothing indicated to him that they are any good (besides the fact that they were well written).
I'm pretty sure plenty of other lawyers are experimenting with ways to use ChatGPT without being quite as naive.
It matters in terms of remediation: incompetence implies that lawyers require better technical education on LLMs, while malice implies that the lawyer has violated an already established rule or law.
Lawyers undergo continuing legal education throughout their careers; in many (most?) jurisdictions, it’s mandatory. “LLMs are not legal search engines” as a CLE topic in the next decade would not surprise me remotely.
That should be read as "the presence of incompetence implies that those incompetent lawyers [...]." Sorry if you found the phrasing ambiguous.
(The only really important part of the original comment is the part about CLEs: we have an entire professional educational system that ought to be able to accommodate subjects like this.)
Yes but be merciful to an unfortunate fool who believed in technology! ChatGPT proved, like the Ouija board, to be the very voice of Satan himself for this lawyer. Bwahahahaaaaaah!8-)
I think the big question is... what was this guy doing 2 years ago? Was his stuff real work, or was he finding a less sophisticated way of phoning it in?
It seems improbable that someone who did all the hard work and knew how to do it would suddenly stop doing that. Such work ethics tend to be habit-forming, or so I had thought.
The lawyer kept digging the hole deeper and deeper, and (as a non-expert) I agree that it seems that the lawyer is at serious risk of being disbarred.
Interesting documents are from #24 onwards:
- #24 (https://storage.courtlistener.com/recap/gov.uscourts.nysd.57...): "unable to locate most of the case law cited in Plaintiff’s Affirmation in Opposition, and the few cases which the undersigned has been able to locate do not stand for the propositions for which they are cited"
- #29: attached the cases - later revealed to be a mixture of made up (bogus) for some, vs irrelevant for others
- #30 (https://storage.courtlistener.com/recap/gov.uscourts.nysd.57...): "the authenticity of many of these cases is questionable" - polite legal speak for bogus. And "these cases do exist but submits that they address issues entirely unrelated to the principles for which Plaintiff cited them" - irrelevant. And a cutting aside that "(The Ehrlich and In re Air Crash Disaster cases are the only ones submitted in a conventional format.)" - drawing attention to the smoking gun for the bogus cases
- #31 (https://storage.courtlistener.com/recap/gov.uscourts.nysd.57...): an unhappy federal judge: "The Court is presented with an unprecedented circumstance. A submission filed by plaintiff’s counsel in opposition to a motion to dismiss is replete with citations to non-existent cases. ... Six of the submitted cases appear to be bogus judicial decisions with bogus quotes and bogus internal citations" ---- this PDF is worth reading in full, it is only 3 pages & excoriating
- #32 affidavits, including the ChatGPT screenshot
We're a very tech forward law firm, and we're bullish on AI. The issue is that lawyers are traditionally tech illiterate, and they treat Gen AI like a search engine that puts results in narrative form. Realistically, I think AI generated motions and contracts are the future, and this instance will be looked at by every tech averse lawyer to try and stymie progress in the field. These lawyers deserve their sanctions for being so reckless with things they don't understand, but rather than take away that lawyers need to learn tech, lawyers will say tech is bad. I almost wish this was non-news so it wouldn't further push the legal industry into the past, but those clients were wronged and I guess people need to know what to beware of when hiring a lawyer.
Personally, we get really good results from using AI, it's already present in all of our processes, but we tell it what to generate, rather than rely on it to know better.
It's not just lawyers who think that ChatGPT is a search engine. I've observed this many times in my vicinity, people from all walks of life think that Star Trek is here and computers now respond accurately to natural language queries. For non-techies, "just asking the computer" is so much more convenient than translating your question into traditional search queries.
So I guarantee you that stuff like this is happening daily across all industries. Depending on the profession, people will lose money or get hurt as a result of someone blindly trusting this technology. I can't prove it, but statistically, that's basically a certainty.
In my opinion, you can't overstate the importance of articles like this, which point out the limitations and highlight the dangers. I'm also against banning. But lay people need to be informed what ChatGPT is and is not, and OpenAI won't do it because they want to ride the hype train.
In local-to-me politics we have a report on changing the admissions process for specialty programs in the public school system - which had a bunch of fake citations and people suspect was written with the "aid" of ChatGPT.
Computer, what's the formula for transparent aluminum? Seriously, I got ChatGPT to spit out a scientific-seeming paper on the formula and manufacturing process for transparent aluminum. It did note that there's a real thing, aluminum oxynitride, which is the closest thing we have to the Star Trek material. It even wrote the following abstract, based on my prompt:
> This scientific description provides an overview of the formula and manufacturing process of transparent aluminum, a material used in applications where both structural strength and transparency are required. Transparent aluminum finds extensive use in diverse fields, including public aquaria, where it allows for the display of large marine organisms. The description outlines the chemical composition, key properties, and the manufacturing steps involved in creating transparent aluminum.
Whether or not the six-step manufacturing process it came up with is correct or not, I haven't the expertise to say.
Idk it's pretty reliable even now if you chuck a vector db of "knowledge" in and inform the GPT in the overall prompt that it is must not go outside the bounds of knowledge provided with the user's query, and that if no hard knowledge is provided that it should respond along the lines of "I don't know" (or search the internet and parse results for them).
I imagine at some point this behaviour will be built in. People are treating GPTs like they're knowledge bases, but they're not. The fact that they can answer simple or complex questions correctly is only a byproduct of the transformer being taught how to string language together.
It's like an artist who learns to draw by making paintings of raccoons. You can't then ask them "what do raccoons like to eat?" or "what foods are poisonous to a raccoon?" just because they learnt to draw by painting raccoons. This is how people are treating GPTs atm. They believe in it because they ask the artist "what colour fur do raccoons have?" and because the artist can answer that correctly, they assume all other answers are factual.
This is the main reason I think disbarment as the punishment in this specific instance may not be fair. There are people who are unaware of the limitations of these systems and the risk of these confabulations occurring.
While I don’t think disbarment is inappropriate, I would rather see the New York State Bar use this to require some better understanding of these emergent technologies or even better have all the State Bars start discussing some standardized training about this because it’s easy to see a person trying to treat this as LexisNexis.
If your doctor asked ChatGPT to tell him how to remove your appendix, followed the directions, and subsequently removed a kidney instead, would you want him to lose his medical license?
For sure, but the difference there is that someone was actually severely wronged. The worst that happened here was some people had time wasted.
I think a punishment where the lawyer had to pay for all the time he wasted for the judges and various legal clerks (and his client) would be sufficient personally.
He is unlikely to make the same kind of mistake again I would think.
> The worst that happened here was some people had time wasted.
At least one party to the suit, if not both, are going to end up spending extra money. Plus it wasted public resources – the time of the judge and court staff, and their salaries, and more – and cost taxpayer money. Your remedy of having the lawyer pay might bankrupt him, and it doesn't really make whole the other party. In cases where one party has limited financial resources or perhaps is close to death (thing capital punishment, or malpractice), this isn't just waste. Someone could be severely wronged.
that is not a meaningful difference when the issue is that both professions can use tools but are responsible for the results of using them, and thus obligated to apply their professional judgement before and after using a tool that can hurt people (in this case, their clients, at least)
Is disbarment about fairness? Is the primary goal of such proceedings to rehabilitate and apply a sort of justice?
Certainly, civil and criminal courts have those as their raison d'être. But I thought licensing boards had an entirely different purpose. If I surgeon was a good guy who genuinely wanted to help people and who didn't engage in any sort of malfeasance... but even so, he just kept slicing aortas open accidentally through incompetence, the board should say "aw shucks, he's had some bad luck but he really wants to heal people".
This is the same. The court system is replete with circumstances where a client does not get a second chance at pursuing justice. A lawyer that fucks that up, even if doing so in good faith, leaves them with zero remedies. This might have been a bullshit "Slippin' Jimmy" case this time, but the stakes could've easily been higher.
I don't think I want to live in a world where fairness plays any part in the decision by the bar on this matter.
Disbarment is usually considered a punishment of last resort. That the failure of the attorney to carry out their obligations is so absolute, that it justifies taking away their right to practice law in a given state. There are certainly other measures that can be done here that are of a similar rebuke, just not as final. A suspension or temporary disbarment is also possible.
We don’t know the full situation here, but a personal injury case against a bankrupt airline for striking someone in the knee with the serving cart seems remediable?
Disbarment usually happens in cases where attorneys fail to file timely repeatedly at the expense of their clients and after multiple admonishments to stop that; utterly fail in their fiduciary obligations (i.e. they were acting like an escrow and then instead gambled the money away in Vegas).
> that it justifies taking away their right to practice law in a given state.
This seems a little weird. As far as I understand it, no one has a right to practice law.
There is a privilege that can be acquired, it one meets the requirements. If you somehow got through without meeting those, or if you start to fail to meet those... time for a new career.
> We don’t know the full situation here, but a personal injury case against a bankrupt airline for striking someone in the knee with the serving cart seems remediable?
I don't know about this particular case, but many cases and circumstances can be a "one shot at it" scenario. You fuck it up, it's tossed and you can't refile. There are many reasons and details, any of which might be messed up by a lawyer relying on a silly chat program to draft motions. One might miss an absolute deadline. It might be dismissed with prejudice. Appeals might be exhausted. This could even be true of the case in question.
In some cases, it might even be true if it was a criminal trial and your defense attorney was incompetent, that you don't get a chance to appeal. In California, I think, those are Marsden cases (someone correct me if I'm wrong). For those, you have to raise an objection during the trial.
So, if someone found out that ChatGPT gave their lawyer bad advice the day after their conviction... well, oops. No appeal for you.
I'll say it again. I do not want to live in a world where law license proceedings are decided on a "what's fair to the bad lawyer" basis. No one has a right to be a lawyer, if you're bad at it there are plenty of other occupations you might make a living with where incompetence doesn't threaten so many lives and livelihoods.
Imagine getting fired and barred from writing code ever again over a bug you introduced because you used copilot and didn't spot the issue.
Pretty sure that would be considered an unacceptable infringement of basic human rights here.
You can assert your ideals all you want but the fact is that professions that govern themselves invariably end up with "what's fair to the bad lawyer".
Licensed professionals are licensed (should be, there are notorious exceptions) because if those professions remain unlicensed, horrible things happen.
Lawyers and medical doctors are two of those. Yes, it would be wrong to prohibit the Starbucks barista from making coffees, no matter how many times such a person burned it.
Software engineering probably falls between licensed professional and burgerflipper on that scale... but let's not full ourselves. If you were working on firmware for medical equipment, then yes banning you from ever doing it again because you used ChatGPT when making a heart rate monitor is just and fair.
Not all of our software matters. But the people working on code for space vessels or aircraft or as in my example, medical equipment? I'm more than happy to see them banned from these things for life if they were to do that.
> You can assert your ideals all you want but the fact is that professions that govern themselves invariably end up with "what's fair to the bad lawyer".
This is irrelevant. We're all aware of how underperformant oversight tends to be. The point is to fix that, to rally against its eventual decline. Certainly I don't know why anyone would want to embrace your attitude of defeat/acceptance.
Out of curiosity, why do you think AI generated contracts are the future? Do you draw a distinction between contracts generated by AI and, say, contracts “generated” by first-year associates (i.e., using precedent to generate a first draft appropriate for the deal that’s then iterated by more experienced lawyers)?
Also, how is this incident going to push the legal industry further into the past? Do you think lawyers are going to, like, stop using email because of this?
... retrieval augmented search is here today and is available in ChatGPT with plugins or integrations with vectorDB systems. A lot of AI systems are "search engines that give you narrative outputs"
Man, why do you guys get paid so much? Contracts need semantic correctness, ie the verification of the logical consequences of natural language. This is an AGI determining problem. At the point this exists, humans are basically obsolete as workers, and you don't have to worry about your law firm keeping up anymore.
I claimed that it was a task that most people can do, but is really genuinely, intelligent. I actually thought it was everyone, but then you responded with your comment.
Contracts have lots of standard clauses or near standard variations of such, and as a consequence plenty of "dumb" template based generators already exists, and you may come across contracts that have just the occasional line manually written.
At least one company already integrates or is about to (not sure if it's in production yet) LLMs in theirs to do effectively smarter completions.
It won't be fully automated any time soon, but it will certainly eat into a lot of the simpler work.
> Judge Castel said in an order that he had been presented with “an unprecedented circumstance,” a legal submission replete with “bogus judicial decisions, with bogus quotes and bogus internal citations.” He ordered a hearing for June 8 to discuss potential sanctions.
Disbarment should be a no-brainer and a minimum.
Just “Southern China Airlines” should have raised eyebrows. This lawyer has shown obvious disrespect to the court, the court time, and to his client.
The lawyer should lose his license. Imagine they had turned in this brief and follow up and ChatGPT had not been involved. Instant disbarment. ChatGPT is not an excuse for a professional to produce nonsense in front of the court. Bye bye.
Yeah, interestingly, being bad at lawyering isn't a typical reason for discipline. Discipline is integrity and process-based, things like stealing client funds and failing to communicate promptly and reasonably with the client.
Arguably the leading goof was using a technology the lawyer didn't understand and failing to inform the client of the risks of using it. Between that and citing garbage precedent-- half the bar might be eligible for discipline on any given day. The judge might issue sanctions but bar discipline is a different ball of wax.
The way I see it, AI cannot claim legal ownership of the output produced, it belongs to the human that generated it (e.g., an example of someone using generative AI to produce a painting that can be copyrighted by them).
Which makes me think that this case should be treated just as if the lawyer wrote it on their own. If the lawyer enters output of chatGPT they generated into court records as if it was their own, it was their own. The lawyer wrote those made up cases into the documents, and the entire matter should be treated as such.
It blows my mind the lawyer didn't double-check the fake cases cited by chatGPT, but had the idea to ask chatGPT whether those citations were legit (and then got his concern satisfied with a simple "yes").
I suspect a lot of lawyers should be disbarred, GPT aside. They don’t always care about their customers. I’ve often been more expert than my lawyers in France, and I’ve seen at least one guy going to prison where the lawyer, publicly shamed by a dozen of youtubers for not actioning the various correct levers, told the excuse that he “only had an hour to review the case”.
What are we paying for, if the guy spends 4 months in prison before the faulty judgement being overruled, if the lawyer says he didn’t even work on the file.
When you hire a lawyer, you have no guarantee he will work for you.
The guy’s Youtube name was “Marvel Fitness”, if you search you’ll find dozens of videos pointing at (among the guy’s mistakes) the lawyer’s mistake (in French):
The title of the made up case seems to be "China Southern Airlines", which is correct. But it is misquoted on Document 31 (Order to Show Cause) as China South Airlines
He swore under oath that everything in the affidavit is correct. So he lied under oath, means he committed perjury. Where I live, perjury has a 3 year jail sentence. However it is very rarely enforced.
The standard for perjury is (generally) a standard of belief, which is much stronger than factuality. Ignorance of what ChatGPT is, even shocking ignorance, may not clear that bar.
In other words: I would be very surprised if perjury was on the table here. This falls under the kind of basic competence and client obligation that the Bar exists to address.
I don’t think disbarment is out of the question (and that would indeed send a very strong message), but I disagree that this comes anywhere close to a serious risk of prison for this lawyer.
Some context: any litigator will have access to Westlaw or Lexis-Nexis to look up and verify cited authorities like cases. It’s considered bad practice, at best, to cite authorities that one has not reviewed—for example, case citations drawn from a treatise or article.
As a practical matter, it is inconceivable to me that the attorney here, at least upon being ordered by the court to provide copies of the cases he cited, did not look them up in West or Lexis and see that they don’t exist. That he appears to have pressed on at that point, and asked ChatGPT to generate them—which would take some pointed prompting—was just digging his own hole. That, more than anything, may warrant professional discipline.
"Hm, maybe I should double check ChatGPTs output... Hey, ChatGPT, does your output make sense?" - "Yeah, my output definitely makes sense". "Are you sure?" - "Yeah".
People are scarily willing to take eloquent text on trust if it says something that benefits them and sounds plausible.
The biggest problem with ChatGPT is that it sounds like a smart adult that people want to give the benefit of the doubt because it agrees with them a lot instead of like a child that people would be more wary of trusting blindly.
This willingness most people show to jump on the bandwagon when hearing confident sounding language actually explains a lot of societal problems. Also a lot of workplace dynamics, and the career of a lot of very powerful and rich people.
> Mr. Schwartz said that he had never used ChatGPT, and “therefore was unaware of the possibility that its content could be false.”
> He had, he told Judge Castel, even asked the program to verify that the cases were real.
These two ideas are incompatible with each other. You can't claim that you didn't know to question the source, and then also that you questioned the source, even if it was done in the least effective possible manner.
To quote Richard "Racehorse" Haynes in the Wikipedia article:
"Say you sue me because you say my dog bit you. Well, now this is my defense: My dog doesn't bite. And second, in the alternative, my dog was tied up that night. And third, I don't believe you really got bit. And fourth, I don't have a dog."
So, here the defence is:
* I didn't believe the content could be false,
* Even if it is legally determined that I (beyond a reasonable doubt) knew the the content could be false, I asked the program to verify that the cases were real.
There are more details in the wikipedia article, but I believe this is legally valuable because a defendant is required to legally enter a defence and cannot easily change this.
While I have no reason to doubt that it's valid to argue "A || (!A && B)", in this particular case B => !A.
When the lawyer brought evidence that he had tried to verify the information (B), that evidence itself automatically disproves his plead that he didn't think it could be false (A).
(Note that didn't have to necessarily be the case: for example, if he had claimed that it was his assistant who asked the verification question, rather than he himself.)
So, unless he made the second plead and brought the evidence at a latter time, shouldn't he have just skipped A altogether, and preserved his credibility? (And possibly avoided a perjury charge? IDK if in an American court both claims would be sworn statements)
Perhaps, but it can be confusing if the listener interprets it as a witness making a factual statement, leading to the rhetorical question "were you lying when you said A, or else when you said not-A?"
For legal arguments, it's more like "we contend that you can't prove A, and even if you can, you can't also prove B, and both A and B must be proven for legal liability." Which most people can understand isn't inconsistent at all. That's why the legal & ethical guidelines spell it out.
What this comes from is that sometimes you claim something that you can't prove.
Suppose you got rid of your dog a month ago. If that's true then the non-existent dog certainly didn't bite anyone. But you still have dog food and leashes and there is a dog registered to you, so that may be tough for you to prove. Even if it's true.
But now suppose you can establish that there was a fresh pie on the doorstep when the plaintiff claims to have been there getting bit by your dog. If there was a dog loose in your yard at the time, the dog would have eaten the pie. Since that didn't happen, if you had a dog then it must have been tied up. It's also perfectly consistent with you not having a dog, but it doesn't help you prove that because it's equally consistent with you having a dog that was tied up.
The reason this makes people uncomfortable is that the system is supposed to work, but you can easily imagine a case where you in fact don't have a dog but there was also no pie, so the only way for you to win is to establish the thing the jury disbelieved. People don't want to have to conclude that the system would arrive at the wrong outcome in that case, therefore how dare you claim you don't have a dog when there is some evidence that you do.
Related article: "End of the Billable Hour? Law Firms Get On Board With Artificial Intelligence, Lawyers start to use GPT-4 technology to do legal research, draft documents and analyze contracts" [0]
Critical thinking skills are more important than ever in the age of AI. Used correctly, ChatGPT(4) can sometimes be a huge time saver, but you cannot believe all the bullshit it serves you.
The thing is that if you are not an expert in the field you cannot tell gibberish from legit facts. Especially if the writing style and grammar are top-notch.
I know it is a bad bias, but we typically associate good and clear writing with legitimacy. Here we have chat GPT, that can do exactly that, but spit out complete bs.
One thing I like about phind.com is that it ties the specific assertions to the specific web page it made it from. That allows me to check into the sources.
However, like all generative AI, it’s good at forming narratives, and not many people are aware how powerfully influential narrative frames are because people rarely step back to examine the frame itself.
> The thing is that if you are not an expert in the field you cannot tell gibberish from legit facts.
Some answers are more easily verifiable than other.
If I ask about an explanation of quantum mechanics, I might check if the names of particles and equations are correct (which they probably will be), but verifying the overall reasoning is basically the same as knowing the subject in the first place.
But if I ask 'tell me the 5 most important neutrino experiments performed at the LHC and their outcomes', I can relatively easily check the results by finding the papers, their citation numbers, and reading the abstracts - I don't need to understand every detail. Maybe it will have missed one that should have been on the list, much like a manual search could have, but I won't fall for an outright hallucination.
And if I ask about, oh I don't know, let's say to find some precedents for a case I'm working on, it's straightforward to take the case names that the LLM spat out, look for them in a legal database, and see if the judge actually ruled what the LLM claimed he had.
Or use it to retrieve alleged facts from its own output, pass those alleged facts to another tool (like a legal search engine) for verification, and then get GPT to edit its output accordingly…
Yes he is guaranteed to have access to some case library software.
Why he didn't verify the existence of these cases spit out by a tool he used for the first time is beyond me. It would have taken him a few minutes at most to discover the fake ones and skimming the bogus references (assuming some legal competence)
Events like this are needed to show to society the flaws of the technology and possible misuse. They can then lead to regulations or guidelines, or even lawsuits that can further define how and when this technology can be used.
In a thread on Reddit I saw some people who were using ChatGPT as their personal lawyer around some tech issues, and the community seemed to have absolute confidence it was providing good advice. I tried to comment that I wouldn't trust GPT as a legal reference and got downvoted into the deep negatives.
I'm guessing we will see a lot more of these type of issues. ChatGPT is sometimes useful but sometimes full of misinformation and many people don't seem to be wary enough of the potential for error.
Unpossible! I’m a 20x coder who spends hours a day writing regex, tried asking it a high-school level coding question, and it answered perfectly. I’m 100% sure that its just as accurate everywhere else.
Even if the lawyer would have carefully checked and corrected ChatGPT's answers, isn't he going against his professional duty by submitting private information to a website that doesn't ensure confidentiality?
I'm only peripherally involved in the law field, and I am aware that ChatGPT and similar do not consistently provide correct legal citations. It also gets things wrong some of the time; the only way to tell is to be an expert or to look up the facts yourself. Don't use ChatGPT for legal writing, nor for anything else that requires accuracy.
When it's correct, it can be a good search aid. But for a lot of things, it is just incorrect with high levels of confidence.
You can also ask it Bluebook questions and it will often get the right answer. At other times, it will get the right answer but cite to the wrong rule (not that it matters that much).
Another issue is that it can cite to the correct case, but misunderstand what it is citing to. You can be really specific and ask something like "what is the x-factor test from Doe v. Doe" and it will get three factors correct and invent the other two.
The thing with law, though, is that there are often already many quick reference materials that have already been extensively published that will get you the answer you are looking for more quickly than you can get it through either search or a chat interface. Many state bar associations make available the equivalent of a "practice area in a box" full of checklists, templates, and other material geared towards making it possible for you to start working in that area almost immediately.
I have had it be useful in course correcting my research in an unfamiliar area of law. I was wasting a lot of time reading secondary sources and cases that were not relevant to my problem because I knew nothing about that area of law and my search queries were just leading me in unproductive directions. ChatGPT pointed me towards a more relevant case that opened up the rest of my research for me using conventional tools like Westlaw. It saved me a lot of time. But I did not use it at all for the final work product and never used it blind without looking at a source.
That's right, generative model outputs are worthless unless checked by a human. You are using it right. For the moment I don't think there is any single domain where AI can work on its own, autonomy was reached in 0% of fields. That makes me think the removal of the human in the loop will take a long time. We are still safe, AI will be our sidekick.
It's crazy how AI seems to progress at incredible speed and yet we don't get closer to full autonomy anywhere. It's as if we discover new problems at the same speed we are solving them. Just 5 years ago nobody would think hallucinations will become a central issue in AI, we might discover other unknown unknowns that hide in our future.
I've been thinking that it's funny how these AI tools are framed as assistants, but it seems they are actually the opposite. They are great at big picture stuff but sloppy when it comes to details. So the more logical division of labor is to make the human the assistant.
GPT4 hooked up to a legal DB and instructed that if no knowledge from the DB can be referenced to respond negatively would've solved this completely.
Even a hard link to rewrite the prompt from "use provided knowledge to answer the question" to "no knowledge could be found, inform the user of this fact in a way that keeps the conversation fluid. Recommend that they search other sources themselves." would be more reliable.
The problem, as always, isn't the tool. It's the tool using the tool.
Hmmm, I suppose we could, if there was a focus on it.
However it's always interesting to watch how people react:
* Traditional programming creates very strict functionality but is not flexible/is rigid.
* Okay, so let's train models to act like biological life, which is super flexible but has a chance to make mistakes.
But then people complain about the latter making mistakes. Realistically we can combine the two to ensure that what a model is saying is based on fact, though.
Easiest way would be just for people to interact with an LLM to browse hard facts and then double check via sources. But relying on humans not to be lazy is just as good as relying on an ML model to act like traditional programming, so it's probably better to train our wild model with as many built-in constraints as possible and then decorate those with more traditionally programmed hard limits.
It's really hard though because of the output data. We're going from a binary output of 1/0 for traditional programming to a continuous output of 0-1 for models; if I ask it to summarise something, whether it's correct or not is not only based on whether it summarised based only on the original text, but also down to the reviewer's individual biases and wants/needs as to whether the summarisation is sufficient.
The continuous nature of an LLM's response does make it difficult to determine if it's sufficiently factually correct, though. Because if it's not repeating word for word, you need to be able to parse the output in order to check it...using another LLM (so compounding error...)
I was able to coax ChatGPT into inventing an entirely fake trend in cocktails and distilling, complete with stories about newly-opening bars and made-up concoctions (including recipes) supposedly created and served at these bars.
(The article does not say anything about this but I'd guess if they are as clueless as they claim to be by calling it "a source that has revealed itself to be unreliable", they might as well be using the free version without the knowledge of the latest version.)
OpenAI deserves some blame here. Their product UI does a very poor job conveying that it will just make things up when asked for specific information. And in the conversation there's no particular hint that the confidently stated "facts" are not in fact true. A large number of people get tripped up by this when first using ChatGPT. I did myself, it took me a day or two to realize the papers or web pages it was claiming existed didn't. It's pathological.
This problem will be solved with LawGPT which will be a generative pre-trained transformer trained on a vast array of legal cases, rulings, and briefings and designed with accuracy in mind and free from hallucinations.
It will cost $3,400/month for a legal firm the size of Levidow, Levidow & Oberman, P.C.
I, Saul Goodman, attorney-at-law and mastermind of legal brilliance, on behalf of the humble but feisty David Rosencrantz, submit this amicus curiae brief to the esteemed Supreme Court. This Court, known for its unflinching pursuit of justice, now has the opportunity to rectify a grave miscarriage of justice perpetrated by the lower courts.
SUMMARY OF ARGUMENT
The case before this honorable Court presents a classic tale of a downtrodden individual, David Rosencrantz, pitted against the formidable Goliath that is the United States government. Rosencrantz, a simple taxi driver trying to earn an honest buck in this dog-eat-dog world, has been unjustly accused of tax evasion. However, this Amicus respectfully submits that the government's case is nothing but smoke and mirrors, an elaborate scheme to crush the dreams of an honest, hardworking citizen.
ARGUMENT
I. Violation of Due Process
The lower courts, in their zealous pursuit of victory, trampled upon the sacred principles of due process. Mr. Rosencrantz was denied his fundamental right to a fair trial when the government's star witness, a notorious jailbird with a penchant for singing, was allowed to testify without any corroborating evidence. The government's reliance on the word of a jailhouse informant, who likely had ulterior motives, is an affront to the Constitution.
II. Reasonable Doubt
Ladies and gentlemen of the Court, this case is riddled with reasonable doubt like a cheap suit in a thrift store. The government's case rests on flimsy evidence and a convoluted web of speculation. The alleged tax discrepancies are based on questionable calculations and fuzzy math, all while conveniently ignoring legitimate deductions and expenses rightfully claimed by Mr. Rosencrantz. Reasonable doubt casts its long shadow on this entire proceeding, and it is the duty of this Court to bring it into the light.
III. Selective Prosecution
This Court has never turned a blind eye to injustice, and I implore you to scrutinize the government's motives. Mr. Rosencrantz, an unassuming member of society, has become the target of selective prosecution. Why, might you ask? It's because he dared to stand up against the establishment, questioning the powers that be. This case is not about taxes; it is about silencing dissent and punishing those who refuse to play by their rigged rules.
CONCLUSION
In conclusion, the government's case against David Rosencrantz reeks of a witch hunt, an attempt to crush the spirit of an ordinary citizen who dares to question authority. The violation of due process, the presence of reasonable doubt, and the specter of selective prosecution are grave injustices that cannot be ignored. This Court, armed with the sword of justice, must rise above the fog of deceit and deliver a resounding verdict in favor of Mr. Rosencrantz.
For these reasons, I, Saul Goodman, implore this Court to right the wrongs committed by the lower courts and restore faith in the sanctity of our legal system.
Original court documents: https://www.courtlistener.com/docket/63107798/mata-v-avianca...
The lawyer didn't only cite "bogus" cases, but when challenged attached entire "bogus" case contents hallucinated by ChatGPT (attachments to #29 on link above)
In the second #32 affidavit, there are screenshots of ChatGPT itself! https://storage.courtlistener.com/recap/gov.uscourts.nysd.57...
A legendary example for the legal risks of hallucination in LLMs https://en.wikipedia.org/wiki/Hallucination_(artificial_inte...