Show HN: Full text search on 630M US court cases

pseingatl · on Feb 20, 2022

A lot of commentators are missing the point. Caselaw is what lawyers rely on to determine what the law is. For years, there was a monopoly, the privately-held West Corporation in Minnesota. In the 1970s, the field became a duopoly, with the advent of Lexis. West eventually sold out to Thompson, the company is now Thompson-West. Computer-assisted legal research was extremely expensive. The Air Force set up FLITE, but efforts to access their database of cases failed. Manual research is effective, but paper publication meant you ran the risk of not knowing yesterday's results. In the 90's with the advent of the Internet, a lot of new players moved into the scene; some more successful than the others. Public-interest organizations sued for access to the cases, which, after all, were all in the public domain. There was a Washington-based group that called the caselaw the "Crown Jewels" of this effort--though EDGAR filings probably have more significance for the public. Time has provided a solution; access to anything older than the most recent 400 volumes of the Federal Reporter is really not needed, as any Thompson-West salesman will tell you. But these efforts to open caselaw to the public is not scammy and has nothing to do with shaming. In the United States, these cases state the law which governs our lives. I'm glad they've succeeded in the effort.

TheMiller · on Feb 20, 2022

A minor correction: West was bought by Thomson (no "p"), which later merged with Reuters to become Thomson Reuters.

dspoka · on Feb 27, 2022

Do you have any good write up on this? Was always interested in the how this space is still locked up.

drewmol · on Feb 19, 2022

I recently had some criminal charges expunged, and I notice they show up here. Is there any way to request removal of court records which are no longer publicly available from the originating court?

richardbarosky · on Feb 19, 2022

This is a possibility that there aren't any great solutions for currently. Can you message me on reddit with the link to check?

jka · on Feb 19, 2022

I'm not a lawyer:

In the absence of a reporting mechanism for issues like this, I'd suggest at least a notice / message alongside results to indicate that they may not reflect the current state of official and amended records.

(I think you may be wise to take this issue fairly seriously; there's a risk of people considering the search engine to be an authority in itself -- which, to be fair, is already a risk for any search engine, but since this one is more domain-focused, it's possible that some users could overdevelop a sense that the results are accurate and complete)

richardbarosky · on Feb 19, 2022

This is stated in simple language on the terms page, which is linked at the top/middle of every page. You have to decide between putting the same text on every page vs. a high visibility place vs. a low visibility place. I opted for 2nd to make sure it's clear.

jka · on Feb 19, 2022

Do most people read and comprehend terms pages before using the information they discover from search engines? (I don't know)

runnerup · on Feb 19, 2022

Absolutely not. A vanishingly tiny percent ever click on the terms link.

sharken · on Feb 20, 2022

Indeed, so the only good option would be to only search public records.

Everything else should not be searchable.

haswell · on Feb 20, 2022

It’s good that this is listed in the terms, but this raises two concerns:

1) People don’t read terms. Should they? Yes. Do they? No.

2) This language is more like a disclaimer than a term of use. I would not assume that disclaimers about the accuracy of the result would be found in the terms.

So yes, the terms link is prominent, but no, I don’t think it addresses this issue.

michaelmrose · on Feb 20, 2022

Wouldn't it be easier for people to just individually sue not so much to achieve the end result but to shut down your operation under the costs incurred?

drewmol · on Feb 19, 2022

I tried you hn handle on Reddit it says user does not exist.

ngold · on Feb 20, 2022

I'm curious as well. Reddit me at napoleongoldfinger if that's cool?

richardbarosky · on Feb 19, 2022

aoeusnth48

boomer918 · on Feb 20, 2022

Isn't there laws protecting people against this? Expunged record should not be visible, right?

richardbarosky · on Feb 20, 2022

If a record is made public, it's always part of the public record. If a case has been expunged, those results are removed though because it's the right thing to do.

Flashtoo · on Feb 20, 2022

There are in Europe i.e. the GDPR but not in the US afaik.

tomrod · on Feb 19, 2022

Neat. I'll add this to my sources on case law -- another one I've come across is https://case.law/

Per my close friend, the value of these (or, why people subscribe to LexisNexis) isn't solely the texts, but the cross referencing. It would be really cool to see that get implemented (and no doubt a non-trivial problem!).

How do you source your case inputs, as it is bigger than PACER?

richardbarosky · on Feb 19, 2022

CourtListener is a free source that does this very well for high-level courts. (i.e., US Supreme Court, Federal Courts, State Courts of Last Resort/State Supreme Courts).

For that, you have to detect references of cases which is a difficult problem itself, and CourtListener's search ranking also takes into account the citation weight of certain cases. This generally works well, but my understanding is that sometimes a not-so-important case can end up having many citations. Or if a case with many citations is overturned completely or partially, these things complicate which cases might be most relevant in search results too.

The data source is provided for each case. In some cases, a direct reference/link is provided.

dd36 · on Feb 20, 2022

lpf.io

heavyset_go · on Feb 20, 2022

Cool, I guess. I'm not really a fan of anything that can be used to further persecute people who have already paid their debts to society. I know this is public information, but the information itself has been relatively opaque for generations.

Beyond mild curiosity, the only paying customers for a service like this will be groups like employers, schools, creditors, landlords etc. It removes the pain around paid background checks, and it includes data that was already legally expunged.

The fact that it includes data that was legally expunged actually makes this service more valuable than traditional background checks to certain types of people.

asimilator · on Feb 20, 2022

Legal research services (access to dockets, case text, etc.) tend to be extremely expensive. I think making that more widely available is a good thing for people with limited resources. It's kind of nuts how impenetrable the legal system is if you don't have any resources.

richardbarosky · on Feb 20, 2022

Great point.

It's extremely difficult to represent yourself pro se if you don't have access to information about how cases like yours might unfold, arguments that have been used, how well those arguments have worked, how the cases have been decided, whether a company has settled a similar case as yours, and so on.

arcticbull · on Feb 20, 2022

Is that really a bad thing? Nobody should be representing themselves because they have no idea beyond Law and Order what all is supposed to be happening around them. That's why you get a complementary attorney if you can't afford one.

It's just too important to risk no?

Like DIY surgery. It's quite expensive and impenetrable to be doing your own appendectomy and I'm not mad about that. In both cases you could if you really had to but a high barrier to entry for me is not a bug but a feature.

HWR_14 · on Feb 20, 2022

> That's why you get a complementary attorney if you can't afford one.

Not in US you don't. Leaving aside how you don't have a right to a lawyer when you are on either side of a civil matter, most states will send you a bill should you avail yourself of the "a lawyer will be provided for you" part of the Miranda rights. Now, if you are broke, they'll still provide services to you.

In Canada you actually get free legal advice before any questioning.

iudqnolq · on Feb 20, 2022

To clarify, that's not a bill you only have to pay if you get a windfall. That's a bill that (in some places) will be agressively collected as a part of your fines.

If you're in prison you also have nearly no rights to a lawyer. (For example, if you want to sue the prison system for inhumane conditions)

Edit: For ex you may need to pay fees to be allowed to drive, which you may need to go to work/buy food.

wolverine876 · on Feb 20, 2022

> that's not a bill you only have to pay if you get a windfall

You can't get a windfall as a criminal defendant, just a conviction or not.

iudqnolq · on Feb 20, 2022

That's (mostly) true, but unrelated to what I meant. I meant to say if the government determines you can't afford a lawyer, it will (sometimes) bill you for the lawyer it provides. Sometimes if you're very poor you get debts that in theory you have to pay but in practice won't be pursued unless you get a windfall, because you don't have any money. This isn't one of those.

wolverine876 · on Feb 21, 2022

I see what you mean - a windfall independently of the criminal prosecution. That makes sense.

pseingatl · on Feb 21, 2022

Anna Sorokin got 300k from Netflix. Incarcerated individuals may still inherit. Sometimes money comes. The States are not always aggressive in pursuing such funds; but the Feds usually (not always) are.

ngold · on Feb 20, 2022

So poor people with an interest in surgery should have zero access to medical journals? This is all just taxpayer funded information that you and I pay for every year in taxes.

iudqnolq · on Feb 20, 2022

Just like surgery, your argument only works given effective public funding.

nanidin · on Feb 20, 2022

Our legal system is built on case law. Reading the law as it was passed is not enough - one must consult rulings in the relevant jurisdiction for precedence. I think the benefits of making this type of information widely available outweighs the potential downsides.

CPLX · on Feb 20, 2022

What does that have to do with listing the calendar calls of 25 year old divorce proceedings

vixen99 · on Feb 20, 2022

You and I have no idea but does that mean no one else might and for reasons of which we'd been unaware?

greggsy · on Feb 20, 2022

Absolutely, but I think the general public tends to completely misinterpret what they’re looking at and will often make faulty conclusions if they’re not given the full context provided through formal training (eg. Dr. Google).

The repercussions of an employer doing their own unqualified background check in someone can be detrimental to a person’s future wellbeing.

richardbarosky · on Feb 20, 2022

When you say used to persucute, that's not accurate in any legal sense. Many states have statewide repositories of court data. Also, when you say "can be used for", that's extremely broad and as far as a standard for taking a position on something, I don't think that makes a lot of sense. Do you mind clarifying?

loxias · on Feb 19, 2022

Fantastic. Love it. Wish I could download the whole 630M DB, not just 700K cases from Texas.

I especially love the interface. It's light and fast. Not unnecessarily burdened by JavaScript. Bravo to that.

richardbarosky · on Feb 19, 2022

thank you!

cryptnotic · on Feb 19, 2022

Today I learned that 20 years ago I was a defendant in an unlawful detainer (eviction) lawsuit regarding an apartment I shared in college. I had moved out of the apartment after graduation. Apparently my roommate stopped paying the rent and the landlord sued both of us. I was never served and didn't know about the case until now.

reaperducer · on Feb 19, 2022

Today I learned that my boss has a lead foot. 15 speeding tickets in six states, all over 85 MPH.

amelius · on Feb 19, 2022

Sounds like a privacy nightmare, though.

giantg2 · on Feb 20, 2022

That would be more of a problem with the legal system, not this search page.

boomer918 · on Feb 20, 2022

That's not true. Court records used to be accessible in person only. You would have to be motivated to find anything. This puts everyone's past and often forgotten transgressions on display.

This might look cool or even useful to some, but it's straight up immoral.

EvanAnderson · on Feb 20, 2022

Court records, including traffic violations, have been accessible online in the jurisdiction where I live for nearly 25 years. They've been aggregated from that website by third-parties for most of that time, too (which I know because I've done contract sysadmin work for one of the courts for most of that 25 years).

ghaff · on Feb 20, 2022

You and others are essentially arguing that we should put artificial barriers in place for accessing information that is public as a matter of law. Arguably there is information that's public today only because the circumstances at the time when that decision was made were different. But the solution would be to make less information public if it's a problem. Not to tell people they can only access this information when the town clerk is in the office--and he doesn't come in much.

distances · on Feb 20, 2022

I disagree. That is an excellent throttling mechanism. Keeps records open but preserves a modicum of privacy.

It's another topic if court records should be public at all, but I gather that's a pretty integral element of US judicial system.

giantg2 · on Feb 20, 2022

How does that work? Lawyers and others can run searches and have for decades. This isn't anything new, except now you don't have to pay a lot of money.

Either it's private or it's not. Fake throttling that discriminates based one's ability to pay or suffer inconvenience is ridiculous. Not to mention, it's been decades since this information hasn't been available in this format.

chaps · on Feb 20, 2022

Mixed feelings on this, and a lot of it unfortunately stems from these systems requiring access in-person only as well as courts generally being inaccessible through FOIA. Had these records been available through FOIA from the beginning, for example, then these records would have gone through a review/redaction process. But, they're being released now, which can many ways be seen as a reaction to the lack of access to this information, generally. The extensive overuse of courts for non-violent cases definitely doesn't help either.

As a researcher, there are deep problems with the inaccessibility of court information in that it prevents the general public from learning about systemic issues, for example identifying extensive abuse by judges (singular, or in a group), or identifying whether bail is applied uniformly.

I don't know what the solution is and things get trickier the more you look at them. Restrictive access isn't a perfect answer, since it allows gatekeeping of those critical. Having talked with lawyers who have access, they basically have to keep completely out of public spot light while they have restrictive access, at the fear of losing it. And our massive systems around incarceration have shown themselves as being uninterested in providing information to those who are critical of them. We've dug ourselves into a pretty deep hole.

giantg2 · on Feb 20, 2022

The documents being released are currently publicly available. It looks like records of minors either don't show up, or don't show details. It also looks like expunged convictions do not have the case showing up.

FOIA doesn't apply for two reasons. One is like you say - many court documents are considered privileged and not subject to FOIA (which I agree has many issues around things like complaints). The other is that documents that are publicly available like this don't need to be requested through FOIA since they are already available.

I can go to my state's website and get the same information as on this site. The thing this site does is allow you to search all state's for free. There are plenty of sites that will allow you to search for people like this, but they currently charge money.

chaps · on Feb 20, 2022

Access to justice and information about justice shouldn't be gatekept through money. I understand where you're going with it, but it's very close to the same reasoning used to justify ex[tp]ensive bail -- it only fucks the poor.

giantg2 · on Feb 21, 2022

"Access to justice and information about justice shouldn't be gatekept through money."

I agree. My comment was about how this site allows that access for free. And also that FOIA is moot for this information - it's publicly available without a request.

chaps · on Feb 21, 2022

My point is that we should have gone through a FOIA route from the beginning.

And no, a lot of important information still isn't publicly available, so we still have these issues that need to be ironed out for the exact same reason as I'm describing. This release still only scratches the surface. I'm able to get court documents from city law departments, but the process takes forever (ten complaints per week, for example) and the alternative is to go downtown to work with computer systems that have poor search capabilities.

Had we been (and been able to be) aggressive from a FOIA perspective from the beginning, the inefficiencies of these systems (eg, segmentting private information is stored in court records) would be more ironed out.

giantg2 · on Feb 20, 2022

"This might look cool or even useful to some, but it's straight up immoral."

Again, that's an issue with the legal system. States make this information available to the public online, and have done so for a long time. This is an aggregation, and similar services have been around for a long time.

If you want this information to be private, then you need legislation to be passed. Frankly, this isn't even the most immoral thing the system is involved in with. For example, 2-10% of the incarcerated are wrongly convicted. Or the fact that complaints and misconduct of judges are so secret that even if they contain exculpatory evidence they are not required to be exposed. Or that magistrates in most places are not required to have a law degree nor pass the bar, leading to the farcical outcome that the lawyers arguing the case have more knowledge of the law than the "judge" who is supposed to be the authority on the law. The list goes on and on. One day, enough people will be screwed over by the system that there won't be support for it anymore.

spaetzleesser · on Feb 20, 2022

Very true. We have to rethink a lot of the rules around privacy considering the scale technology has made mass surveillance possible. .

tinco · on Feb 20, 2022

Why? There's no private information being shared right?

spaetzleesser · on Feb 20, 2022

Technically no but it makes a difference if you have all that information available easily through the internet vs having to put some effort into obtaining the records one by one. Same goes for surveillance: technically you have no reasonable expectation of privacy in public spaces but once it becomes feasible that somebody can plaster a whole state/country with cameras and can analyze the data automatically it gets a little scary. I think a lot of privacy rules need to be revised in the context of current and future technological capabilities.

amelius · on Feb 20, 2022

Not officially, no. But that's not the only possible interpretation.

Pixeleen · on Feb 20, 2022

There are records here connecting my legal name to my deadname, not even on a court name change order. I had gone to lengths to keep things private. This is devastating. I want to cry.

TigeriusKirk · on Feb 20, 2022

I learned there's an attorney with my exact name and middle initial and his kajillion filings obscure my traffic tickets.

trhway · on Feb 19, 2022

That is great. Regular people access to the information is great power equalizer. I had lost a small case - fine print and a lot of undelivered promises - after 3 lawyers said I'd lose and won it on appeal after finding in an online database (not available anymore sadly) a similar precedent referring the law exactly for my situation. According to yelp and case search the company I had this case with was regularly taking people for a ride, and the people very grudgingly paid hundreds to several thousands of dollars a pop mostly because of the fine print, and I became the first with winning case in that list.

richardbarosky · on Feb 19, 2022

That's a great use case. Thank you for sharing!

hbcondo714 · on Feb 19, 2022

OP submitted this site in November 2020 with 400M cases[1]. Other than the increase in cases, what else has changed?

[1] https://news.ycombinator.com/item?id=25150702

richardbarosky · on Feb 19, 2022

Right, more cases primarily. The performance has been optimized so the searches, search result pages, and individual pages load significantly faster. Most searches load in under 200ms and most pages including SRPs load in less than 20 ms. Search syntax improvements (see info page for details). The search is still not very granular and field-specific, but definitely an area of improvement.

hbcondo714 · on Feb 19, 2022

Thanks, I think it would be nice to have a changelog.

dang · on Feb 19, 2022

Not as a criticism but just FEI (For Everyone's Information), reposts are ok on HN after a year or so. This is in the FAQ: https://news.ycombinator.com/newsfaq.html.

magicjosh · on Feb 19, 2022

Here's Steve Jobs' speeding ticket: https://www.judyrecords.com/record/vde11sdzw25ac

richardbarosky · on Feb 19, 2022

hmmm, middle initial checks out. though it's possible it's another steve.

sva_ · on Feb 19, 2022

Was trying to find speeding tickets of John von Neumann, but in vain. It would be nice if one could limit search by years.

jonbraun · on Feb 19, 2022

“One does not have to be a Richard Feynman to figure out that 200 tons is 100% greater than 100 tons.“ https://www.judyrecords.com/record/dhuql2nm6942

hervature · on Feb 19, 2022

Apparently importing a Jaguar through Canada went horribly wrong for him: https://www.judyrecords.com/record/0vctgni5684d

sva_ · on Feb 19, 2022

> Argued and Submitted June 3, 1981.

John von Neumann died in 1957. The name is a bit generic, so many results show up. Hence I wished there was a way to limit search to a range of years.

hervature · on Feb 19, 2022

Good call, now I'm embarrassed. I should've known that. Funny how the mind works. I knew he died in his 50's and was involved in the Manhattan project but somehow was content lumping him in with all the other scientists from Operation Paperclip and using loose math that 1981 was possible.

airstrike · on Feb 19, 2022

This seems pretty good at first glance but there's significant room for improvement. Since this is HN, allow me to nitpick...

- "630M" is a big number, sure, but I don't have a sense for what % of total court cases it corresponds to. Is it closer to 10% or 90%? And either way, which ones are included vs. excluded? What was the criteria used? Accessibility, date, costs?

- I get the artistic view behind the choice of typography but the font is just too large. I find myself having to scroll to get just as far as the 5th result. Information density is good in search engines

- The results consist of two pieces: the name of the court (followed by "record", which is unnecessary) and a short snippet, but not the actual name of the case... which is an interesting choice given that the name of the case is stored in a database field as evidenced by the fact that it is in the <title> tag of any detail view

- Also I also think the snippets are too short. Together with the previous point, this site is basically forcing me to click on each potential match to see if it is what I wanted or not

- The URLs are... interesting. Searching for anything takes you to "https://www.judyrecords.com/getSearchResults/?page=1" which does not identify your search. Somehow this is using GET but not storing the form input in the URL but locally somehow... so searching for "foo" in one tab, "bar" in a different tab, and hitting refresh on your "foo" tab will then show "bar" results there. Which is not only "Not Cool", but seems actually harder to accomplish than a straight up form using GET

- And then the actual results have URLs like "https://www.judyrecords.com/record/qxemfajbcae3". I'd be fine with a slug, really, but in 2022 I expect URLs to be API-like

- I can't search for specific cases, e.g. "paramount communications, inc. v. qvc network, inc" returns a bunch of results, none of which are the actual case I'm looking for which is a hugely influential precedent

richardbarosky · on Feb 19, 2022

Valid criticisms, thanks for pointing them out as areas of improvement. Good question about the % of total cases though I think there are some estimates on that. My guess would maybe be 100M+ cases per year.

Hammershaft · on Feb 19, 2022

> My guess would maybe be 100M+ cases per year.

If this is close then that blows my mind. That would be roughly 1 court case per ~2.5 american adults per year.

richardbarosky · on Feb 19, 2022

I think it depends on how narrow you define what a court case is. The number doesn't seem too high if you factor in traffic cases. But you're probably right on a narrower definition, that would be too high.

ALittleLight · on Feb 19, 2022

Some random searches I did showed it does include at least some traffic cases. I saw one for speeding and another for texting while driving.

ghaff · on Feb 19, 2022

I note that this isn't just court cases. I have a long ago (paid) traffic ticket in there--well, not the ticket but a record pointing to a no longer existing ticket. (Maybe that's technically a court case though.) Something I wrote is also in a footnote to a patent filing.

paulnpace · on Feb 20, 2022

Whenever I see stuff from lawyers especially stuff with litigation, seems like the font size, typeface, and white space always punch me in the eye it's so gruesome.

chmod775 · on Feb 19, 2022

There's quite a few court cases with the string "ASDF".

Are people just being lazy? https://www.judyrecords.com/record/vfa3d40l07812

gamekathu · on Feb 20, 2022

I'm always amazed at the rampant patent trolling that happens with deep learning papers/ideas. In this dump, if you search with the names of famous researchers in ML (such as Yoshua Bengio [1] or Yann Lecun [2]) you will find 100s of troll patents citing their work. Not all of them are troll though. Maybe this corpus can be used to automatically identify them, perhaps by merging data from arxiv?

[1] https://www.judyrecords.com/record/vibvgc4w4e2bc [2] https://www.judyrecords.com/record/v9kbckceib0c9

YPPH · on Feb 20, 2022

Does anyone know why American court opinions tend to omit the word "the" before plaintiff and defendant? For instance

"Afterwards plaintiff sued defendant claiming damages".

In Australia and the UK, this would be

"Afterwards the plaintiff sued the defendant claiming damages".

In general US opinions seem more concise and formulaic than their Anglo counterparts. This is just one striking example. I'm just curious about the origin of this distinction. Perhaps there is some text on concise legal writing prescribed at US law schools which offers such a suggestion?

Another curious difference, it's an opinion in the US, a decision or judgment in Australia/UK.

not-a-lawyer · on Feb 20, 2022

American courts generally give very strict and tight page limits when submitting briefs, so American legal writing has evolved to be concise as possible. The best lawyers regularly and repeatedly cut their briefs to length, removing any superfluous words. There's no requirement to omit "the" before plaintiff and defendant, but it's acceptable and saves space, so everyone does it.

YPPH · on Feb 22, 2022

Thanks, this is a very convincing answer.

NightMKoder · on Feb 20, 2022

Aren’t those usually capitalized as well? I’ve always though that style in legal texts means “a proper noun defined previously” - in case of plaintiff and defendant, probably on the first page. That said I have no legal background so take that with a grain of salt.

YPPH · on Feb 20, 2022

They would appear capitalised on the cover page, not generally in text.

Here is a recent example from a 2022 SCOTUS opinion.

"In rejecting petitioners’ allegations, the Seventh Circuit did not apply Tibble’s guidance. [...] The court determined that respondents had provided an adequate array of choices, including “the types of funds plaintiffs wanted (low-cost index funds).”"[0]

By contrast, a decision of the High Court of Australia:

"The appellants applied to the Supreme Court of New South Wales for orders that the third respondent, a former director of Arrium, appear for examination and produce documents. Orders were also sought for the second respondent (the auditor) and the bank who advised on the capital raising to produce certain documents.[1]

[0] https://supreme.justia.com/cases/federal/us/595/19-1401/

[1] https://eresources.hcourt.gov.au/showCase/2022/HCA/3

HWR_14 · on Feb 20, 2022

At least in US contracts, it's fairly common to establish role based pseudonyms at the beginning of the contract. Especially for reused contracts. Presumably, the same style applies to our legal system.

For example, a contract may read:

This contract is between John Smith (hereafter employee ) and XYZ LLC, a Delaware company (hereafter employer). Employee agrees to provide Employer with services for...

imajoredinecon · on Feb 20, 2022

Guessing: it’s an old legal system (not in relative terms, perhaps, but 250 years is a decent chunk of time) and a bunch of the language has stayed pretty similar over time.

beardedetim · on Feb 20, 2022

I don't think it's the plaintiff. It's John Doe, here by referenced as Plaintiff.

bradknowles · on Feb 19, 2022

You should be able to do an exact match search here. Trying to use double quotes on my name turns up a boatload of hits, but most of them appear to be cases where my first name is found somewhere on the page, and somewhere else my last name is found somewhere on the page.

It should also be possible to limit the search by city, state, and or region, as well as by timeframe.

Not very useful.

richardbarosky · on Feb 19, 2022

Exact match searches are supported.

Can you give an example?

Also, to limit by other qualifiers you can add those to the search criteria. However, the search isn't field-specific and so that ability can be done loose-ly (like Google) and not in a strict field-by-field sense. It's difficult and time consuming, but something that could improve the search.

bradknowles · on Feb 19, 2022

How are exact match searches supported? There is absolutely no evidence on the page that this is possible.

How does searching for “brad knowles” match “brad alan knowles”, when I put my name in quotes? How does it match a case where “brad knowles” does not appear to be used anywhere on the page, but where one line matches “knowles” and then another line matches “brad”?

richardbarosky · on Feb 19, 2022

"brad knowles" - 14 results returned. The exact phrase is found in each.

brad knowles - 1925 results returned, which include results with brad and knowles in the text, where close proximity cases are ranked higher. Results with brad and knowles further apart will be ranked toward the bottom.

When you use quotes above, I'm not sure if that means what's in quotes above is what you searched or that you searched with quotes, but based on these checks and what I see, exact match searches as well as weighting without use of exact match quotes is working correctly.

bradknowles · on Feb 20, 2022

Here's a case that shouldn't match an exact search, because the two words are separated by a line break: https://www.judyrecords.com/record/nqt11i9y33dec

I've found other cases where most of the hits shown on the screen are for one word or the other, but not both together. There is at least one hit on each of those cases where the two words are properly found side by side and in the correct order, and so it is technically a hit for the search. But the display is not correct, because on displaying the article it is showing each word hit individually from the others.

Using proper ASCII quotes to force an exact match instead of somehow getting smart quotes is definitely an improvement, but there's still more work to be done here with regards to line breaks and display of hits.

epberry · on Feb 19, 2022

You can put 3 commas after your name to search for an exact match, eg brad knowles,,,

Tempest1981 · on Feb 20, 2022

I think this is a smart-quotes issue -- smart-quotes are removed:

“brad knowles” -> brad knowles

"brad knowles" -> "brad knowles"

I always set my OS to disable smart-quotes. Worst is when you paste them into source code.

richardbarosky · on Feb 20, 2022

Maybe that can be handled better automatically. Good point, thanks for mentioning.

bradknowles · on Feb 20, 2022

So, a couple of weird things here.

First, when I go to the website, whether it's the mobile version or the desktop version, the on-screen iPadOS keyboard is immediately hidden from me. I have no way to type anything into the website, unless I flip out the physical keyboard that just happens to be attached to this case. I have never seen that kind of behaviour before on any other website, ever.

Second, Apple does not make it easy to figure out where the "turn off smart quotes" option is located. I think I turned it off under the switch for Settings > General > Keyboard > Smart Punctuation but I'm not 100% certain. Nevertheless, this is the first case where I recall using quotes where they were not honored as I would have expected. I'm not sure where the blame lies on this -- is it a user expectation problem, an iPadOS problem, or a website problem?

I'll try again, this time trying to make sure I use the proper type of quotes.

Too · on Feb 20, 2022

What kind of smart-ass OS would do that in a single line text input form?

This is a feature that belongs to a word processor.

bradknowles · on Feb 20, 2022

Fuck me.

How did I not spot that?!?

btdmaster · on Feb 19, 2022

Just an FYI -- you probably need to declare the use of Google Analytics explicitly in your terms. (Although my personal preference is something that does not require consent, like Matomo or Plausible Analytics :)

ejb999 · on Feb 19, 2022

why would that be? I don't think I have ever seen a site that disclosed they are using GA?

FWIW: I also prefer Plausible, and have all GA traffic blocked in my hosts file

btdmaster · on Feb 19, 2022

Since it collects personally identifiable information (at least IP addresses, but it's not clear where it stops) this requires special treatment under GDPR: https://en.wikipedia.org/wiki/Google_Analytics#Privacy

brobinson · on Feb 20, 2022

Why does an unmonetized website about US court cases, presumably targeted towards Americans, need to care about GDPR?

extortomatic · on Feb 20, 2022

Who says this website is unmonetized? The search query stream alone is very valuable.

brobinson · on Feb 21, 2022

There's no way to make an account, and there isn't any functionality for payment. There aren't even any Paypal/Patreon accounts or donation links. The info page on the site literally says "judyrecords is a 100% free nationwide search engine".

How do you think this website is monetized in the absence of those things?

dgellow · on Feb 20, 2022

Because the website is accessed by Europeans, meaning it is collecting Europeans data (via google analytics). But also because California has CCPA which is more or less equivalent to GDPR (as far as I understand at least. I might be incorrect).

brobinson · on Feb 21, 2022

Got it. In that case, whatever European commission is in charge of GDPR fines can charge them a % of their $0/year income like other GDPR violators.

dgellow · on Feb 21, 2022

I'm not sure that you understand GDPR fines, the annual revenue is only used as an upper limit for companies with massive revenue.

The service being free doesn't protect you.

> The less severe infringements could result in a fine of _up to €10 million, or 2% of the firm’s worldwide annual revenue_ from the preceding financial year, _whichever amount is higher_.

> The more serious infringements go against the very principles of the right to privacy and the right to be forgotten that are at the heart of the GDPR. These types of infringements could result in a fine of _up to €20 million, or 4% of the firm’s worldwide annual revenue_ from the preceding financial year, _whichever amount is higher_.

Source: https://gdpr.eu/fines/

brobinson · on Feb 21, 2022

Good to know. Hope the author of the site is aware of this if they ever go to Europe.

They should just block European IPs like other sites do, though. It'd be safer for them and also less work.

dgellow · on Feb 21, 2022

Yep. But as I said, California CCPA is more or less equivalent to GDPR, so even this might not be enough.

btdmaster · on Feb 20, 2022

It was just an example; similar issues occur under the CCPA and other legislation. (Assuming no user is covered by GDPR, which is likely not the case.)

brobinson · on Feb 21, 2022

This is why services simply block European IP addresses. If your options are:

1. do extra work to make your service comply with laws in areas you don't live or have any customers

2. put a blanket IP ban in place for these places where you don't live or have customers

3. do nothing

Bigger companies will do number 2, and individuals/small business/small and unmonetized projects will do number 3.

Brajeshwar · on Feb 20, 2022

[EDIT]

Sorry everyone and thanks everyone for the sentiments. I have been advised not to write anything public about this further.

This may be nothing and this will also pass. Unfortunately, I've to delete my comment here.

About: IBM seem to have stolen my idea and patented it. I have the full source code, and multiple proofs that I'm the owner.

Brajeshwar · on Feb 20, 2022

Lesson learnt, if you have an idea, even most mundane, boring one; don't just go about writing it openly -- there are too many IBMs that will claim as their own.

richardbarosky · on Feb 20, 2022

Do you have any references to the post? Original post/diagrams or other references?

richardbarosky · on Feb 20, 2022

That's pretty unbelievable, wow.

Simon_O_Rourke · on Feb 19, 2022

Searched my former boss on this. Hoooo doggy, I knew he was up to some questionable financial practices, but it looks like it caught up with him.

nabla9 · on Feb 19, 2022

603 total cases for: emacs

260 total cases for: "mind control"

768 total cases for: "donald j. trump"

State of Minnesota vs Steven Captain America Rogers https://www.judyrecords.com/record/vfvd30smme78f

btdmaster · on Feb 19, 2022

> mind control

I love it! (Is witchcraft constitutionally protected?!)

mostlystatic · on Feb 19, 2022

It's much more limited in what's covered, but when I had some questions around VAT I found the website of the British and Irish Legal Information Institute really helpful: https://www.bailii.org/

It's noindex, so it would normally be super hard to find the cases if you don't search on the BAILII site directly.

throwaway-PII · on Feb 19, 2022

The fact that this is free is mind boggling. Maybe four or five years ago I had access to a commercial court search API which had 850mn cases nationwide, and it cost a pretty penny.

donatj · on Feb 20, 2022

Crazy. Searching my name find almost every traffic ticket I’ve ever had, as well as a bunch of people with the same name as me getting tickets.

richardbarosky · on Feb 20, 2022

Using Apache mpm prefork and server load < 1. 132 busy workers. Not bad.

https://ibb.co/4J7STV6

https://ibb.co/NYwWd3T

cperciva · on Feb 19, 2022

TIL that I'm cited in a lot of patents.

MaknMoreGtnLess · on Feb 20, 2022

Entity recognition on this would be an extreme value add.

Don't you think?

richardbarosky · on Feb 20, 2022

It's a difficult problem, but that would definitely be useful for sure.

andrewguenther · on Feb 19, 2022

https://patents.google.com is great for this

VikingCoder · on Feb 20, 2022

Really neat.

I found my dad's DUI.

He went into rehab and cleaned up. I got to spend more than a decade with a sober father, before lung cancer got him.

gallerdude · on Feb 20, 2022

What the heck does this use to search so fast? We use Elastic at work, and for 100K entries, it crawls compares to this.

richardbarosky · on Feb 20, 2022

Thanks!

Replied to this comment here with some additional info: https://news.ycombinator.com/item?id=30399881#unv_30400160

momothereal · on Feb 20, 2022

Sounds like you're resource-starving it if it crawls at 100k. Either that or ES is perhaps not your bottleneck?

motohagiography · on Feb 20, 2022

As a corpus for ML training, I'd be interested in whether there are linguistic predictors for court victories in opening statements and whether optimizing for them could yield an advantage.

alangibson · on Feb 19, 2022

This site will be the first stop for anyone wanting to harass another person online. Some times a little friction is a good thing.

I love projects like these, but they're the digital equivalent of "dual use technologies". They can be used for good or evil.

That said, nice work.

ghaff · on Feb 19, 2022

There's a whole lot of information that the collective "we" decided to make public for various reasons. But those decisions making things public were in the context of the information being in some dusty town, county, or state office somewhere.

With more and more of that information being digital, we've more or less punted of the question whether all that information should still be public. Overall, more transparency is probably good but, as you say, it's not an unalloyed good as most of this information will live forever and be cheap/easy to access.

thr0wawayf00 · on Feb 19, 2022

> I love projects like these, but they're the digital equivalent of "dual use technologies". They can be used for good or evil.

Isn't pretty much every technology "dual use"? Just look at social media. You need a platform that gives you the ability to harass someone in order to actually do it.

> Some times a little friction is a good thing.

We as a market repeatedly justify the frictionless experience of being spied on for ads in ways that we have little to no control over, but we're gonna deny ourselves the frictionless experience of being able to see public records because we're worried about our privacy?

vintermann · on Feb 19, 2022

On the other hand, powerful people who wanted to harass you or hurt you have had access like this for a long time.

It's how I feel about facial recognition technology or other ML-based technology too. The worst people who could ever have access to it, already had access to it. Giving everyone access to it is just leveling the field.

EvanAnderson · on Feb 20, 2022

I'd love a world where the "powerless" have the same ability to leverage surveillance as the "powerful".

I'd love it if we could achieve that balance by eliminating surveillance, but I don't see that as a realistic outcome (at least initially).

In the absence of eliminating surveillance I'll take full public transparency. Maybe such transparency would even drive the elimination of surveillance.

duped · on Feb 19, 2022

I tried some rather specific queries of things I know to should return some records and it was fairly useless, so I'm not terribly worried.

Just anecdotally, I have a fairly uncommon last name but common first name, I know what states/counties I have appeared in court in and couldn't find any of the records. If you search something like <name> <county> <state> the results are overloaded with <county> <state>, for example.

rmbyrro · on Feb 19, 2022

Yeah, the only missing piece for fulltext harassment is a "Google alert" for particular keywords. Put the names you wanna track and receive a delightful alert in your inbox with rocks to throw over other people's roof.

EDIT: the tech is great, but I think there should be a record of who is accessing the data, for what purpose, terms for how it can be used in a civil way, and means to go after misuse.

alangibson · on Feb 19, 2022

How is harassment as a service not a thing yet?

You get a "Google alert" for your target. The service presents you with several buttons:

1. Send an AI written email 2. Post a link to the new info on their Facebook page 3. Tweet an image macro with the incriminating text embedded @ them

rmbyrro · on Feb 19, 2022

It is a thing, but making it so easy to find and access court documents mentioning someone's name will add to the pile of rocks malevolent people can throw at anyone.

inetknght · on Feb 19, 2022

> How is harassment as a service not a thing yet?

What makes you think it isn't?

bryanrasmussen · on Feb 19, 2022

Given that one third of Americans have criminal records of one sort or another, so that somebody almost certainly has a criminal in their family or near circle of friends, I suppose criminality is about the same as finding out somebody watches porn.

on edit: actually one third is probably overstating but close.

iqanq · on Feb 19, 2022

>This site will be the first stop for anyone wanting to harass another person online. Some times a little friction is a good thing.

Precisely I was thinking of how much fun we'll be having in efnet with this.

richardbarosky · on Feb 19, 2022

I think broadly the same tradeoffs exist for any search sysetm, like Google or PACER for example.

stjohnswarts · on Feb 19, 2022

Not sure how good this on a "regular citizen" level. I tried several drug/alcohol related incidents that I knew about and nothing came up.

astura · on Feb 20, 2022

Seems to completely lack records from some states.

I searched a close friend's last name and I got all the stuff I expected - his civil suit, his divorce, his sister's paternity suit, a foreclosure involving his cousin. Seemed very complete.

Searched my own last name and a whole bunch of records of my uncle's various criminal activities came up. Surprisingly what was missing was records of my dad's various criminal activities.

My friend, my dad, and my uncle all live in three different states. My dad is currently incarcerated.

gitgud · on Feb 19, 2022

Is this what Aaron Swartz was trying to achieve [1] here?

[1] https://arstechnica.com/tech-policy/2013/02/the-inside-story...

stefan_ · on Feb 20, 2022

No, this is just "metadata" more or less - who was sued by whom over what and what were the individual events in the case. PACER has the individual filings - the complaints, briefs and orders and so on.

sharken · on Feb 20, 2022

Cool project, it led me to a patent filing which is also a part of court cases i guess.

Anyway, now i know someone have tried to patent a scooter that looks like a firetruck.

https://insight.rpxcorp.com/patent/USD552186S1

tptacek · on Feb 20, 2022

This is wild, I have so many speeding ticket judgements. I forgot I’d ever set foot in Idaho.

bredren · on Feb 20, 2022

Maryland driver with license plate “HIJINKS” pleads guilty to speeding ticket https://www.judyrecords.com/record/r28qv86bva834

markbnj · on Feb 20, 2022

Nice work, very fast, simple display. Also, I learned that a speeding ticket I never paid back in the 80's is still an open case, and that a few articles I wrote in the 90's have been cited in dozens of patents. Yippee.

idontwantthis · on Feb 20, 2022

Doesn’t that mean you can be arrested in that state?

markbnj · on Feb 20, 2022

Probably? I haven't been there in years, but I'll be contacting them to pay the fine.

MaknMoreGtnLess · on Feb 20, 2022

OP, Will you consider allowing download of the actual dataset? (WikiPedia style)

If not, what would hold you back from doing so?

I would be OK hosting a mirror of this service including the infrastructure (and handling the associated costs) if it helps.

jtmlis · on Feb 20, 2022

Is there a Canadian version? Or something similar within a Canadian context.

rnotaro · on Feb 20, 2022

In Quebec, we have the SOQUIJ (Société québécoise d'information juridique) [1] that allow you to search court cases. Other provinces might have something similar.

[1] http://citoyens.soquij.qc.ca/

richardbarosky · on Feb 21, 2022

Stumbled on this: https://www.canlii.org/en/

https://www.reddit.com/r/MadeMeSmile/comments/swy4ge/comment...

richardbarosky · on Feb 20, 2022

Good question. Not that I'm aware of.

fastaguy88 · on Feb 20, 2022

Wow. What a resource. Do a search for "cDNA patent" and read how many things the Supreme Court can get wrong in just a few paragraphs. (Hint: cDNA is not "composite DNA").

channel_t · on Feb 19, 2022

Wow I just found out that a lot of distant family members on the opposite side of the country who I've never met are really bad drivers. Found one of my own moving violations in there too.

linuxhansl · on Feb 19, 2022

Funny. I found my various software patents in there.

For most of them - even though they are my patents - I cannot determine from the patent what it is that I have invented.

supernova87a · on Feb 19, 2022

I know there is some open source (?) effort to publish and give access to court cases instead of having it behind a paid subscription channeled through the federal court system. Does anyone know how that's going?

And also, are only the primary filings of the court and parties available to be searched? What happens to depositions, evidence records, etc. that are part of the case? Are those ever available to the public?

richardbarosky · on Feb 19, 2022

It sounds like you're referring to this: Open Courts Act of 2021

Some commentary at these links:

- https://free.law/pacer-facts

- https://www.politico.com/magazine/story/2019/03/20/pacer-cou...

- https://abovethelaw.com/legal-innovation-center/2021/03/11/t...

- https://unicourt.com/blog/modernizing-pacer-realizing-crimin...

busymom0 · on Feb 19, 2022

Mind sharing info on server, backend, costs etc?

richardbarosky · on Feb 19, 2022

Replied to this comment here with some additional info: https://news.ycombinator.com/item?id=30399881#unv_30400160

dheera · on Feb 19, 2022

Damn, even traffic citations in there. Wow.

freediver · on Feb 20, 2022

Great job Richard. Do you have or plan to have an API avaialble? We (Kagi Search) would like to use this.

ChrisMarshallNY · on Feb 19, 2022

TIL that a lot of sad MFers share my name...

This tool is awesome, but, in knucklehead hands, could be fairly awful.

weird-eye-issue · on Feb 19, 2022

I have a speeding ticket listed as "Criminal" in here... That's news to me

richardbarosky · on Feb 19, 2022

Hmm, sounds like a broad classification associated with the record. In the broadest sense, I think all cases are defined as either civil or criminal.

weird-eye-issue · on Feb 20, 2022

I see another speeding ticket listed as "Non Criminal".

I've got another ticket not listed as anything and actually it has my full DOB and previous address, cool

heavyset_go · on Feb 20, 2022

> I've got another ticket not listed as anything and actually it has my full DOB and previous address, cool

If it's a street you grew up on, what banks do you use? ;)

weird-eye-issue · on Feb 20, 2022

Chase, why?

heavyset_go · on Feb 20, 2022

I was making a joke because some basic info like name, DOB and address are enough to get into a bank account if the password was forgotten, especially if you know the answer to security questions like "What is the name of the street you grew up on?"

weird-eye-issue · on Feb 20, 2022

Yeah I was joking too

heavyset_go · on Feb 20, 2022

Woosh

sytelus · on Feb 20, 2022

I also learned recently that if things weren’t sealed then it is available to anyone. Is it possible to create a DB of all such public documents attached to the cases? What would it take to do that?

jordanpg · on Feb 20, 2022

How is this different from other free legal DBs like Justia and Casetext?

richardbarosky · on Feb 20, 2022

Justia is a general legal info portal and has many high-level court opinions within that portal. Casetext is primarily legal research software and has many US/state codes within its database. (https://casetext.com/coverage) I think the broad strokes are right in that summary. judyrecords has many more cases than Justia or casetext. More than 600M+ if I had to guess quick.

motyar · on Feb 20, 2022

Why patents are there? https://www.judyrecords.com/record/v5j59cngw2c8e

richardbarosky · on Feb 20, 2022

You can think of a patent as a legally binding decision itself, and a kind of legal case in its own right.

motyar · on Feb 20, 2022

Even I can see my gist links in cases https://www.judyrecords.com/record/va3htxa9i5abd

ww520 · on Feb 19, 2022

This is great. I tried it and found the traffic tickets I got.

raegis · on Feb 20, 2022

I can't find any of mine. I always did traffic school, so I suppose that's why.

zmix · on Feb 20, 2022

I wonder, what technology the database is implemented in.

pwned1 · on Feb 20, 2022

Do you really need to include divorce cases?

hammock · on Feb 19, 2022

This is unbelievable. It has speeding tickets.

bitxbitxbitcoin · on Feb 19, 2022

Found out an article I wrote in 2014 was cited in a patent application and found a speeding ticket I paid off. Cool!

spoonjim · on Feb 20, 2022

I discovered that I have an open warrant for an unpaid fine and will get that sorted out. Very useful site

skilled · on Feb 19, 2022

Page 1 of 78 total cases for: wikileaks

leoxvi · on Feb 20, 2022

Nice. Could you add a URL-parameter for the search-term, so it is possible to link to a query?

fosshogg · on Feb 19, 2022

Thankfully none of the (many) speeding tickets I got in my youth are showing up.

mikewarot · on Feb 19, 2022

It found my 2 patents, cool!

boomer918 · on Feb 20, 2022

This is a pretty big violation of privacy, especially for people who's criminal records have been expunged.

It could satisfy some people's curiosity or if they are a lawyer, they could save a few grand on PACER, but for everyone else this is a privacy disaster.

I hope you take this down eventually.

nickphx · on Feb 20, 2022

You don't need to be a lawyer to get a PACER account. I'm a felon, I have a pacer account.