A lot of commentators are missing the point. Caselaw is what lawyers rely on to determine what the law is. For years, there was a monopoly, the privately-held West Corporation in Minnesota. In the 1970s, the field became a duopoly, with the advent of Lexis. West eventually sold out to Thompson, the company is now Thompson-West. Computer-assisted legal research was extremely expensive. The Air Force set up FLITE, but efforts to access their database of cases failed. Manual research is effective, but paper publication meant you ran the risk of not knowing yesterday's results. In the 90's with the advent of the Internet, a lot of new players moved into the scene; some more successful than the others. Public-interest organizations sued for access to the cases, which, after all, were all in the public domain. There was a Washington-based group that called the caselaw the "Crown Jewels" of this effort--though EDGAR filings probably have more significance for the public. Time has provided a solution; access to anything older than the most recent 400 volumes of the Federal Reporter is really not needed, as any Thompson-West salesman will tell you. But these efforts to open caselaw to the public is not scammy and has nothing to do with shaming. In the United States, these cases state the law which governs our lives. I'm glad they've succeeded in the effort.
I recently had some criminal charges expunged, and I notice they show up here. Is there any way to request removal of court records which are no longer publicly available from the originating court?
In the absence of a reporting mechanism for issues like this, I'd suggest at least a notice / message alongside results to indicate that they may not reflect the current state of official and amended records.
(I think you may be wise to take this issue fairly seriously; there's a risk of people considering the search engine to be an authority in itself -- which, to be fair, is already a risk for any search engine, but since this one is more domain-focused, it's possible that some users could overdevelop a sense that the results are accurate and complete)
This is stated in simple language on the terms page, which is linked at the top/middle of every page. You have to decide between putting the same text on every page vs. a high visibility place vs. a low visibility place. I opted for 2nd to make sure it's clear.
It’s good that this is listed in the terms, but this raises two concerns:
1) People don’t read terms. Should they? Yes. Do they? No.
2) This language is more like a disclaimer than a term of use. I would not assume that disclaimers about the accuracy of the result would be found in the terms.
So yes, the terms link is prominent, but no, I don’t think it addresses this issue.
Wouldn't it be easier for people to just individually sue not so much to achieve the end result but to shut down your operation under the costs incurred?
If a record is made public, it's always part of the public record. If a case has been expunged, those results are removed though because it's the right thing to do.
Neat. I'll add this to my sources on case law -- another one I've come across is https://case.law/
Per my close friend, the value of these (or, why people subscribe to LexisNexis) isn't solely the texts, but the cross referencing. It would be really cool to see that get implemented (and no doubt a non-trivial problem!).
How do you source your case inputs, as it is bigger than PACER?
CourtListener is a free source that does this very well for high-level courts. (i.e., US Supreme Court, Federal Courts, State Courts of Last Resort/State Supreme Courts).
For that, you have to detect references of cases which is a difficult problem itself, and CourtListener's search ranking also takes into account the citation weight of certain cases. This generally works well, but my understanding is that sometimes a not-so-important case can end up having many citations. Or if a case with many citations is overturned completely or partially, these things complicate which cases might be most relevant in search results too.
The data source is provided for each case. In some cases, a direct reference/link is provided.
Cool, I guess. I'm not really a fan of anything that can be used to further persecute people who have already paid their debts to society. I know this is public information, but the information itself has been relatively opaque for generations.
Beyond mild curiosity, the only paying customers for a service like this will be groups like employers, schools, creditors, landlords etc. It removes the pain around paid background checks, and it includes data that was already legally expunged.
The fact that it includes data that was legally expunged actually makes this service more valuable than traditional background checks to certain types of people.
Legal research services (access to dockets, case text, etc.) tend to be extremely expensive. I think making that more widely available is a good thing for people with limited resources. It's kind of nuts how impenetrable the legal system is if you don't have any resources.
It's extremely difficult to represent yourself pro se if you don't have access to information about how cases like yours might unfold, arguments that have been used, how well those arguments have worked, how the cases have been decided, whether a company has settled a similar case as yours, and so on.
Is that really a bad thing? Nobody should be representing themselves because they have no idea beyond Law and Order what all is supposed to be happening around them. That's why you get a complementary attorney if you can't afford one.
It's just too important to risk no?
Like DIY surgery. It's quite expensive and impenetrable to be doing your own appendectomy and I'm not mad about that. In both cases you could if you really had to but a high barrier to entry for me is not a bug but a feature.
> That's why you get a complementary attorney if you can't afford one.
Not in US you don't. Leaving aside how you don't have a right to a lawyer when you are on either side of a civil matter, most states will send you a bill should you avail yourself of the "a lawyer will be provided for you" part of the Miranda rights. Now, if you are broke, they'll still provide services to you.
In Canada you actually get free legal advice before any questioning.
To clarify, that's not a bill you only have to pay if you get a windfall. That's a bill that (in some places) will be agressively collected as a part of your fines.
If you're in prison you also have nearly no rights to a lawyer. (For example, if you want to sue the prison system for inhumane conditions)
Edit: For ex you may need to pay fees to be allowed to drive, which you may need to go to work/buy food.
That's (mostly) true, but unrelated to what I meant. I meant to say if the government determines you can't afford a lawyer, it will (sometimes) bill you for the lawyer it provides. Sometimes if you're very poor you get debts that in theory you have to pay but in practice won't be pursued unless you get a windfall, because you don't have any money. This isn't one of those.
Anna Sorokin got 300k from Netflix. Incarcerated individuals may still inherit. Sometimes money comes. The States are not always aggressive in pursuing such funds; but the Feds usually (not always) are.
So poor people with an interest in surgery should have zero access to medical journals? This is all just taxpayer funded information that you and I pay for every year in taxes.
Our legal system is built on case law. Reading the law as it was passed is not enough - one must consult rulings in the relevant jurisdiction for precedence. I think the benefits of making this type of information widely available outweighs the potential downsides.
Absolutely, but I think the general public tends to completely misinterpret what they’re looking at and will often make faulty conclusions if they’re not given the full context provided through formal training (eg. Dr. Google).
The repercussions of an employer doing their own unqualified background check in someone can be detrimental to a person’s future wellbeing.
When you say used to persucute, that's not accurate in any legal sense. Many states have statewide repositories of court data. Also, when you say "can be used for", that's extremely broad and as far as a standard for taking a position on something, I don't think that makes a lot of sense. Do you mind clarifying?
Today I learned that 20 years ago I was a defendant in an unlawful detainer (eviction) lawsuit regarding an apartment I shared in college. I had moved out of the apartment after graduation. Apparently my roommate stopped paying the rent and the landlord sued both of us. I was never served and didn't know about the case until now.
That's not true. Court records used to be accessible in person only. You would have to be motivated to find anything. This puts everyone's past and often forgotten transgressions on display.
This might look cool or even useful to some, but it's straight up immoral.
Court records, including traffic violations, have been accessible online in the jurisdiction where I live for nearly 25 years. They've been aggregated from that website by third-parties for most of that time, too (which I know because I've done contract sysadmin work for one of the courts for most of that 25 years).
You and others are essentially arguing that we should put artificial barriers in place for accessing information that is public as a matter of law. Arguably there is information that's public today only because the circumstances at the time when that decision was made were different. But the solution would be to make less information public if it's a problem. Not to tell people they can only access this information when the town clerk is in the office--and he doesn't come in much.
How does that work? Lawyers and others can run searches and have for decades. This isn't anything new, except now you don't have to pay a lot of money.
Either it's private or it's not. Fake throttling that discriminates based one's ability to pay or suffer inconvenience is ridiculous. Not to mention, it's been decades since this information hasn't been available in this format.
Mixed feelings on this, and a lot of it unfortunately stems from these systems requiring access in-person only as well as courts generally being inaccessible through FOIA. Had these records been available through FOIA from the beginning, for example, then these records would have gone through a review/redaction process. But, they're being released now, which can many ways be seen as a reaction to the lack of access to this information, generally. The extensive overuse of courts for non-violent cases definitely doesn't help either.
As a researcher, there are deep problems with the inaccessibility of court information in that it prevents the general public from learning about systemic issues, for example identifying extensive abuse by judges (singular, or in a group), or identifying whether bail is applied uniformly.
I don't know what the solution is and things get trickier the more you look at them. Restrictive access isn't a perfect answer, since it allows gatekeeping of those critical. Having talked with lawyers who have access, they basically have to keep completely out of public spot light while they have restrictive access, at the fear of losing it. And our massive systems around incarceration have shown themselves as being uninterested in providing information to those who are critical of them. We've dug ourselves into a pretty deep hole.
The documents being released are currently publicly available. It looks like records of minors either don't show up, or don't show details. It also looks like expunged convictions do not have the case showing up.
FOIA doesn't apply for two reasons. One is like you say - many court documents are considered privileged and not subject to FOIA (which I agree has many issues around things like complaints). The other is that documents that are publicly available like this don't need to be requested through FOIA since they are already available.
I can go to my state's website and get the same information as on this site. The thing this site does is allow you to search all state's for free. There are plenty of sites that will allow you to search for people like this, but they currently charge money.
Access to justice and information about justice shouldn't be gatekept through money. I understand where you're going with it, but it's very close to the same reasoning used to justify ex[tp]ensive bail -- it only fucks the poor.
"Access to justice and information about justice shouldn't be gatekept through money."
I agree. My comment was about how this site allows that access for free. And also that FOIA is moot for this information - it's publicly available without a request.
My point is that we should have gone through a FOIA route from the beginning.
And no, a lot of important information still isn't publicly available, so we still have these issues that need to be ironed out for the exact same reason as I'm describing. This release still only scratches the surface. I'm able to get court documents from city law departments, but the process takes forever (ten complaints per week, for example) and the alternative is to go downtown to work with computer systems that have poor search capabilities.
Had we been (and been able to be) aggressive from a FOIA perspective from the beginning, the inefficiencies of these systems (eg, segmentting private information is stored in court records) would be more ironed out.
"This might look cool or even useful to some, but it's straight up immoral."
Again, that's an issue with the legal system. States make this information available to the public online, and have done so for a long time. This is an aggregation, and similar services have been around for a long time.
If you want this information to be private, then you need legislation to be passed. Frankly, this isn't even the most immoral thing the system is involved in with. For example, 2-10% of the incarcerated are wrongly convicted. Or the fact that complaints and misconduct of judges are so secret that even if they contain exculpatory evidence they are not required to be exposed. Or that magistrates in most places are not required to have a law degree nor pass the bar, leading to the farcical outcome that the lawyers arguing the case have more knowledge of the law than the "judge" who is supposed to be the authority on the law. The list goes on and on. One day, enough people will be screwed over by the system that there won't be support for it anymore.
Technically no but it makes a difference if you have all that information available easily through the internet vs having to put some effort into obtaining the records one by one. Same goes for surveillance: technically you have no reasonable expectation of privacy in public spaces but once it becomes feasible that somebody can plaster a whole state/country with cameras and can analyze the data automatically it gets a little scary. I think a lot of privacy rules need to be revised in the context of current and future technological capabilities.
There are records here connecting my legal name to my deadname, not even on a court name change order. I had gone to lengths to keep things private. This is devastating. I want to cry.
That is great. Regular people access to the information is great power equalizer. I had lost a small case - fine print and a lot of undelivered promises - after 3 lawyers said I'd lose and won it on appeal after finding in an online database (not available anymore sadly) a similar precedent referring the law exactly for my situation. According to yelp and case search the company I had this case with was regularly taking people for a ride, and the people very grudgingly paid hundreds to several thousands of dollars a pop mostly because of the fine print, and I became the first with winning case in that list.
Right, more cases primarily. The performance has been optimized so the searches, search result pages, and individual pages load significantly faster. Most searches load in under 200ms and most pages including SRPs load in less than 20 ms. Search syntax improvements (see info page for details). The search is still not very granular and field-specific, but definitely an area of improvement.
Not as a criticism but just FEI (For Everyone's Information), reposts are ok on HN after a year or so. This is in the FAQ: https://news.ycombinator.com/newsfaq.html.
Good call, now I'm embarrassed. I should've known that. Funny how the mind works. I knew he died in his 50's and was involved in the Manhattan project but somehow was content lumping him in with all the other scientists from Operation Paperclip and using loose math that 1981 was possible.
This seems pretty good at first glance but there's significant room for improvement. Since this is HN, allow me to nitpick...
- "630M" is a big number, sure, but I don't have a sense for what % of total court cases it corresponds to. Is it closer to 10% or 90%? And either way, which ones are included vs. excluded? What was the criteria used? Accessibility, date, costs?
- I get the artistic view behind the choice of typography but the font is just too large. I find myself having to scroll to get just as far as the 5th result. Information density is good in search engines
- The results consist of two pieces: the name of the court (followed by "record", which is unnecessary) and a short snippet, but not the actual name of the case... which is an interesting choice given that the name of the case is stored in a database field as evidenced by the fact that it is in the <title> tag of any detail view
- Also I also think the snippets are too short. Together with the previous point, this site is basically forcing me to click on each potential match to see if it is what I wanted or not
- The URLs are... interesting. Searching for anything takes you to "https://www.judyrecords.com/getSearchResults/?page=1" which does not identify your search. Somehow this is using GET but not storing the form input in the URL but locally somehow... so searching for "foo" in one tab, "bar" in a different tab, and hitting refresh on your "foo" tab will then show "bar" results there. Which is not only "Not Cool", but seems actually harder to accomplish than a straight up form using GET
- I can't search for specific cases, e.g. "paramount communications, inc. v. qvc network, inc" returns a bunch of results, none of which are the actual case I'm looking for which is a hugely influential precedent
Valid criticisms, thanks for pointing them out as areas of improvement. Good question about the % of total cases though I think there are some estimates on that. My guess would maybe be 100M+ cases per year.
I think it depends on how narrow you define what a court case is. The number doesn't seem too high if you factor in traffic cases. But you're probably right on a narrower definition, that would be too high.
I note that this isn't just court cases. I have a long ago (paid) traffic ticket in there--well, not the ticket but a record pointing to a no longer existing ticket. (Maybe that's technically a court case though.) Something I wrote is also in a footnote to a patent filing.
Whenever I see stuff from lawyers especially stuff with litigation, seems like the font size, typeface, and white space always punch me in the eye it's so gruesome.
I'm always amazed at the rampant patent trolling that happens with deep learning papers/ideas. In this dump, if you search with the names of famous researchers in ML (such as Yoshua Bengio [1] or Yann Lecun [2]) you will find 100s of troll patents citing their work. Not all of them are troll though. Maybe this corpus can be used to automatically identify them, perhaps by merging data from arxiv?
"Afterwards the plaintiff sued the defendant claiming damages".
In general US opinions seem more concise and formulaic than their Anglo counterparts. This is just one striking example. I'm just curious about the origin of this distinction. Perhaps there is some text on concise legal writing prescribed at US law schools which offers such a suggestion?
Another curious difference, it's an opinion in the US, a decision or judgment in Australia/UK.
American courts generally give very strict and tight page limits when submitting briefs, so American legal writing has evolved to be concise as possible. The best lawyers regularly and repeatedly cut their briefs to length, removing any superfluous words. There's no requirement to omit "the" before plaintiff and defendant, but it's acceptable and saves space, so everyone does it.
Aren’t those usually capitalized as well? I’ve always though that style in legal texts means “a proper noun defined previously” - in case of plaintiff and defendant, probably on the first page. That said I have no legal background so take that with a grain of salt.
They would appear capitalised on the cover page, not generally in text.
Here is a recent example from a 2022 SCOTUS opinion.
"In rejecting petitioners’ allegations, the Seventh Circuit did not apply Tibble’s guidance. [...] The court determined that respondents had provided an adequate array of choices, including “the types of funds plaintiffs wanted (low-cost index funds).”"[0]
By contrast, a decision of the High Court of Australia:
"The appellants applied to the Supreme Court of New South Wales for orders
that the third respondent, a former director of Arrium, appear for examination and
produce documents. Orders were also sought for the second respondent (the
auditor) and the bank who advised on the capital raising to produce certain
documents.[1]
At least in US contracts, it's fairly common to establish role based pseudonyms at the beginning of the contract. Especially for reused contracts. Presumably, the same style applies to our legal system.
For example, a contract may read:
This contract is between John Smith (hereafter employee ) and XYZ LLC, a Delaware company (hereafter employer). Employee agrees to provide Employer with services for...
Guessing: it’s an old legal system (not in relative terms, perhaps, but 250 years is a decent chunk of time) and a bunch of the language has stayed pretty similar over time.
You should be able to do an exact match search here. Trying to use double quotes on my name turns up a boatload of hits, but most of them appear to be cases where my first name is found somewhere on the page, and somewhere else my last name is found somewhere on the page.
It should also be possible to limit the search by city, state, and or region, as well as by timeframe.
Also, to limit by other qualifiers you can add those to the search criteria. However, the search isn't field-specific and so that ability can be done loose-ly (like Google) and not in a strict field-by-field sense. It's difficult and time consuming, but something that could improve the search.
How are exact match searches supported? There is absolutely no evidence on the page that this is possible.
How does searching for “brad knowles” match “brad alan knowles”, when I put my name in quotes? How does it match a case where “brad knowles” does not appear to be used anywhere on the page, but where one line matches “knowles” and then another line matches “brad”?
"brad knowles" - 14 results returned. The exact phrase is found in each.
brad knowles - 1925 results returned, which include results with brad and knowles in the text, where close proximity cases are ranked higher. Results with brad and knowles further apart will be ranked toward the bottom.
When you use quotes above, I'm not sure if that means what's in quotes above is what you searched or that you searched with quotes, but based on these checks and what I see, exact match searches as well as weighting without use of exact match quotes is working correctly.
I've found other cases where most of the hits shown on the screen are for one word or the other, but not both together. There is at least one hit on each of those cases where the two words are properly found side by side and in the correct order, and so it is technically a hit for the search. But the display is not correct, because on displaying the article it is showing each word hit individually from the others.
Using proper ASCII quotes to force an exact match instead of somehow getting smart quotes is definitely an improvement, but there's still more work to be done here with regards to line breaks and display of hits.
First, when I go to the website, whether it's the mobile version or the desktop version, the on-screen iPadOS keyboard is immediately hidden from me. I have no way to type anything into the website, unless I flip out the physical keyboard that just happens to be attached to this case. I have never seen that kind of behaviour before on any other website, ever.
Second, Apple does not make it easy to figure out where the "turn off smart quotes" option is located. I think I turned it off under the switch for Settings > General > Keyboard > Smart Punctuation but I'm not 100% certain. Nevertheless, this is the first case where I recall using quotes where they were not honored as I would have expected. I'm not sure where the blame lies on this -- is it a user expectation problem, an iPadOS problem, or a website problem?
I'll try again, this time trying to make sure I use the proper type of quotes.
Just an FYI -- you probably need to declare the use of Google Analytics explicitly in your terms. (Although my personal preference is something that does not require consent, like Matomo or Plausible Analytics :)
There's no way to make an account, and there isn't any functionality for payment. There aren't even any Paypal/Patreon accounts or donation links. The info page on the site literally says "judyrecords is a 100% free nationwide search engine".
How do you think this website is monetized in the absence of those things?
Because the website is accessed by Europeans, meaning it is collecting Europeans data (via google analytics). But also because California has CCPA which is more or less equivalent to GDPR (as far as I understand at least. I might be incorrect).
I'm not sure that you understand GDPR fines, the annual revenue is only used as an upper limit for companies with massive revenue.
The service being free doesn't protect you.
> The less severe infringements could result in a fine of _up to €10 million, or 2% of the firm’s worldwide annual revenue_ from the preceding financial year, _whichever amount is higher_.
> The more serious infringements go against the very principles of the right to privacy and the right to be forgotten that are at the heart of the GDPR. These types of infringements could result in a fine of _up to €20 million, or 4% of the firm’s worldwide annual revenue_ from the preceding financial year, _whichever amount is higher_.
It was just an example; similar issues occur under the CCPA and other legislation. (Assuming no user is covered by GDPR, which is likely not the case.)
Lesson learnt, if you have an idea, even most mundane, boring one; don't just go about writing it openly -- there are too many IBMs that will claim as their own.
It's much more limited in what's covered, but when I had some questions around VAT I found the website of the British and Irish Legal Information Institute really helpful: https://www.bailii.org/
It's noindex, so it would normally be super hard to find the cases if you don't search on the BAILII site directly.
The fact that this is free is mind boggling. Maybe four or five years ago I had access to a commercial court search API which had 850mn cases nationwide, and it cost a pretty penny.
As a corpus for ML training, I'd be interested in whether there are linguistic predictors for court victories in opening statements and whether optimizing for them could yield an advantage.
There's a whole lot of information that the collective "we" decided to make public for various reasons. But those decisions making things public were in the context of the information being in some dusty town, county, or state office somewhere.
With more and more of that information being digital, we've more or less punted of the question whether all that information should still be public. Overall, more transparency is probably good but, as you say, it's not an unalloyed good as most of this information will live forever and be cheap/easy to access.
> I love projects like these, but they're the digital equivalent of "dual use technologies". They can be used for good or evil.
Isn't pretty much every technology "dual use"? Just look at social media. You need a platform that gives you the ability to harass someone in order to actually do it.
> Some times a little friction is a good thing.
We as a market repeatedly justify the frictionless experience of being spied on for ads in ways that we have little to no control over, but we're gonna deny ourselves the frictionless experience of being able to see public records because we're worried about our privacy?
On the other hand, powerful people who wanted to harass you or hurt you have had access like this for a long time.
It's how I feel about facial recognition technology or other ML-based technology too. The worst people who could ever have access to it, already had access to it. Giving everyone access to it is just leveling the field.
I'd love a world where the "powerless" have the same ability to leverage surveillance as the "powerful".
I'd love it if we could achieve that balance by eliminating surveillance, but I don't see that as a realistic outcome (at least initially).
In the absence of eliminating surveillance I'll take full public transparency. Maybe such transparency would even drive the elimination of surveillance.
I tried some rather specific queries of things I know to should return some records and it was fairly useless, so I'm not terribly worried.
Just anecdotally, I have a fairly uncommon last name but common first name, I know what states/counties I have appeared in court in and couldn't find any of the records. If you search something like <name> <county> <state> the results are overloaded with <county> <state>, for example.
Yeah, the only missing piece for fulltext harassment is a "Google alert" for particular keywords. Put the names you wanna track and receive a delightful alert in your inbox with rocks to throw over other people's roof.
EDIT: the tech is great, but I think there should be a record of who is accessing the data, for what purpose, terms for how it can be used in a civil way, and means to go after misuse.
It is a thing, but making it so easy to find and access court documents mentioning someone's name will add to the pile of rocks malevolent people can throw at anyone.
Given that one third of Americans have criminal records of one sort or another, so that somebody almost certainly has a criminal in their family or near circle of friends, I suppose criminality is about the same as finding out somebody watches porn.
on edit: actually one third is probably overstating but close.
Seems to completely lack records from some states.
I searched a close friend's last name and I got all the stuff I expected - his civil suit, his divorce, his sister's paternity suit, a foreclosure involving his cousin. Seemed very complete.
Searched my own last name and a whole bunch of records of my uncle's various criminal activities came up. Surprisingly what was missing was records of my dad's various criminal activities.
My friend, my dad, and my uncle all live in three different states. My dad is currently incarcerated.
No, this is just "metadata" more or less - who was sued by whom over what and what were the individual events in the case. PACER has the individual filings - the complaints, briefs and orders and so on.
Nice work, very fast, simple display. Also, I learned that a speeding ticket I never paid back in the 80's is still an open case, and that a few articles I wrote in the 90's have been cited in dozens of patents. Yippee.
In Quebec, we have the SOQUIJ (Société québécoise d'information juridique) [1] that allow you to search court cases. Other provinces might have something similar.
Wow. What a resource. Do a search for "cDNA patent" and read how many things the Supreme Court can get wrong in just a few paragraphs. (Hint: cDNA is not "composite DNA").
Wow I just found out that a lot of distant family members on the opposite side of the country who I've never met are really bad drivers. Found one of my own moving violations in there too.
I know there is some open source (?) effort to publish and give access to court cases instead of having it behind a paid subscription channeled through the federal court system. Does anyone know how that's going?
And also, are only the primary filings of the court and parties available to be searched? What happens to depositions, evidence records, etc. that are part of the case? Are those ever available to the public?
I was making a joke because some basic info like name, DOB and address are enough to get into a bank account if the password was forgotten, especially if you know the answer to security questions like "What is the name of the street you grew up on?"
I also learned recently that if things weren’t sealed then it is available to anyone. Is it possible to create a DB of all such public documents attached to the cases? What would it take to do that?
Justia is a general legal info portal and has many high-level court opinions within that portal. Casetext is primarily legal research software and has many US/state codes within its database. (https://casetext.com/coverage) I think the broad strokes are right in that summary. judyrecords has many more cases than Justia or casetext. More than 600M+ if I had to guess quick.
This is a pretty big violation of privacy, especially for people who's criminal records have been expunged.
It could satisfy some people's curiosity or if they are a lawyer, they could save a few grand on PACER, but for everyone else this is a privacy disaster.
Hard agree, it feels like an invasion of privacy in line with the cancerous people search/background search websites that make available everything from your home address and phone number to your pets name.
This is... not great. It's crucial that these records be open to public inspection. But instant full-text search of the entire dockets of 630M cases feels wrong, invasive, and dangerous to me.
It's yet another instance of panopticon surveillance now being too cheap to meter. I think our society needs to come to grips with this new reality and figure out what to do about it.
These records have always been available to people with money to spend on a lawyer with a subscription. So what you're complaining about is that normal people can also access the information now.
Quantity has a quality of its own. To use a similar example, arrest and imprisonment records are public data in my country. But you have to actually go to the courthouse and fill out some paperwork and/or hire a lawyer to do it for you.
This has consequences. For example, in some US states it takes a few seconds for an employer to find out a candidate was once arrested while drunk, or has a conviction for a minor offense from 15 years ago. And employers do that sort of search routinely, because it's free and easy. Only someone being targeted for a specific background check gets that treatment here, because it's not so easy.
Same argument applies to, for example, reading the previous divorce case for someone you're dating. Only a real weirdo would do that here, in part because it involves time and money. If it's freely available online, I do think it would be a lot more common.
I don't know whether it'd be better or worse to have such information more accessible, but it can change things.
> Only a real weirdo would do that here, in part because it involves time and money.
I think your parent's point is that money isn't an issue for the rich. A billionaire doesn't care that it costs $150 to find out, they don't care that it costs a $1,000 to find out. So suddenly information becomes a class issue. Either it should be available to nobody or everybody, money shouldn't factor into it.
A lot of policies or laws don't affect billionares the same way. We don't fine speeding tickets based on income like norway. Nobody is changing any laws to make it proportional impact on billionares.
You're all missing the point... ANYbody with a Lexis Nexus subscription, or a Bloomberg terminal, or one of those background check sites, already has this exact capability. It's not new.
You dont need to be a lawyer to access any of it... I think the other poster simply meant that lawyers generally have Lexis subscriptions, already.
Also, the various court databases this site is searching are ALREADY online and publicly available, and have been for years. This is just providing a free, unified interface with a fast search index.
So you think the $10/month fee for existing services is what induced the phase change? I've done hundreds of these searches, paid maybe $20 total.
I have a different theory about why this bugs you... Previous to today, you were ignorant of that these records were available online so cheaply and quickly. Nothing has really changed, except your own anxiety levels as your worldview struggles to absorb this information. But your brain wants the change to be external, because that's less threatening than the realization that this capability has been lurking out there in the world, all along.
At some level I get the angst about typing someone's name, especially if it's fairly unusual, and getting back a whole lot of information about, in this case, mostly legal-related stuff and in others past addresses, things they've written etc. for free. (And, if you know something about them you can probably sift the returns somewhat effectively.) You may be able to find out a lot about your date, your neighbor, etc.
On the other hand, outside of casually checking out someone, the reality is that this has long been available for anyone want to spend a very few bucks to do so.
> - Three or four years of study at a law school accredited by the American Bar Association
> - Four years of study at a State Bar-registered, fixed-facility law school
> - Four years of study with a minimum of 864 hours of preparation at a registered unaccredited distance-learning or correspondence law school
> - Four years of study under the supervision of a state judge or attorney
> - A combination of these programs
If all you've done is pass the bar exam, you can go to hell. It may not even be possible; the requirements for the California exam are behind a login wall, but other states have restricted eligibility to take the exam based on the testee's education. I assume they didn't want to be embarrassed by having the wrong sort of person pass the bar.
I think you may be misunderstanding what this is... All of these documents were ALREADY public records, and were ALREADY available online. Most US courts have been publishing these records online, for a while now.
And they are ALREADY other websites/search products that provide a unified search interface... Lexis Nexus is probably the biggest/oldest, and I believe Bloomberg also has this feature... There are dozens (if not hundreds) of cheap public record search websites that charge $10/month for it, too.
If you're surprised by all this, you haven't been paying attention... For a few decades now.
Powerful corporate and government actors have massive surveillance and data warehousing capabilities that aren't going away. At the very least, putting those powers into the hands of the public helps to level the playing field.
Society will have to change to accommodate the digital panopticon. I don't see the digital panopticon going away, though.
> putting those powers into the hands of the public helps to level the playing field
Agreed, but ...
> Powerful corporate and government actors have massive surveillance and data warehousing capabilities that aren't going away.
To nitpick: They aren't going away as long as we spread that message. It's not easy, but we can make them go away. People do accomplish things and change the world - just compare today's world with 500 years ago; all the differences the result of people changing things. Defeatism is trendy, and who benefits? (The status quo.)
> To nitpick: They aren't going away as long as we spread that message. ... Defeatism is trendy, and who benefits?
It's not defeatism-- it's just being realistic. I don't believe there's any useful method to make government actors comply with the law. I have an, admittedly US perspective, but evidence the FBI under J. Edgar Hoover, the NSA and the subsequent Church committee hearings, and Snowden's disclosures as examples. The power afforded by mass surveillance and data warehousing is too attractive not to be abused.
You must have heard that line from pessimists 10,000 times before. There would be no startups, science, democracy, etc. if people believed it. We'd be living in caves - 'let's be realistic, we've been living in them for 190,000 years!'
> I don't believe there's any useful method to make government actors comply with the law.
The evidence is overwhelmingly otherwise: Many, many government actors have been caught, prosecuted and punished, at every level - including multiple Presidents, at least one Vice President (off the top of my head), members of Congress, federal judges, generals and admirals, and more, and that's just the federal level.
Goverments have widely differing levels of corruption, and the US has in the past been one of the best - so it's effective. Other countries are also more and less effective than the US, and we can see what they do and what works. There is plenty of research. Nothing is stopping you, but you.
The current trend in defeatism - against all evidence, in a country with one of the most effective governments in the history of humanity - serves someone's interests. Who? Who benefits from spreading this message?
I don't see any problem with this. These cases are in the public record, why should the public not have the ability to search them for free without requiring access to expensive legal indices?
The public has always had the ability to search them, it's just been more difficult to do in the past.
Lowering the barriers to this is not necessarily a good thing. For example: if I have to drive to the county clerk's office to get records, I am unlikely to do so. This means that if I need the information, I will go get it, but if I do not need it, I won't bother.
The patent section being so easily searched is very useful and I see no downsides there.
When the giant Equifax hack happened creditors had to start doing more diligence to make sure they weren't allowing fraudulent borrowing since it was happening so frequently.
So optimistically maybe having lots of information public and easily searchable like this ultimately leads to a similar outcome. Like maybe it forces us to stop using date of birth to verify identity since it's so easy to find anyone's date of birth.
If the information is going to be out there maybe it's safer for everyone to have it than just a few people. If just a few people have it, and they use it to screw just a few other people, there's no pressure to fix it.
Generally you’ve had to pay for an expensive service (Lexus nexus), or go to the courthouse yourself to pull the records. Search was also a bit of a black art.
So generally easy to hide in the noise. Here you can just put in a name, and off you go.
The public has access to most local court data in my state (Ohio, US) thru websites run by the various local courts. A state-level database for government use is, as far as I know, still not actually available (though it has been in planning and some phase of execution for 10+ years).
There are public court records (criminal, civil), and there are non-public court records (e.g. sealed - juvenile, divorce, etc.)
As far as I can tell, all of this data is of the public nature.
While it may feel weird to type in someone's name and see their history with regard to legal filings... that is the society we live in: an open society.
Aggregating a number of disconnected data sources for search I think is absolutely a legitimate usage of the data.
I have some records that are sealed, but show up in this database. So there are records that were once ‘public’ but are no more, but this database makes them public again.
Fair enough - in my state they are limited to parties involved and their counsel.
The public can still see the filing and result (when the divorce was granted), but the actual documents are restricted so as not to air all of one's dirty laundry unnecessarily.
This is amazing. Can you share any info on how you were able to compile so much info from different sources? In my limited experience of hunting for legal filings, it seemed like every court had its own system, with nothing standardized or programmatic.
The search uses elasticsearch 7 for full text search. It's been extremely fast and worked very well. You're right court data is scattered across many different systems and needs to be aggregated, which is a difficult process.
Are you using freelaw's code to scrape all the different servers? Why are there no contact details on the site? I don't understand the mystery and black ops nature of this thing. It feels like there is some sort of conspiracy here that I've yet to uncover!
There are 2 search boxes going. One for storing the search index without source and another which stores the source, which is only used for highlighting. Searches usually take under 200ms and SRP and individual pages usually take less than 20ms. The 2 ES nodes are not formally part of a single cluster due to the index storage difference. Another box uses a traditional LAMP setup. Feel free to send a message on reddit if interested in more detail.
Think of it like a search engine, like Google: There is a lot of editorial power in a search engine, such as what is listed, what is listed first, what is excluded, and accuracy (correctness and completeness).
Always know your source; there are no exceptions - especially in the 'post-truth' era.
Not the original poster but: Your service costs money to run. You're not, so far as I can tell, a recognized non-profit doing this out of the goodness of your heart. Given that you're not running ads, all the monetization angles are yucky.
The search query stream of your site alone is immediately monetizable for targeting by "legitimate businesses" who sell "reputation management services." (People who expect to find records are searching for their own names. People who suspect others might have records are searching for those names. Source: see every second comment on this topic) Notably, nothing in your terms suggests this isn't already your business model - you don't need to collect or retain personal information about the visitors to your site to monetize the search query stream. In fact, it's probably "better" not to collect PII, because then you can claim that the search queries are "just strings the visitor provided voluntarily" and that you have no idea whose names those are.
Also: your dataset is littered with records that are properly under seal, including juvenile records. You are NOT relieved of potential liability for disseminating these just because they're included in your dragnet, should they be found through your service and used in a way that causes actual damages.
I'd strongly caution anyone against using this site at all, much less searching names. This site is plausably an input to an extortion machine at scale.