I suspect wikipedia has peaked and shall forever be fighting with itself. Larry Sanger's article on how bad political influence has taken over wikipedia is revealing. Then you start to see how bad it really is on the subjects you know well.
Wikipedia does sports teams stuff well. If you ever need to look up who won the Formula 1 race in 1990, you know wikipedia will have the right answer but soon as you dive even a millimeter into anything current events or political or religion and you arent going to find real information.
It would be more interesting to have estimated the hazard function of the fibs. Here's my bogus estimation: A third of the fibs being removed within 2 days implies a half life of 3.5 days and an average fib duration of 5 days.
While the topic is intriguing, I dislike the use of "public services" for this type of research.
For instance, adding substances to a water reservoir to study their effects is unacceptable, without permission or supervision.
Similarly, conducting such research without Wikipedia's permission/supervision should not be accepted.
Someone tried something similar but with higher risk: inserting security backdoors into the Linux kernel. They were caught and (AFAIK) their entire school was permabanned from sending pull requests.
This was also my thought. Search for hypocrite commits, and a link to an lwn article: https://lwn.net/Articles/853717/ . They did ban their whole school
I'm of quite the opposite opinion. Within reason (importantly), I believe any public service, which is also managed by an anonymous, decentralized community, ought to be under test constantly and by anyone. What's the alternative, really?
Imagine if it was taboo to independently test the integrity of bitcoin for example.
The sibling mentioned the linux kernel case. I admit that one felt wrong. It was a legitimate waste of contributor time and energy, with the potential to open real security holes.
I don't pretend to have reconciled why one seems right to me and the other wrong.
> Imagine if it was taboo to independently test the integrity of bitcoin for example.
> The sibling mentioned the linux kernel case. I admit that one felt wrong.
> I don't pretend to have reconciled why one seems right to me and the other wrong.
The "how" is what matters here, not just the "what". "Testing the integrity of Bitcoin" by breaking the hash on your own machine (and publishing the results, or not) is one thing. "Testing" it by sending transactions that might drain someone else's wallet is quite another. Similarly with Linux, hacking it on your own machine and publishing the result is one thing. Introducing a potential security hole on others' machines is another. Similarly with water: messing with your own drinking water is one thing. Messing with someone else's water is quite another.
> Similarly with Linux, hacking it on your own machine and publishing the result is one thing. Introducing a potential security hole on others' machines is another.
Playing devils advocate for a moment. How else do you test the robustness of the human process to prevent bad actors? Don’t you need someone to attempt to introduce a security hole to know that you are robust to this kind of attack?
You do it w/ a buy-in, e.g. permission from some of the maintainers - so they are aware. If you do not get permission, you do nothing. It's similar to penetration testing/
Interestingly, while I 100% agree with you regarding the parent's question about security holes, I'm actually not sure how an experiment like the one on Wikipedia could be performed even with proper buy-in from all the owning entities (Wikimedia Foundation?) Is it even in principle possible to test this ethically without risking misleading the users (the public)? If not, does that mean it's better if nobody researches it at all? The best I can think of is by making edits that as harmless as possible, but their very inconsequentiality would make them inherently less likely for them to be removed. Any thoughts?
The usual answer is the chain of trust. However, that might be against the wikipedia principles. There is "importance scale" for articles, for anything considered C+ class important, editing becomes similar to pull request, or the page has a warning of having unverified info.
It's a hard problem having fully editable storage by anyone, while maintaining integrity.
You sift through the edit log to find edits correcting factual errors.
Then you find the edit where the error was introduced.
You can probably let an LLM do the first pass to identify likely candidates. With maybe 20 hours of work you could probably identify hundreds of factual errors. (Number is drawn from a hat.)
Excellent point. That's more difficult but I think the ethical way to do it would be to recruit subject matter experts to fact check articles across a variety of disciplines. Bonus, you can then contribute corrections.
In general what I'm saying is, this is a fertile ground for natural experiments. We don't need to manufacture factual errors in Wikipedia. They occur naturally.
I mean, you're asking for a retrospective study, as opposed to a randomized controlled trial. It's useful and a great idea, but it's not like it's an equivalent way of getting equal quality data.
But is the goal to conduct a randomized controlled trial, or to measure the correction rate within the bounds of ethics? You go to war with the army you have.
Well the goal is to measure the correction rate within the bounds of ethics, but the question is how accurate the result would be without an RCT. Intuitively I would hope it's accurate, but how would you know without an experiment actually doing it? How do you know there aren't confounding factors greatly skewing the result?
If you'll grant that we're also able to replicate the study many times, we're left with errors that are not caught by Wikipedians or independent teams of experts. At that point I think we're looking at errors that have been written into history - the kind of error that originates in primary sources and can't be identified through fact checking. We could maybe estimate the size of that set by identifying widely-accepted misconceptions that were later overturned, but then we're back to my first suggestion and your objection to it.
But more importantly we probably won't catch that sort of error by introducing fabrications, either. Fabrications might replicate a class of error we're interested in, but if we just throw it onto Wikipedia, it's not going to be a longstanding misunderstanding which is immune to fact checking (at least without giving it a lot of time to develop into a citogenesis event, but that's exactly the kind of externality we're trying to avoid).
(Of course, "how many times do we need to replicate it?" remains unanswered. I think maybe after we have several replications and have data on false negatives by our teams of experts, we could come up with an estimate.)
> Playing devils advocate for a moment. How else do you test the robustness of the human process to prevent bad actors? Don’t you need someone to attempt to introduce a security hole to know that you are robust to this kind of attack?
How do you test that the White House perimeters are secure, or that the president is adequately protected by the Secret Service?
I think the key difference is supervision, is there another party keeping an eye on what is tested and how. And maybe insuring no permanent damage is done at the end.
That's frankly one of the first thoughts that came to my mind.
I've asked the author about ethical review and processes on the Fediverse.
That said, both Wikipedia and the Linux kernel (mentioned in another response to this subthread) should anticipate and defend against either research-based or purely malicious attacks.
If it's a mature product, you should be able to pick it up and rattle it without it breaking. If it's still maturing, then maybe the odd shock here and there will prepare it for maturity?
It's true that the system must be tolerant to these sorts of faults, but that doesn't mean we have a right to stress it. The margin for error is not infinite, and by consuming some of it we increase the likelihood of errors going undetected for longer.
Sometimes it will be worth it anyway, and I don't have an opinion about this Wikipedia example, but I think it's pretty uncontroversial that the Linux example was out of line.
I think one would have to weigh the pros and cons of this kind of research. In particular, the main cons (IMO) are:
* users are misled about facts
* trust is lost in Wikipedia
* other users/organizations use this as a blueprint to insert false information
Harm 3 seems to be the most serious, but I suspect it has happened/will happen irrespective of this research.
As opposed to the water reservoir example, these harms seem quite small by contrast. I would have liked to see a section discussing this in the blog post, but perhaps that's included in the original paper.
Everything was reverted with 48 hours, your arguments might all apply theoretically but given scope, size, practice and handling, I wonder - apart from the theory - what your opinion is how they practically apply for this case.
I didn't make it very clear, but I agree that the specific example isn't problematic. The false claims weren't meant to be any sort of targeted disinformation, and like you mention they reverted it in 48 hours.
So the author, a philosophy professor, casually permitted themselves to vandalize a public resource "for science", with the only accommodation given to the public being that they generously fixed their own vandalism after 48 hours if others hadn't done so already.
Not a single word about the ethics of all this in the entire article, as far as I can tell. It's not just that they decided to do this, spreading misinformation and wasting other people's time in the process – they seem to not even have considered the impact of it.
Please re-take Foundations of Ethics 101 before presuming to teach others about philosophy.
15 years ago is almost exactly the date I have in my head for when Wikipedia stopped being interesting or exciting. Remember when it felt like something new and exciting, something that would tell you everything about the whole world? And then somehow it got captured by a bunch of bureaucrats who decided they wanted it to be a free copy of Britannica, and it achieved that, and has since just sat there begging for money and throwing it into the endless pit of irrelevant projects that is the rest of the Wikimedia Foundation?
More concretely, the new user experience got worse and worse: powertripping mods squatting their pet pages grew out of control, and all the notability/sourcing rules haven't really done anything to improve the quality of the content, they've just made it more tedious to contribute. And all this felt predictable at the time. The whole site took a massive wrong turn, and yet there never felt like any way to stop it.
How could you possibly call the largest encyclopedia in human history, freely accessible to all, and correct nearly all of the time irrelevant?
There are imperfections to Wikipedia and the Foundation of course, but, what more can you reasonably expect? If you want old internet fun, we have neocities.org for that. Wikipedia serves a functional purpose and does is exceedingly well in most cases.
> How could you possibly call the largest encyclopedia in human history, freely accessible to all, and correct nearly all of the time irrelevant?
I didn't - I called the other stuff the Wikimedia Foundation does, the stuff that it spends the vast majority of its immense donation income on, irrelevant.
> There are imperfections to Wikipedia and the Foundation of course, but, what more can you reasonably expect?
I expected policies that better reflected what most users and contributors wanted, or that helped make something better. Almost everyone was against the notability rules (and, while it's a minor thing really, almost everyone was in favour of spoiler warnings; they may have had theoretical problems but they worked well in practice). All the measures to make it harder for new users to edit were based on a false assumption - actually new users are no more likely to be vandals or politically biased than anyone else.
There's a major difference that makes Wikipedia fundamentally different than encyclopedias. Britannica was written primarily by experts ranging from Albert Einstein to Isaac Asimov, and everybody in between. And these writers would offer their own perspectives, insights, and expertise. It was a rich primary source.
Wikipedia, by contrast, is not a primary source, and writers are supposed to do little more than paraphrase reliable sources. This is a good idea in theory to solve the issue of going from having articles written by Einstein to having them written by random people, but it doesn't seem to work so well in practice. The problem is that topics whose framing is seen as relevant by some group or another (which is an absurdly large amount) drags in partisans, corporations, politicians, intelligence agencies and the rest of the dreg. And it leaves Wiki really hurting for quality on these pages.
I think it ceased to be a wiki a long time ago. The whole point of a wiki is to have the lowest barrier to entry possible for editing so any non-expert can jump in and change something quickly. The word “wiki” is Hawaiian for “quick”.
But these days, a huge amount of bureaucracy has sprung up around editing things on Wikipedia. There’s an immense number of rules; there are people with seemingly infinite time gatekeeping their favourite topics; there are protected pages; there are notability requirements; there are vast IP bans; and so on. Wikipedia has a bureaucracy fetish and an army of rules lawyers. You might argue that for the scale of Wikipedia this is required, and I’m sure there’s at least some truth to that, but that is incompatible with the idea of a wiki.
I know that new user experience is awful now. Can’t even correct misinformation because my whole range of available dynamically assigned IP addresses from my ISP (the largest in the country) are blocked and I can’t find any way to sign up for an account that’s navigable compared to the benefit of a huge body of errata on a wide range of topics. That’s my example.
Beyond that, a bureaucracy has arisen and much like the nomenklatura they exhibit many of the problems of the system they sought to replace.
Wikipedia is still great.
But the new stuff is not as fun. The user added photographs from 20 years ago pleasantly surprise me a lot more than the new stuff. But I don’t keep a file on hand.
wikipedia is great for science content, old content that is research based. things where the sources can be find and cited clearly. wikipedia is not great for current events. there is a clear bias in wikipedias content.
even something as simple as the did you know section on the homepage. for nearly all of 2022 and the early part of this year one of the 5 items in the did you know section was about ukraine every single day. i’m not trying to make a post about the war i’m really trying to say i wish wikipedia didn’t do this kind of stuff to portray a sense of “neutral information hub” rather than “wiki admin propaganda tool”
I arbitrarily selected the month of September 2022 and found that on only 5 out of the 30 days of the month was there an entry related to Ukraine or Ukrainians in the "Did you know" section. Of those 5, only 1 was related to the war. I disagree that 5/30 constitutes "nearly every day".
Furthermore, I don't see what's inherently wrong with including information in that section that relates to a topic people are likely to be interested in at the moment.
Additionally, the "Did you know section" isn't really for current events. That would be the "In the news" section.
Wikipedia does sports teams stuff well. If you ever need to look up who won the Formula 1 race in 1990, you know wikipedia will have the right answer but soon as you dive even a millimeter into anything current events or political or religion and you arent going to find real information.