Hacker News new | past | comments | ask | show | jobs | submit login
AI beats human sleuth at finding problematic images in research papers (nature.com)
148 points by webmaven on Oct 4, 2023 | hide | past | favorite | 91 comments



How is the software only 2-3x faster than human review, which lasted months? Little in this article is very clear


I guess they had a system that required humans to confirm the validity or so.

I had a similar problem when I realized someone put up CSAM (Children sexual abuse material) on my public demo of my CDN called PictShare [1].

I didn't want to look through all of these images so I built a Raspberry Pi with a Neural compute stick [2] that used an AI model trained by yahoo to filter out "nudity" images and I put them in an encrypted ZIP file along with the access logs and sent them to Interpol.

This lead to the arrest of a teacher here in Austria so I'm glad I could do my part.

This even lead to a BBC article about my system [3]

[1] https://github.com/HaschekSolutions/pictshare

[2] https://blog.haschek.at/2018/fight-child-pornography-with-ra...

[3] https://www.bbc.com/news/technology-44525358


Good thing you aren’t German where this would now probably be illegal because we have politicians that are scared of thinking (possession of CSAM leaves no more way for a judge or a prosecutor to not prosecute or stopping, no matter the reason, even if the reason is to bring it to the police, a proud tradition in Germany, making the removal of CSAM harder, that Ursula von der Leyen started).

A German legal discussion about this is here [0] where a teacher who wanted to help is now being prosecuted (simpler summary here [1]), and apparently those idiots now finally plan to change the law after exactly those issues were explained to them when they started implementing this law.

[0]: https://community.beck.de/2023/08/30/wenn-gesetzgeber-und-ju...

[1]: https://www.lawblog.de/archives/2023/08/28/staatsanwaltschaf...


Oh wow that's really fucked up.

Here in Austria the police is not very well trained on any computer related crimes either. When I initially found the first CSAM image my first instinct was to call the non-emergency line of the Police and the officer told me to "print out the images, and bring them to the next police station".

I had a thought that this would be a very very stupid thing to do so I contacted Interpol and they sad basically "omg whatever you do, don't print those out and don't carry them around"


> print out the images, and bring them to the next police station

That lets them solve an easier crime than finding the person who uploaded them or even who made them. Sounds like the police are pretty smart there. "Today we busted an individual who was intending to distribute child porn."


Ahh, austria police. A few years ago, my neighbour had an aggressive episode and ended up telling me "I bring di um". Naiv as I was, I called the police, which promptly answered "Do kemma nix tuan".


Can't you get a lawyer to file a restraining order? It unfortunately costs time and money, but at least it would deal with your neighbor.


How does a restraining order help with a mentally unstable person? Do you seriously believe in time of crisis, they will remember "Oh yeah, but there is this outstanding restraining order, so I need to calm down"?


oh classic. Nothing is "gefährliche Drohung" unless it's pointed at public figures


True that. Quod licet Iovi, non licet bovi.


> "print out the images, and bring them to the next police station".

Wow, I think that would have been an issue here even before the change, only then the prosecutor could have decided to stop the proceedings before it even went to court.


This is awesome, but since you mention you used a nudity detector and haven't reviewed them...

I would expect for a popular service you would have gigabytes of random legal nudity for every actual Interpol-worthy image, right? I guess Interpol is OK sifting through all that?

And wouldn't that mean that for all of those nude-but-legal pics you were unnecessarily disclosing regular user PII by sending along access logs to law enforcement? Not to say it wasn't worth it in the end


I asked interpol upfront if it would be okay to send them the data for them to review and they said OK.

There would probably not be a reason to upload "legal nudity" on a demo site of a selfhostable service and also I wipe the whole database in irregular intervals.

But actually now I'm using the CSAM detection tool from cloudflare which automatically detects, flags and reports CSAM which takes the hassle out for me. Not sure about false positives though but afterall. It's just a demo site and nobody should upload anything other than test things


The more detailed stories in the link do not mention that the material was forwarded unreviewed. In fact, they specifically mention that this concerns 16 images.


To me it did (and does) sound like service operator did a very straightforward "detect nudity -> zip -> Interpol" at first.


Note that they said "I didn't want to look through all of these images", not the ones that are detected. They probably manually reviewed the ones that are detected by the automated system before sending them to Interpol.


> Little in this article is very clear

It also wasn't clear why finding duplicated images requires "AI" or in what sense Imagetwin, which advertises:

Imagetwin is an AI-based software for detecting integrity issues in figures of scientific articles.

is AI. Perhaps it's just marketing.


It largely is. Many use AI and machine learning interchangeably these days. I’ve even seen a random forest model called “AI”…


The article seems to discuss manipulation (a valid concern) but the research focuses on duplication. I'm not entirely clear what the link between the two is?


Duplicated images are often a substitute for real data. https://www.newyorker.com/science/elements/how-a-sharp-eyed-...


Yikes. She contacted the journal to report this clear fraud they were perpetuating and received no response for 6 months. Yet people equate “peer reviewed” and “absolute fact”.


Peer review means "has no blatantly obvious methodological issues" at best, and not even that a decent amount of the time. A lot of people take it to mean that the paper's results/conclusions are accurate.


Peer review standards also differ between fields and journals.


I think when lay people say "peer reviewed research" they mean it as a metanym for "the scientific consensus as established by empirical research and vigorous debate," which isn't absolute fact to be sure, but is about as sturdy a foundation as you can hope for. I think I know more about peer review and retractions than most people - and I know next to nothing.


I think this demonstrates a hypothesis: The institution is failing, the technology which exploits the failure is close to irrelevant.


> I think this demonstrates a hypothesis: The institution is failing

This claim needs to be more specific to be meaningful: How is the 'institution' defined? What evidence do we have that it's performing worse than before, or not functioning (producing valuable scientific knowledge)?


If you want to hear back from any organization, you’ve got to write it on lawyer letterhead.


Which indeed doesn't work for journals because publishing wrong research isn't (and shouldn't be) illegal...


I just mean, it’s so easy to ignore inquiries and I’ve found that if you ever get ignored and need to be heard, a lawyer’s letterhead attracts attention.

I agree with the legal issue you put forth.


Yes, but the social engineering aspect of that trick might still work.


Frustrating that an article about images has none.


I worry about the ongoing ratcheting of the arms races that AI creates. It's nice that we have better tools to detect problems, but those tools are also readily available to bad actors and they will run their attempted fraud through these tools multiple different times until they succeed and bypass them. We see the same thing in malware where every malware producer make sure before they deploy something that it doesn't get detected by any known tools or methods.


That's not quite the full picture you're painting here.

First of, at the moment the effort is super asymetrical: it takes way more efforts to check a paper for image duplication than it takes to alter an existing image to add it to a paper. So the journals are losing against the counterfeiter as we speak.

What's more, it is possible to run the detection tools long after publication, which makes it harder for someone to fake images because they run the risk of being detected later on with a more advanced version of the tooling. Typically malware devs try to erase all traces of their bad deeds and to run away undetected for this exact reason. But how do you hide an already published paper?

Finally, it takes efforts to run the tools. If this effort is greater than the effort it takes to fake an image, which it is, then the cost may become prohibitive, so fakers will try to avoid faking+prechecking in favour of faking+hiding (by publishing in unchecked journals for example). That is until they group together to organize faking at scale (setting up and automating a tool like that is probably were the cost is, not running it). Faking at scale does exist, some countries practice it eagerly. So now the question becomes: do we turn a blank eye to such blatant misuse of research or do we try to take them red-handed and to ban/punish bad behavior? Not such a hard question since being on the offensive is the only scenario where we have a remote chance of stopping or at least reducing the influx of faked papers, you don't stop the mafia by ignoring them.


I think it might stay the same as it's always been: perfectly executed fraud is impossible to detect, but fraudsters are lazy and leave clues.


Sounds like a problem that AI can solve! It can help both protect against and perpetrate fraud!


Remember that "AI" doesn't actually mean anything. In this case it seems to mean "perceptual hashing image search".


Yea, that’s super annoying!

Just like the word car doesn’t mean anything. Last time someone threw out the buzzword car to me, it turned out they were actually talking about a 2008 four door Honda civic with four speed automatic in blue paint. I don’t know why they couldn’t have said that instead of trying to bs me by saying they were buckling their kid in the booster seat in their car.


Soon enough we’ll re-learn that you can’t trust anything except that what you can personally reproduce.

Personally I’m glad. Too many years of people subscribing to the most bone headed theories and saying “look, it’s peer reviewed!”.


> [Y]ou can’t trust anything except that what you can personally reproduce.

Do you believe that matter is composed of atoms, or that stars are distant suns, or that black holes exist, or that the speed of light is 299,792,458 m/s? Have you reproduced any of those observations?

Personally I don't really want to be confined to my own humble means. I'm just not that smart, life is much richer when I can learn from others. I'd rather stand on the shoulders of giants and inherit a body of knowledge from my society, accepting that it contains flaws and even outright lies that I will need to do my best to discard.

There's a lot of space between blindly accepting all peer reviewed research and rejecting science altogether.


> Do you believe that matter is composed of atoms, or that stars are distant suns, or that black holes exist, or that the speed of light is 299,792,458 m/s?

Not particularly. I don’t actively disbelieve them either. But whether or not any of those are true has zero measurable impact on my life, so it is of no consequence.

I do however believe that exciting electricity in a periodic manner will cause a wave to emit that travels “really fucking fast” and can be received by a similar device a great distance away.

And that such excitations can be received by antennae I have constructed which communicate with satellites in a precise manner that can allow me to locate myself on a coordinate plane. And that those calculations depend on the speed of light being as you say. So sure, it is likely to be as you say. Or at least isomorphic to that, for my purposes.

How has your blind acceptance of these various other non-observed phenomena benefited you?


I reject that my acceptance is "blind," just because I haven't taken spectrograms of distant stars and verified that they are composed of the same stuff as our sun doesn't mean I haven't engaged critically with the idea and evaluated it on merits. Honestly I am irked to be told that I need to be more humble about what I do and don't know by someone willing to immediately jump to conclusions about me, and I feel it undermines your point.

Engaging with and understanding the world does routinely enrich my life, yes. I've seen a lot of really beautiful things because I've known what to look for out what is surprising or out of place, and I've known those things by being educated about the world.

Here's one arbitrary recent example I discussed on HN: https://news.ycombinator.com/item?id=37442127

But hey, you do you, if neither believing nor disbelieving in stars does it for you, go for it.


Not sure why this is downvoted. Is there some blaringly obvious life changing impact of holding faith in the common astronomer’s view of black holes / stars / etc that I am missing?


I would wager it's because of the "blind acceptance" bit, which is rude and presumptive.


> that the speed of light is 299,792,458 m/s

This is currently a fixed constant, not an observed quantity, because of the way the metre is defined.


True, but that means the length of a meter is a scientific observation we can't individually duplicate.


While you shouldn't trust everything, trust is certainly still necessary since few people have the experience or resources to do independent scientific studies.


Of course you will still need to trust others to reproduce results in areas with which you are unfamiliar, but this is still better than implicitly trusting peer reviewers.

Same thing with software security: if software is open source then I still need to trust someone else to determine if the source code is malicious, but it is better than implicitly trusting software because e.g. “a contractor has done a security audit.”

Basically I don’t believe you can truly trust closed-anything, especially scientific analysis. You need _less_ trust when the thing is reproducible, which is the best you can ask for.


Science is very reproducibile by design, half the goal of peer review is to ensure that.

The problem is the incentive structure is not aligned with that and so reproduction is never done.

I would instead argue that consensus is the best bar we have to judge science. If the majority of author papers align that means either there is a conspiracy or some fraction of things was reproducibile.


If that is the case, why did the replication crisis occur despite peer review? And why don’t most studies publish code or data? If you look closely most papers today are not reproducible.

Consensus is a horrible way to judge science. The whole point of basing knowledge on empirical data is so that you don’t need to defer to authorities or consensus (nullius in verba). Consensus based science is just a synonym for Kuhnian paradigms (aka a scientific popularity contest). See Lakatos and Feyerabend for why this is not a logical or useful way to judge science.

Imagine a field where the consensus methods are later found to be in error. In this case the only valuable thing to do is to challenge the consensus so the field adopts better methods. In the meantime, following the consensus guarantees following falsehoods because everyone is wrong in the same way. In many cases the minority heterodox views are the most correct.


Replication crisis occurred because nobody replicated. It wasn't that you couldn't (although I slightly overgeneralized the ease of replication I will admit).

Pretending that the tools we have aren't useful because they aren't perfect isn't valuable.

Heliocentric models were good for science even if they were wrong.

You aren't saying anything that moves the line towards a better situation just pointlessly complaining that it isn't good enough.


I have read a lot of papers that are vague with the details of implementation. Some release code that isn't portable at all and probably works only on their machine with 0 documentation.

A lot of obfuscation in general. Poor documentation and omission of critical implementation details means it's way too hard to replicate a paper.

And even if you do replicate there's no way of finding out if the output is wrong because of your implementation or the idea itself is flawed.

Doing proper documentation should be mandatory. I can only talk about CS but yeah more information.


I think you should reread the original replication crisis paper, a huge proportion were simply not able to be replicated based on information in the original papers.

I agree heliocentric models were useful, but if scientists had followed the consensus methods that produced heliocentrism then we would not have gone past it. It was heterodox theorists who resisted Catholic authority who progressed science. The Vatican are not good peer-reviewers.

If you want a better solution look to Lakatos’ Methodology of Scientific Research Programmes. It is much better than the status quo of relying on peer review. It is rarely used due to inertia of the status quo, but methods like it will become essential as scientific failures mount.

I don’t think my complaints are pointless, we need to take scientific failures very seriously because they eventually become policy failures and cause all kinds of suffering.

Also, I’m not pretending that tools like peer review are useless, I’m arguing that they are useless and providing alternatives that have been developed over decades of epistemological progress.


My understanding of the paper is less "we couldn't reproduce this paper" and more "insufficient data is available to guarantee any replication is identical".

Basically how can you be sure your difference points to a mistake in the original vs a mistake in repetition.

Additionally I think it was also a little "replication would be cheaper if we did this".

However I think calling it fundamental is a bit of a stress. After all such a paper is going to heavily focus on such details by necessity. You have to categorize to perform such a study and blockers become glaringly obvious.

However IMHO I think that a failed replication caused by inaccurate replication isn't a lost cause and instead acts as a new lens to view the original in. "You forgot to mention whether you controlled for X so I did but it didn't work" is valuable but certainly harder to catch in peer review (where the explicit mention might have caused questions).

Put another way we can replicate as well as we need to if we prioritize it and while better steps are important they aren't as important as actually replicating.

You only once mentioned a single theory which only touches on a part of the replication crisis and not the core issue of "no one can replicate with no funds to do so"...


I completely agree that replication is essential, but am arguing that peer review and consensus based science do not help improve replication and thus do not produce better science (generally).

We put huge amounts of (unpaid) scientific resources towards peer review and it has hardly improved anything. It’s the opposite: “it’s peer reviewed” has become a reflexive defence of published papers with the incorrect implicit assumption that peer review makes the science better.

We should focus on the initiatives that have a proven track record of improving things (like open science) and abandon peer review and reliance on consensus.

As you point out, financial limits prevent replication. So why do we spend so much time and money on fancy journals and reviewers who have such a terrible track record of improving things? We could be replicating instead.

For a recent example of consensus failures, see Nobel Prize winner Katalin Karikó whos seminal paper was rejected by Nature and whos research programme was rejected by her University for being too incremental. Her peers failed because they relied on flawed heuristics like consensus.


> We put huge amounts of (unpaid) scientific resources towards peer review and it has hardly improved anything.

What was the alternative (and when?), and what were the results? Who will fund all this replication - basically double-funding all research? Do we want our scientists spending their limited time on old things instead of discovering new ones?


I have more ideas in my other comments, but fundamentally I think open review is a better norm. Publish a preprint and let the world have at it! You don’t need a committee to decide who your ‘peers’ are: if they care enough they will be reading your paper anyway. Normalising publishing these reviews (like PubPeer) would ensure review, without an arbitrary choice of ‘peers’. Think a paper needs more reviews? Then pay the reviewers and publish their reviews. The benefits of secrecy and gate keeping are negligible compared to the downside.

Currently billions are spent on academic publishing even though servers are cheap and the labour is unpaid. This would be a good starting point to fund replications. But that will never happen while the status quo is “peer reviewed = good science”

Also, scientists should be spending time on what they think is valuable. Scientists happily replicate important papers using their existing resources. If no-one cares about your result and won't replicate for free, then pay someone to do it! This is the basis of adversarial collaborations (see FEP collaboration funded by a philanthropic foundation). No-one is advocating that e.g. we replicate newtons law for the Nth time. And how to determine if N is enough? Use Lakatos' MSRP to compare rival theories.

Again, these dreams are predicated on the status quo changing to where scientists care about the content of a paper, not where is it published or that a faceless committee has accepted its foibles. A world where the public response to a non-replicated result is:

"I wonder if any groups plan on replicating?" vs "New paper published in __nature__ has <newsworthy counter-intuitive yet unreplicable result>! Book deal / gov. contract / faculty promotion here I come!"


Without consensus-building there is only madness. Either you accept that some authority (possibly you) is the ultimate arbitrer of truth. Anyone who disagrees with them is automatically wrong. Or everyone who believes that their opinions are scientifically justified is equally right, as long as they stick to their beliefs. Or you choose to believe that you can always trust your own judgment, ignoring the evidence that often points to the contrary.

Consensus-building is about crowdsourcing the truth. You reject both authorities and relativism. You believe that, in many cases, the objective truth exists, even if it can be difficult to reach by fallible humans. You are also optimistic enough to believe that experts who share your worldview will accept true arguments and reject false arguments. Eventually, on the average, and with a high probability.


Basing truth on consensus is madness.

Science _is_ crowdsourcing the truth, by using ideas that other people generate and then testing them empirically to verify them.

This does not require consensus; rather consensus causes problems which make science worse, disagreements are always better for progress. We don’t have to accept current problems when there are better alternatives (see Lakatos for a good example).


But how do you know that the idea was verified successfully? It's common that some people manage to replicate a result, while others fail at it. No matter how many levels of indirection you add, you are still facing the same problem. Do you choose to believe in an authority, or is everyone equally right? Do you believe in your own infallibility? Or do you resort to consensus-building?

Disagreements have a key role in building a consensus. But those who disagree with the consensus are usually wrong, because the world is full of capable people with contrarian tendencies and weird ideas.


You will never “know” with certainty because of the limits of inductive reasoning to prove theories, but you _can_ choose demarcation methods that don’t rely on a consensus.

The consensus might be correct more of the time, but that shouldn’t be a scientific reason to follow the consensus. You should follow the consensus because you have some empirical reason to agree with it and disagree with it if you have reason to believe a better theory.

The only person who needs to defer to an authority reflexively is someone who is completely ignorant of science. They don’t have the sophistication to judge, but thinking that this extends to sophisticated actors is an error. Consensus is for the ignorant.


> The only person who needs to defer to an authority reflexively is someone who is completely ignorant of science. They don’t have the sophistication to judge, but thinking that this extends to sophisticated actors is an error. Consensus is for the ignorant.

This is where we disagree. The ignorant believe in their ability to judge. The deeper your expertise gets, the narrower the scope where you trust your judgment becomes. Because you have seen so many ways things can go subtly wrong. And because you have already been confidently wrong so many times.


I don’t understand how we disagree? The expert has more confidence in their field and thus can more safely reject the consensus than the ignorant.


This is a pretty rosy interpretation of how the science sausage is made.

1. The scientific method might be reproducible by design, but there can be (and often is) a very large gap between "Science" and what gets published as a paper.

2. Peer review is a grab bag. Sometimes it's obvious the reviewers barely read or understood what was written. Other times they provide feedback that is asinine. Or simply they provide commends and you tell the editor you're ignoring those comments and the paper still gets published.

3. Only a fraction of reviewers (depending on field) will check that the paper includes enough information to be reproduced. This is easy to verify, just go ask different academics how many papers they read that don't include enough information to reproduce. And only a fraction of a fraction make any attempt at reproducing the paper to any degree before signing off on it. (and if they say that they couldn't reproduce it, then see the end of point 2 above)

> If the majority of author papers align that means either there is a conspiracy or some fraction of things was reproducibile.

This is a false dichotomy. It misses some very real dynamics, none of which are conspiratorial.

1. It's hard to publish negative results, especially if the paper written by someone influential in the field. Which means that there's a great disincentive to even try.

2. Competing theories are difficult to study not just because of peer review issues but also because of financial and repetitional issues. Getting funding is significantly harder if your work doesn't use the methods that the funding agencies expect. Same idea with reputation, hard to get your career started if you want to work on theories that go against the prevailing theory or if you have work that contradicts it.

This is the whole idea behind the concept of paradigm shifts.


The former issue will be easier to fix than the latter.

After all going against the establishment generally results in failure.

Quantum Mechanics and General Relativity may not agree with each other but they both have destroyed mountains of attempts to disprove them.

I am not being rosy I am on the side of "we have to do something" contrasted against "all science is bad".

Not saying you are saying that but some posters are.

We can certainly improve things on many vectors but throwing away all existing science isn't the way forward either.


> Quantum Mechanics and General Relativity may not agree with each other but they both have destroyed mountains of attempts to disprove them.

I agree. Unfortunately for all of us the rest of published literature is nowhere near as battle test as these two theories, nor would they survive such testing.

> Not saying you are saying that but some posters are.

I didn’t mean to imply that you were. But the it doesn’t sound like the personally you replied to was on the side of “all science is bad” either.

I am closer to the “much ‘science’ is junk”. There is much good work a being done, but the garbage is very much there, more so in some fields than others.

The null hypothesis at this point for most new papers one reads is certain fields (looking at you nutrition & health) is “this won’t replicate”. It’s certainly also true for much work primarily involving modelling; the null hypothesis as a read should be “if this doesn’t have code, I won’t be able to reproduce/replicate it”.

We should not throw the baby out with the bath water, but we should be frank about the current state of things.


No one (in this thread) has suggested science is bad or we should throw it away, that is your own straw man.

I have suggested we throw away a part that is causing problems. You seem to be rushing to defend scientific failures by saying “it doesn’t matter that a lot of studies don’t replicate, we should follow the status quo regardless.” We can throw out the status quo and keep (the good parts of) science intact.


Peer review has little to nothing to do with reproducibility. In fact I’d say is has absolutely nothing at all to do with it. Why? Because exactly as you say: you can get a paper “peer reviewed” despite it being totally bogus and irreproducible.

But, as you also say, it’d be a Really Good Thing if it was about reproducibility! Imagine a world where instead of some people writing an essay, their “peers” giving it meaningless comments, and some editors at a paper selecting it to be enshrined as “valid”, we totally flipped the script:

People perform a scientific observation. They record their methods and results, and put it into some freely accessible store of data regarding the question at hand. Anyone is free to consult the store for any question, and observe how many entries it has, and how their results compare. If an entry has very few results, the person consulting it with the question would be encouraged to create a reproduction of their own, and share the results they derived as a sibiling of the original paper.


It’s related because peer reviewers have the power to sink or float a paper based on its ability to reproduce, but they don’t currently. Thus there are not many incentives to try to make a paper reproducible.

I couldn’t understand the last couple paragraphs. sarcasm?


An institution which fails to use a power is precisely equivalent to one that doesn’t hold that power at all.

As for the last few, I’m basically just saying we should dismantle the current “journal” concept entirely (it stands only to benefit those who receive the fees it takes and those who derive self-worth from being published in a “prestigious” journal), and replace it with a system by which for any given scientifically testable hypothesis, a collection of many different reproduction attempts and their respective methodologies and results are immediately available all side-by-side. With that in place, no scientific result would derive any credibility from being “peer reviewed” or not, but rather from the quantity and diversity of reproduction attempts it has faced.

This database should be free to query and free to insert into. Individual papers may support community comments to serve as the weak “peer review” we currently have, but at no point should these comments be considered anywhere near equivalent to a full reproduction attempt.


How much research have you replicated?


A handful of papers. Lots more if you include high school and undergrad lab assignments.

In my field replication usually involves a lot of guesswork since papers don't usually have enough information to be certain of their exact methods (no code or data unless the planets align or the authors are friendly) and yet peer reviewers happily apply their rubber stamp.


A large proportion of the population subscribes to one or more false conspiracy theories. On the other hand, man-made global warming is just barely accepted in the general population. We shall learn no such thing.

People believe the papers they want to believe and don’t believe the papers they don’t want to believe. AI doesn’t change that equation because people don’t read papers to start with and journalists are easily fooled.


Bad actors have always been able to take advantage of the same tools that good actors have. There is nothing new about this with regards to this technology iteration.

You have to simply believe (which is fairly easily to validate) that there are far more good actors than bad.

At least outside China and Mexico. I don't know what's happening in those countries. Systemic breakdown of morality perhaps. But that's a people problem, not an AI problem.


I hope AI will find the nurses that hate their jobs and are harming their patients. There has to be a vast number, just going by the large percentage of other people in other professions that hate their jobs, and are doing it badly.


> I hope AI will find the nurses that hate their jobs and are harming their patients.

I think it would be better to just find the nurses that are harming their patients irrespective of whether they enjoy it while they do it.


> arms race

I would guess it mostly isn't an arms race because it is not symmetric.

There are significant asymmetries in tool costs and benefits, especially build versus SaaS. Any tool created would sell in different markets (selling to cheaters versus selling to verifiers).

Also there is a time component for publications. A cheat could test their paper against a tool, only to find that the tool adds new detection capabilities in the future and the cheating is detected.

Look at the markets for plagiarism tools and game cheats.


I thought the same thing, the next crisis will be "We've run the tool over it, it's all good, no further checking required", yet it fails to detect fraud. I know, I know, it's an improvement on humans until humans don't bother verifying anything anymore and it turns out we've been accepting fraudulent papers for 10 years.


I agree that malicious actors will always be able to exploit holes in existing tools, it's an endless arms race.

I worry as much about the use of these fears as a means to established centralised control of AI tools, rather than strengthening the institutions which are already supposed to deal with this kind of thing: Illegal things are already illegal, detect them and follow up. Consequences stop crime.

For example, fake news is/was nothing new when it became a weaponised buzzword in the Trump era - lying has always existed, and social networks of trust plus legal consequences for malicious/dangerously negligent mistruths are a proven solution. The fact that these institutions might not be functioning properly (i.e. allowing the mass uncritical publication of mistruths for profit) is more dangerous than the technology that exploits those gaps.


The tool in question isn't openly available, you have to apply to get access to it. So I don't think the arms race problem exists here.


Compared to malware, fraud has a lot more of a paper trail, and the AI tools of tomorrow can detect the fraud of today, limiting the damage the fraud can cause. Meanwhile malware can cause a cascading wave of damage very quickly.


This article is very confusing. It switches between talking about image duplication and image manipulation. It sounds like the software detects image duplication, ie, reusing an image from some other source. Then there are other references to image manipulation being a problem, but no indication that this software detects manipulated images (unless they mean it finds images that are mostly duplicates, but have been manipulated to hide the duplication).


This would be a good article title for let's say 1980 or 1970s period..


For one, this definitely seems like a submarine ad for the AI tool (imagetwin, henceforth called AIT for AI tool to not give them too much more free press). Not sure if Nature has an angle (probably also free scans) but they gave the researcher free access to the AIT per the preprint disclaimer [0]. Also, it's not really giving you enough info to asses whether the AIT is better, because it's basically PR, so allow me:

For one, the fact-checking against the AIT per the preprint is "Duplications highlighted by ImageTwin.ai were evaluated as appropriate or inappropriate by80 one reviewer (the author)". Not a great way to reduce false-positives given the conflict of interest (more free scans for the author!) and the fact he'd already reviewed most of the offenders.

Also, the article says the AIT missed four papers the author flagged. But in the preprint the category with "At least one inappropriate duplication was identified during the manual review, none were highlighted by the ImageTwin.ai software" has 34 members. Per the author's results text, out of 715 papers with images, the author caught 34 inappropriate duplications the AIT missed, the AIT caught 57 inappropriate duplications that the author subsequently agreed were bad, and they both caught 24 together. But these numbers disagree with the venn diagram shown in figure 3 and referenced in the conclusion, which the Nature article also references. So... am I missing something or is that inconsistent? And were there any AIT flagged papers that the author disagreed were problematic or not?

AIT charges a pretty non-trivial amount per scan, at high volume it's still >$2 before you get to custom pricing [1]. It was "only" 2-3x faster than the researcher, and at least per the article it still needs to be checked for false positives. Taking some normal researcher pay and reasonable estimates for the rate at which they can review papers, it seems like the AIT is pretty damn expensive and might be priced competitively to "research intern".

This technology has existed for a really long time in the form of reverse image search. Google launched it in 2011, later neutered it (probably not profitable enough), but Yandex still has a pretty good one since 2014 [2]. Overall this seems like a pretty sloppy preprint with an obvious conflict of interest, and a tool that doesn't have any apparent innovation beyond commercializing a probably-underserved vertical.

[0] https://www.biorxiv.org/content/10.1101/2023.09.03.556099v2

[1] https://imagetwin.ai/pricing/

[2] https://www.searchenginewatch.com/2014/06/19/yandexs-sibir-r...


Machines are very good at detecting other machines.


Why is the article's title phrased with the word "problematic" rather than "duplicated" or outright "fraudulent"? Are we trying not to offend someone's sensibilities, or not leap to conclusions (when in fact, journals have been far too reluctant to do what they should have been doing about this kind of academic dishonesty)?


Things like this are why you cannot immediately know it's fraud: a bug in xerox copiers changes numbers to other numbers due to some OCR nonsense.

http://www.dkriesel.com/en/blog/2013/0802_xerox-workcentres_...


Jesus christ


AI can't do a full investigation and identify it as fraudulent, it can only find out when it's weird-looking.


Because it’s trying to collapse multiple types of “problems” into one word in the title.


And also because "creating problems" that need to be double checked, is a much lower bar than "fraudulent"; especially if accusing someone of fraud can carry slander/libel charges.


I am always skeptical when I see the word "problematic". It is often used to handwave over what the actual problem is, and is very often used when the supposed problem is either trivial or a subjective judgement that many people wouldn't even deem to be a problem.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: