I had a disreputable eBay seller use a similar trick: The Apple product they sold turned out to be counterfeit (unbeknownst to them, they claimed), so they took down the original eBay listing. For some reason, eBay prohibits you from leaving feedback for sellers on orders from listings that are taken down in this way. So this seller still has like 99.7% positive feedback and continues operating even though they at best wasted the time of dozens/hundreds of people who received counterfeit goods and either didn't notice or had to fight for a refund.
Had an eBay seller sell me refurbished hard drives as brand new. I purchased new Western Digital Red 5k, they sent refurbished 10k enterprise drives with 5k stickers.
High 99% feedback, large numbers marked as sold.
Ok contacting the seller saying I got refurbished drive with the reported drive stats I got abusive phone calls saying they have my address, they don’t sell fakes, they are going to come to my house and teach me a lesson calling the drive fakes. Turned out the registered address was 2000 km away at some random suburban residential address, the person on the phone had broken English phoning from a Thai phone number so I wasn’t to worried about that and laughed it off and said good luck.
I contacted eBay, left a poor review saying received fake goods, got threats of violence from the seller. eBay refunded my money and a week later my review was removed for the seller to continue scamming customers which must be many by the items sold and number of reviews.
I ignore positive reviews now. They are curated so not worth any substance if bad reviews get removed.
RE "......the person on the phone had broken English phoning from a Thai phone number......" Probably a person that works at the hard drive factory IN Thialand. Hard drive capacity reduced in this way when part of the drive has a problem. Maybe its a warranty return. I would complain directly to Western Digital . If the problem was detected at factory, most likely it never would have had a 10K sticker put on ....
Next day delivery.. you pay, it ships, then delivery estimate gets an "oops you're delivery had a problem and has been rescheduled for 6 weeks in the future".
Can't refund as it has been shipped, and the original listing disappears after a couple of days "listing not found", so no feedback possible either.
The last thing I bought on Amazon was a potato peeler. Instead of the quality European peeler I paid for, I received a cheap Chinese peeler that bent the first time I tried it. No returns allowed because the item was considered "dangerous".
You'd think that a consistent customer would be worth more than $30, but apparently not to Amazon. They wouldn't make it right so I shop elsewhere now.
> No returns allowed because the item was considered "dangerous".
If you have prime, talk to customer support until you get the answer you want to hear. In general, they have been extremely supportive of all the various issues I've run across, but then again, I buy a lot of stuff from there.
I only ask because your post begins "if you have prime".
Resolving a situation where a different, cheap, unusable item is delivered is absolute basic customer service which was paid for when GP bought it, regardless of how frequent/premium of a customer they are.
You can't shop at costco without being a member, I don't see how that's related.
The other reply was my experience too. Ordered a large box of a food product, arrived expiration date the same week (technically a best by date, but I doubt it would be good by the time I finished all of it). The site said no returns on food, contacted support, got a chatbot that told me the same. Went through to a person and they credited my account.
It sounds like Facebook Marketplace. I bought counterfeit AirPods Max. They showed up as legitimate in the phone so I thought they were good, but they were fake. Before I realized what happened the seller blocked me, which apparently prevents leaving feedback. I contacted Facebook about the fraud that went into a black hole.
I had found someone else scammed by the same seller and I was annoyed enough to go to the police. The police called it a civil matter and refused to do anything. Obviously I didn’t have the seller’s real name so I couldn’t pursue that angle. I saw from another Facebook account that the seller is still hawking counterfeit goods. So, I guess that’s effectively legal.
The ridiculous thing I learned recently is that negative EBay feedback falls off after 12 months. Helped explain why I keep buying things from 100% positive feedback sellers and receiving broken junk…
The seller tried to only refund half the purchase amount because I didn't have the original box it was sent in. (I threw the box away before I realized the item was counterfeit.) I had to appeal his partial refund to eBay, who agreed with me that if someone sells you counterfeit crap they have to fully refund you even without the box.
After exchanging several emails with the seller, it was clear that he believed that as long as he gave refunds when people sent everything back to him in mint condition, he had done nothing wrong and did not deserve to have his reputation damaged. The problem with that, obviously, is that he wasted the time of dozens/hundreds of people, and defrauded people who did not notice the item was counterfeit. (People only got refunds if they noticed and requested; they were not automatically told they could get one, as an honest seller would have done.)
Under the current system, sellers face little incentive to make sure they aren't unwittingly acting as a fence for stolen/counterfeit good. Without reputational hits, they only have to reimburse buyers for the purchase price, not for the time wasted, and many/most buyers won't bother to return it anyways.
On those simple ideas to fix it: I don't think it's simple. Once you do simple heuristics, the other side will start doing it just a little bit more sophisticated, to get around the simple heuristics. So, then you improve the heuristics to catch the spammer. And again, the spammers get around that improved tests as well. And so on.
In the end, you end up with similar spam filter methods as we also have for mails and probably as other social networks have as well. But this is far from simple. I don't think having a huge number of hand-crafted heuristics is really a good solution. I think it should be machine learning model which you train and it does it all automatically without too much false positives (and also not too much false negatives).
I think issues like these will continue along the path you describe until we tie authorship to real humans and their reputation, or perhaps something else they care about, in one way or another.
I'm not saying that we should give up on the idea of anonymity on the internet, at least not completely. But real humans have to put something at stake when they use space on any part of the internet that other humans should care about.
I've participated in plenty of communities without spam problems where I have no idea of the real identity of the other participants, and neither does anyone else.
I can't help but note that neither of us attach our real identity to our participation here.
I think a better solution might be structural. In this GitHub example, why can someone tag people who they're not collaborating with to begin with? Why are identities so easily discoverable and contactable? The value seems to far outweigh the risk.
I hope there are other ways. Although I suspect that the communities you're thinking about are somewhat niche? I guess that would only motivate a small amount of investment for setting up spambots and a little resistance is enough to move on. If this explanation is not enough I also wonder what the explanation would be?
I also suspect that the barrier for spamming absolutely everywhere is going to get lowered as soon as somebody figures out how to make LLM:s configure spambots for arbitrary domains?
I don't understand your comment; if reddit is part of the dead internet then it is full of spam bots no? Or are you saying that users perceive that most other users are bot when they are not?
> I think issues like these will continue along the path you describe until we tie authorship to real humans and their reputation, or perhaps something else they care about, in one way or another.
I think this is attacking the problem from the wrong end. This is just increasing the penalty for spamming. (Real reputational damage.)
Why not shift the other end—increase the difficulty of spamming in the first place.
Could be as simple as charging for account creation (even if just a deposit that’s refunded after some amount of non-spam activity).
But without getting into all the baggage payments bring—don’t allow people to ping others until they’ve reached a certain “reputation” by hitting some threshold of non-spam activity? Rate limit how many people you can ping based on reputation? Limit to only users associated with the repo you’re making the MR on? Rate limit how quickly you can create MRs and comments based on reputation?
Seems like there are a lot of levers to pull here that could curtail the problem without jumping straight to tying accounts to real life IDs.
A one time administration fee for account creations would make most of these spam tactics unprofitable. People are probably too used to having everything for free online (paid for by ads) for anyone to try this right now.
The fix isn't ever more heuristics - that is an uphill battle that can't ever be won. The fix is following the money and disconnecting the bad actors' ISPs from the Internet.
There was a time when abuse@<isp-domain.tld> emails were honored and administrators actually took notice of what came in, but these days are long since gone - ISPs simply don't want to spend the money, and so the cost of abuse is externalized to society at large.
ETA: Also, a fix would be to have a human with more than ten seconds time take a look at even 1% of spam reports. Spammers are lazy, they always use the same template, so if you have a human actually looking into the template and then routing every match to /dev/null, it's far more effective. Like... I can do this on Twitter for every new variation of some scam, why can't Twitter do it on its own?!
Do you really want ISPs to be in the business of deciding what should and should not be on the internet though? That sort of thing typically doesn't work too well.
Deciding what is or isn't a "scam" is really a job for the independent judiciary. But getting a ruling is difficult and time-consuming, and also largely pointless because a new website can be created almost instantly, and there are many foreign ISPs where you can't get a ruling at all.
I don't really disagree with your basic premise that "disconnecting the bad actors' ISPs from the Internet" is the ideal solution, but this is far more difficult than your comment implies – almost impossible with how the internet currently works.
I think that the parent comment was referring to the days when ISPs were local or network services were provided by a school or a some other entity which had a vested interest in keeping their network clear of bad actors.
Author/OP here: ya "simple" is a relative word and I tried to address that with a few sentences around it. Didn't want to come across as an "internet arm chair expert" since I don't have a lot of experience in at scale content moderation.
On the other hand, I don't think posting the same spam nearly 1k times in 24 hours shouldn't raise some sort of alarm. 1 manual take down of a spam by a human should trigger a search (or have an option) to trigger a search through recent comments to see ones that are close. If I can search their whole site in 3 seconds they should have some sort of system that looks for 95% simular comments.
Seems simple but I am sure it's actually much much harder and has caveats and gotchas along the way as it scales.
I mean, you suggest using AI to scan comments/code.
Maybe think harder and don't avoid a certain tech that is in a bubble and may soon be sued out of existence.
I do agree, github and other places that allow user generated content need to do a better job. I have more bots than humans following on twitter (and I have thousands of followers), for example, however trying to say "AI CAN FIX IT" won't win you any favors.
FYI I have a suggestion, maybe spin up said AI model on AWS and try scanning all the comments/issues github receives in a day. When even Microsoft can't pay your bill, you'll see why they do not yet use AI for that...though I'm sure they will try eventually.
Yes it's a game of cat and mouse, but starting out with a defeatist attitude, the cat makes things way too easy and the mice will simply roam free. Making life hard for the spammers can and will get rid of all but the most persistent.
The other issue I have with this particular GitHub spam is the notifications persist even after the spam has been removed. You get notified and subscribed to some random thread because you were previously tagged in it.
After GH removes the spam (which is currently too slow) they should also retract any notifications or subscriptions that were made as a result of the spam comment.
That's right. If you login to to GitHub even days after the spam is removed, you will see the bell icon with a notification about a since cleaned up thread in your feed.
I have a different problems - Github's notification settings are far too coarse, and if you're either subscribed to lot of repos, or have a lot of actions happening on those repos the flood of email messages you get on every comment or action a person or a CI process takes is just unmanageable.
All I want is "If someone (ie, not a bot) specifically tags me on a PR where the CI is passing, send email once". This granularity unfortunately doesn't seem to be possible - that said, I would love to be wrong about this.
I ended up turning off Github's email notifications for this reason, as the signal to noise is horrible.
Try having a generic username on there where everybody @s you as a shorthand for the real person's username and you end up getting notified. Or added to random repos and removed minutes later, but still getting the notification. :)
I recall a HN user whose GitHub username was some common word related to software development (like @deploy or @prod, but I can't recall the right name).
There was a thread about six months ago mentioned that they get @'ed constantly by mistake. They had a funny attitude towards it, though. They said they always enjoy getting to see what everyone else is working on and didn't mind the notifications.
If anyone remembers what I'm talking about, let me know because now I'm so curious about this username I can't remember.
I have the exact same problem, but I assumed I was too stupid to work out how to configure Github to stop this and didn't want to waste any more time trying to work it out so I just turned the emails off.
I work for a fairly large org with lots of Github repos which I occasionally contribute to and there seems to be no way to configure emails alerts in a manageable way – I must either get an email for everything or ignore everything. And ignoring everything is obviously preferable when the signal to noise is this bad.
Sounds like the same problem! I'm working at a large org, was involved in setting up lots of repos, therefore I end up getting pinged on so many of them that I'm no longer directly working on or directly interested in.
There's also not a bulk method anywhere (that I can find) in Github that can deal with unsubscribing or changing settings on many repos at the same time.
I concur - I saw the React Native repo getting spammed with hundreds of similar issues/prs. So many unique usernames and such a cumbersome process to report a painfully obvious spam account. I hit the limit of open abuse reports you could have. My attempt to help was ended - I was only 4 accounts in.
Thought I would get creative and add comments to one of my existing reports of the other 10 or so spam accounts. The tickets were closed and only the main account was deleted - not the others mentioned in the ticket.
Internet is 5% useful important stuff and 95% spam. When a more intelligent organism finds our planet they will be so confused why we wasted so much digital space on senseless spam.
I think a more intelligent organism will understand perfectly that it is explained as the inevitable result of combining virtually free message-sending with simple economic self-interest, and why there are no perfect solutions to it (at least not yet).
While I like bashing on capitalism as much as the next guy, I don't think it's very relevant here. For the general phenomenon of spam and trolling, I don't think capitalism matters at all, and for the scams on eBay and Amazon, I don't think a system where those are fixed has to be much different from today's.
Also github pages and "app" pages are used to distribute scam dating site spam on social media platforms. The bad actors try to use the domain reputation of github to evade detection. It's extremely bad and seems to be out of control on github.
Another thing, men, please, PLEASE, stop falling for these scams. No, beautiful women will not message you at random and show interest in you. Even unattractive ones won't. Please stop falling for these scams. Tell everyone you know to stop falling for these. If a random woman messages you to meet for sex, it's a scam. Do not fall for it, it will seem real and authentic, it's not. If you send nudes they will extort you out of money.
Unless you are a male sex worker or in some kind of niche hookup community, it's a scam. Even on a dating site, it's probably a scam unless you are exceptionally attractive.
Women like sex, just as much as men (if not more so). Sure not every popup about “hot singles in your area” is legit, but women on dating sites messaging you with the goal of a quick night is definitely a thing. And I’m certainly not exceptionally attractive.
Same thing happens on Twitter. I login every week or so and my notifications are full of NFT scams. People tag me with an image and "new mint dropped!!!1" post, by the time I see it the tag is deleted but notification is still there.
Github/Microsoft could sue the beneficiary of the spam. It's clear who that is.
Binance is in legal trouble with the SEC right now.[1] Send this to the SEC lawyers going after Binance. You can find out who they are from SEC litigation announcements. If Binance can identify someone else to blame, they have a big incentive to do the work.
This very likely has nothing to do with Binance or even AltLayer. The scam is to fool someone into signing a transaction that sends their tokens to the scammer. They are using the well known name of Binance to make it look legit. Trying to sue Binance for this would be like trying to sue Apple for "You won a free iPad" scams. Also, it has nothing at all to do with the SEC's jurisdiction.
I don't think that's what the commenter is claiming. I think they want to be able to sue the companies who indirectly benefit from this kind of spam, which is pretty ridiculous.
However, crypto has become such a menace to society that it's time that governments do something about it, if they even can at this point.
> I think they want to be able to sue the companies who indirectly benefit from this kind of spam, which is pretty ridiculous.
There's the legal concept of an implicit or implied conspiracy. Usually comes up in antitrust law, where sellers raise prices at about the same time without actually getting together to talk about it.[1][2]
It's a difficult area of law.
> With the rise of generative AI and ChatGPT being able to write endless variations of 1 spam template to bypass the similarity check I just proposed above, content moderation will continue to be an uphill battle. It most likely will get even harder!
Thanks to LLMs, the spam issue will get even worse on Github.
> Thanks to LLMs, the spam issue will get even worse on Github.
I think we'll quickly develop NLP/LLM filters for this. And while that may lead to an arms race, we'll likely simultaneously develop distributed systems for credibility attestation.
We already rely on professional networks. We'll just grow all the more robust and capable along these lines.
I'm actually extremely excited for automated systems that increase the signal. We've been in a noise trough for a while, and now we have means of filtering it.
Edit: I don't necessarily refer to anything cryptocurrency related. We can build distributed networks of trust like the semantic web tried with PGP and FOAF, though I'm sure there are valuable tools and lessons we can borrow from the crypto folks' algorithms and research.
Even though this post is about cryptocurrency spam, decentralized cryptocurrency networks could have a solution to this spam problem with so called Soulbound Tokens.
> Bayesian filters worked very well for email spam.
As someone flagged me, I'd like to know how well current filters catch LLM-generated spam? By hypothesis, not well, especially given that you can always run a LLM output through f(g(h(...))).
It doesn't matter if human/LLM or its simply preset text is the source. Spam has to be consumed and "bought" by a human, so it is easy to classify as humans are predictable in what scams we fall for.[1]
Most spam is advertisement for something to make you click, In the 90s and 00's in the early days of email, it was things like Viagra etc, today it is "coin drops" and crypto related things.
Naive Bayesian filtering is a just a matter of training on the probability of such words in regular issue/PR/discussion comment threads and assigning a probability for the post and flagging it when crossing a threshold
In the case of Github, they would probably refine and improve this by adjusting the weights for different topics.
There is a good chance they already do this, and it just that the sensitivity for crypto scam words is set too low in crypto related projects as they would be probably used more by real people as well, and that is why OP noticed this as issue and rest of us rarely see much spam in Github.
You could add reputation for the user globally and with respect to the project (akin to ESP reputation) and many other refinements in addition to Bayesian filtering.
[1] Nigerian prince emails are written in poor English for a reason for example. You could bypass a filter yes, but people are far less likely fall for such evolved language defeating the purpose of sending spam in the first place.
With a sample size of 2 because I've not tried before and it's late here, gpt-4-1106-preview seems to work OK at this just by asking for a probability.
Though you will need to post-process the response because it says the usual blah blah blah ${number} blah blah reason blah.
Hopefully the smaller models can also be used for spam checks; perhaps even some of the open source models? My insomnia-driven test last night was the complete opposite of an exhaustive study.
Why even use an LLM? A classifier is perfectly suited for this kind of thing, and they aren't new. As far as I can tell, this is what is often used in the real world, and is incredibly cheap compared to an LLM, so GitHub/$OTHER_PLATFORM could totally run it on everything posted. They could even use a classifier as a first filtering step, then run a smarter model on flagged comments. (Then let a human double-check, right? Right?)
In my opinion, the fix should be to make it expensive to post spam.
For example, every legitimate user of my open source project is probably fine with paying $1 to file an issue report. So I'd like to have a user setting that says "don't let anyone contact me unless they pay for it".
It's not just issues either. Fake repo spam is terrible as well, usually some form of credentials or cryptocurrency theft software. GitHub really needs to implement moderation, and fast.
If it was an SEO trick, you'd think they wouldn't bother at-mentioning people. Usually SEO stuff tries to fly under the radar so it can stay online, and visible to search engines, for as long as possible. (Like the classic technique of spamming generic "nice post!" comments on WordPress blogs with your profile URL set to the spam site.)
God, these crypto "airdrop" scams annoy me... it's just as bad on Twitter. I'm active in Community Notes and the inbox is like 1/3rd far-right conspiracies, 1/3rd other politics, and 1/3rd of zkSync scam alerts - never figured out what zkSync is or if that itself is some scam, it wouldn't surprise me.
Some of these are even able to fake the target URL - the Tweet Card shows them going to "starknet [.] io", but hover over the link and it will actually point to "reward - zksync [.] club". I wonder what the fuck is going on at Twitter that they're unable to spot and hammer down on this.
The reason that zkSync and Starknet are targeted is that neither have run an "airdrop" or token distribution scheme, and both are rather well known projects in the scene. It's a bit pity for them.
Lack of available staff, and a CEO who is probably mostly focused on finding some way to salvage advertising and has no time to spare on thinking about anything else because the advertising situation is just abysmally dire…
I mean I think fully a quarter of the community notes I see directly on things in my timeline (not people sharing screenshots of particularly funny/interesting community notes) are people putting community notes on overpriced drop shipping scammers.
The advertising situation seems completely doomed and when the CEO was picked to remedy that and the owner is in a nebulous executive role as “product owner” and simultaneously a hyper emotionally invested product user (based on credible reports and the publicly verifiable behaviour)… what can she really do besides do a good enough job for long enough that it won’t seem weird when she quits/takes another position.