Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
NYT Exposes J.C. Penney Link Scheme That Causes Plummeting Rankings in Google (searchengineland.com)
47 points by bjonathan on Feb 13, 2011 | hide | past | favorite | 26 comments


I'm sure these SEO finds follow the rule of cockroaches... For every one you see, there's a dozen more that you don't see. I'm sure this makes it frustrating for everyone who's playing the game legitimately. If one guy jumps the turnstile in the Subway, it's fine. If everyone jumps the turnstile, you feel like a chump for paying your fare.

I'm starting to think that Google's influence on the web is more and more negative. By giving an economic value to a link, spam is rampant. I run a coding wiki, and every day there are dozens of spammers flooding in no matter what kind of wall I put up (nofollow, capture, delay in posting, ip blacklisting...). All of these walls are bad for users and discourage participation. Websites are now being coded more for Googlebot than humans. Every link is treated as suspect. No-follow was supposed to help, but from the looks of it, spammers don't care since they get paid by the link.

We desperately need an end to the pagerank monoculture; either from Google or from their competitors. Hopefully it's replaced by a truly open standard, and not a walled garden like Facebook.


You're right about the cockroach rule. It's not just you facing wiki vandalism en masse, incidentally.

Lately, I've been amusing myself by looking into a group that has been vandalizing hundreds if not thousands of wikis in order to create a fake "community" relevant to certain juicy SEO terms (jobs, porn, counterfeit luxury goods, and viagra, mostly).

They appear to all be working together to promote http://starsea rchtool.com/ (link deliberately broken). They have so many links that I should probably code my own bot to follow them. They might be promoting other pages that I don't know about, too. After all, they can easily change their code to link to anyone they want to.

It's quite a mess. Some people have just deleted their wikis in frustration, others have fought back by banning spambot accounts en masse, and some are still oblivious to the fact that their wikis are now filled with spam pages.

I've followed the spam around, tried to clue people in, submitted some reports via Google's feedback page, etc., but I'm not sure how much good it's doing.


We'll check that out--thanks for mentioning it. Feel free to write up something on the web and tweet me a link to it, if you want to send more details.


I'm just glad to know it's getting some attention. They attacked a small game wiki I use and I started to get alarmed once I realized the scale of their operation.

I don't know if you got it, but I wrote down a bunch of URLs for spammed-up wikis and some information on one of the Google feedback forms a few days back. I realize that that kind of thing can be lost in the noise, though.

EDIT: Here's one search that will show you a (now deleted) wiki that was spam-filled. You can follow the links on the spam pages to other vandalized wikis.

http://google.com/search?q=site%3Apreventioncommunities.com+...


I finished a more complete writeup and sent it to you. There's probably nothing you couldn't figure out from that spam-filled search alone, but maybe it'll save you guys some time.


Could you post a link here, please?


I've found a circumstantial connection to the Rustock spam botnet. That's still far from certain, but the botnet already uses wiki-like code in its control network, so there are good reasons to think it might be related.



Thanks.


How do you propose solving the problem of helping people find what they are searching for if it doesn't involve a search engine?

As long as there are search engines, there will be algorithms that people try to game. And as long as people need help finding things, there will be search engines. It seems the only way around this would be to remove the ability to make money on the Internet.


Although NY times nofollowed a handful of the links to the spammy sites that did the linking to jcpenney, they surprisingly failed to nofollow all of them.

The link to the SEO firm that JCPenney used for the blackhat campaign was also DO-followed!

So, the NYTimes is unwittingly contributing to the problem by exposing it.


You would think the author would double check seeing as he wrote the previous piece about 'decor my eyes' where links followed from the NYT and Bloomberg got them to rank high.


Matt Cutts tweets out that Google algorithm had started to work after he was notified about the SEO practices by NYT.

What I dont understand:

1) Since the search results disappeared after it was brought to Google's attention, this was a manual effort. Matt makes it sound as if the automated algorithm removed the offending sites.

2) What is the world does 'started to work' mean ? Was it 'not working' earlier ?


We were already in the middle of deploying an algorithmic change in how we trust links in our scoring. So between Feb. 1st and Feb. 8th, lots of rankings for the pages in question had already dropped. The author of the NYT article, David Segal, noticed that independently and we confirmed that algorithmic changes had been responsible for those drops. After Segal contacted us, we investigated and ended up taking corrective manual action as well.


My interpretation of the 'started to work' comment was that (they were claiming that) the algorithms had actually begun to detect the badness of the links (based on their structure or whatever) and they were starting to automatically be dispreferred; but that manual intervention had hastened the process.


> What is the world does 'started to work' mean ?

I would assume that 'starting to work' would involve investigating what, exactly, is going on. It probably wouldn't do them much good to start making changes before they understand what's happening and why. But that's only a conjecture on my part.


To answer a few of the questions asked in the comments:

My interpretation of what the combo algorithmic and manual efforts is this:

-One of Google's paid link algorithms (possibly a new one or possibly an existing one that was recently tweaked) flagged some of the links or one of the link networks. This caused those links to no longer count towards PageRank credit (and possibly causing some of the initial rankings drops, such as the ones from position 1 to position 7).

-When Google was alerted to the issue, they took a closer look and on manual inspection found not only additional problematic links but also other spammy issues (if you follow the link in my story to the blog post by the guy who helped NYT with the investigation, you'll see that the SEO firm set up doorway pages and that the jcp pages themselves have keyword stuffing and hidden links on them). Based on that manual review, Google added a manual penalty to the site.

That's why my conclusion is that once they fix the issue, the manual penalty will be removed and they'll rise a bit in ranking position. But since the algorithmic penalty simply (I'm speculating) caused some of the paid links to be devalued, there would be no "lifting" of this penalty.

It is very disheartening that something so vital to business success (understanding how to operate online; build a web site with good site architecture; engaging with searchers; solving their problems) is so much equated with these types of tactics.


You get what you pay for. Google 'pays' for links; that's what they get. Lots of links. Substituting an algorithm for human judgment in determining a page's value works until its used long enough to affect and skew people's behavior. Competition in the search and ranking field is desperately needed in the short run, and AI that can directly recognize what we actually want to measure: content value, in the long run. When an AI can only be fooled by spam pages so good that humans like the pages, too, we're there.


to the best of my knowledge, the approach JCP (or their agency) took was to to create a background/shadow site full of pages behind the main site. these pages are promoted via links and cross linked to be used as doorway pages.

jcp's server/hostname strategy seems broken, but if you look at the robots.txt and the site index they promote, it's really confusing compared to the UI and links they expose to traffic entering their site on the homepage.

also see some similar patterns (but not exact) on other big name retailers that might share the same agency.


Kudos to NYT. I wonder if they've seen their last dollar of advertising from JCP though...


"We're so pissed off with NYT for showing up our use of a sleazy SEO outfit that we're going to stop using a non-sleazy advertising venue." It's be interesting to see if they do try such a thing; I think it's higher risk than blaming their SEO for violating their ethics policy.


I feel bad for JC Penney who got connect by a shady SEO firm. Google's reaction of completely delisting them sounds like a kill switch action, not an algorithmic tweak to ignore spam. It almost seems like Google just doesn't have the technology to detect and remove spam pages. I'm pretty sure I can do this over a weekend [1]. I wonder if it's because the problem is hard technically, or because Google doesn't have the incentive to do so.

[1] for those who don't remember, this is a tongue in cheek reference to an infamous comment on HN that somebody made claiming he could clone stackoverflow in a weekend.


They actually did exactly that. They didn't do a hard de-list for JCPenny as in the bmw.de case, only a tweak to the link weighting to remove the type of bogus link JCPenny was benefiting from. So they've dropped down 30-50+ positions in a lot of cases, as have other who were benefiting from similar links, but they are still being fairly listed.


You feel sorry for them because they didn't properly vet their SEO firm? They're paying someone (probably a crap-ton of money) to perform a service for them. They should be keeping tabs on what they're actually doing & have someone in house or a third party checking to make sure they're being legitimate.


> It almost seems like Google just doesn't have the technology to detect and remove spam pages

this


SEO - legal spam




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: