One of the sites mentioned, "wtnnews.com" ("West Texas News"), is a straightforward content farm running Google Adsense ads. Here's their home site, "wtnmedia.com", "A global business-to-business media company where you can forge high level client relationships through the web, print, and events." Their advertising kit shows a base ad rate is $1,295 for 7,500 impressions of a banner ad, for a Cost per Thousand of $171. Nobody would pay that; typical CPTs today are around $1. So that's not their revenue source.
The use of the business addresses of other businesses is unusual. That's identity theft. Our SiteTruth system checks out the address of the business found on the web site, and it brings up StreetView pictures of auto repair shops for some of those sites. Lots of sites don't give an address, or use some mail drop, but use of fake street addresses is rare. It tends to draw tax collector and law enforcement attention.
Google's anti-spam efforts have never been very aggressive. They filter out some of the worst offenders, but don't try hard to get rid of bottom-feeder sites like these. If Google didn't put spam sites in news results, how would they get revenue? Google News itself has no ads, and many of the top news outlets have their own advertising systems that don't involve Google.
I used to work on a news site and the process to get listed on Google News was very manual, at least compared to how you think the rest of Google operates. You had to have a physical address and phone number (though I don't think the two attributes are actually tested, i.e. sending a postcard via snailmail). You also had to configure your pages for the spider so that if your site delivers wire content (e.g. Reuters, AP) or does aggregation, the spider would ignore those pages, and only pick up your original news. The first time we submitted, the response we got seemed to indicate that someone had looked over our sitemap and saw aggregated news/wire reports, and so we had to resubmit. The process took a few weeks, IIRC.
That said, bigger news sites were probably whitelisted...it seems like the whitelist could be done algorithmically...but the number of new authoritative news sites, in theory, changes so slowly, that there's probably not really a need to do that.
When I worked at CNET, we were (and still are) indexed by Google News despite not having three-digit numbers in URLs.
That's because your excerpt from Google News' guidelines left off a very important addendum: "Please note that this rule is waived with News sitemaps."
Actually the big reason why it may have worked back in the day for CNET was because it was CNET.
Thought these days a news sitemap suffices.
I would personally still add the 3+ digit unique number to the permalinks and eventually if necessary remove it once rankings improve overall on Google News.
Some Back Story
==================
I consulted for a Movie News and Review website for about 6 years. We had news sitemaps and would show up on Google News once in a while, but one of the small yet significant changes that we made that increased ranking on Google News was when we implemented adding a unique number to urls.
That was implemented back around 2009-2010.
We stopped using that unique numbers in the URL around 2013, mostly because the numbers were becoming pretty darn big. We noticed a small downtrend in Google News traffic which stabilized in a couple of months.
For anyone using Wordpress unique numbers can be added using a permalink structure such as:
/%postname%-%post_id%/
If you do decide to use it don't forget to setup 301 redirects.
which would not fulfill (the basically deprecated) three digit spec for the first 99 posts (maybe even the first 100).
also it makes the URL longer, less user friendly and you have to deal with an URL migration that you might have to revert at some point (i.e.: the example you mentioned)
additionally: as it is mentioned that the 3 digit requirement is "waived" with a google news sitemap it is a strong indicator that this is a crawling, not a ranking directive.
my 2 cents: don't do it, as it would be a clear violation of the golden URL rule a.k.a. "Don't overdo the f###### URLs!"
yeah, and from the paragraph you just cited you left out the last sentence.
"Please note that this rule is waived with News sitemaps."
so basically the >= three digits rule doesn't apply any more (for years now), as a matter of fact i don't know why google still has in in there as it always leads to a lot of confusions.
i once consulted a big newspaper which were in an ongoing project to change their URLs just because one of their (middle)managers has read this spec...
There is a lot of good information in the post submitted here, which should be actionable for Google if Google cares about the quality of news results. I'll note for the record that I have been using Google News as a news aggregator since the beginning of its existence, and since I almost use Google News in a logged-in condition, Google News responds to the way I have trained it about my news interests by mostly showing me stories from established news organizations all around the world, and not from spammy linkfarms.
My current gripe about Google News is that it pushes far too many low-quality or too local news outlets into the Editor's Picks section of my view of Google News, and even if I click repeatedly as those sources display "Personalize this news source" to display it rarely, the same podunk local TV station or trade magazine website will keep displaying in that section over and over and over. (I have already complained to Google through Google News feedback channels about this behavior. It should be possible to mark a source as NEVER appearing in the "Editor's Picks" section and have that selection be implemented until the user affirmatively turns it off.)
On the whole, I like Google News. But for sure if a site has eyeballs, spammers will try to grab those eyeballs, so the price of freedom from spam is eternal vigilance.
We need alternatives to Google. We can't have one company deciding what gets seen on the WWW.
It is a great search engine, Google changed the world. But now we need alternatives. We can't have their editorial staff dictate what should and should not be seen on the WWW.
If you look closely at the top search results, they're mostly big spenders in Adwords. Ebay, Amazon, Expedia, TripAdvisor, Yelp, Answers.com and several others. It's blatant conflict of interests. There is no way those search results are "organic". Basically the WWW has become controlled and curated by Google and every site that gets seen must conform or be destroyed.
We need alternatives to Google. We need them urgently.
> There are a handful of good alternatives. Use one. Use multiple.
For search at least, unfortunately, the alternatives are pretty questionable. I've been using DuckDuckGo experimentally for a few months now, which is mostly Bing as the backend, with some DDG-specific add-ons and tweaks. I find myself having to use the !g command to rerun the search in Google somewhere around 40% of the time. Some of this is site owners causing the problem: a surprising number of sites block all crawlers in robots.txt, but then whitelist Googlebot and only Googlebot. But even leaving aside those sites, DDG seems to miss a lot of results, and the quality of the first-page results, at least for how I search, is consistently lower.
If you try searching in a language other than English the differences are even larger. I suspect that's because some of the infrastructure and datasets used for Google Translate are also used by search in some form, while Bing doesn't have a similarly solid multilingual stack.
I'd like to see more competition in search, but the barrier to entry to produce a good full-web search engine seems quite high.
(Disclaimer: I'm generally biased in favour of Google)
This logic absolutely doesn't work. The whole "vote with your wallet" attitude only works when a majority of consumers have a concrete choice in the matter. In the current case, even if the entire tech sector decided to "vote with their wallet" and stopped using Google, a phenomenally large amount of users would still be on Google due to partnerships with internet browsers, manufacturers, OEMs and so on.
You need to be honest with yourself. When you say "If you don't like x, don't use it", it doesn't give the person a vote/voice. It's to get them off your back (as if they were on your back in the first place).
For a very simple reason - I use Tor browser a lot, Google blocks it and constantly shows me ugly captchas (not even ReCaptcha) and sometimes blocks me entirely, Bing just works. (Startpage - which is a Google skin basically - shows captchas too, and doesn't even redirect me to the actual search after the captcha. DuckDuckGo has unusable results.)
So I am using a Microsoft service to be more anonymous. Which is kind of funny.
It isn't just the tor browser bundle that causes this. I believe it has more to do with either bots using tor to access google or google forcing captcha on access from known tor exit nodes. I have seen the same behaviour from vanilla firefox while trying to use google from tor. I would still rather input captcha than use bing!
> I believe it has more to do with either bots using tor to access google or google forcing captcha on access from known tor exit nodes.
Yes of course, but I don't care. I want to search while using Tor.
> I would still rather input captcha than use bing!
Google uses really, really terrible captchas for this. Usually more in a row, with no indication if you did the previous one right or not. And sometimes they just refuse to show the results, period.
It's quite possible that the results are "organic" in the naive definition by following this process:
1. {site} starts at rank 1 due to low visitors/connectivity
2. {site} pays for adwords driving new visitors
3. visitors link to the site
4. rank of the site increases allowing increase adwords spend
5. goto 2
The suggestion that Google artificially inflates the rank of big spenders is maybe conflating correlation and causation. On the other hand, it's not impossible or even that difficult to believe.
> We need alternatives to Google. We need them urgently.
Thanks to the 80/20 rule, we have them. Google is still unmatched for edge cases in maps and search, but something else works 80% of the time, and Google is spending 80% of the effort for that last 20%.
Isn't it the opposite? Google maps works 80% of the time, but for the 20% remaining, you need to look at more specialized maps. For example look at https://www.google.com/maps?q=lamma+island : almost nothing, just of few paths. Now look at http://www.map.gov.hk/ : all paths and houses in Lamma are precisely described!
The funny thing is that as a large publisher, it's not links on news.google.com that we really care about, it's landing in universal Google search one-boxes as a result of being indexed in Google News - that's where the traffic really comes from.
Yea, there's a lot of sites that take advantage of this like The Christian Post who target high search keywords for their "news" that has no relevance to the actual term.
Search for "Stream NFL games" and "Watch NFL football online free" and other variants and see them pop up for everything with no mention of actually streaming games online (obviously).
Google News has always given off this vibe like it's being administered by someone as their side project. Very curious for one of the most popular news websites in the world.
Without going into too much detail, this is essentially true. The site is built on top of algorithms that were developed years ago, and there are some runbooks to help SREs keep the site running. However, there isn't much investment going on. Occasionally they'll have one engineer or an intern add a feature, but it's not permanently staffed.
One of the reasons is that Google doesn't seem to want to hire the "b" team.
And the "a" team I'm guessing doesn't want to work in a backwater place with no glory like "google news".
You know they only hire the best and the brightest and all of that. I'm sure even the person who gets hired as a janitor is a cut well above the average janitor. [1]
The funny part is there are plenty of people who would die to work at Google on anything. But they aren't the type (total conjecture here btw) that would ever pass the google interview process.
[1] I've always thought this was an interesting paradox. That is someone gets a job as a maid in the White House and works near the President let's say. So she/he must have something going for them to get that type of job. But yet they are still a maid in the white house. You would think if you are able to land that job you would have risen above that job.
>You would think if you are able to land that job you would have risen above that job.
What an awful thing to say. Some people like doing that type of work. There is nothing wrong with that. I don't think our society is destined for greatness when we devalue important work.
As an aside, my aunt did a lot of jobs and always went back to being a cleaner because thats what she likes to do.
> a backwater place with no glory like "google news".
I know they have lots of people working in their "social" wasteland (G+, hangouts, etc.). I can only assume they already gather most of the "news" tracking information by other means, so GN isn't worth supporting.
I guess I don't get why it's a backwater. They have insanely high traffic numbers. They have interesting problems of surfacing relevant news and grouping stories around topics. Seems like there are tons of cool things you could do with the site, if you were so inclined.
They can't run ads against it unless they manage to work out some arrangement with the News Publishers.
Technically they might have the legal right to do it, but the Google/News publisher relationship is already strained enough, and Google wouldn't want to strain it more.
I don't see why that would be true. Google News is just links to other sites like any other google search, except for some wire articles that they are actually paying for.
Yea, I understand what Google is trying to do with their news site, but it'a not working--for me--in the Bay Area.
They have money to burn/waste/spoil Geniuses? Why not hire the best reporters and
put together news sites for every country on the planet? Old
fashioned reporting? Stop leaching off struggling news papers in order to obtain your content? The internet has destroyed so many good
newspapers. Why not reverse the trend and bring back good journalism? Or just, give some of the struggling newspapers
cold hard cash--in the form of a grant, before we lose them all? Or, they just become pet/vanity projects for rich tech guys?
I too have found some serious quality control issues with Google News. Some news sites are redirecting their mobile viewers (knowingly or not) to the app store instead of giving them the article that they clicked on. I wrote about this experience here: http://jmarbach.com/google-news-growth-hack-exposed
I reached out to Emily Linnert, one of the real journalist who's photo is being used on one of the sites and shes replied. Her (or her station) is taking legal action.
I noticed that one of the scam sites is currently listed for sale on Flippa. On their listing, they even openly boast that the best way to use the site is to use its Google News status to scam Google's search results:
The use of the business addresses of other businesses is unusual. That's identity theft. Our SiteTruth system checks out the address of the business found on the web site, and it brings up StreetView pictures of auto repair shops for some of those sites. Lots of sites don't give an address, or use some mail drop, but use of fake street addresses is rare. It tends to draw tax collector and law enforcement attention.
Google's anti-spam efforts have never been very aggressive. They filter out some of the worst offenders, but don't try hard to get rid of bottom-feeder sites like these. If Google didn't put spam sites in news results, how would they get revenue? Google News itself has no ads, and many of the top news outlets have their own advertising systems that don't involve Google.