Ask HN: Please help me with seo. 600k player statistics are not getting indexed.
6 points by mcorrientes on Aug 2, 2012 | hide | past | favorite | 14 comments
Hi everyone, I'm having problems with my site getting indexed. I thought a hackernews member might could help me.

My site's holding about 600k player statistics and almost none of them are getting indexed. I'm not exactly sure what the reason is. About a week ago google crawled and indexed 300k pages in about 2-3 days but a few days later they dropped everything again.

We have more content than our competitor and the players statistic page is also liked and shared by visitors a lot.

Our competitor is doing quite well, he got about 4.6m of his player statistics indexed.

Cleaning up the HTML a bit, using meta title and descriptions, properly redirecting old site structures to the right place (301) didn't help, google still doesn't bother about indexing the player statistics.

Although they're still crawling some of them but they choose not show them.

I thought it might be because of duplicate content so I moved the languages from the structure (e.g. /kr/ ) to a sub domain (e.g. kr.riot5.com).

I'm not quite sure if my site's under a penalty or if there's something wrong with my content.

I feel a bit overwhelmed of all the possible reasons that might cause google to stop indexing my page and why they once indexed a lot.

I would be really grateful if someone could help me finding the problem.

The site's at http://www.riot5.com/

You could try a re-inclusion request. State your mysterious problem and just frankly ask why and if there are any penalties or what you need to do to start ranking. However Google do take forever to do anything. And even if you are doing the right things already you may only see results in a matter of several weeks.

Re-inclusion requests here (pretty much your only way to ask Google anything easily, but responses aren't guaranteed) https://www.google.com/webmasters/tools/reconsideration?pli=...

Or try the Webmaster forums, where Google folk apparently post every once in a while, but it's mainly a clusterfuck of people hijacking your problem with their own questions, and solutions being unreliably crowdsourced.

Remember: Google moves in mysterious ways. We cannot understand their arcane techniques for we are not worthy.

Thanks for the advice, I'll try it over the re-inclusion request.

Do you have a google webmaster tools account? If you were being penalized for some reason (which i doubt), webmaster tools will inform you.

Thanks for mentioning it. I wasn't quite sure, but we're probably not penalized, so I haven't done a reconsideration request. Apparently it's now indexing slowly again.

Don't do it. You're not banned.

First of all, you should only submit a reconsideration request if your site is banned from Google (and you haven't used other methods to get it back in).

Your site is not banned in Google - you can check this by searching for "site:riot5.com"

In terms 301'ing your dead links this is a good thing to do however, don't expect them to show in the same place in the SERPs as they were previously at least not immediately.

I didn't find a canonical tag on your pages, not did I find a robots.txt page or sitemap on your site you definitely should add those (you should also submit the Robots.txt and Sitemap to Google Webmaster Tools)

Similarly, as you have created sub-domains for languages you should really be using Google Webmaster Tools and telling Google they're geo-ips. Ideally, you should also .htaccess the pages so when you visit from another country you are redirected to the geoip address.

There are other things you do as well but once you have done those, and along with naturally building links to your site you should notice that your SERPs are returning to where they once were and are improving.

Think about it from Google's perspective. They're indexing billions of pages.. millions of new days every single day. There's a lot of noise, and spam to filter through. LOTS.

Why should they index everything on your site if it's not valuable content? It seems like it's just profile data.. but what exactly should it rank for anyway? Google considers this "thin content", and since the Panda update, they've punished sites that have too much of this content (of course, every site has some thin content, but for you, it's the majority)

It looks like you are using links which use onclick events instead of a standard href. The 30+ javascript files included in the <head> section suggest you might be using it in other places that you shouldn't be if your goal is SEO.

That, the bad text/html ratio, and the non-descriptive urls are what i see as the likely culprits. It looks like you have a good number of backlinks, and that that number is growing quickly.

How are you linking to these stat pages? If you want Google to index something, you have to show them that it's important by linking to it on your site. Moreover, you don't seem to have a lot of Domain Authority, so internal links may not be able to get all of those pages indexed.

Good external links are the best way to get something indexed. Good internal links are probably the second best way.

Your site seems to have quite a bit of duplicate content. In fact, every search I did showed other sites when I searched for long strings of text copied from your site.

Search Google for "Whenever Jarvan III, the king of Demacia, delivers one of his rallying speeches"

Submit a sitemap to Google and update it every time you add pages. This helped me get all of my pages indexed.

Is there an email that I can reach you at? I wasn't able to find one in your profile.

My email is mcorrientes@googlemail.com

You probably grew too fast.

Slow down your growth next time, limit how many profiles they can view per week. 300k at once they probably thought you were a spammer.

