Hacker News new | past | comments | ask | show | jobs | submit login
Crawling and Indexing: Technical SEO Basics That Drive Revenue (rozhon.com)
84 points by vinnyglennon on March 6, 2018 | hide | past | favorite | 8 comments



I really hope this is targeted at legacy sites. A well patterned and modern site developed in the past five years shouldn't have these issues.

Edit: The claim that Google ignores canonical refs should have a solid reference. I'm dubious but happy to be enlightened!


> A well patterned and modern site developed in the past five years shouldn't have these issues.

You would think so, but...

- One of my clients is a Fortune-10 company. One of their e-commerce sites wasn't rendering HTML (I think it was built with React) which completely prevented search engine indexing. Was a huge jump in organic traffic when we got that fixed after I caught it.

- Another client lets its users to have publicly accessible project pages (think GitHub). This resulted in Google indexing X0,000 of these URLs, many of which have 1) duplicate content, 2) minimal organic traffic, 3) minimal value to business. We de-indexed those pages and organic traffic is way up.

- And so on...

This is why people like the author can make a living doing nothing but SEO work.


I see so many people that create sites using React (and the like) because it's the cool new framework that offers all the glitter, but they totally ignore the fact of SEO and accessibility.


> A well patterned and modern site developed in the past five years shouldn't have these issues.

All depends on the site at hand. A great example is a content aggregation site with lots of ways to slice the data... MANY ways to "skin the cat" and it's a case-by-case situation when you fine tune how a site gets crawled.

Modern sites are wonderful because you can build in modern best practices but complexity is still a PIA when you get a product out there in the wild =)

Anyone should be really skeptical with SEO advice/content since there is no silver bullet, but I found this to be a enlightening read too!


In my experience with SEO work on a large data driven website, they ignore canonicals if they don't actually point to the same content.


still wastes crawl budget and in the past I saw a large uk site lose > £500 k in less than a week from a simple canonical mistake on 20 % or so of their pages.


It is a good exercise to simple open the new Google Search Console Index Coverage[1] report on any new modern website. You will often find important pages that don't get indexed for one reason or another.

Proper SEO indexing is a huge opportunity for most big sites. It doesn't happen magically on its own.

1. https://support.google.com/webmasters/answer/7440203?hl=en


I work since 10 years as SEO consultant. Still almost every ecommerce site I audit don't get the basics straight. Sometimes is just pure overindexation / duplication caused by facets, filters, tracking params.., others just wrong implementation of canonicals, robots-tags, hreflangs...

So even today in 2018, fixing your SEO issues can be a game changer for some businesses.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: