Google rewrites many page titles

xg15 · on Jan 26, 2022

> Many site owners find that the titles they carefully craft almost all get rewritten.

Yeah, I'm with Google on this one. I don't see many reasons why a site owner would spend extraordinary amounts of time to "carefully craft" page titles other than SEO and optimizing for clickbaitness. As a user, I'm fine with Google counteracting this.

pavon · on Jan 26, 2022

I think that is the worst reason for them to rewrite titles. If they left the title as-is, then I would be able to see in the search results that it was a spammy site and ignore it. Instead Google is helping to launder their SEO and present it as a more legitimate site. If Google thinks a site is gaming their algorithms they should de-prioritize it, not rewrite it.

stickfigure · on Jan 26, 2022

I think you've got this wrong - they should heavily editorialize the titles.

Honest titles of search results:

* Five pages of flowery text and images before two lines of instruction on how to boil rice

* A bunch of tantalizing pictures of exactly what you're looking for but zero further information about it

* Product reviews machine-generated from public review sources with affiliate links. Top-rated product has the best affiliate revenue.

* You won't care about this solution to a problem you don't have.

etc...

titanomachy · on Jan 26, 2022

Ah, rice. The quintessential Asian grain, now consumed by billions around the world. When I was a child, my mixed-race family used to eat rice every day! Even today, the subtle aroma of rice wafting up from the kitchen brings a sense of nostalgia. It's a sure sign that dinner is approaching, my favorite meal of the day...

[5 pages later]

1. Put rice and water in rice cooker

2. Press "start"

prawn · on Jan 26, 2022

Isn't Google one major reason that recipe sites act like this? They've long favoured sites with a lot of textual content (which authors then break up with images) and also penalised sites that people tend to reverse out of quickly? A long story fits that because the majority of people need to read down for the content rather than get their instant answer and immediately retreat.

I find it annoying too, but it often feels like people ridicule the authors when they wouldn't get any traffic if it weren't for that approach. I don't think I've ever searched for a recipe and come across a barebones Gantt-chart-style engineer-thinking recipe plan.

loa_in_ · on Jan 27, 2022

It's not Google per se but a practical impossibility: it has to rank somehow, hopefully as a human knowing the answer would. They could theoretically hire humans to do it but they won't because it's obviously impossible due to how vast the dataset is, so they use software. Software is still far from human level reasoning so they use metrics. Metrics can and will be discovered and gamed, regardless of what kind they are.

prawn · on Jan 27, 2022

As an AdSense user, if you don't use the maximum allowable number of ads (regardless of content length), Google literally emails you to suggest you add more ads. Their documentation encourages you to maintain a reasonable ratio of ads to content at risk of being shutdown, which pushes out page length. They push for unique content (so writers differentiate with personal stories), they measure time on page (longer details, pictures), etc.

CRConrad · on Jan 27, 2022

> I don't think I've ever searched for a recipe and come across a barebones Gantt-chart-style engineer-thinking recipe plan.

https://clovegarden.com/recipes/index.html

prawn · on Jan 27, 2022

Sorry, I might not have been clear enough. I know they exist. I'm saying that I've never searched for a recipe for something and a leading result has been in that sort of format. Google has created the environment in which the maligned 'epic story and photo album finished by actual recipe' formula wins through, yet the recipe creators get the ridicule.

CRConrad · on Jan 28, 2022

> Google has created the environment in which the maligned 'epic story and photo album finished by actual recipe' formula wins through, yet the recipe creators get the ridicule.

They still deserve it, IMO. Willingly making a clown, a pawn of an ad-spamming corporation, out of oneself by doing one's darnedest to "win through" at some perverse game rigged by the aforementioned scourge of the Internet, is neither a natural human right nor a divine command. Not playing that game is still a valid move, and AFAICS the only honourable -- i.e. the only non-ridicule-worthy -- one.

prawn · on Jan 28, 2022

So if you're super-keen on food and trying to establish a career as a recipe creator or food photographer, it's dishonourable to: put in a lot of effort custom-writing supporting material and taking quality photos of the dish you're pitching to people? Sounds like these things traumatise you! :)

I'd agree that misleading made-for-AdSense sites that purport to, but then don't, answer a question, farm out writing to $5/page content squads and intersperse stock photos - that's shoddy. But all the recipe sites I find in my searches and have to scroll down for the content, they always seem like genuine, personal efforts. If I'm getting the content for free, scrolling a little bit as a price isn't too laborious and a stretch to think of it as dishonourable work IMO.

CRConrad · on Jan 28, 2022

> So if you're super-keen on food and trying to establish a career

IOW, you're trying to earn money. Sure, go ahead -- but then you get to pay the price. If the method you're trying to earn money by is going to involve playing along in the game of clickbait, then the price you get to pay is going to be, to be seen as a purveyor of clickbait. Which I, and I suspect quite a few others with me, see as distinctly less than honourable.

It's a free choice: Nobody is forcing anybody to "establish a career as a recipe creator or food photographer" on the ad-financed Internet. If they choose to play the clickbait clown/scum game, they're making themselves into -- so, in the end, are -- clickbait clown/scum. I sure didn't tell them to do that, so I'm perfectly free to see them as such for doing it.

They, OTOH, are perfectly free to try it some other way: publishing printed cookbooks in stead of Internet clickbait; or something adjacent, like run cooking classes, start a restaurant or catering business... Or to do something else altogether.

They could always go into the deeply honourable (/s) business of software engineering, which nowadays seems to consist to about 45% of running ad-spam networks, to about 45% of writing SEO crap to get your ads onto those networks, and about 10% other development... :-( What, me cynic? Bah, geroffmylawn!

prawn · on Jan 29, 2022

I have dozens of cookbooks. Almost every single one is absolutely packed with personal details about the chef and background information on the recipe (who taught them the recipe, their beloved Nanna's method, the history of nut x in remote tribal desert y, etc).

I've been to cooking schools in multiple countries. All have gone into detail about the background of the chef and each recipe.

Same with restaurants. Many restaurants and certainly almost every fine-dining restaurant pushes the profile of the head chef.

CRConrad · on Jan 29, 2022

I can't help that you paid God knows how much extra for this unnecessary fluff. If I were to get any of those, I'd look around for the least extraneous-fluff-y offer I could find. :-)

More seriously: At least the classes and restaurants already push that stuff in their marketing, don't they? So I get all that already doing my comparison shopping, and therefore would probably actually (at least to some extent) resent the time wasted on repeating it. And the few cookboks I (or we, my wife and I) have are also of the matter-of-fact, recipes and nothing more, kind... I am probably just much less of a "foodie" than you. I think my preference pattern is the overwhelming majority.

Note that Clovegarden has "the history of nut x in remote tribal desert y, etc" too -- but on pages separate from the recipes. (As I recall Mr Grygus started the site in preparation for starting a business of selling foodie stuff online after winding up his computing and automation consultancy business -- but that still seems to linger on, and he is nearing [or, probably, well past?] normal retiring age, so I don't know if that new business will ever materialise. But as long as he is up to updating Clovegarden every now and then it remains my favourite site for food-related stuff.)

[Edit: Ttypo.]

sidewndr46 · on Jan 26, 2022

Woah now, shouldn't step 1 be broken up into 2 steps. Each with their own heading and a paragraph explaining how to do that?

AceJohnny2 · on Jan 26, 2022

What kind of rice? Do you rinse the rice first? How much rice?

How much water? Do you salt the water?

edeion · on Jan 27, 2022

Reminds me of Plain Old Recipe, a website that strips out fluff from big recipe websites. You provide a link to a recipe, it makes it to the point. I thought the site had closed but it's apparently still live!

https://plainoldrecipe.com/ https://news.ycombinator.com/item?id=23648864 (Thank you HN :))

rhizome · on Jan 26, 2022

and "rice cooker" is an affiliate link

codeisawesome · on Jan 26, 2022

Let’s also not forget the 55 auto-playing video ads that I need to vault over to get to Step 1. Each one determined to hijack my mouse as I scroll/hurry past and cause a click! It’s like the world’s least fun platformer game.

tdeck · on Jan 27, 2022

You forgot the part where there's a pseudo-recipe after the story that catches your eye but doesn't have any measured amounts, and then the actual recipe later.

codeisawesome · on Jan 26, 2022

I got instantly annoyed by the first few words of this comment, thinking you’d gone off on some tangent about rice… until I saw the last part. Well played!

sosuke · on Jan 26, 2022

This sounds like it would make a very entertaining Chrome extension.

dheera · on Jan 26, 2022

Would also be nice if they edited things to actually be true, e.g.

* e-bike with 10 miles of actual range even though they advertise 30 miles

* laptop with 2 hours of battery life at 100% CPU usage even though they advertise 10 hours

* median $450 flight even though they advertise it as $199

Spivak · on Jan 26, 2022

> laptop with 2 hours of battery life at 100% CPU usage

Is there any laptop on the market that lives up to this. Even top specced MBPs I've gotten from work fall down when you actually use the CPU with compilers and VMs.

hetspookjee · on Jan 26, 2022

My simple M1 mpb 16gb seems to work for almost 2 hours when hammering the cpu. Haven’t timed it actually but I find it astonishing compared to the Dell mess I’ve had to deal with before.

dheera · on Jan 27, 2022

Oh just an example. Hammer it at 100% CPU usage and report battery life based on that.

Or a (min,max) based on idle and 100% CPU.

vel0city · on Jan 26, 2022

You're never going to guarantee some kind of range on an e-bike. What's the temperature of the battery? Is it mostly uphill or down hill? How much are you going to brake?

And advertising laptop battery life based on the CPU getting pegged to 100% gives meaningless information as its rare for people to actually have their device running at 100% load anyways.

dheera · on Jan 27, 2022

> You're never going to guarantee some kind of range on an e-bike. What's the temperature of the battery? Is it mostly uphill or down hill? How much are you going to brake?

Yeah but testing the e-bike on a track and telling the public it has 30 miles of range based on that is disingenuous.

Instead, go to a city with an average amount of hills, stop lights, and cold weather and give in a go, and tell that number to the public. If it beats that, in their actual city they'll only be pleasantly surprised. Right now you strand a shitton of people because they think they have 30 miles.

autoexec · on Jan 27, 2022

That depends on Google being both honest and accurate. Perhaps they have been so far, but my concern would be that a re-written title would cause quality content to get passed over by many viewers as undesirable/irrelevant because some algorithm misunderstood/misinterpreted what it was looking at, or because google wanted to subtly discourage people from content that competes or disagrees with whatever Google is attempting to promote.

In a better world, algorithms would be perfect and there would be a lot of healthy competition in search engines and google would be incentivized to provide users with the best possible results. In our current world Google's algorithm can't identify obvious spam well enough to keep it out of their results and there are no major search engines that haven't been lifting results from Google directly or indirectly and repackaging them as their own, so google has no pressure to do anything but promote whatever is in their own best interests or keep their results accurate and free of spam.

lolsal · on Jan 26, 2022

Imagine if your CLI tools did this.

lelandfe · on Jan 26, 2022

"Gaming their algorithm" sounds like a fancy way of saying SEO. If Google can produce for me a more accurate (or concise) title, it should only help me find what I'm looking for.

Forcing folks to trudge through inaccurate titles – or hoping people know the tells of a "spammy site" title – does not seem a better alternative.

addingnumbers · on Jan 26, 2022

> "Gaming their algorithm" sounds like a fancy way of saying SEO

It's quite the opposite, "Search Engine Optimization" is the fancy euphemism for gaming the algorithm.

withinboredom · on Jan 26, 2022

My favorite is when the title sounds like what you’re looking for only to discover it’s a page full of ads and keywords. The original title doesn’t even match.

That causes me to lose faith in google not a better experience.

lelandfe · on Jan 26, 2022

If that actually happens, I'm surprised the article doesn't cover it. I've never experienced that.

withinboredom · on Jan 26, 2022

I’ve found it most on the 2nd or 3rd page when googling specific but not common error messages.

mostdataisnice · on Jan 26, 2022

I think what HN and the SWE community at large has just missed about Google over the last 10 years is that the product is being built for the masses. Most people would prefer if you just rewrote the title to what it actually was rather than having to take on the cognitive load of understanding what SEO even is.

maxrebich · on Jan 26, 2022

AMEN to that

ashudeep · on Jan 27, 2022

Loughla · on Jan 26, 2022

>I don't see many reasons why a site owner would spend extraordinary amounts of time to "carefully craft" page titles

Because I want the title to be concise, but still help people explicitly understand what my writing is about? Because I've already spent a lot of time on the content, to then just slap 'Lou's Wednesday Website Update' as a title? Because, historically, a title is an introduction to my writing?

Any of those.

judge2020 · on Jan 26, 2022

Regarding one of these examples:

  How to Fix a Broken iPhone Screen [Tested by Experts] - Phone Fixer

->

  How to Fix a Broken iPhone Screen - Phone Fixer

Tested by Experts is obviously clickbait; nobody's going to say [Tested by novices].

Same for things like [Updated 2022] - there are tons of websites that superimpose [updated <currentyear>] even if the article content wasn't updated.

fma · on Jan 26, 2022

If Google believes the site is being disingenuous by writing a click bait headline, then they should punish the site by decreasing their ranking, not reward it by keeping it high and rewriting a more fitting headline.

karaterobot · on Jan 26, 2022

But if the title is spam, and the content is good (this is a big 'if'), the best solution would be to rewrite the title so that it's useful and keep the page at its original rank, based on the content. Ideally, Google would be able to handle all these different cases and just give me the best search results. Now, we all know that's increasingly less true, but in theory that's how it should work.

jtbayly · on Jan 26, 2022

But “for 2022” is a guarantee that the content is bad if it hasn’t changed in 2022.

And yet, I don’t see how Google can automate checking this. It’s possible to add a couple of sentences about how you’ve not seen anything to change your mind about last year’s recommendations. That may well be true. Or false. How can Google know? It just sees content that has changed. So it has been updated in 2022.

The bigger issue is brand trust (as a reviewer brand). The NYT bought Wirecutter, I think, because it had established itself as a trustworthy brand. That’s in direct line with the reputation the NYT wants to have as a whole.

ziml77 · on Jan 26, 2022

I hate how true your second paragraph is. Google should punish sites that change the date without updating the content, but all the SEO spam is just going to automate changing content when it changes the date. And then what does Google do? Figure out how to make an AI that can understand all the indexed content and accurately determine if it's truthful?

That seems fundamentally impossible without defining trusted sources. But then that means that you're trusting that Google's trusted sources are good. And if you do think they're good, then why not just check those sources directly?

The only answer I have is to find your own sources that you trust and go to them first.

bbarnett · on Jan 26, 2022

But if the title is spam, and the content is good

Then the content would not need to be spam, to be high ranking.

Not if google just cared about content quality.

So in this scenario, where only quality counts for rankings, all a spammy title shows, is the desire to bypass legitimate rankings.

Thus, it should be downranked.

Again, this was if Google legitimately wanted to rank good content high.

zepearl · on Jan 26, 2022

I'm not convinced, in general I don't like this additional layer of "fiddling around" with the original contents.

What about the opposite, the title being great but the contents not really? Shall Google serve its own "improved"/"summarized"/whatever version?

Meh... - this reminds me of the snippets of text extracted by some websites that are sometimes shown directly in Google's results, which in my case were sometimes wrong because they didn't take into account the context of what was written in the original contents.

mike_hock · on Jan 27, 2022

It should do both.

josefx · on Jan 26, 2022

Wouldn't it be better for the users to penalize the sites ranking instead hiding the fact that the result is your usual click bait drivel? Rewriting the titles just hides that the results Google found are low quality garbage.

slater · on Jan 26, 2022

Maybe Google does both?

logifail · on Jan 26, 2022

> Tested by Experts is obviously clickbait

If we're going to start filtering all "obvious clickbait" then the search results are going to change fairly dramatically...

JumpCrisscross · on Jan 26, 2022

> If we're going to start filtering all "obvious clickbait" then the search results are going to change fairly dramatically

Isn’t this the intended effect?

logifail · on Jan 26, 2022

> Isn’t this the intended effect?

I hate clickbait as much as the next user, but using that technique to get users to click appears to have even become part of the core business model of previously prestigious outlets.

Picking on the WaPo for no real reason:

How the Washington Post pulled off the hardest trick in journalism https://www.cjr.org/public_editor/washington-post-fluff-news...

An Open Letter to the Washington Post: Please Stop Doing Clickbait https://thedailybanter.com/2016/05/letter-to-the-washington-...

reaperducer · on Jan 26, 2022

As a subscriber to several newspapers, it's always interesting to see how different the headlines are between the dead tree editions, and the online versions — even for the same story.

The dead tree headlines are almost always very factual and to the point. I don't think I've ever seen anything close to something like "Here's four awesome tricks to get China to admit to the Tiananmen Square massacre" as a headline in actual print.

dpark · on Jan 26, 2022

The easiest fix for clickbait would be to penalize them for it.

foxfluff · on Jan 26, 2022

More importantly, if the content is actually relevant to the user's search, does it matter whether the title is clickbait or not?

Clickbait pisses me off when it's used to waste my time, but a good search engine wouldn't give me results that waste my time.

In other words, it could give me a relevant result with a clickbait title.. I guess that'd be a little annoying but I don't know if I would want Google to be the judge on what's clickbait or not, and even then I don't feel like it's their place to override titles. I wouldn't want useful pages be downranked just for having a poor title.

dpark · on Jan 26, 2022

A poor title reduces the quality of the resource, though. I think it’s reasonable that there is some penalty imposed for poor titles, and that could include clickbait. If the result is the best one for the search, sure, surface it. But if it’s not clear, though, “clickbait title” is a signal that the result is not the best.

I do agree it’s not really Google’s place to be rewriting titles, though. That feels very suspect.

logifail · on Jan 26, 2022

> A poor title reduces the quality of the resource, though

Is there an objective way to assess quality?

A click-bait title on a page full of ads and text that keep the visitor's attention but don't deliver on the title... ?

Then having held the visitor on your site for a minute or two, but managing to leave them unsatisfied, how about ending the page with a big fat block of even more visual click-bait content at the bottom (Taboola, I'm looking at you).

Don't advertisers and publishers love this stuff? Great metrics.

Tempest1981 · on Jan 26, 2022

It would be a great feature if they tracked the date when the content actually changed... significantly. I guess that could still be gamed.

LightG · on Jan 26, 2022

Not obviously. If true, adds credibility.

"Phone Fixer" sounds more scammy to me, lol

eulenteufel · on Jan 26, 2022

> Tested by Experts is obviously clickbait; nobody's going to say [Tested by novices].

Nobody would write [Tested by novices] into their headline, but leaving out the part in the brackets would leave it open if it was tested by experts or novices. So in this case the removed bit does provide some information.

phkahler · on Jan 26, 2022

>>> Because I want the title to be concise, but still help people explicitly understand what my writing is about?

And yet from TFA:

>> In fact, we found that matching your H1 to your title dropped typically dropped the degree of rewriting across the board, often dramatically.

Users don't look much at titles - they end up in the browser tab or somewhere like that. If a title doesn't match the H1 heading it's often to get more stuff in for SEO. OTOH short titles might be useful when they show up in a tab where there is limited space. Maybe they shouldn't lengthen them for that reason.

ricardo81 · on Jan 26, 2022

Can't say I agree.

Google should be a neutral middle man providing the results as they are found. If they feel the title is not of their version of quality they should rank it lower.

I'd prefer the version of title of several hundred million individuals rather than Google's aggregated version.

They used to 'borrow' DMOZ titles before DMOZ became defunct. At least in that case it's another point of view on top of their own (and the site author)

edave64 · on Jan 26, 2022

Google can't be a neutral middleman because everybody is trying to manipulate the search results. If everybody is clickbaiting their page titles, and Google just displays them as is, it makes their product worse.

worldofmatthew · on Jan 26, 2022

The solution is not to re-title, the solution is to de-rank clickbait.

azurezyq · on Jan 26, 2022

Well nowadays a lot of well known websites use clickbaits regularly, e.g., wsj and NYTimes. Many times, they are willing to summarize the news in the title when the news itself is not that complicated.

crtasm · on Jan 27, 2022

I'm sure they'd change to better headlines to avoid getting downranked.

bduerst · on Jan 26, 2022

That's assuming there aren't click bait false-positives based on page title.

TheCowboy · on Jan 26, 2022

You step away from neutral as soon as you introduce "version of quality". There will always be an introduction of bias and judgement calls that need to be made to get useful results, especially because bad actors on the web are part of the geography that aren't going away. Just like the press trying to force a neutral "view from nowhere" leads to confused and problematic journalism that can be exploited by bad actors.

https://pressthink.org/2010/11/the-view-from-nowhere-questio...

ricardo81 · on Jan 26, 2022

Indeed, quality/bias/judgement - I wouldn't argue about it wrt 'going away from neutral', I just meant that if a decision is to be made, either de-value it or show it in the top results, either way don't tinker with the information as it was laid out.

avivo · on Jan 26, 2022

I agree in theory for SEO mills... but it can apparently go a bit overboard!

Concrete personal example:

- Title shown by Google: "Policymaking Beyond Corporate CEOs and Partisan Pressure"

- Original title: "Towards Platform Democracy: Policymaking Beyond Corporate CEOs and Partisan Pressure"

Rather large difference!

More details in another comment: https://news.ycombinator.com/item?id=30087485 , but search term is just "platform democracy" (2nd result)

unglaublich · on Jan 26, 2022

For the same reason they extraordinary amounts of time to "carefully craft" the content of the page? And the images, and the citations, and the links, etc. For the sake of quality.

tweetle_beetle · on Jan 26, 2022

I think I see where you're coming from, but come to a different conclusion.

If you are, rightly, disappointed about low quality results in SERPs, then why not direct your frustration at Google's search algorithm? But ultimately once the algorithm has decided what to return, I don't want any of it to be tampered with. Maybe there's an argument that once you're using a black box, it might as well be the best black box it can be, but I don't agree.

I wonder whether there is a case for legal action here. Google would not have wasted time developing this rewrite engine unless it had an effect on clicks. Whether that is positive or negative, only they truly know. What if it was found that it was, or wasn't, being applied consistently to the results of their competitors, but not their own sites, for example?

ren_engineer · on Jan 26, 2022

Google isn't doing this for the user, they are doing it so Ads are more clickable than organic search, they want people clicking on Ads. I can guarantee they won't rewrite the clickbait ads written by marketers who are paying for space. The result is ads are more likely to be clicked

100% of the above the fold content is now ads on many search terms, Google is doing everything they can to squeeze more ad clicks, not provide the best information to their users

kevin_thibedeau · on Jan 26, 2022

"How to growth hack your old website after reaching market saturation"

bryanrasmussen · on Jan 26, 2022

Some titles of the past before they were optimized for clickbaitiness:

Omelas, bye-bye (The Ones Who Walk Away from Omelas)

Things are looking up (Great Expectations)

A crying cop (Flow, my tears, the policeman said)

The one that got away (The Old Man and The Sea)

on edit: I expect someone will point out those are the names of works of literary fiction not webpages, but obviously if we assume that webpages do not deserve the kind of respect we would give a creative work in book form and not change the title because it suits our needs, then we should not spend all our time complaining that the content of the web is just lousy stuff that nobody would care if you changed with an algorithm.

onion2k · on Jan 26, 2022

As a user, I'm fine with Google counteracting this.

The problem there is that "optimizing for clickbaitness" means "making the titles as appealing to click on as possible when they're displayed in search pages". Google deliberately making them less appealing to click on means Google are reducing the effectiveness of organic search results, and that favors adverts instead.

In other words, what you are saying is that you believe it's valid for Google to rewrite website content to make search page adverts more appealing than the actual search results.

That is very hard to justify. If Google wanted to 'punish' sites for being too clickbaity then they should drop that site's position in the search rankings. Ranking it highly but rewriting the title to be something worse (or 'less clickbaity') is a massive abuse of their search market position to favor their ad business.

ASalazarMX · on Jan 26, 2022

Specially when the article ends with:

> Want to optimize your titles for increased traffic?

> We built a title optimizer to take advantage of the outsized role titles play in SEO. Free to try.

Definitely SEO gaming.

falcolas · on Jan 26, 2022

If this were being done by a person, I might agree with you.

But it's not. It's being done by an algorithm which was carefully crafted to improve someone's chance for getting a promotion. It won't be maintained long term, yet it will continue to punish articles based on wholly arbitrary, biased, and opaque logic.

reaperducer · on Jan 26, 2022

If this were being done by a person, I might agree with you. But it's not. It's being done by an algorithm which was carefully crafted [by a person] to improve someone's chance for getting a promotion.

I made a little change there. Algorithms don't just magically appear like leprechauns and unicorns.

greiskul · on Jan 26, 2022

Google search is one area of Google in which this big company problem actually doesn't happen that much. Changes in the algorithm are never implemented by fiat, Google employs raters, and performs blind experiments to test if a change to search actually improves user satisfaction before rolling it out to everyone. So at least they must have some data that it increases user satisfaction, both with the metrics of the signal they measure, and with subjective raters satisfaction.

libertine · on Jan 26, 2022

I don't get this, why are you ok with bots changing your content, even if it's to be displayed on Google SERP?

Why stop at the titles?

I have an idea, let's have bots rewrite the content in a compact tl;dr format and have it be directly displayed on Google SERP, as user, the less actions I take the better right? You don't even need to leave the SERP.

Why can't I just choose what title I want in my blog to be indexed, and if Google wants to penalize it, so be it?

NavinF · on Jan 26, 2022

> let's have bots rewrite the content in a compact tl;dr format and have it be directly displayed on Google SERP, as user, the less actions I take the better right?

Fuck yeah! I’d pay monthly for a search engine that does this consistently. Google already does this for the articles that are easy to parse, but I’d love to see what newer methods based on language models can do.

Btw this article is talking about the <title> tag which is mostly used for SEO since users don’t see it on the page. I don’t think search engines have ever cared about it all that much.

duxup · on Jan 26, 2022

And every site has different motivations.

How many times do we bicker about titles that make no sense / are deceptive on HN...

The whole situation is a mess.

ransom1538 · on Jan 26, 2022

"I'm fine with Google counteracting this."

The ministry of truth. Google shall own all truth.

worldofmatthew · on Jan 26, 2022

They should de-rank clickbait websites, as many of them qualify as webspam.

yucky · on Jan 26, 2022

>As a user, I'm fine with Google counteracting this.

Would you be fine with Google changing the work of all authors? Maybe "The Brothers Karamazov" doesn't get enough clicks and Google decides it needs a better title. Or "A Portrait of the Artist as a Young Man" doesn't quite convey what Google thinks it should...

How is that different?

time_to_smile · on Jan 26, 2022

To be fair, The Karamazov Brothers is arguably a more natural English translation.

kevin_thibedeau · on Jan 26, 2022

It's perfectly cromulent English.

yucky · on Jan 26, 2022

Should Google adjust it then?

tiborsaas · on Jan 26, 2022

Google has a guide on this :)

https://developers.google.com/search/docs/advanced/appearanc...

justin_oaks · on Jan 26, 2022

While the Zyppy article is interesting in that it has statistics about the title rewriting, the Google guide on writing proper titles is more relevant to all of us who maintain websites affected by this. Thanks for linking it.

The ideal article would be something like "Google rewrites your titles in search results because your titles suck."

The Google guide does well to explain why some titles are rewritten, such as having duplicate titles across multiple pages, making it impossible to differentiate between pages that show up in the same set of search results.

bo1024 · on Jan 26, 2022

In other words, Google's policy is that the search result isn't showing the page title, it's showing Google's short description of the page. If Google thinks your page title is an adequate description it might use that, otherwise it will write its own.

(edit: and Google has enough self-importance to advise you to write your title as if it were a short description, to make their job easier)

seren1 · on Jan 27, 2022

TIL that indexing and crawling are different, and robots.txt prevents Google from crawling but not indexing.

dahart · on Jan 26, 2022

> Takeaway: to dramatically decrease the chance of Google rewriting your title, matching the H1 to the title tag seems to be an effective strategy.

Of course it should be mentioned this wont last if it becomes popular. Historically every time an SEO trick gets popular, the rules are adjusted. Even having this article on the front page of HN might be enough to see Google react by rethinking how (or whether) tags in titles affect the title rewrites.

nerdponx · on Jan 26, 2022

I wonder if Google is going to try out "AI"-generated titles that are directly summarized from the page content by machine, treating the page title and headings as inputs to the model.

soco · on Jan 26, 2022

Next step, an AI to regenerate the contents according to what the AI thinks I should have said. </s>

mhitza · on Jan 26, 2022

Problem solved WRT copyright issues relating to news articles. If the AI derived content (a la GitHub copilot) is deemed as original "unlicensed" content, no reason to force users to visit the website. (it's been a while since the news media and Google had their legal battles, and I'm unsure what the end resolution was then)

propogandist · on Jan 26, 2022

artificial intelligence (or at least their corporate puppet masters) are fighting for copyright law protection on insights the AI derives from reading copyrighted pages and content on the internet.

a robots.txt can keep you safe

phreack · on Jan 26, 2022

I would be surprised if that's not the case already!

shockeychap · on Jan 26, 2022

One think I don't see getting discussed in the pros and cons is the simple fact that you can't even tell what titles have been rewritten. Google gives no information in the search results to tell you what is original and what they've rewritten. This matches other trends like how it's become ever harder to discern sponsored ads from organic search results.

I used to love Google for how it presented relevant results and made it easy to discern sponsored ads. Today, I avoid Google products like the plague. (I can't escape all of them, but I'm about 90% off.)

avivo · on Jan 26, 2022

I had this problem recently, was hoping there was a reasonable fix but it appears not... (the H1 already contains the title)

I don't think about SEO, and just focus on useful writing / societal impact. However, I recently discovered by accident that I ended up with a top 2 search result for "platform democracy": https://google.com/search?q=platform+democracy .

But the title is missing the first 3 words—including the key words "Platform Democracy" — so that if I was a random person aiming to learn about the concept, I would likely skip over the result! (I almost did even though I wrote the piece!) This seems not ideal for either users or Google, and also an interesting exploration of AI/NLP impacts, so I tried to dig a bit deeper.

I had a brief exchange with Danny Sullivan, Google's public @searchliaison on it on Twitter (https://twitter.com/metaviv/status/1484636387366289413) which linked to two guides from Google on this. Sadly neither were particularly helpful, but will share them here in case they are helpful to others:

- https://developers.google.com/search/docs/advanced/appearanc...

- https://developers.google.com/search/blog/2021/09/more-info-...

(Also plausibly relevant: I have http://platformdemocracy.com/ redirect to the piece. I imagine this might impact search ranking, but I would be surprised if it impacts the title rewriting.)

cyrusshepard · on Jan 27, 2022

Author here. Frustrating situation. As the title is long at 84 characters, we know that Google is definitely going to rewrite it. The simplest way is to break it into parts and get rid of the shortest part that still makes sense.

So maybe take

'Towards Platform Democracy: Policymaking Beyond Corporate CEOs and Partisan Pressure'

And 1) condense it and 2) lose the colon

'Platform Democracy is Policymaking Beyond CEOs & Partisanship' (60 characters)

If that is too condensed, you could try a a short title in the <title> and a longer title in the copy.

ttty · on Jan 27, 2022

Your Google result is actually "Towards Platform Democracy: Policymaking Beyond Corporate CEOs and Partisan Pressure"

butler14 · on Jan 26, 2022

Zyppy's content marketing efforts aside, this wouldn't be so much of an issue if Google was any good at it

But as with its meta description rewrites, they're often worse than what was there to start with, and in some cases completely change the meaning, to the detriment of searcher experience

hammyhavoc · on Jan 26, 2022

Thinly veiled content marketing for Zyppy, complete with CTA at the bottom, and mentions of themselves throughout, including: "Fortunately, here at Zyppy, we have a large database of titles thanks to our title tag analysis tool. Armed with this data, we set out to determine how often Google rewrites titles and the scenarios which trigger this behavior."

Furthermore, "HTLM" instead of "HTML"? Needs proofreading. Lol.

partiallypro · on Jan 26, 2022

Pieces from Cloudflare and even Google themselves are posted here all the time, and have CTAs at the bottom.

hammyhavoc · on Jan 26, 2022

And you bet I flag them.

nerdawson · on Jan 26, 2022

Your point about proofreading seems fair.

Pretty much any company producing blog content is engaging in content marketing though. I’m not sure I understand the criticism. Perhaps this particular piece was overly self-promotional?

Sure, there’s a balance to be struck, but I thought the article had some decent takeaways.

KerryJones · on Jan 26, 2022

Are you implying that what they're saying isn't important or invalid? Or, what is your point?

hammyhavoc · on Jan 26, 2022

I'm saying that we should expect better from HN, and the people who frequent it. Otherwise, it's just a poor alternative to Reddit. If I wanted content marketing that nobody could even be arsed to have proofread then I'd go elsewhere. The ever-decreasing standard continues.

floatingatoll · on Jan 26, 2022

Quoting the last paragraph of the guidelines, which has a bunch of supporting hyperlinks:

> Please don't post comments saying that HN is turning into Reddit. It's a semi-noob illusion, as old as the hills.

I dislike that they’re an SEO company, but I don’t object to self-serving posts, as long as they’re also curious and interesting to me. Seeing a comprehensive list of how Google rewrites page titles is interesting to me, because I’m fascinated with headline writing.

I’m sorry that you find self-serving content and calls-to-action to be problematic, but I would warn you to expect more of them on HN over time (as neither are violations of the site guidelines). There’s no need to claim it’s going to turn us into Worse Than Reddit. It’s been this way for HN’s entire life, or at least the chunk I’ve been present for. HN’s just the same as it always was. I respect your outrage but you chose an invalid expression of it.

hammyhavoc · on Jan 26, 2022

This isn't outrage. This is my opinion. I'm outraged by many things, some bullshit on HackerNews and your opinion of my opinion does not bother me. Have a nice day.

Chris2048 · on Jan 26, 2022

> Otherwise, it's just a poor alternative to Reddit.

There are plenty of great subreddits. Every time I hear reddit used as a put-down it feels very snobbish.

hammyhavoc · on Jan 26, 2022

Sure. Likewise there's far more terribly moderated ones than good ones. OK, "the average badly moderated sub-Reddit". My opinion remains.

floodyberry- · on Jan 27, 2022

"SEO" is pure marketing whose sole purpose is to sell you a worthless product. The article says nothing you will ever need to know and it, and all other "SEO" "content", should be blocked as spam

phreeza · on Jan 26, 2022

It's quite funny to me that people are running these kind of almost scientific experiments on a fully human-generated and in principle knowable system. The reason are understandable of course but it does seem like a waste of human energy.

gibspaulding · on Jan 26, 2022

You see this a lot in gaming too. There are entire sites, devoted to figuring out what various weapon attachments actually do in Call of Duty. If you poke around the Minecraft wiki you'll find the same thing - people working out exactly how fast you can move with different potion effects or how scaling works when leveling up enchantments.

Theoretically, all of this information could be found in the source code, but without it gamers are left to an endless research project.

yjftsjthsd-h · on Jan 26, 2022

I'm not convinced that having the source code is necessarily a perfect shortcut to accurate results. Video games, in particular, seem to be subject to a decent amount of emergent behaviors such that scientifically measuring things is honestly probably a better option than trying to read the source code to find out what the developer thinks should happen.

spijdar · on Jan 26, 2022

At least in the case of gaming, I think (some) people actually enjoy this aspect. It's a waste in a lot of systems, but in an "art", I think it can elevate the experience, at least for certain games and genres.

An interesting inverse of the norm is the Roguelike ADOM. Most similar games from the same time period like Angband, Nethack, and DCSS were open source, while ADOM was a free, but proprietary game. The other games' secrets were open-book, with no real secrets to speak of as the source code is scoured by players. ADOM remains sort of interesting to me as there are red herrings in even the machine code to throw off reverse engineering, and genuine secrets that open source games simply can't have. I've always appreciated that you can't simply look at the source to know everything, anyway.

gibspaulding · on Jan 26, 2022

You are certainly right, there's a certain appeal to the mystery!

I remember reading an article on the Minecraft wiki about how to achieve the slowest possible movement, which is of course a totally useless thing to do in game, but you could see someone had put a ton of thought into working out how to do it! And who's to say that your slow machine is less an expression of artistry than playing the game "right" and building castles!

LegionMammal978 · on Jan 26, 2022

Minecraft at least is effectively open source; however, many of the quantities being measured are indirect consequences of the physics engine which would be difficult to derive from the constants in the code.

capableweb · on Jan 26, 2022

Agree. Then you imagine that the entire SEO industry is basically based around the idea that a company has a algorithm only they know, and the industry is trying to reverse-engineer that algorithm. If they released a whitepaper describing exactly how it works, the entire industry would have to change their ways to consulting already public information instead of experiments like this.

Veen · on Jan 26, 2022

If they released a white paper explaining how it works, the search results would have even more spam than they already do.

shadowgovt · on Jan 26, 2022

Interestingly, they are likely to find things that the developers themselves don't yet know.

These systems are large and complicated and time is finite. When it comes to analysis of a written system, there's a lot more time free-floating in the global network of users than there is in the group of a dozen, maybe a hundred, developers who wrote the engine (many of which have immediately been re-tasked to write something else).

yomansat · on Jan 30, 2022

Here's an example you might like:

https://twitter.com/JoelBurgess/status/1428008041887281157

chillingeffect · on Jan 26, 2022

Great systems view, that's the general basis of cooperation vs competition. we keep some things secret, stimulating other people to expand energy and think creatively instead of doing it for them. It becomes wasteful when the energy required to produce new information and techniques is impossible to obtain. e.g. in massive inequality: a homeless person just can't gain the skills to obtain a job, or an oppressed population can't overcome the excitation energy needed to free themselves. It's also the reason we outlawed monopoly in the U.S., only to reach the local minimum of duopoly.

the_biot · on Jan 26, 2022

OP is an SEO company. Wasting human energy is what they do.

tonmoy · on Jan 26, 2022

Haven’t we been doing similar thing with for example stock market analysis when we analyze a company’s earnings/management etc?

valyagolev · on Jan 26, 2022

it's like calling adversarial ML a waste of energy. we use this approach for the problems where we want to preserve a lot of variety in solutions

zerkten · on Jan 26, 2022

This is tangential, but it could similarly be used to describe many support teams which staffed by non-technical folks, or are cut off from engineering for cultural, political, etc. reasons. It's a complete waste of energy, but for various reasons people get put in these situations and experiment instead of talking to an authority in another department, or getting an expert on their team. It can be sustained for a surprisingly long amount of time as well before someone gets called out on inaccuracies.

picture · on Jan 26, 2022

Similarly, there's also "research" being done to decipher and understand Apple's hardware and software. It does seem like a waste of human brain cycles.

eraserj · on Jan 26, 2022

Ironically, they failed to organize the world's information and make it universally accessible and useful.

bmitc · on Jan 26, 2022

The opaqueness of human systems is a real issue. It basically describes 99% of issues in the workplace, and those are systems in the small.

renewiltord · on Jan 26, 2022

The waste is the point. One might as well wonder why I keep my password secret and force hackers to break it.

_lqaf · on Jan 26, 2022

> but it does seem like a waste of human energy

Legal processes are enormous wastes of human energy on what are usually negative-sum games.

In only humans could cooperate.

JumpCrisscross · on Jan 26, 2022

> Legal processes are enormous wastes of human energy on what are usually negative-sum games

I don’t know if this is true. Private legal disputes can be purely antagonistic.

They can also be a form of short-term adversarial long-term adaptation to new information. Court cases, on the other hand, produce precedent (irrespective of the legal system). That, too, helps guide a society through novel circumstances.

jpswade · on Jan 26, 2022

Google has done this for practically as long as I can remember. If you remember when dmoz was still a thing, Google would favour the title from that, rather than the site's actual title because it perceived it as more useful to the user as it was moderated. By now I would expect that Google has used this and real moderators to train their machine learning model to rewrite titles, perhaps as a way to, you know, hopefully make the product more useful.

johnnyApplePRNG · on Jan 26, 2022

Speaking of rewriting titles... I noticed that HN reworte a self post title a few days ago. [0]

Why is HN editing self post titles?

[0] https://news.ycombinator.com/item?id=30053890

slater · on Jan 26, 2022

Mods frequently rewrite submitted titles, either cos it's not the same title used in the article, or because there's a better wording for the HN crowd ¯\_(ツ)_/¯

Check dang's (HN mod) comments:

https://news.ycombinator.com/threads?id=dang

johnnyApplePRNG · on Jan 26, 2022

This title edit was not for an article, which is what confused me.

A user submit a self post, a rant basically which got popular, and the title was edited hours later.

zerocrates · on Jan 26, 2022

It's totally ordinary for a post to go front-page and attract hundreds of comments on the basis of a title that's later deemed too interesting/editorialized/whatever, and you later see some dry uninformative title like "Google account security" or whatever it was occupying a top slot.

I can't say I've noticed it often before for text posts, but I do generally think this pattern of closing the barn door after the horse has bolted is pretty silly in general.

johnnyApplePRNG · on Jan 26, 2022

It's just wrong to change the title of a self post imho.

Without making it publicly known at least... you're changing what the poster intended to say.

Editing sensationalised headlines back to sanity makes perfect sense though.

pxeger1 · on Jan 26, 2022

This article has been really badly proofread (or probably not at all)

- "HTLM"

- "includ"

(unmatched parentheses

only4here · on Feb 2, 2022

https://ghostarchive.org/archive/54mNm Archived this a few days ago since I knew they'd fix the typos quickly.

omoikane · on Jan 26, 2022

(An unmatched left parenthesis creates an unresolved tension that will stay with you all day.

https://xkcd.com/859/

vba616 · on Jan 26, 2022

\(\\\\

Escaped left parenthesis and two backslashes, or a cross section of a phalanx?

Aardwolf · on Jan 26, 2022

teddyh · on Jan 26, 2022

(⁽₍⟮⦅⸨﴾﹙（｟

pphysch · on Jan 26, 2022

POV: your about to learn the most esoteric LISP yet

saalweachter · on Jan 26, 2022

A LISP-like language that only used various left-brackets sounds even worse than the whitespace-based programming language.

stevelosh · on Jan 26, 2022

Common Lisp isn't that esoteric!

    (defun fun-reader (stream arg)
      (declare (ignore arg))
      (read-delimited-list #\⸨ stream t))
    
    (set-macro-character #\⸨ (get-macro-character #\) nil))
    (set-macro-character #\( #'fun-reader)
    
    (defun square (x⸨
      (* x x⸨⸨
    
    (square 10⸨
    ; => 100

martin_a · on Jan 26, 2022

"Hello World" sample in some esoteric language?! :-)

nerdponx · on Jan 26, 2022

These characteristics all seem like Google attempting to combat SEO-oriented spam titles.

godshatter · on Jan 26, 2022

...by making the titles more enticing for users to click on. Doesn't seem like that great of a solution. I would rather they went the other way by prepending "[possibly spam]" or "[possibly clickbait]" to the title.

There's an arrogance to this whole process that amuses me. I can picture the dev team responsible for this code sitting in a meeting with the lead dev saying "it appears that some idiot web authors are using titles that have extra information placed in brackets! Idiots. Well, let's brainstorm possible solutions to this problem so we can protect our idiot users from this obvious menace..."

Leaving them alone and giving us back pagination of search results would solve this problem for me. Or they could demote these sites that they think need their titles rewritten in their search algorithm.

igouy · on Jan 26, 2022

The issue isn't that Google search results exclude stuff that was in the page title.

The issue is that Google search results insert stuff into the page title that wasn't there.

So the issue isn't that the overly long + pipe

    <title>Which programming language is fastest? | Computer Language Benchmarks Game</title>

is abbreviated. The issue is that the domain of the hosting service is inserted, which gives the misleading impression that this is a project in-some-way approved and promoted by the Debian organisation:

    "Which programming language is fastest? - Debian"

:when it would be better just to snip:

    "Which programming language is fastest?"

the-dude · on Jan 26, 2022

HN rewrites page titles too.

xnx · on Jan 26, 2022

20+ years ago also used the meta description tag from a page instead of page text snippets. We're many decades past blindly accepting page author provided content as being the most useful thing to display. People keep thinking of Google as a search engine greps pages to find matching text. That is old/obsolete thinking, any Google-like services has evolved how directly it can return the information/answer you seek instead of returning a page that may contain the information/answer you seek.

eyelidlessness · on Jan 27, 2022

What I find fascinating is I’ve seen a small, but increasing, subset of results where:

- the result title is clearly not original, usually derived from content on the page

- the original title is known to be generated

- the original generated title is as close to harmless as any web content could be

- the result title is actively harmful and misleading

- the original title is demonstrably better

- this idiosyncrasy is applied to very high trust hosts (eg GitHub)

- it’s not applied to the same content from obvious scraped content/spam/scam sites with obvious tells

xnx · on Jan 26, 2022

Good. Google's interests more closely align with my own than page authors. I'm glad to have Google as an agent working for me to make more useful page titles.

lowbloodsugar · on Jan 26, 2022

Surely the misconception is the belief that what google displays is the page title. Google displays a link to a page, with a short description of what you will find there. Likewise, when I link to a page from my page, I don't use the title of the page: I use some text that I chose. This a non-story, as far as "rewriting titles" goes. What is interesting is that Google has an automated way to briefly summarize a page.

tapland · on Jan 26, 2022

Every day my desire to be able to rate sites relevance after a search increases. And I'd love to be able to choose if the original or Google generated title was the most relevant. (Cmon there is some machine learning training potential in that).

Rather that than ditching google search completely which is getting closer every day.

wodenokoto · on Jan 26, 2022

On modern browser, the page title is almost completely obscured. It’s not a thing users generally see, and the few views where users have access to the page titles that are not some sort of developer tool, the title is more often than not cut short.

I don’t see why google has to use the page title as a headline for a link result.

5Qn8mNbc2FNCiVV · on Jan 27, 2022

Something has to be fishy with this because I get tons of "Untitled" results now which directly lead to spam. This sucks big time because I usually got really good results since I search a lot coding related things and now I cannot use this account anymore for searching

echelon · on Jan 26, 2022

https://web.archive.org/web/20220126145329/https://zyppy.com...

Since people are reporting failure to load.

bobbyi · on Jan 26, 2022

This data is based on what's seen in the wild, right? So if they see text in brackets removed more often than text in parentheses, that could reflect what sort of text people tend to put in brackets vs parentheses rather than (or in addition to) how google treats those characters.

travisgriggs · on Jan 26, 2022

I’m no search engine expert. Is this standard practice at some level across other search engines? Is “retitler” just part of every search engine stack (e.g. DDG, Bing, BraveSearch, etc)?

Or is this unique to the “I’m Feeling Lucky” folks?

Honestly curios.

pictur · on Jan 26, 2022

Google literally turns the internet into a garbage dump. There are so many spam news sites that can come to the fore thanks to their seo nonsense that the sites that provide real news are not even seen lol

Zachsa999 · on Jan 26, 2022

I hate those dropdown things that say "How to change gamma values in gimp" and they lead to a YouTube tutorial.

Please stop serving me YouTube tutorials; they all suck.

Giorgi · on Jan 26, 2022

Ok... owning couple of dozen sites all submitted and fully indexed, I have never seen even single URL re-written. Does this really happen?

saalweachter · on Jan 26, 2022

Google claimed that the original <title> is used "more than 80% of the time" when announcing the change[0].

Combining this rate with the rate seen by the article (rewritten 61% of the time, on the subset of 81,000 URLs they were interested in), I'd guess that some websites see a lot of rewrites and many other websites see none at all.

[0] https://developers.google.com/search/blog/2021/08/update-to-...

cyrusshepard · on Jan 27, 2022

So an interesting distinction here is required! When Google says they use the title 80% of the time, they mean they use the title 80% of the time to create their search result title, which they may or may not modify. The other 20% of the time they use an H1 or other elements on the page.

treycopeland · on Jan 26, 2022

Yes, happens 61% of the time.

friendlydog · on Jan 26, 2022

Unless the website is public domain or license to freely remix, isn't Google violating copywrite law by creating a derivative work?

animal_spirits · on Jan 26, 2022

Hackernews rewrites many post titles

randrews · on Jan 26, 2022

Including this one, ironically.

uptownfunk · on Jan 26, 2022

Oh I missed the memo where we added Google to the list of things Hacker News loves to hate on.

jpswade · on Jan 27, 2022

Has anyone else started seeing results with titles as “Untitled”?

endisneigh · on Jan 26, 2022

tldr: Google sometimes uses headings instead of titles. Match them to prevent title rewrite; stop using long, verbose titles

indigochill · on Jan 26, 2022

From a pure HTTP perspective, isn't the point of page titles to be how a page is referenced? It would be an error if a library reported the title of "A Tale of Two Cities" as "It Was The Best of Times".

> stop using long, verbose titles

This is good advice, but if Google wants to penalize bad titles it should dock their rank, not misreport them.

tiborsaas · on Jan 26, 2022

> isn't the point of page titles to be how a page is referenced

It is, but what would you do if all titles across pages just said "ACME Corp."? That happens often if the developer just displays SITE_NAME in the title.

In those cases it makes sense to present the person searching the web with additional information from a H1 tag which probably has more information like "Contact us"

renewiltord · on Jan 26, 2022

No thanks, as a Google user, I’m happy that Google is descriptive.

Ideally, Google tells me what the page actually contains. I.e. if you title the page “Top TVs of 2022” and you’re reviewing cars, then it titles it appropriately. Google can’t do that right now, but every step closer is a good thing for me.

shadowgovt · on Jan 26, 2022

There's lots of "isn't the point of..." in HTML that actual users have broken. Google (and other crawlers and intermediaries) have to adapt their algorithms to account for that.

indigochill · on Jan 26, 2022

> Google (and other crawlers and intermediaries) have to adapt their algorithms to account for that.

As I see it, Google's in a prime position to algorithmically reward actual users for better HTML discipline by ranking them above users who can't be bothered.

colejohnson66 · on Jan 26, 2022

The Search Console is great at knowing if your site could use some improvements. They could easily[a] add a mark for “bad titles”.

[a]: “easily” because they already have logic to determine something needs rewriting

ldjb · on Jan 26, 2022

HTTP actually has nothing to do with page titles. I think web browsers should probably display the titles verbatim, but there may be use cases where they don't, a common one being where there isn't enough space so the title is truncated in the UI.

As for what search engines should do with page titles, it's really up to the individual search engine, I'd say. Whatever serves their users best.

marginalia_nu · on Jan 26, 2022

As a search engine developer I totally get why. HTML in the wild is not well behaved in the slightest. People use title and heading tags in all manner of weird ways. I've seen <title>-tags in the <body>-tag used as headings. I've seen documents where every line was a <h1>-tag.

You kinda need to make the most of what you're given.

munk-a · on Jan 26, 2022

HTML5 has been around for long enough that we should be able to punish sites that use completely bonkers markup at this point right? Since Google effectively has historical archives of the internet they could pretty trivially grandfather in legitimately old content (things they tracked before some date) and just start down-ranking sites that continue to misbehave with markdown but skate by with browsers running in compatibility mode. Something like abusing <h1> tags is legal, if obnoxious, HTML and so it shouldn't really fall under this... but it's been long enough that we can start punishing completely incorrect syntax right?

marginalia_nu · on Jan 26, 2022

That would be a massive loss, though. A lot of content isn't in HTML5, and a lot of that pre-HTML5 content is precious and valuable.

Google has sadly already tossed a lot of that by the wayside, since it often isn't served with HTTPS. I think something like 80% of the sites my crawler is aware of serve pages over plain HTTP.

In general, attempts at shaping the web through search engine indexing requirements seems to mostly serve to filter out content made by humans and select for search engine marketing.

ncphil · on Jan 26, 2022

Not so sure older content (like the stuff I wrote in the late 90s to mid 00s) would be negatively impacted, so long as search providers pay careful attention to the <!DOCTYPE> tag (or lack thereof). I wouldn't characterize holding people to at least a bare minimum of standards (e.g., title in the head and nowhere else, which has been the rule since at least HTML 2.0 in 1994) as "punishment", any more than dinging them for unclosed parens and other typos. Language is how we communicate understanding, and markup is how we frame presentations on the web (mostly). People need to be prepared for the consequences of making it up as they go along rather than educating themselves on the standard (whether spelling, grammar or markup language).

marginalia_nu · on Jan 26, 2022

That really doesn't seem to be what I'm seeing, having built a search engine specialized in this type of content and finding almost nothing but gems in the refuse.

If anything, it seems like the single best predictor of whether a website is a content mill is strict adherence to modern web standards and other "google rules".

munk-a · on Jan 26, 2022

I think it'd be a pretty good to let in historical stuff on grace - and just start penalizing new content. Google absolutely has the tools to do this the right way and the internet archive could allow most other folks to accomplish the same thing.

Enabling HTTPS is easy on most platforms. Folks that have rolled their own platform or got unlucky and are using a CMS that fell out of favor do tend to get screwed over by this - but I think its fair to de-prioritize content that fails to adhere to good practices. The HTTP vs HTTPS debate in particular can be a real security concern - with tags its more about paying down the tech debt in our browser technology.

dylan604 · on Jan 26, 2022

I really wish browsers would stop shrugging their shoulders at bad markup and display blank pages with errors in the consoles or even visible in the rendered page. It would force devs to clean up their act. But as long as 1 browser vendor doesn't do it, the end users will all just assume the strict browser is broken since there is another browser that does "work".

onion2k · on Jan 26, 2022

On the website of the company I work for, the title is "tagline | company name" but in the search results it shows up as "company name: tagline". That style doesn't appear on the company website anywhere.

I imagine it's Google trying to normalize how things are shown but it's quite annoying. It could potentially break some company's branding.

zbrozek · on Jan 26, 2022

Shrug. The almost-religious belief in the necessity for ultra-consistent branding within some companies is nearly comical so long as you're on the outside.

Jcampuzano2 · on Jan 26, 2022

Sadly agreed. Definitely not comical when I have multiple times had our marketing department blame/throw fits at the dev team for the site not showing up in Google's search results exactly how they want it to.

shadowgovt · on Jan 26, 2022

To be fair to the devs, that's an education gap. The response should be "You want us to develop a solution to a third party's whims? Maybe you should try writing them a nice letter about how their representation of our company affects our image; it'll have as much impact. Possibly more."

In real corporations, of course, that's not how it works because the tech people are "wizards" and Google is "part of the wizard stuff," but this isn't a technical problem (and maybe marketing needs to stop trying to control another company; that's no more likely to succeed than Coke yelling at Amazon that they don't always put Coke products at the top of every search result).

onion2k · on Jan 26, 2022

It's relatively minor for most businesses, but sometimes it isn't. Inconsistent messaging makes it a lot easier for someone to set up a phishing attack against your customers. My bank uses several different URLs, email sending addresses, and taglines for its services. It's not always easy to tell if an email is actually from the bank.

Google adding more permutations into the mix doesn't help.

nightpool · on Jan 26, 2022

Google changing the way title tags are formatted on their SERP is not the reason that your bank's customers are falling for phishing attacks.

onion2k · on Jan 26, 2022

Of course not. I didn't say it is. It's a whole bunch of things. Google changing things is one very small factor.

But it is a factor...

duckmysick · on Jan 26, 2022

Related, but I dislike when I'm bookmarking a page and the title is one word - the name of the product or the company. It makes it hard to search for it later.