Hacker News new | past | comments | ask | show | jobs | submit login

Pardon this off-topic comment but would anyone happen to know why on pages like this I see unusual characters in place of punctuation? Screenshots: http://s3.amazonaws.com/2009/safari.png http://s3.amazonaws.com/2009/firefox.png. It doesn't matter which browser I use or what user account I'm logged in as, my Mac chokes on certain web pages - the linked article being an example.

The page is character encoded in ISO-8859-1, but the web server is reporting it as UTF-8. (Probably because the web page was written before Unicode became a common concern of webmasters, it dates back to 1999.)

In Firefox, select View->Character Encoding->Western (ISO-8859-1). In Safari, select View->Text Encoding->Western European.

(FYI, this shows up this way on Windows systems too. It's not MacRoman or Windows-1252 to blame this time.)

EDIT: Dang, took too long to write this message to be technically accurate; several others posted before I did. People who complain about spelling errors and minor inaccuracies in web posts should be shot...

Aha - that fixed it! Thanks, all of you.

The webserver says the page is encoded using utf-8, but it is encoded using iso-8859-1.

In Firefox, select View -> Character Encoding -> Western (ISO-8859-1). There is a similar option in Safari.

Joel Spolsky wrote a post called "The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!)" and the title is true. Especially for those of us that are unfortunate enough to live in countries with strange squigglies that the Internet doesn't quite think belongs there.


Character-encoding problem. The http header has:

  Content-Type: text/html; charset=UTF-8
but presumably the html page isn't actually in UTF-8, but maybe in some kind of extended ascii?

Yeah, if you download the page and serve it locally as text/html it renders ok. It's probably ISO-8895-1 or similar.

For people unable or unwilling to muck about fixing the encoding of the story, I've mirrored the content on my website:


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact
