Oh god please no.

jQueryIsAwesome · on March 12, 2013

explain.

sukuriant · on March 12, 2013

It's a huge hammer and a considerably larger webpage footprint just so that a type of character isn't allowed to run amok, something that's not going to happen anyway in the vast majority of cases.

jQueryIsAwesome · on March 12, 2013

We have very different definitions of "huge hammer", in any decent modern browser (Chrome, Firefox, IE9) it takes less than 10ms to apply the mentioned code in 20 texts (using the linked 10 Google results as an example).

sukuriant · on March 12, 2013

It also puts every single special character into its own span. That may negatively affect programs that trust HTML for copy-paste, for example Open Office, Microsoft Word (I think?), some IM clients, some text editors, etc. Doesn't seem worth it in the general case.

That said, if those sorts of things bother an individual, they could run that on the page themselves, I suppose, so it's good for that :)

jQueryIsAwesome · on March 12, 2013

I don't know OpenOffice but In most programs putting spans around a letter or word does not affect pasting.

jamesaguilar · on March 13, 2013

10ms is a huge amount to spend for such an obscure fix. How many obscure corner cases do you think Google webpages have to account for? A lot more than 100, I would bet. So rough methods like these would be horrible for performance.

jQueryIsAwesome · on March 13, 2013

I invite you to look at the ridiculous amounts of JS lines twitter uses in its interface.

jamesaguilar · on March 14, 2013

You're the one who made the 10ms claim, not me. If indeed it is faster than you claimed, then maybe it would be ok. Still probably not worth engineering time for a problem of this magnitude.

sukuriant · on March 13, 2013

And Twitter is horridly slow on its website. That's what we're trying to avoid

windsurfer · on March 12, 2013

It is a huge amount of CPU and memory and wasted resources to correct a problem sometimes and only for some edge cases.

nickpatch · on March 13, 2013

It corrupts Unicode data because it splits on code points instead of grapheme clusters. When performed on 'Spin̈al Tap', it splits the base character U+006E (LATIN SMALL LETTER N) from the combining character U+0308 (COMBINING DIAERESIS) and results in the string 'Spin<span class="s_char">̈</span>al Tap', which contains the valid Unicode grapheme cluster '>̈'! If you were to split on grapheme clusters instead, the result would be 'Spi<span class="s_char">n̈</span>al Tap'. However, I still wouldn't support that solution because it could negatively affect text segmentation used by search engine indexing and natural language processing tools.