What would that do?

kbrackbill · on March 12, 2013

A bunch of stuff, but the important thing here is that it clips things above and below each line so things like this can't overlap other lines

lorettahe · on March 12, 2013

It doesn't actually cap the line. It caps the block. Sometimes the block can extend to multiple lines.

http://i.imgur.com/698dzMo.png

After trying with the css suggested here on Google inside Chrome, the problametic characters don't have the top going across too high like originally, but it gets capped on the top of the first line of the paragraph, instead of the line the character is actually at.

That said, I'm not sure how to solve this though.

kbrackbill · on March 12, 2013

Doh, you're right. For some reason I had it in my head that the block itself wrapped onto multiple lines but I must have been thinking of something else.

It seems like there isn't really any CSS only solution for this without wrapping every character in its own element like jQueryIsAwesome suggested.

jQueryIsAwesome · on March 12, 2013

The thing is to avoid overlapping with other elements, if they want to ruin their own description/comment so be it.

But just for the technical challenge you could do something like this in Javascript to chop every individual special character:

    string.split("").map(function(a){ 
        return /[a-z0-9\s]/i.test(a) ? a : '<span class="s_char">' + a + '</span>'; 
    }).join("");

And apply the mentioned CSS to the "s_char" class.

windsurfer · on March 12, 2013

Oh god please no.

jQueryIsAwesome · on March 12, 2013

explain.

sukuriant · on March 12, 2013

It's a huge hammer and a considerably larger webpage footprint just so that a type of character isn't allowed to run amok, something that's not going to happen anyway in the vast majority of cases.

jQueryIsAwesome · on March 12, 2013

We have very different definitions of "huge hammer", in any decent modern browser (Chrome, Firefox, IE9) it takes less than 10ms to apply the mentioned code in 20 texts (using the linked 10 Google results as an example).

sukuriant · on March 12, 2013

It also puts every single special character into its own span. That may negatively affect programs that trust HTML for copy-paste, for example Open Office, Microsoft Word (I think?), some IM clients, some text editors, etc. Doesn't seem worth it in the general case.

That said, if those sorts of things bother an individual, they could run that on the page themselves, I suppose, so it's good for that :)

jQueryIsAwesome · on March 12, 2013

I don't know OpenOffice but In most programs putting spans around a letter or word does not affect pasting.

jamesaguilar · on March 13, 2013

10ms is a huge amount to spend for such an obscure fix. How many obscure corner cases do you think Google webpages have to account for? A lot more than 100, I would bet. So rough methods like these would be horrible for performance.

jQueryIsAwesome · on March 13, 2013

I invite you to look at the ridiculous amounts of JS lines twitter uses in its interface.

jamesaguilar · on March 14, 2013

You're the one who made the 10ms claim, not me. If indeed it is faster than you claimed, then maybe it would be ok. Still probably not worth engineering time for a problem of this magnitude.

sukuriant · on March 13, 2013

And Twitter is horridly slow on its website. That's what we're trying to avoid

windsurfer · on March 12, 2013

It is a huge amount of CPU and memory and wasted resources to correct a problem sometimes and only for some edge cases.

nickpatch · on March 13, 2013

It corrupts Unicode data because it splits on code points instead of grapheme clusters. When performed on 'Spin̈al Tap', it splits the base character U+006E (LATIN SMALL LETTER N) from the combining character U+0308 (COMBINING DIAERESIS) and results in the string 'Spin<span class="s_char">̈</span>al Tap', which contains the valid Unicode grapheme cluster '>̈'! If you were to split on grapheme clusters instead, the result would be 'Spi<span class="s_char">n̈</span>al Tap'. However, I still wouldn't support that solution because it could negatively affect text segmentation used by search engine indexing and natural language processing tools.