Hacker News new | past | comments | ask | show | jobs | submit login

What would that do?



A bunch of stuff, but the important thing here is that it clips things above and below each line so things like this can't overlap other lines


It doesn't actually cap the line. It caps the block. Sometimes the block can extend to multiple lines.

http://i.imgur.com/698dzMo.png

After trying with the css suggested here on Google inside Chrome, the problametic characters don't have the top going across too high like originally, but it gets capped on the top of the first line of the paragraph, instead of the line the character is actually at.

That said, I'm not sure how to solve this though.


Doh, you're right. For some reason I had it in my head that the block itself wrapped onto multiple lines but I must have been thinking of something else.

It seems like there isn't really any CSS only solution for this without wrapping every character in its own element like jQueryIsAwesome suggested.


The thing is to avoid overlapping with other elements, if they want to ruin their own description/comment so be it.

But just for the technical challenge you could do something like this in Javascript to chop every individual special character:

    string.split("").map(function(a){ 
        return /[a-z0-9\s]/i.test(a) ? a : '<span class="s_char">' + a + '</span>'; 
    }).join("");
And apply the mentioned CSS to the "s_char" class.


Oh god please no.


explain.


It's a huge hammer and a considerably larger webpage footprint just so that a type of character isn't allowed to run amok, something that's not going to happen anyway in the vast majority of cases.


We have very different definitions of "huge hammer", in any decent modern browser (Chrome, Firefox, IE9) it takes less than 10ms to apply the mentioned code in 20 texts (using the linked 10 Google results as an example).


It also puts every single special character into its own span. That may negatively affect programs that trust HTML for copy-paste, for example Open Office, Microsoft Word (I think?), some IM clients, some text editors, etc. Doesn't seem worth it in the general case.

That said, if those sorts of things bother an individual, they could run that on the page themselves, I suppose, so it's good for that :)


I don't know OpenOffice but In most programs putting spans around a letter or word does not affect pasting.


10ms is a huge amount to spend for such an obscure fix. How many obscure corner cases do you think Google webpages have to account for? A lot more than 100, I would bet. So rough methods like these would be horrible for performance.


I invite you to look at the ridiculous amounts of JS lines twitter uses in its interface.


You're the one who made the 10ms claim, not me. If indeed it is faster than you claimed, then maybe it would be ok. Still probably not worth engineering time for a problem of this magnitude.


And Twitter is horridly slow on its website. That's what we're trying to avoid


It is a huge amount of CPU and memory and wasted resources to correct a problem sometimes and only for some edge cases.


It corrupts Unicode data because it splits on code points instead of grapheme clusters. When performed on 'Spin̈al Tap', it splits the base character U+006E (LATIN SMALL LETTER N) from the combining character U+0308 (COMBINING DIAERESIS) and results in the string 'Spin<span class="s_char">̈</span>al Tap', which contains the valid Unicode grapheme cluster '>̈'! If you were to split on grapheme clusters instead, the result would be 'Spi<span class="s_char">n̈</span>al Tap'. However, I still wouldn't support that solution because it could negatively affect text segmentation used by search engine indexing and natural language processing tools.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: