English Syntax Highlighting

foobarbecue · on March 16, 2016

It would be interesting to see the major parts of speech (nouns, verbs, adjectives) colored. Instead this is a coloring of fairly random words. A bunch of short words are grey, but they don't belong to any particular part of speech. They include some articles, prepositions, conjunctions and a few verbs...

wodenokoto · on March 16, 2016

Those are stop words. You usually remove them from a text before analysing word frequencies because they are so common to all English text that they don't tell you anything specific about the current text.

I don't know what the rational is for greying them out is, but that's the category.

elros · on March 16, 2016

Sure, but "stop words" are a computing concept as opposed to "parts of speech" which is a linguistic one, which I think is related to his point.

https://en.wikipedia.org/wiki/Stop_words

https://en.wikipedia.org/wiki/Part_of_speech

cyphar · on March 16, 2016

I think we'd need to solve computational linguistics for a completely accurate parser to be able to tag the words properly. Although, the current state of the science works for something like 80% of cases (which are the "easy" ones).

verroq · on March 16, 2016

Sorry but POS tagging is pretty much solved already. It is already in the 97+% [1]. Current papers are now mostly improving it by less than one percent.

1. http://nlp.stanford.edu/pubs/CICLing2011-manning-tagging.pdf

Gorgor · on March 16, 2016

It's true that POS tagging works fairly well. But consider that a sentence involves more than one word. Even at 97 % accuracy for one word, the probability of correctly tagging every word in a short sentence of only ten words, is still as low as 0.97^10 = 0.74. And sentences are generally longer than ten words.

And as POS tagging is usually only done as preprocessing for some other task like syntactically parsing a text (which itself is usually preprocessing for yet another task), 97 % accuracy per word is not as good as it sounds. Parsers need to work with wrong data for every second or third sentence.

philh · on March 16, 2016

Indeed: the first paragraph of the linked paper says "Current good taggers have sentence accuracies around 55–57%".

(This surprises me. I would expect accuracy for different words in a sentence to be correlated, you either make no errors or several.)

Gorgor · on March 16, 2016

Whoops, I didn't even look into the paper. Kind of makes my comment superfluous …

glup · on March 16, 2016

For the record, 97% on canonical test datasets with little recent progress doesn't mean that it's a solved problem. Admittedly, part of the problem is that elementary school POS categories aren't a great model of natural language.

More generally, for true syntax highlighting I think you do need the parse tree, and parsers (as opposed to taggers) definitely aren't at 97%.

Al-Khwarizmi · on March 16, 2016

They definitely aren't at 97% but they aren't that bad either. For English they are at around 92% (see for example http://arxiv.org/pdf/1603.04351.pdf) in labelled attachment score (right tag+right label).

If you are interested only in the syntactic tag, not in the structure of the tree, the number is somewhat higher.

arnsholt · on March 16, 2016

Yes, you can get 97-98%, but only when evaluating on data from the same corpus as you trained on. If you evaluate on data from a different corpus, you immediately get a pretty big drop in performance. Thus one person in the field I've talked to even went so far as to say that competing in this part of the field (state-of-the-art performance, basically) is fundamentally a question of "who is the best at overfitting".

There's basically no part of NLP that's a solved problem. Even something as superficially simple as segmenting running text into sentences and tokens is decidedly non-trivial.

cyphar · on March 16, 2016

To be fair, I'm not a computational linguist (some of my friends did their PhDs in the field though). From what I remember, one of the most glaring issues in the field is that the most used corpus is a bunch of issues of the Wall Street Journal (which is a very specific data set).

The 80% figure was quoted from some talk I heard a few years ago, so I concede it's almost certainly improved since then.

Someone · on March 16, 2016

On the other hand, 97% sounds impressive, but also would, on average, be slightly less than one error in your post (and more than one in this one)

Graded as a school exercise, I think 97% wouldn't be that good.

personjerry · on March 16, 2016

I think a naive approach would do quite well. We're not looking to comprehend the sentence entirely, rather a fairly good accuracy, so that the reader only needs, usually, focus on the highlighted words. For example, serif'd, bolded verbs and noun would retain most of the information while filler words like "the" and "and" might not be necessary.

The consequent highlighted parts reminds me a lot of Chinese, where essentially every word is "important" and there are few filler words, and hence there is often a lot of contextual information.

foobarbecue · on March 16, 2016

Looks like http://parts-of-speech.info/ does a pretty good job at detection; too bad the styling is so garish.

SixSigma · on March 16, 2016

One site I use to improve my prose is

http://hemingwayapp.com/

not quite what you're proposing but useful nonetheless

yegortimoshenko · on March 16, 2016

It is not really interesting, see: http://www.linusakesson.net/programming/syntaxhighlighting/i...

userbinator · on March 16, 2016

I thought of that article too, and while this highlighter isn't quite as colourful as that example, it still felt more distracting to read than monochrome text.

yAnonymous · on March 16, 2016

You can stop spamming that blog now. The author has a wrong understanding of what syntax highlighting tries to achieve and because of that he comes to questionable conclusions.

It's really just a way to get people to visit his blog by making a bold statement.

koehr · on March 16, 2016

Why is that spamming? Only because you don't share the writers opinion doesn't meant it doesn't fit here, right?

cced · on March 16, 2016

There's a cool app on ios which does this, iA Writer. I would love to know how they go about doing this.

susi22 · on March 16, 2016

In German (and other languages) you capitalize every noun. When I was younger I found it confusing that English didn't do that. It seems that this kind of syntax highlighting allows your brain to read text a little bit faster since it instantly knows that this will be a noun. It's also a little annoying if people write German and don't properly capitalize, your brain just doesn't expect the word to be noun if it's not capitalized.

* http://www.ruediger-weingarten.de/Texte/Capitalization.pdf (pg 4ff, last paragraph)

* https://mindmodeling.org/cogsci2013/papers/0462/paper0462.pd...

* http://linguistics.stackexchange.com/questions/699/does-capi...

ubernostrum · on March 16, 2016

English used to do this. You can still see evidence of it in things like the US Constitution, where the capitalization seems random until you remember English is a Germanic-family language and did this as recently as a couple hundred years ago:

We the People of the United States, in Order to form a more perfect Union, establish Justice, insure domestic Tranquility, provide for the common defence, promote the general Welfare, and secure the Blessings of Liberty to ourselves and our Posterity, do ordain and establish this Constitution for the United States of America

I.

All legislative Powers herein granted shall be vested in a Congress of the United States, which shall consist of a Senate and House of Representatives.

The House of Representatives shall be composed of Members chosen every second Year by the People of the several States, and the Electors in each State shall have the Qualifications requisite for Electors of the most numerous Branch of the State Legislature.

etc.

Al-Khwarizmi · on March 16, 2016

Why is "defence" not capitalized then? A typo?

Double_Cast · on March 16, 2016

"common defense" is being used as an elision of "common defense of the People". The entire phrase is a noun phrase. But " defense" is a verb within the noun phrase.

FoeNyx · on March 16, 2016

That was bothering me too.

At least, it is not a typo from ubernostrum, it's spelled "defence" in the original [1].

Moreover that's the British spelling of defense (is there a link with the lack of capital?).

Also note that, instead of "Blessings", it is "Bleſsings" in the original. But they are roughly equivalent if you apply a compatibility decomposition in your unicode normalization (NFKD/NFKC).

So where is the edit button for that constitution? Or do they only accept pull requests?

[1] http://www.archives.gov/exhibits/charters/charters_downloads...

Al-Khwarizmi · on March 16, 2016

Someone asked the same question here

https://www.quora.com/Why-is-the-word-defence-the-only-uncap...

but at least in my browser, I can't see the actual answer (it says "2 Answers" but I can only see one, which just confirms that the word is not capitalized).

lmm · on March 16, 2016

Maybe defense is considered active in a way that Welfare and Tranquility are not?

rahimnathwani · on March 16, 2016

In primary school, I was taught to capitalise nouns in titles.

So, an essay title might be "The Man and his Dog" rather than "The man and his dog".

siberianbear · on March 16, 2016

But that rule doesn't apply only to nouns: it applies to all words that aren't pronouns, conjunctions and articles. Hence, "To Kill a Mockingbird" ('kill' is a verb), "Malone Dies," etc.

legulere · on March 16, 2016

There isn't one uniform rule for headings: https://en.wikipedia.org/wiki/Letter_case#Headings_and_publi...

rahimnathwani · on March 16, 2016

That rule (the one I learned in primary school) does apply only to nouns. I'm not suggesting my primary school teacher's rule is the correct, most popular or best rule :)

Personally, though, I find many US newspapers' headlines jarring due to excessive capitalisation.

GFK_of_xmaspast · on March 16, 2016

Other languages also have more complex inclination and declension rules that add additional structure. (And for writing systems, c.f. the use of hiragana in Japanese for things like particles and verb endings).

wellpast · on March 16, 2016

Syntax highlighting traditionally chooses a different color for each token in this Java statement:

    final String id = leader(NAMES_AND_SCORES);

If I try to translate this statement into English:

    Given a global list of names and scores, determine the leader's 
    id. (Ensure that id is a string of characters.) I'll use "id" to
    refer to that leader throughout this paragraph.

If our traditional highlighting approach is generally correct for the code, shouldn't I be highlighting each sentence and/or phrase wholly with one color and not highlight per parts-of-speech?

Or in other words, does the analogy being proposed really hold?

Or another take -- speed readers take in whole sentences at a time. Colorizing parts-of-speech this way would only seem to slow them down whereas syntax highlighting code speeds my reading. I'm sure there's an analysis here; final, String, leader() are not parts-of-speech; each is a separate semantic statement.

cjhveal · on March 16, 2016

We don't highlight each line of code separately, which would be the analogue to highlighting natural language at the sentence level. We highlight tokens based on their syntactic type. Strings are all colored the same. Operators are colored the same. That's pretty much the same idea as coloring all common parts of speech the same.

Parsing is the act of taking a linear string of tokens and building a tree out of them. That means reading in a string of tokens and applying the parsing rules (which may be encoded as a set of fuzzy correlations when humans learn those rules). When the rules are not solidly codified or slow to apply due to unfamiliarity, it helps to have hints to orient/validate yourself.

You do this parsing routine with your own natural language, too. You're just much more comfortable doing so and do not need hinting on what each word's role is. Just like a lot of old-school unix guys of lore are more comfortable reading code without the spectra of colors we commonly apply today. I could see natural language syntax highlighting being very useful for language learners, though. Color is used in Chinese language learning to indicate tonalities for learners, since most have no native/intuitive way to transcribe the pitch contours. I'm not convinced that the syntax highlighting presented in the article is really what you'd want, but I'm interested in the direction it's headed.

As an aside, speed readers don't take in a whole sentence at a time. An entire sentence simply doesn't fit within your fovea, but they have optimized their eye tracking to boost their speed. I do imagine that having lots of colors would disrupt and distract from the text and harm their speed / comprehension, but it may be possible that a different highlighting scheme could work for them.

nickbauman · on March 16, 2016

I think you make a wonderful comment here, but your analogy doesn't hold, I'd say. While I applaud the effort going into this, I don't think this works for me. I read "code" for humans completely differently than I do code meant to describe computation (NLS text versus programs, respectively). IOW, there can be no analogy for me.

I think it stems from the way we read Sherlock Holmes as an experience whereas we read a program as an explanation. You cannot substitute an explanation for an experience.

duaneb · on March 16, 2016

> does the analogy being proposed really hold?

It's not an analogy, it IS syntax highlighting. You just encode a lot of semantics into syntax in java, so syntax highlighting is more useful for determining the semantics.

I don't think the OP was claiming this is useful for english in the way syntax highlighting is useful for java.

efaref · on March 16, 2016

Syntax highlighters for natural language would help against garden path sentences:

For example:

    The old man the boat.

... is not ambiguous if written as:

    {SUBJECT}[The old] {VERB}man {OBJECT}[the boat]

I think the reason syntax highlighting is important in programming is because garden-path style sentences are more common with the pedanticly strict grammars that programming languages require.

yorwba · on March 16, 2016

The equivalent of garden-path sentences in formal languages would be shift-reduce conflicts and most pedanticly strict grammars are specifically designed to avoid that kind of problem, so that they can be parsed efficiently.

efaref · on March 16, 2016

Of course the compiler has no problem understanding them, but humans aren't that good as parsing.

The purpose of a syntax highlighter is so that the human knows that the computer agrees on what the sentence/code structure is.

WillAbides · on March 16, 2016

So you're saying English would be better as a statically typed language /s

japaget · on March 16, 2016

Let's have different colors for dialog spoken by different characters, such as Holmes and Watson. That way I won't have to reread the text to figure out who said what.

foobarbecue · on March 16, 2016

That would be nice. It's often really hard to tell with long conversations in novels who is speaking. There's a tenancy to have the speaker implied rather than implicit for long conversations, and color would be a good solution. I wonder if this is ever used in cinema and / or theater.

Mindless2112 · on March 16, 2016

It's not common, but some anime fansub groups make the subtitle text match the hair color of the character that is speaking.

dayglo · on March 16, 2016

This would be fantastic for reading to the kids. At the moment I have to scan ahead to work out which voice to put on.

WillAbides · on March 16, 2016

"Even when reading a novel myself, it bothers me when I don't learn the speaker until after the I have read the quoted phrase, like in this comment" said WillAbides

gnicholas · on March 16, 2016

This is actually one of the big issues with RSVP techniques like Spritz. When you can't see the paragraph structure, it can be very difficult to track who's talking.

cyphar · on March 16, 2016

I was taught that the protocol is for the speakers to alternate if no other information is given in the lines. In all other cases you need to restart the speech block with something like "Watson continued, "..."".

scott_s · on March 16, 2016

Sure, but sometimes I get a little lost - my eyes wandered away from the page, I got distracted and started thinking about something else, or there were too many back-and-forths for me to keep it straight.

baby · on March 16, 2016

We generally add points and comma to emphasize parts of the text, slow down, stop the reading... Try reading the text out loud without the syntax highlighting, then try with. See? It doesn't work, you're going to emphasize and stop on higlighted words. What would have worked would have been to highlight everything and split the highlight according to punctuation, not words.

As someone else pointed out, different colors for different protagonist talking would be neat as well. But there is not much you can't do without deforming the original intent of the writer.

Another thing that could be interesting would be to give these tools to the writer instead of highlighting the text automatically, but then as someone else pointed out, these tools already exists and are rarely used because of the noise they add (bold, italic, underlined). There are also quotes, quads, uppercase, ... There are many ways to help the reader follow the text.

Pamar · on March 16, 2016

Sorry, but when we speak of "syntax highlightning" for natural language what is the actual goal?

Are we trying to color differently verbs, nouns, adverbs and prepositions? So that the "goal" is to properly decide if "lie" is a verb or a noun?

Or are we trying to colour subject, verb, object and other elements of the sentence? So that "Rome destroyed Carthago" will have a different color for Rome then the sentence "Hannibal tried to destroy Rome"?

In general, code has "reserved words" and the rest is either a "name" (variables, constants, literals) or a ... Well... "Verb". Like functions, procedures, methods.

In some rare case (function pointers, closures) you have "verbs" that can be used as "nouns" but you completely lack concepts like dative, accusative and so on.

I think that this really breaks down as an analogy when you try to adapt syntax parsing to natural language.

sound_of_basker · on March 16, 2016

I can tell you right away that it would be of huge help when reading a new language especially in a new script. And who are biggest set of users that might benefit from this? (clue: they haven't said their first words yet).

Pamar · on March 16, 2016

Which one? Sorting out names, verbs, adjectives etc. Or finding out what "role" each word is playing (sorry, English is not my language so I do not know the proper technical terms for this... In my culture the first is called "Grammatical Analysis" while the latter is called "Logical Analysis")

_vya7 · on March 16, 2016

Syntax highlighting works great for code, because code itself is highly structured, mostly normalized data (hence BNF being a thing). Even though we call them "languages", they're much closer to spreadsheets than natural languages. That's why it's so useful. Code isn't meant to be read from left to right, top to bottom. There's a lot of back-and-forth skimming to understand what the code is doing. Natural languages are meant to be read from beginning to end. We do skim, but it's almost entirely contextual, and only slightly syntactic. They're just not alike enough for things like this to make sense.

jordigh · on March 16, 2016

As an armchair linguist, I get frustrated by how computer nerd types keep comparing programming languages with natural languages. The two have very little in common other than some superficial similarities. The way each is acquired, used, and evolved is very different from the other. One of the worst examples of this bad analogy is sigils in Perl, which Larry insists are supposed to be good because they mirror plurals in English.

Mikhail_Edoshin · on March 16, 2016

The analysis itself could be IMMENSELY useful in things like typesetting. There's a whole assortment of stylistic rules based on rather subtle things (e.g. the space between initials is usually slightly smaller) and they're rather hard to automate. If it were possible to parse and tag text like that, it may give way to advanced typesetting algorithms.

By the way, syntax highlighting works for code only as long as you see the same color scheme. If the scheme changes, the benefit is lost.

lalaithion · on March 16, 2016

This is interesting. I wonder if any publishing <font color="blue">house</font> would print a book with this idea?

harlanlewis · on March 16, 2016

Putting “house” in blue reminds me of House of Leaves, a book that plays with formatting & typography in unusual ways…

very minor spoilers

…as it descends into madness (blank pages, words in spirals, backwards characters, single-character pages, overlaid paragraphs…). The very first unusual formatting in the book, and spit-take surprising to me as I wasn't expecting anything unusual at all, was simply printing the word “house” in blue. A fun read, thanks for reminding me of it!

https://en.wikipedia.org/wiki/House_of_Leaves#Colors

lalaithion · on March 16, 2016

Yes, that was the intent.

gnicholas · on March 16, 2016

My startup [1], which also uses color in order to increase text readability, is launching a pilot program with an on-demand textbook publisher. But your point is well-taken — it has been a long slog to find our first couple publishing partners, even with solid independent research showing benefits for students and other readers.

1: http://www.BeeLineReader.com

slackstation · on March 16, 2016

It's cute idea but, feels distracting. Maybe a nicer color scheme would work better.

ori_b · on March 16, 2016

To be fair, I've started to feel the same way about syntax highlighting in code. I've slowly moved to only highlighting 2 things: String literals and comments.

_pfxa · on March 16, 2016

I'm nowadays using a theme in Emacs where everything is black on white and keywords and comments are bold, whereas code regular. I'm happier with this than syntax colouring nowadays. I've also removed some colour from Org, namely headlines are bold and black. Again, I guess the less the colours the better here. I use colours with parens, they're pale by default, then highlighted with highlight-parenthesis mode, denoting nesting via tones of red.

I'm even doing some html-css-js-m4 work for a relatives business website nowadays, and I did not miss highlighting even with such complex mess, instead, I'm happier.

marssaxman · on March 16, 2016

I like to have language keywords accented too, since that helps me scan the shape of the code without having to read it closely, but yes - most of the benefit is just in knowing what context you should read the characters in.

Mikhail_Edoshin · on March 16, 2016

Same here. I do not highlight anything though: just white text on black background.

kpil · on March 16, 2016

I found it to be very distracting. The reading process stopped on every colour change - especially punctuation and conjunctions, etc.

Larger blocks of cited text might work.

yxhuvud · on March 16, 2016

You find that if you scroll down, but yes, darkening the interjections is not helpful.

kpil · on March 17, 2016

I saw it. I could not really tell if it was good - as it was now with the dark text.

I can't help to think that this was done by someone that does not read that much.

Younger people seems to prefer videos instead of READMEs, and I have not really understood why until I saw somewhere that speedreading skills apparently have been falling drastically among young people. I mean why look at a video for 30 minutes to see if a tool or framework is worth trying, when you can scan the equivalent text in 30 seconds.

scott_s · on March 16, 2016

The first book I read was "The Neverending Story", which used green for text that took place in the "real" world, and red for text that took place in the fantasy world in the book the protagonist was reading. That's not syntax highlighting, but structural story highlighting.

I would actually be interested to read a novel which did something similar: one color for narration (probably black, as it will be most common), and then a different color for each person speaking. That would be useful in a similar way that I find syntax highlighting useful: I could instantly look at text, and without even reading it, know who said it.

Narration text could also take on different colors, similar to in "The Neverending Story". How it is done, and how obvious its meaning, could even be a part of the art. That would be far more interesting to me than English syntax highlighting, to the point that if anyone knows of a book that does this, please tell me, because I would read it just to experience it. The point here is that in fiction, it's not the parts of speech that matter to readers, that's just a means to tell the story. What matters are the elements of the story, and communicating those elements visually could be interesting and useful.

chrismcb · on March 16, 2016

Wow that was hard to read. We don't need to highlight prose. We don't read prose the same way we read code.

jacobsenscott · on March 16, 2016

I found that difficult to read. Most editors go way over the top doing syntax highlighting and to me it makes the code harder to read.

I switched to a gray scale theme (emacs tao theme) two weeks ago and it is so much better.

istib · on March 16, 2016

Interesting, this follows the approach of sentence blocks, rather than syntax. I've written an Emacs plugin for lisp blocks (https://github.com/istib/rainbow-blocks) and one for English syntax (https://github.com/istib/wordsmith-mode) using NLP tools.

SloopJon · on March 16, 2016

I'm getting this error when trying to install wordsmith-mode using package-install: `http://melpa.org/packages/wordsmith-mode-20140203.427.el: Not found`

cosmicexplorer · on March 16, 2016

I actually made a little thing for emacs that does something similar, actually tagging on the parsed parts of speech instead of just tokens. It requires you to select the text first instead of automatically highlighting, though. It uses coreNLP which was pretty lovely to work with.

https://github.com/cosmicexplorer/speech-tagger

joepvd · on March 16, 2016

Highlighting the quotes reminds me of a gripe I have with quoting styles in print: When a quote consists of two paragraphs, the first paragraph does not get an ending quote:

    He said, "The first sentence.

    "The second sentence," he continued.

Somehow my mind gets triggered pretty intensely by these unbalanced quotes.

Does anyone have some background?

skykooler · on March 16, 2016

I believe the reasoning for this is that if you had two people talking, and there were end quotes after the first sentence, it would be parsed as the second sentence being said by the second person.

joepvd · on March 16, 2016

It might be. Most of the time, however, it is just a single person getting a longer quote.

azdle · on March 16, 2016

This isn't actually mine, just something I stumbled on. Before I go and pull this apart to do it myself, does anyone know of an extension that will do this to arbitrary text in Firefox?

I found this because someone was using it as an argument against syntax highlighting for code, but I actually find that it lets me read significantly faster.

bjterry · on March 16, 2016

You may be interested in BeeLine Reader [1]. It puts a color gradient on alternating lines to text to hopefully let you read faster. I think it works a little bit. When I really want to read an entire article but don't want to invest too much time, especially if it's somewhat fluffy, I'll use Spritzlet set to 700 wpm [2].

1: http://www.beelinereader.com/

2: http://www.spritzlet.com/

gnicholas · on March 16, 2016

Interestingly, when people see BeeLine Reader for the first time, many of them think (incorrectly) that it's a sentence-based algorithm instead of a line-based algorithm. Their belief often persists even after being told (by me, the creator) that it is in fact line-based. We've thought about doing something that's sentence-based, or syntactically or semantically aware, but as others have pointed out those tasks are much more complex.

ntoshev · on March 16, 2016

I've experimented with this for a while, the goal being easier comprehension of new text. I found highlighting parts of speech and syntactic groups didn't work for me. The only thing that did work was highlighting keywords (that are specific to the text) and maybe named entities that refer to the same thing. Interestingly, there is some research indicating highlighting keywords may help people with dyslexia.

I've written a chrome extension for highlighting keywords, too bad I don't currently have time to give it the love it deserves:

https://chrome.google.com/webstore/detail/highlit/cooahmcpma...

xemoka · on March 16, 2016

It'd be interesting to integrate http://www.beelinereader.com/ style gradients. I had a hard time with the quick change from gray to white.

smsm42 · on March 16, 2016

I see a lot of negative feedback here but I must say I tried reading it and it looks like a noticeable improvement. I can't put my finger on what exactly improved but somehow it reads better I think than just text.

jaakl · on March 17, 2016

For me the point would be to make reading main point of the text faster and more precise. Why not render based on meaning of the words, e.g. "love" and "sex" should be red, strong words in bold, weak in gray, "emotion" in tone based on particular emotion etc? Grammatical terms are not even that interesting to me, and any syntactic sugar can be made softer.

jolux · on March 16, 2016

If you want a Mac app that does this, iA Writer has a feature called Syntax Control: https://ia.net/writer/updates/ia-writer-3-1-comes-in-colors

I use it sometimes, it works pretty well but occasionally gets confused.

rekshaw · on March 16, 2016

Interesting concept although IMHO poorly executed: I found it harder to read than non highlighted text as critical ligature words were faded (like "and"). these should not be faded but emphasized (a bit like ampersands in programming languages)

shahzeb · on March 16, 2016

On a somewhat related note, has anyone on here successfully mastered the art of speed reading? I'm a CS student at Uni right now, and compared to non-Engineer students, I've noticed I tend to read must slower. Any tips on up-ing my read speed?

hgh · on March 16, 2016

If you try to read all the words, just faster, you won't get much farther. The trick is to take a more active approach and optimize how you spend your time towards the goal of comprehension.

One thing that worked well for me was shifting from "reading" to "interrogating". Don't just try to read through a text, think carefully (and jot down) what questions you need to answer from the text, and jump around the text as necessary to answer the questions. If you don't have a sense of what questions to ask, do a quick skim of the text and any other relevant material to get the questions first, then dive in. Iterate and refine your questions and answers.

andersonmvd · on March 16, 2016

I'd focus on understanding rather than reading speed. If reading faster compromises your understanding, back off. One more thing: the best way to see the need to read fast is to be overwhelmed by browser tabs. When you are, try to kill them after reading. With so many tabs, you'll want to end it up quickly.

andersonmvd · on March 16, 2016

In the presented example, gray Highlights within white phrases negatively affected my reading, but the white highlights within the yellow text helped I guess. Nonetheless, congratulations for experimenting.

return0 · on March 16, 2016

Language is already highlighted. It's called Bold, Italics, Underline.

gwern · on March 16, 2016

The problem is, if you use italics/bold/underline remotely as often as necessary to convey intonation or emphasis, you read like you're a loon or crank. Bolding is the green ink of the Internet.

betenoire · on March 16, 2016

That's a good point. Also, syntax highlighting helps me navigate my code by sight, something I don't need to do when I'm reading.

stevebmark · on March 16, 2016

This is a very interesting idea! How about syntax highlighting for proper nouns, using a preset cycling color tables to give proper nouns consistent colors?

so4pmaker · on March 16, 2016

It's an interesting idea to highlight English. However, I believe it needs to be a little more intelligent than highlight.js.

Animats · on March 16, 2016

In some early books, the open class words are capitalized. This style lives on in titles.

clux · on March 16, 2016

I would totally install a sublime plugin or something like that for this.

chris_wot · on March 16, 2016

Similar to Red Letter Bibles, I think this gets in the way of the prose.

pesnk · on March 16, 2016

This is amazing Thanks so much author.

stephenitis · on March 16, 2016

please make this a chrome extension that wraps all p tags in my browser.

I'm very curious.

Pxtl · on March 16, 2016

Red periods were a bad idea. Also ew.

azdle · on March 16, 2016

I don't know. That was my first reaction too, but I started reading and after a couple of paragraphs I found them to be helpful. The red periods actually let me read faster because they were easier to spot.

peterburkimsher · on March 18, 2016

In British English, periods are always red. Full stops are usually black when printed.

Pxtl · on March 18, 2016

I'm wondering if the visual pun was deliberate.

languagehacker · on March 16, 2016

This isn't particularly interesting as art, and it's beyond incorrect as far as any formal theory of natural language goes, but you do you, playa