Hacker News new | past | comments | ask | show | jobs | submit login
Thanks, HN Here are the vocab survey results from all your participation (testyourvocab.com)
118 points by crazygringo on July 25, 2011 | hide | past | favorite | 32 comments



In the 1990's the scoring system changed for the SAT's [1]. For example, a 700 verbal for someone on their 40's would be roughly equivalent to a 760 score for someone in their 20's. Did the data used for the graph correct for this? Otherwise, I would have expected to see an uptick in the vocabulary size around the early 30's age range.

[1] http://www.greenes.com/html/convert.htm


Quite true, and we have not corrected for that yet -- the graph is based on raw reported scores, regardless of age. So it would be expected that real vocabulary scores should be slightly lower for ages 20 and 25 than they appear. (Or slightly higher for ages 30+, depending on how you want to look at it.)


In January 2005, the SAT tests also got rid of Analogies. On top of that, the SAT changes every year to a small degree. You may be right, but I'm not sure if it's worth accounting for changes like this. Even accounting for these changes, I don't think the presented trend will change significantly.


Really? That's too bad. The analogies used to be a significant factor in the tests, and I thought they were an important measure of a test taker's ability to reason. A few years ago I read that most U.S. high schools don't teach proofs any more in math. It seems like we're dumbing down the critical thinking requirements for "kids these days".


The problem with analogies is that in practice, what they tested was 10% reasoning, and 90% "has this student seen SAT-style analogy questions before." A student from a mediocre school would be at a severe disadvantage because they would have to spend a non-trivial amount of their limited time figuring out what the question was even asking and getting used to the notation, while someone who went to a good school could skip straight to the reasoning. SAT prep classes also focused heavily on analogy questions, which created the perception (dunno how accurate) that analogies in particular were a better measure of your parents' wallet than your own brain. That the analogy questions were poorly explained on the test is probably a solvable problem (ditch the : :: : notation!), but they still require non-trivial explanation, which will inevitably give an advantage to students familiar with the format.


I think analogies are important too. They require the students to consider meanings on multiple levels and establish common connections. More than that too, they require students build up a semi-decent vocabulary.

As for the math, that may be a product of the No Child Left Behind policy. "If we can't get kids to pass, we'll just make the work easier!" The natural way towards advancement, would be to make education increasingly tougher, to demand higher standards for youth, in this ever-advancing world. How are we getting anywhere by moving backwards?


lmkg is correct. The SAT is not a test that tests your knowledge; it's a test that tests how many test prep classes you've taken. It's shaped by money and politics, and not by how to better assess students' knowledge.


I offer my sole data point with no value-laden broad-application. I never took a single prep-class, nor read a prep-book, and went to typical public schools for my education. I scored a 730 on the SAT reading section.


I would be interested to know if the flat spot between 12 and 15 persists with larger sample sizes. That seems the most surprising thing here.


Based on past conversations with my mother (teacher) and a friend (linguistic researcher, can't for the life of me remember what it's actually called) I would actually expect that to be the slowest period of learning in this area - slowest, but not that flat. So my suspicion is that it's 80% down to lack of data.

(But equally possible that I'm proven wrong, as it certainly isn't my area of expertise and this knowledge comes from purely random conversations in the past.)


I'd like to see per gender breakdown. It's interesting to check past research[1] against new data.

[1] http://www.scientificamerican.com/article.cfm?id=are-women-r... [Girl Talk: Are Women Really Better at Language?]


Yes, the data before 15 is still pretty spotty, not too many preteens participating -- I would be very surprised if it persisted.


You could always make another graph that includes confidence intervals! (Please)


I also would love to see some confidence intervals and p values, just out of personal interest.

Either way, really glad you did something with the data beyond collecting it!


We'd definitely like to, and hope to in the future.


Thinking back to that age, school and learning was not really much of a focus for me or my peers, we were more interested in what to do with the hormones flowing through our systems.


For myself at that age vocabulary drills present in the earlier stage of schooling were subsequently dropped moving to the next level (primary school to secondary school). I presume that junior high and middle school in other systems occupy that space and might be breaking with the teaching patterns, and teacher training, of the elementary school level. But, as always, more data would be better than speculation.


I took the test, and while a cool idea, I still have a problem with the fact that it doesn't actually test your functional vocabulary, but rather tests your perception of your own vocabulary.

I do appreciate that doing something more sophisticated would be considerably harder.


Agreed. I'm a native English speaker, read constantly, and get the dictionary.com word of the day via their iPhone app. I also scored below average on this test... so I'm calling "bs" on the data.

I think it would be cool to try out something like the game Balderdash. Have the participants select the correct use of the word by selecting the appropriate usage within a sentence?

Also, who actually remembers what they scored on the verbal part of the SAT??


When I took this test I scored slightly under the median for my age group. Determined to improve, I looked around for a vocabulary-words app for Android. I didn't see anything quite as simple as I was looking for, so I threw this together:

https://market.android.com/details?id=com.millertinkerhess.a...


Oh wow! I didn't realize you were listening here - how did you come up with the word groups? Are there statistically related "levels" of words (e.g. "if you know this word you're 92% likely to know this one, too?") In general, what kind of structure does vocabulary have?

(I immediately had my kids take the test when I discovered it last week - all in the interest of Science!!)


That would be a very interesting question. But without testing people's entire vocabularies (as opposed to our sampling technique), I think it would be difficult to detect.


But how did you come up with the sampling words?


I would imagine that, of all the "outside the class" activities that correlate closely with high working vocabularies, reading is a major one. Those who read for pleasure tend to read more (no surprise), but also more broadly -- thus covering a wider variety of subjects, and thus encountering many more words along the way.


I live in Brazil and I'm not sure we have that homogeneous english teaching system you think, we have several hundreds schools with several teaching systems... and every school choose their own book to teach and so on.

But It would be nice to know how I am standing against other Brazilians.

Also would be nice to get a comparison between other languages native speakers like how am I comparing to other Portuguese speaking people (Portugal) ? and comparing to Germans (they usually speak very good English) ? and so on...

Btw, great project... thanks for sharing.

Took your survey 3 times and got pretty close scores (2.000 works difference between them) so I think it's pretty good.

If you need any help from a native Portuguese speakers drop me an e-mail or message...


Falaí mermão! I taught English in Brazil for a few years, what I'm talking about is the fact that, for students in outside courses (like Ibeu, Cultura Inglesa, Fisk, Brasas, Wizard, etc.) course levels are defined in a generally consistent manner (basic, intermediate, advanced) -- and many of the courses are nationwide franchises.

And I'm specifically not publishing country-specific data because the per-country numbers are not that big yet, and it says much more about who happened to stumble upon this site, as opposed to actual national differences.

Abraços!


By computing “learning rate” from the distribution of “vocabulary as a function of age” you are implicitly making a number of assumptions: the world is stationary, and your sampling is homogenous across age. A word of caution could be useful before your conclusions: by looking at the graph one could also come to the conclusion that the quality of education is decreasing.

Nonetheless, very interesting data. Thanks.


Note that the data is still a big jagged

s/big/bit/

I'd be curious to see confidence intervals and p values (or, heck, even just the number of samples that caused the given data point) for these graphs. "a bit jagged" and "pretty spotty" aren't very informative.


Thank you. This is one of the most exciting data sets I've seen for a long time.



Could you set up an RSS feed for your blog? I'm interested in further developments.


"Sure, I know what that word means!"




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: