Hacker News new | past | comments | ask | show | jobs | submit login
Evidence That Computer Science Grades Are Not Bimodal [pdf] (toronto.edu)
146 points by moyix on Sept 27, 2016 | hide | past | favorite | 140 comments



Having worked with many of my fellow graduate students to try to teach CS concepts to incoming students without any CS background my conclusion here is that we simply don't know how to teach computer science. My suspicion is that many people who learn CS usually do it inspite of attempts to teach them, not because of instruction. Think about how long it took humans to figure out how to teach reading to _everyone_. We had to break the old whole reading method and switch over to phonics, and that was decades after we had the hard scientific evidence that phonics was the missing perspective needed to get that last percentage of students to read. We are probably decades away from understanding how to teach CS to everyone, let alone getting those techniques implemented.


> Having worked with many of my fellow graduate students to try to teach CS concepts to incoming students without any CS background my conclusion here is that we simply don't know how to teach computer science. My suspicion is that many people who learn CS usually do it inspite of attempts to teach them, not because of instruction.

I have this assumption about almost all organized learning at this time. I think that's also why things like IQ/g/etc. look so important - they make the learning "in spite of" more tractable, because you have enough of a buffer to deal with even really bad information at a very fast pace, and because high IQ students will make one's education methods look good even if they're terrible.


The ironic conclusion is that the most selective institutions could offer the worst environment and still boast about high levels of post-institutional success.


Having gone to a relatively selective institution, that was definitely my impression of the place...


The biggest problem with teaching reading is that not everyone is ready or interested in learning how to read at the same age. Studying why we can't force a random group of 7 year olds to learn how to read is missing the point. We homeschool. One of my daughters became a fluent reader very early. The other when she was a few years older. You can't force people or children to learn something they're not interested or aren't ready (for various reasons) to learn which unfortunately is basically what the educational system is built around. It's really less about methods which make a marginal difference.

The problem with higher education is that we are still forcing people into those programs. If you want to make a good salary you take CS. Whether you have any interest or passion in that area is not relevant. Whatever your level of mathematical maturity is not relevant. You are of the right age and you want a successful career that's where you go. It's no wonder that teaching all those people CS is going to be difficult. It's still the same faulty approach to education. Everything can be taught and everyone can be taught and if it doesn't work we just need to find a better way of force feeding them.

Compounding that even further is that teaching is difficult under the best of circumstances. In higher education it's often researchers who are asked to teach (or students) and while they may be brilliant on an intellectual level it doesn't necessarily make them good teachers.

EDIT: And a thought on the bi-modality ... I would expect to be able to see this sort of modality at the output of a CS program. I would expect modality amongst students who graduated with the same grades. I would be really surprised if students graduating with the same grades perform similarly in software development tasks. Not clear to me what a CS grade means and how they relate to any natural abilities to e.g. write computer programs.


This analogy fails when you consider Chinese and similarly encoded written systems. Sure, there are methods employed that use radicals in order to provide phonetic "hints," but written Chinese is predominantly acquired through rote memorization. Many countries that use written Chinese today have comparable literacy rates to countries that use syllabic and phonetic systems. What have they done right that took phonics for us to do?


I think the difference here is that reading is considered to be so important that people don't just stop if a child doesn't learn it. The child will end up spending an enormous amount of time studying something when another technique might allow them to learn it more quickly.

There is no similar equivalent for learning how to program. No one is going to prevent a student from graduating college until they figure out how to read and write a computer program, and you sure as hell can't prevent them from going out to play.


That "old whole reading method" seems to be a 20th-C. thing: https://en.wikipedia.org/wiki/Reading_education_in_the_Unite... preceded by phonics-ish learning.


I have taught some, and the most successful technique I employed was much the same as what a compiler does with chunking packets of instructions for the assembler.

It's fairly logical, actually very logical, inasmuch as it is to think that the human mind can process things in some discrete amount, and then move that data to the hindbrain and ready the visuo-spatial cortex for more.

Repetition is key for this to happen, yet we don't really reiterate a lot (well, in my humble experience) of the things we learn as CS students, other than the bajillion attempts to code properly on our own. We do things maybe twice, three times, then we take some notes or move on to a different part of the subject. The brain needs a 'lot' of repetition to have something down like we do things that require muscle memory.

So, how to make something that would, on its face, appear tedious into a feature set?

Well, again, from my humble teaching experience, and to extend the muscle memory analogy: muscle confusion!

Mix up the media used to teach, and be creative about it. Don't just copy paste a slideshow from a dataset that some UW prof has put together, and then add a video from a YouTube tut, and call that a multimedia presentation (no offense, I know teaching is fucking hard, I'm being ideal here). Instead, MAKE UP a cool little ditty about what constitutes a class declaration (punny, no?) and then couple that with a dance that entails the two dancers to be logical constructs like XOR and NOT and have a little memory register ho-down...lol.

Ok, well like I said, muscle confusion, but with the various lobes of e mind being entertained and enervated, booooi (said to the voice of Flava-Flav).

IF in doubt, simply go piece by piece, building larger chunks of data for he students to learn and reiterate over each new chunk aggregating the last until the entire set of concepts is learned well enough to be determined comprehensive.


>We had to break the old whole reading method and switch over to phonics, and that was decades after we had the hard scientific evidence that phonics was the missing perspective needed to get that last percentage of students to read.

https://en.wikipedia.org/wiki/Phoenician_alphabet#Spread_of_...


I'm not clear on what the link is supposed to be getting me. Phonetics and Phoenician are two different things. Right?


I agree that the post is severely underspecified. I think the point is the last paragraph of the linked section:

> Phoenician had long-term effects on the social structures of the civilizations which came in contact with it. As mentioned above, the script was the first widespread phonetic script. Its simplicity not only allowed it to be used in multiple languages, but it also allowed the common people to learn how to write. This upset the long-standing status of writing systems only being learned and employed by members of the royal and religious hierarchies of society, who used writing as an instrument of power to control access to information by the larger population.[11] The appearance of Phoenician disintegrated many of these class divisions, although many Middle Eastern kingdoms such as Assyria, Babylonia and Adiabene would continue to use cuneiform for legal and liturgical matters well into the Common Era.

That is, Phoenician was the first wide-spread phonetic script. That, in turn, had major effects on civilizations as it was simple enough for commoners to learn. That being the rather tenuous link back to the original idea of teaching reading based on phonetics.

Of course, there is a bit of a gap here, in that English phonetics are notoriously unstandardized across the lexicon.


> Phoenician was the first wide-spread phonetic script. That, in turn, had major effects on civilizations as it was simple enough for commoners to learn.

The cuneiform of the time was, at heart, a syllabic script; while a phonetic script is arguably simpler, it's not much simpler. The real problem with cuneiform is not that it was syllabic instead of alphabetic, it's that you had plenty of options other than syllabic representation, as well.

(And written Egyptian, the parent system of the Phoenician alphabet, was already genuinely phonetic! It didn't have vowels, but neither did the Phoenician script.)

All that said, that sort of shortcut-encrustation seems to develop naturally within all writing systems. If I send someone "have a good day" and get back "u 2", that's exactly the kind of shortcut that we blast ancient writing systems for using. But that shortcut, in modern English, actually represents a very high literacy rate, and use of the writing system by someone who can't be bothered to observe its formal conventions. Under low-literacy conditions, educated people tend to scrupulously observe whatever weird conventions their writing system might have. That's how you know they're educated.


I'm assuming by Egyptian you mean Hieroglyphics. (They also used Hieratic and later Demotic.) Hieroglyphics are a mixed phonetic-logographic system. Even on the phonetic side, there are many ways to spell the same set of consonants, thanks to the existence of digraphs and trigraphs and repetition. The vowels probably differed, but those aren't written. The spelling for a particular word was usually fixed, though.

Wikipedia has an excellent example using nfr in its determinatives section:

https://en.wikipedia.org/wiki/Egyptian_hieroglyphs#Determina...

So, Hieroglyphics had phonetic components, but it was far from being a phonetic abjad like Phoenician. (An abjad is an alphabet without vowel markers.)


Hieroglyphics are a mixed phonetic-logographic system in the same way that English is a mixed phonetic-logographic system, where the second person pronoun can be represented by "you" or by the logograph u. How much is that hurting our literacy rate?


"u" and "you" have exactly the same pronunciation /ju/. "2" and "two" and "to" and "too" all have exactly the same pronunciation /tu/. That is, phonetically there is no difference -- again, English has terribly inconsistent phonetics. These shortcuts arose due to texting applying negative pressure to the length of words, both in maximum message size and ease of input.

Compare this to Hieroglyphics, where the glyph representive of a loaf of bread could mean either an actual loaf of bread or the abstract phoneme /t/. Specifically, a logogram represents an idea regardless of its pronunciation. This is not what we see in "text-speak", where words were replaced with shorter methods of achieving the same phonetics. Or, in other words, reading the text "u 2" is /ju tu/, which is exactly the same as what you would read with "you too". The matched phonetics are an integral part of this replacement.


> Compare this to Hieroglyphics, where the glyph representive of a loaf of bread could mean either an actual loaf of bread or the abstract phoneme /t/. Specifically, a logogram represents an idea regardless of its pronunciation. This is not what we see in "text-speak"

"u" does not have the pronunciation /ju/, it has the name /ju/.

And we specifically don't see that egyptian glyphs "represent an idea regardless of its pronunciation". Look at the story of the decipherment:

> Champollion focussed on a cartouche containing just four hieroglyphs: the first two symbols were unknown, but the repeated pair at the end signified 's-s'. This meant that the cartouche represented ('?-?-s-s').

> Champollion wondered if the first hieroglyph in the cartouche, the disc, might represent the sun, and then he assumed its sound value to be that of the Coptic word for sun, 'ra'. This gave him the sequence ('ra-?-s-s'). Only one pharaonic name seemed to fit. Allowing for the omission of vowels and the unknown letter, surely this was Rameses.

( http://www.bbc.co.uk/history/ancient/egyptians/decipherment_... )

That's the glyph "sun" being used because of the phonetic value of the word "sun" coinciding with part of a name. The idea goes unused (at least, the glyph is not marked with the logograph mark). That's precisely what we see in text-speak.

(Side note: while the vowel of Coptic "sun" might be /a/, this is a rare case where the ancient Egyptian vowel is known, and it is /i/. Fortunately, the general omission of vowels makes this mistake irrelevant.)

> "2" and "two" and "to" and "too" all have exactly the same pronunciation /tu/

This is not correct. "to" is a clitic; its pronunciation differs from the others, which means in particular that "to" and "2" have different pronunciations. That doesn't stop "2" from substituting for "to" on occasion.


When you see an isolated letter "u" in English, you say /ju/. When you see the word "you", you also say /ju/. This fact is what allows "u" to be shorthand or abbreviation for "you". Perhaps for you there is no distinction between a shorthand and a logogram. But, in that case, I don't see how you also couldn't argue that "you" is a logogram for the abstract idea of the second person.

In reality, we separate between logograms and alphabets by the use of symbols representing ideas or sounds. In the English example, "u" is being used as a non-traditional phonetic digram /ju/. Your example from the Rosetta Stone was a phonetic use of the symbol. There are also non-phonetic uses, where the symbol represents the sun, regardless of its phonetics. This is why it's a mixed system.

The Egyptians marked semivowels /j~i/ and /w~u/. Egyptologists use them as vowels, because they have to use something and that's as good a thing as any. We have no idea how any ancient Egyptian word was realized.

Everyone I know produces all the variations of "to" the same. Maybe there is a dialectic difference.

EDIT: I just had a thought. Let's use an English example where the shorthand is not isolated. Let's look at "r8" for "rate". This is obviously not a logographic use of "8", as the "r" is still necessary for the meaning. Instead, "8" is again being used as a non-traditional phonetic gram, the trigram /eɪt/. The use of "u" and "2" is exactly the same; their use is as phonetic grams and not logograms.


Sorry. I just found it amusing, that here we are, using the English alphabet (a descendant of the original Phoenician alphabet) talking as if phonetics as a reading aid is a new phenomena. But the whole reason the Phoenician alphabet became so popular 3200 years ago was it's use of symbols representing spoken sounds.


Maybe the point is that deemphasized the phonics technique for 3000 years and focused more on lessons like this:

https://books.google.com/books?id=E-diAAAAcAAJ&dq=reading&pg...


> phonics

Very interesting reading at https://en.wikipedia.org/wiki/Phonics#History_and_controvers... that phonics goes all the way back to the 1830's. I just remember the Hooked on Phonics commercials (if you're too young to remember those, see Youtube).


We don't know how to teach fuckall. We basically shove a bunch of crap and resources in students' faces, and we hope that they can manage to teach themselves anything of value. Once in a blue moon there's a teacher who truly knows how to get their students to learn well (I've had more than my fair share in my academic career, that's for sure, but even so it was a tiny minority), most "teaching" is basically just cargo culting and going through the motions to "cover the material". So of course we see a pattern of folks who don't have as much support for learning outside of school tend to be at a disadvantage except in rare cases. Most education is practically coincidental to the process, a matter of osmosis and expectations as much as anything else. And in truth if you take a recently minted college graduate off the line and do some truly in depth tests of the quality of their education you'll find it sorely lacking in almost all cases, no matter their GPA, alma matter, or major.

Education is a bit like journalism. You find in journalism that whenever a story comes out in a subject you are intimately familiar with it's riddled with errors, inaccuracies, half-truths, and sometimes worse. Which makes you wonder about the rest (it's the same). In education you find that once you acquire a sound and thorough understanding of a topic you look at the way it's taught and see a lot of busy work, almost no concentration on the fundamentals, and a constant pattern of "keeping pace" even as most of the class gets left behind and typically only retains enough knowledge to pass the tests by sheer force of will (memorization, test taking techniques, and trial and error).

For me the best STEM classes in college were always the ones where you had to apply your knowledge, but those classes were too rare and not as well attended as the mandatory classes.

Unfortunately, a lot of institutional education in STEM fields suffers from several fundamental difficulties. For one there's is the common perspective that "it's supposed to be hard", and so curricula are tuned to weed out those who can't tough it through. For another, there's a tremendous focus on rote memorization, busy work, and trivia versus practical, applied, core knowledge. Real life is not like an episode of Jeopardy, the vast majority of the time you are better off with having a sound understanding of the fundamentals and the ability to look up or pick up any specifics or trivia as needed. This is true whether we're talking about CS, math, chemistry, physics, or what-have-you. Yet consistently the system is setup to reward the folks who are able and willing to suffer through most of that useless memorization and busywork and punish those who can't or won't.

This is why you see so many adults who routinely say "ugh, I haaaated math in school" or science, or english, or essay writing, or computer science, etc. Those subjects are vibrant, fascinating, exciting, rich, and interesting areas of study. Studying math and computer science can be playful, fun, and inventive. But when crushed under mountains of drudgery all of that is lost and the result is that people end up cut off from learning those things and they lose their enthusiasm for studying.


This is mostly unrelated to the main topic of the article, but this paragraph caught my eye:

> For example, Padavic et al. [20] found that the "work-family" narrative in business is an example of a social defense: people will say that women leave the workplace because of "family", despite the large amount of evidence that women leave their jobs because of inadequate pay or opportunities for advancement [20], particularly when they see male co-workers promoted ahead of them. The "work-family" narrative is a more palatable explanation rather than to confront sexual discrimination in the workplace, and so the narrative continues. > > [20] I. Padavic and R. J. Ely. The work-family narrative as a social defense, 2013.

I tried to track down the work by Padavic et al. that Patitsas et al. are citing, and found this:

http://www.hbs.edu/faculty/conferences/2013-w50-research-sym...

However, this article by Padavic et al. is actually saying something different from the way that Patitsas et al. summarized it. Padavic et al. appear to actually be arguing that the work-family narrative is a social defense against the uncomfortable truth of excessively long working hours for both genders at the corporation, not against any uncomfortable truth about sexual discrimination. Also, Padavic et al. don't seem to discuss sexual discrimination in wages or promotions at all.

Did Patitsas et al. make an error in citing their sources? Am I missing something?


Mis-citation occurs all the time. When I was a professional scientist I could only find the source for a claim I was interested in less than 25% of the time. I would find some claim and about 50% of the time the reference cited another reference and so on until the chain broke. The other 25% of the time the reference cited said something completely different to what was being claimed. Very frustrating.


During my Ph.D. I tried tracking down the reasoning behind a couple sets of assumptions made in non-uniform torsion of beams (e.g., twisting a beam with one end fixed). It turns out that they just kept the same assumptions from uniform torsion (just twisting the beam but with no ends fixed) without checking if they still held (they don't). All because nobody thought to check if the assumptions might be in conflict. The uniform torsion models go back to the mid 1800s, while the non-uniform torsion models are from the early to mid 20th century. So, it's been over 60 years people have been working with wrong assumptions.


Thanks for tracking this down, I thought that passage made little sense indeed.


>Did Patitsas et al. make an error in citing their sources?

An error if you are very generous. It appears to be deliberately lying about what the source says to match it to the pre-selected explanation they are pushing.


"Are CS grades bimodal, or unimodal? To test this, we ac- quired the final grades distributions for every undergraduate CS class at the University of British Columbia (UBC), from 1996 to 2013. This represents 778 different lecture sections, containing a total of 30,214 final grades (average class size: 75)."

My understanding of the bimodal situation, if it exists, is that it primarily applies to earlier classes---later classes only include those who did well, or at least passed, the previous classes.

Ah, yes...

"Of the 45 classes which were multimodal, 16 were 100- level classes (35%), 5 were 200-level (11%), 12 were 300-level (27%), and 12 were 400-level (27%). For comparison, in the full set of 778 classes, 171 were 100-level (22%), 165 were 200-level (21%), 243 were 300-level (31%), and 199 were 400-level (26%)."

How about we take a closer look at those 100 level classes, hmmm?


The earlier classes are exactly the place where things like prior experience and level of interest can possibly explain bimodality. In which case bimodal grades exist, but the conclusions you can draw from their existence are pretty mundane (e.g., turns out you do better if you had a programming class in high school... surprise!) and sufficiently varied that it'd be hard to come up with any compelling interpretation of the data.

There's a survivor-ship bias in later courses. The extent of that bias can only be determined by looking at attrition rates. But it's far easier to predict and measure the major threats to validity for data in later courses.

So, if I were interested in the question "Are some people just naturally better at CS than others", I would either focus on later courses and try to determine the extent of survivor bias, or else I would look holistically at the entire curriculum. But I definitely wouldn't give a lot of weight to 100-level courses.


But if binodality is really strong in early level classes for that reason, it sorta suggests that maybe you should split into a fast and slow track, letting the experienced people maybe do 2 semesters in one class. Then the students with no experience but raw talent don't just drop off because they are having trouble keeping up with peers who have been programming since they were 11.


That is a form of "mastery-based" learning. Basically, you group students according to ability and separately teach them each topic to the point of mastery before moving onto the next topic.

The guy from the Khan academy talks about that (https://www.youtube.com/watch?v=LnVzug0mA2w), although he suggests going even further-- individualizing each student.

BTW, this is how the children of very wealthy parents often get to elite colleges in quite high numbers. The parents send them to private schools where student-teacher ratios can be as low as 5 (FIVE!). Under those circumstances, each student can get tracked very accurately according to their needs and aptitudes. It is a great way to learn, it works, but is also very expensive.


My understanding is that some schools are experimenting with this. CS is unusual (though not unique; music is an obvious other example) in that there can be such a wide disparity in experience among otherwise equally bright/talented entering students. By contrast, no one expects an incoming Chemical Engineering freshman to know much in the way of chemistry or engineering beyond their couple semesters in high school.

As someone with a fair bit of programming experience, but who doesn't really do it professionally, I can say that with one of the Intro to CS courses I took on edX, I would have been utterly and completely lost had I none or little programming experience.


I hope what will eventually happen is that a semester of CS in high school because normalized. Then we can just dispense with pretenses and admit that CS 1 is most properly an introduction to Computer Science, not an introduction to computer programming; that some programming experience is required in order to learn the basic data structures and algorithms typically covered in a top department's CS 1 course.

Then there can be a pre-100 level course which assumes zero background/experience. Like a CS equivalent of pre-calculus "college algebra" courses offered by many American Mathematics departments. The idea would be that most majors should've already learned basic programming skills in high school, but there's still an option for people who fell through the cracks for whatever reason.

But importantly, this issue is independent of the "geek gene" hypothesis the article is about.


I'd rather teach generalized sequential logic and reasoning in high school. Then the programming will naturally fall from that, if they decide to learn. However, if they don't decide to learn, then at least the class will still have some applicability to their life.

EDIT: Think of teaching at the level of what's in this Atwood post, and the related paper:

https://blog.codinghorror.com/separating-programming-sheep-f...


>'d rather teach generalized sequential logic and reasoning in high school. //

Isn't teaching programming one way to do that?

Interesting codinghorror link. What struck me was the ineptitude of the question in their example assessment "intended for students who have never looked at a line of code in their lives" ("int a= ..."). It's testing syntax primarily without having taught the syntax, it seems really worthless - all you're really going to test for is who has actually seen some programming and knows that "=" is not equals but is being used as an assignment operator. Surely a proper test for ability in programming would be to write it out in pseudo code like:

integer value a is set to 10 integer value b is set to 20 a is assigned the value of b

what are the values now of a and b.

Supremely facile.

Surely they're not really trying to test ability to guess arbitrary syntax? If they were then they'd be better with

    nni a :-# 10
    nni b :-# 20
    a :-# b

    what are the values of a and b?
At least that's not giving a free ride to those who know the previous syntax, they also have to impute meaning on unlearned symbolic strings.

Mind you how do you decide to do CS at Uni and not ever have programmed anything??? Sure there's not the degree of crossover that people imagine between the two, but it seems like choosing to do maths without ever having proved a theorem.


I'd consider programming to be a specialized topic, as it has domain issues and limitations. As an example, think of things like integer datatypes. Is it really important that we teach people that there's several "flavors" of integers based on how much memory they consume, which is directly related to how many distinct values they can represent? Obviously that's important to programmers, but I think the general populous is safe throughout their life without that information.

Also, the assignment test is interesting, even with your rewording. The barrier is in the sequential part of the topic. In something like proofs, which is the closest I remember getting in high school to logic, those statements are simply invalid together, because there is no time domain. If `a` is 10, it is 10 throughout the entire construct. Seeing `a` is `b` means that an impossible conclusion has been reached.

Finally, the last point is, well, curious. How many people decide to go into aerospace engineering without ever having engineered an airframe? I'd guess the vast majority. Requiring prior exposure to succeed in a degree is an oddity, so what is is about CS education that causes this?


> The barrier is in the sequential part of the topic

pbhjpbhj's point still stands -- the question is "supremely facile" once you get past matters of syntax. For example, most third graders could solve the problem when you make the time indexing explicit:

    a(1) = 10 and b(1) is undefined. 
    
    a(2) = a(1) and b(2) = 20. 
    
    b(3) = b(2) and a(3) = b(3). 
    
    What is the value of a(3) and b(3)?
We can go a step further for students still struggling by associating to each variable at each time step a fresh variable (bonus: when they see SSA it'll be a breeze):

    Q = 10
    
    W is undefined
    
    R = Q
    
    T = 20
    
    Y = T
    
    U = Y
    
    What are the values of Y and U?
I've had students for whom even this didn't work, so I definitely don't believe "everyone can code". But still, if 30 - 60% of your students can't solve this problem, then there's something very wrong with the way the course is being managed, because I don't even get close to that percentage with sixth graders. Of course, I take the time to explain in excruciating detail what the symbols mean.

So if I had to bet, it means that no one is sitting down and taking the time to help students get up to speed on the meaning of the syntax for this huge engineered system that was just thrown at them. Which is a real shame, because having these sorts of difficulties early in an intro course is, IME, completely uncorrelated with how the student will do throughout the rest of the curriculum.

You see the same thing when tutoring for early mathematics courses -- most students in the bottom of the distribution are having difficulty because some historical-cruft syntax is presented and not properly explained in lecture. Unsurprisingly, if no one bothers to explain what the hell the symbols mean, three more weeks of using the symbols to do more complicated stuff isn't particularly useful. So lack of movement among the groups isn't surprising, but reading off from that anything other than lack of sufficiently differentiated instruction is a fool's errand.

> Requiring prior exposure to succeed in a degree is an oddity

1. This is obviously not true for any of the natural sciences or for mathematics. It's also not true for English -- students have already done a lot of reading and essay writing by the time they get to college. Maybe it's a little more true for engineering fields, but at least there students are expected to have some background in math and physics. Oh, and also, typically take some university-level science and math alongside or prior to their serious engineering courses.

2. Requiring the mastery of large engineered systems during the first week or day of a course is also an extreme oddity.


I'll defer to your experience on the first portion. I only did a bit of CS tutoring while in college, and sequential logic was one of the big issues I saw in the intro classes. Recursion was, of course, usually the next big hump after that. I definitely understand the point about opaque syntax, and do agree that, in both math and CS, the tendency is to simply throw these things at students. Compare to the approach taken by R6RS [0]; it's very explicit about symbols and locations and values and what operations like assignment mean.

On the second point, however, I still have some dispute. In maths or natural sciences, are you really expecting prior exposure to the subject matter? Or are you expecting a foundation upon which to build further? [1] I think the latter is a more appropriate statement. However, in CS, I feel it's more the former. "You must have experience programming, so that we can teach you how to properly program." If someone already knows how to program, what are they doing in a CS program other than getting a piece of paper that certifies their knowledge? Indeed, this is the argument you see regarding drop-outs or "equivalent experience" in lieu of degrees.

This is why I argue that we should instead find those fundamental building blocks and teach those instead. Basically, extract programming from the domain of computers, and figure out how to teach that. Then learning programming is about how those concepts apply to the domain of computers, much like how aerospace engineering is about learning how physics applies to airframes. [2]

[0] http://docs.racket-lang.org/r6rs/r6rs-std/r6rs-Z-H-2.html#no...

[1] It doesn't help that something like "math" is really more of an umbrella term for a large number of fields across different domains.

[2] This is, I'm sure, a huge simplification. Hopefully you can see the intent of my point and overlook the crude analogy.


You're probably from the UK (given your usage of 'Uni'). The high school system in the US is pretty meh. You choose your major after getting to college, not when you apply. So people come to college, realise all the money is in CS, and decide to switch over.


"generalized sequential logic and reasoning" sounds like a pretty specific skill set, actually. I can't imagine wanting to teach that to high schoolers in an alternative history where lisp takes off and becomes the lingua franca of software development. For example.

I could get on board with teaching mathematical logic or combinatorics and discrete mathematics, for example. But "generalized sequential logic and reasoning" is hard for me to get too excited about because it's not a particularly useful concept outside of understanding and using imperative models of computation.

And if the only substantial motivation for teaching something is its application to some particular useful skill (programming in an imperative language), why not just teach that skill?


By the way – the study that the Coding Horror post is based on has been retracted:

http://retractionwatch.com/2014/07/18/the-camel-doesnt-have-...

But I think it's what spurred all the interest into the "bimodal hypothesis" that the current work explores.


The problem with what you're asking is that it's one more course you're trying to fit into a curriculum. My engineering department already gutted their first year calculus course by making most of it (about 3/4) pre-calculus after finding out that a majority of students weren't prepared.

I was asked to tutor a first year engineering student in programming as she was failing. I ended up spending most of the time helping her conceptualize the problem and breaking it down into steps. Once she learnt that, she could manage the syntax aspects of C++. She ended up with a grade in the 60s which isn't bad considering she had a grade in the low 40s after the midterm.

When I was a TA for thermodynamics, I noticed that we would get a bimodal distribution on the midterm exam. Thermodynamics isn't hard once you understand it. We'd get the people who didn't understand with grades in the 40s, but the people who did had grades in the 70s and 80s. The prof said that was pretty common. The objective was to get as many people from the lower curve to the upper curve by the end of the term. You'd typically have around 3 to 5 of a class of 80 fail by the end. Those were the people who just didn't get it. I'm sure that thermo is one of many courses like this.


At the high school level or the uni level?

At the high school level, I remember taking a lot of required BS courses in high school, including public speaking, cooking/child care, a business course, and PE. No one suggested gutting core areas to make room for CS. Or at least, I didn't.

At the uni level I really don't think this would add significant overhead. Take your N sections of intro and switch one or more to intro-intro. Then get other departments to allow either one to fulfill a CS breadth requirement. Students who start with intro-intro can stuff an extra CS course in at some point after the intro sequence bottle neck.


I had none of those required courses in high school aside from one PE course. In grades 7 and 8 we had combined shop and home economics courses (in each year) which I thought was a good idea since everybody learnt to cook, sew, and do some basic carpentry. I saw so many people who couldn't cook at university. That would have been different if people learnt to cook in school.

This was at the university level. The problem is that my engineering department has a hard time fitting in courses. If you need an additional CS course that's one course covering something else that doesn't get included. We already merged our separate probability and statistics courses into a single course to make room for more intro materials.

For accreditation, all of our courses are broken down into having some of: Math, Intro Science, Engineering Science, and Engineering Design (and possibly none of these). You need a set amount in each category to get accreditation. Other departments have the same requirements, but their programs are designed to meet it. My department lets the student decide most of their upper year courses from across the university. So, if you decide to take a bunch of CS courses, you'll miss out and need to make it up on other courses. By the end, you need more Engineering Science and Engineering Design which isn't hard to do if you have mostly engineering courses, but pure CS courses basically give you nothing on these.

Until you actually participate in these kinds of discussions you don't know how carefully the overall curriculum is planned. Students going through my program have had a number of excellent suggestions, but it's hard to make changes and still meet all the requirements.

My department used to have more courses per term. Originally, there was 7 a term (I think), then there were 6. They've since moved to 5. That means you get 40 courses instead of 48 or 56.


Then we can push the introduction to Computer Science into high school so that we can get directly to what's important in semester 1: databases and web UIs.

:-)


As opposed to trigonometry, which was gifted to us from the Gods without any connection to work-a-day tasks.

:-)

Efficient computing machines are an important invention. Probably one of the most important inventions of the 20th century. The fundamental principles of computation are deep mathematical discoveries. Teaching some programming isn't an unreasonable way to teach about computation.


That would be a step in the right direction. But we should just walk all the way over to the solution. Education in all areas should have n-tracks, where n is the number of students.


Do most colleges really not do this? The University of Melbourne had a split like this back in 2003.


> The earlier classes are exactly the place where things like prior experience and level of interest can possibly explain bimodality.

I saw this a lot at GT. There, many students had to take the introductory programming course at the time (all engineering and sciences? circa 2000 if someone wants to help me remember). There was a great disparity between those of us who programmed in HS (or earlier) and those who didn't.

What isn't clear is whether the poor grades were due to a lack of aptitude for the material, or a lack of skill on the faculty's part in presenting and educating, or a lack of interest or focus from those particular students (given it was their first and last CS course and they had to deal with the rest of the hell that was GT's first year).


Some of my professors at GaTech complained that half the students just "couldn't learn" computing. I believed it at the time, but now I think it's that certain styles of coding are prone to mistakes when under stress -- such as when taking a test or on a deadline to submit an assignment.

The classic example is the swap function. One professor asked us to code swap on every exam, even announcing he would in advance. Every exam, a significant portion of the students got it wrong. Because they couldn't memorize 4 lines? That's hard to believe. Easier to believe those 4 lines are so easy to remember that the student wrote them too quickly and moved on to the next question before noticing they'd written them out of order.


In my own experience, later courses (in computer science specifically) tend to erase the bimodal distribution because your ability to do science related to computing is almost orthogonal to your ability to actually build something with computers. Some of the people who graduated with honours alongside me could not code their way out of a paper bag if there was an exit_paper_bag() instruction.


>The earlier classes are exactly the place where things like prior experience and level of interest can possibly explain bimodality

Ok? This is claiming bimodality doesn't exist, not trying to explain it.


A more obvious consideration is upper level classes have smaller sample sizes. They defaulted to 'not binomial' which creates a very strong bias in their methods. Even then they found a statistically significant results across a range of samples, thus invalidating their conclusion.

The issue seems to be a failure to understand statistical analysis. The possible results are probably Yes or inconclusive.

PS: You can also hide bi-modal distributions by averaging several offset distributions together. IF class A average is 70 with 2 peaks at 60 and 90 and class B average is 80 with two peaks at 70 and 90. You get four weak peaks at 60, 70, 80, and 90. Now repeat for a range of averages and it's going to look like a bell curve.


Maybe the group as a whole is a normal distribution and your small samples made it look bimodal.


I remember my real analysis professor explicitly telling us that he ran the numbers on one of our exams and found bimodality. Real analysis is not what I would call an earlier class. I'm not sure if we were an outlier or a representative data point, however.


I can easily believe it was representative.

I would expect ability driven bimodality to show up in courses that present new types of challenges that were not in previous courses. That is where you'll get students who either do or don't master key concepts, and where they go from there depends on their mastery of those concepts.

For computer science, one of those key concepts is reasoning about state manipulated through a layer of indirection. The infamous "getting" pointers issue.

In the case of real analysis, it is an issue of switching from "Can you apply known operations to produce the right answer?" to "Do you understand the structure of the reals well enough to prove it?".


> n the case of real analysis, it is an issue of switching from "Can you apply known operations to produce the right answer?" to "Do you understand the structure of the reals well enough to prove it?".

Is this really the case in general? My math degree curriculum started with 100% proof-driven classes almost immediately, long before most people got around to taking real analysis. It was my impression that that kind of approach was fairly ubiquitous.


In the USA and Canada, it is standard for the professor to stand at the board and present proofs of everything, but students are not expected to understand them. If you can learn the formulas and apply the techniques that are in the homework sets, you can pass the class.

I strongly question the value of having the professor recite proofs that the students do not really understand. However it is traditional to do that.

The first course where students are expected to write their own proofs varies. My personal experience was that the transition happened in third year courses aimed at math majors. So I began writing proofs in abstract algebra and real analysis. But not in my third year differential equations course that was required for engineering and physics majors.

When I was in grad school at Dartmouth, they introduced it earlier in a linear algebra course. I taught the last linear algebra course that did that, the next quarter they took proofs out of it.


Hm interesting. My undergrad degree at Berkeley was pretty much solely the professor running through proof after proof, and our homework was more proofs. It actually suited me just fine, it was one of the few curricula in my life that I found presented any challenge.

My impression was that it worked fine for the majority of the other students as well, but I don't have tons of data on that or anything.


I'm not surprised that Berkeley is an outlier in this regard.

Talk to people who went to other universities and I'm confident that you'll find my experience to be more common.


I think the idea is that people on the lower mode of grading would self select themselves out of a CS at the 100 most often. This reduces the bimodality for the 200 level. Where again students self select themselves out, then the 300 level bi-modality is even further reduced.

I think the 100 level is the place where such a bimodality is most likely to exist and investigation ought to be focused there.


Your parent's point is that this hypothesis ("bimodality reduces as students progress through the curriculum" is possibly contradicted by the fact that many late-stage mathematics courses are strongly bimodal.

(FWIW my analysis course was also bimodal.)


Based on some limited experience upper-level math and (I assume) physics probably have a stronger element of "get it" vs. "don't get it" than most majors. Way back when I was an undergrad a number of the math majors I knew considered it to be a relatively easy major and there were definitely people who pretty much breezed right through.

At the same time, I was quite convinced that I could no more have successfully graduated as a math major--no matter how hard I worked--than I could have successfully flown through the air by flapping my arms. By contrast, I'm reasonably confident that I could have gotten through pretty much any other majors, whether engineering or something else.


If there are two major topics in a test (say Topic 1 and Topic 2), then bimodality could arise simply because

* some people studied both Topic 1 and Topic 2

* some people studied Topic 1 more than Topic 2

* some people studied Topic 2 more than Topic 1

This would cause bimodality even when inherent "genetic" skill is the same.


If this were be turned into a more rigorous MCMC style simulation and fed basic societal trends of high school and early stage college on courses taken, interest, aptitude, etc you might be able to get an estimate on the size of effect of these factors. Hmmm...


This seems unlikely across multiple students and years (though it is possible so more than one schoool should be studied).

You comment also presumes there are no students who can study well enough to pass all exams. Considering the ease of 100 level courses CS courses acing that and another 100 level is not exactly hard.


aside: real analysis classes are rarely large enough to do a good analysis of this in, unless you incorporate many years.


While real analysis alone is probably not enough to do a real representative analysis, if the class were the first real analysis class in the undergrad sequence, then I'd call it an earlier class. Eg, 1st upper division real analysis class, assuming the lower divs were the ones the pre-meds and CS majors took.


100-level: 16/171 = 9.4%

200-level: 5/165 = 3.0%

300-level: 12/243 = 5.0%

400-level: 12/199 = 6.0%

The evidence supports the claim that 100-level classes are relatively more likely to have bimodal grade distributions than upper-division courses, but does not support the claim that they are likely to have bimodal grade distributions (and we would want to compare csci to other 100-level classes if we thought there was something particularly problematic about entry-level csci education). By all means let's get more data, but entry level csci courses do not in general have bimodal grade distributions according to the data.

Having TA'd low-level csci courses, I would conjecture that entry-level courses in general are taught by less talented teachers, tend to be graded more sloppily due to sheer work-load, also have a high percentage of people who are absolutely whacked out of their minds than upper division courses.


100 level classes werent substantiallly more often bimodal than the others though.


9% vs. 3%, 5%, and 6%.


But the 300/400 are almost as high.


The whole premise of the survey part of this paper strikes me as very flawed. The idea was "show some professors histograms we know are normal, and see if they see bimodality that isn't there". The problem is, they didn't show them 6 random histograms of the normal distribution. For one, they didn't actually show them normal distributions at all; they showed them normal distributions capped at 100. That's going to give you a point mass at 100... making it a bimodal distribution! Second, they didn't give 6 random examples of a capped normal histogram; they gave them 6 examples chosen from a random set of histograms chosen in such a way that 4 of the 6 look bimodal. The survey participants didn't see bimodality that wasn't there, they saw bimodality that was purposefully generated by the histogram selection methodology!


Aren't final grades often "curved", i.e. re-graded according to rank? That would explicitly destroy bimodality, making the final grade distribution consist of non-iid order statistics.

Also, it's commendable that they point out that they expect 5% false positives, but frustrating that they don't go further and explicitly plan for a multiple testing correction procedure. It seems that they don't need it to fail to reject the null "meta-hypothesis", but still.


Not only are they curved, different courses are 'curved' according to vastly different rules. Some curve percentages, some curve only letters, and different distributions are enforced for different classes.

So if this is post-curve data, I would expect it to be basically worthless as a reflection of skill distribution.


Yeah, lots of ways to fudge. One exam, my friend Brian and I were allowed to skip the final and take our grade. So the rest of the class could go from failing, to getting a grade on a curve. There was that much gap between us and the rest.


My Operating Systems course had a really odd grade curve.

There was a listed one in the syllabus, but then the exams were so difficult that nobody was scoring above 80-85 out of 100, so there had to be an additional full grade curve after that.

So yeah, post-curve isn't helpful. Unless there's some way to normalize the data.


The curving process should be linear, I think.


The curving process shouldn't exist.

If you make an exam that is difficult enough to need curving, you're getting a poor measure of ability. This is because exams ask only a few questions, and unreasonably difficult exams result in low scores even from high achieving students. Low scores are very susceptible to noise (the delta between 50% and 60% is greater than between 85% and 95%).

If that doesn't convince you, take my argument to the obvious extreme. The Putnam Competition in mathematics is tough. Sometimes over half of the people score 0. Getting 1 question correct (out of 12) at times puts you in the top 20%.

Imagine I gave this as an exam to a math class of 20 students. One person scores a 1. The rest score 0. Is it meaningful to curve this? I could correct by giving partial credit: Some people get 0.25, others 0.5, and others 0.75, so we now have 5 different grades. Should I just give A, B, C, D and F?

The lower the scores, the higher the effect of noise. It's a bad idea.


> Low scores are very susceptible to noise (the delta between 50% and 60% is greater than between 85% and 95%)

This seems untrue from information theory.

The entropy of a question is maximized when its probability of being answered correctly is exactly 50% [1]. If your only goal is to have the least amount of measurement noise given a fixed number of questions, then you'll want each question to be hard enough to filter 50% of the class out, and to minimize the correlation between questions.

For example, 10 independent 50% questions reveals as much information as 16 independent 85% questions.

> Imagine I gave this as an exam to a math class of 20 students. One person scores a 1. The rest score 0. Is it meaningful to curve this?

You're looking at only 1 tail but ignoring the other. By symmetry, a question that only 1 person gets correct tells you exactly as much as a question that 19 get correct. An exam with a 90% pass rate (before curving) is no better than a 10% exam.

[1] https://en.wikipedia.org/wiki/Binary_entropy_function


Thanks. I'm now having flashbacks to the summer of 1990, when I worked on a Education professor's research project, creating a math test in a Hypercard stack that presented the question that provided the most information about the testee's abilities, given their previous answers.


The computerized GMAT and LSAT attempt to do this - the difficulty level of questions dynamically adjust to your current performance level (higher means harder questions).


First, while you may have a point, you misunderstood my percentages. I did not mean them to be the pass rate, but rather the score.

> An exam with a 90% pass rate (before curving) is no better than a 10% exam.

I do not encourage a 90% pass rate. I do not endorse exams where most of the students score 90%. I'm saying an exam which allows for a large variation in scores (anywhere from 10 to 100) yields more information, whereas an exam where a really brilliant student will get 50, with the next highest score being a 30 by someone who is only very smart, tends to be less likely to yield useful information about most of the students who will get 20 or less. A fairly bright student and a fairly average student may both score a 20 on such a test - yet the test failed to distinguish between them.

(BTW, I had an instructor whose exams were like this - I think I once had the highest score at around 25-30 out of 100).

A less demanding, but not trivial test, will separate out the average from the brighter.

Regardless, why the need for a curve? Your grading system should not depend on which students are present. It can lead to poor students getting a good grade and smarter students getting a poorer grade - in different semesters, for the very same tests.


> Regardless, why the need for a curve? Your grading system should not depend on which students are present. It can lead to poor students getting a good grade and smarter students getting a poorer grade - in different semesters, for the very same tests.

The distribution of skill between one 300-person calc class and the next is going to vary much less than the difference in teaching styles and exam difficulties. This is exactly why a curve is required -- so that your grade reflects how well you perform relative to others doing the same thing instead of some absolute level of competence that would vary from school to school, prof to prof, semester to semester.

Consider an employer or graduate school admissions committee that needs to decide who to interview. Looking at curved grades makes it easy to pick the top X% of students, whereas looking at uncurved grades leaves a lot more to chance (maybe a C was the highest grade in your section, but there was an easier prof the following year where the highest grade was an A).


That works fine when you have classes with 100+ students. In the ones I attended, it would range from 15 to 40 (the latter being considered high). Lower numbers tend to be impacted more by noise.

>Consider an employer or graduate school admissions committee that needs to decide who to interview. Looking at curved grades makes it easy to pick the top X% of students

As an employer, I'm not interested in the candidate's ranking in the class. I'm interested in their skills. While one is often used as a proxy for the other, I do not.

As a student, I want feedback on how much knowledge I learned, not how I did in comparison with the class. This was the original purpose of scoring tests.

Having gone through the PhD route, I know that "A" grade students who were always focused on the metric of relative ranking rather than knowledge acquired eventually were more likely to do a poor thesis or drop out, compared to "A" grade students who were focused on acquiring knowledge.

This was more acute from students who came from top undergrad schools: Very competitive background with heavy curving - and they would take their A as a faulty indicator that they were "doing well". In grad school, even though the courses are more challenging, most professors give A's and B's. Only rarely were C's given. The professors want to focus on learning and theses - grades are a distraction. Suddenly these students were getting A's, thinking they were doing well and not learning much. Their internal barometers were measuring the wrong thing, so their research suffered.


> As a student, I want feedback on how much knowledge I learned, not how I did in comparison with the class. This was the original purpose of scoring tests.

My university automatically attached our grades to all internship applications. It's pretty clear the purpose of grades (other than pass/fail) is employer or grad school evaluation, not for student feedback. For better or for worse.


Which is why in grad school, many professors subvert this by never giving C, and only giving B's if you're fairly poor. They lost the battle for undergrad, but grad school (at least in science/engineering) is still their domain.


I was unclear before -- an exam with an average score of 90 is equally bad as one with an average score of 10.

The variation in score is maximized when the exam has an average score of 50.

Since an uncurved score of 50 is a failing grade, you'd need curving to make that system work.


>I was unclear before -- an exam with an average score of 90 is equally bad as one with an average score of 10.

I agree but it's not what I'm advocating for.

>Since an uncurved score of 50 is a failing grade, you'd need curving to make that system work.

A curve is inherently about grading relative to one's peers, and I see no reason why it is required. If your test will have an average score of 50, make that a B (or whatever), and perhaps 75 and A. Just do it and fix it to those grades. Do not keep changing the threshold based on your current batch of students.

I think the fundamental differences between the two camps boils down to this question:

Should grades be an absolute measure or a relative measure? I'm strongly in the absolute camp.


> If your test will have an average score of 50, make that a B (or whatever), and perhaps 75 and A. Just do it and _fix_ it to those grades.

That's still a form of curving, since you can't know ahead of time on a new test how students will do. E.g. if you arbitrarily fixed it at 50 / 75 and your students do a lot worse than you expected, do you just fail the lot of them? If not, and you move the score thresholds lower, then that's just what curving is.

In fact, I had a couple of professors who told us ahead of time that curving could only be used to adjust our scores upward, but never downward, e.g. 70% was passing, but so was 60%, if enough of the class did poorly. Made the students feel much better, since they had a fixed bar to reach.

Seems like you'd be satisfied though if the curve used at least ~100 students to calculate so that it didn't change much due to random variation (e.g. if your class is small, aggregate multiple years of data)? Still a form of curving (since you are, at the end of the day, being compared to other students), just with different methodology.


But I've seen a lot of courses where it isn't. Among the many techniques I've seen used:

- Raw average, then set letter cutoffs for a normal distribution across students. Arbitrary on distribution, normal on letter.

- "Skew high" by counting missed points more weakly as you miss more. Inflated letters with a score-accurate relative distribution.

- "Fixed and curved", where each cutoff is based on the more (or less!) generous of standard deviations and 'conventional' grade brackets: the floor for A is either 92%+ or +2 SD from mean, B is 83%+ or +1 SD, etc. Bizarre and distorted everything, since the relevant cap is determined by the local distribution.

- God knows what else.

So honestly, I think post-curve data is a hopelessly course-specific, nonlinear mess.


I've seen a few different curving processes: 1. Formula, like square root of grade times 10, that's designed to bring up grades. There're a lot of possibilities here 2. Top x% get A's, next y% get B's, etc 3. Look for clusters: top cluster gets A's, next B's, etc

These can also be applied to classes as a whole or individual assignments and tests


That sounds funny, but it's true under the common modern curving regime that is really just lowering the bar for raw score to letter grade conversion, not forcing a bell curve.


Square root of grade times 10 :-p


"We theorized that the perception of bimodal grades in CS is a social defense. It is easier for the CS education community to believe that some students “have it” and others do not than it is for the community to come to terms with the shortfalls of our pedagogical approaches and assessment tools."

I didn't read the whole article, but I'm not sure I agree with the conclusion. Even with a standard gaussian distribution, one may still believe that some students have it and some don't. Let say there is consistently 25% of students that are unable to pass their exams (for various reasons: lack of interest, motivation, intelligence, discipline...), why would a "social defense" be needed?

While I agree that we always should try to improve our pedagogical tools, I don't believe that anyone can learn anything provided they have a good teacher.


Read section 4 of the paper.

They fix some data that as a matter of mathematical fact is/is not bimodal, and then ask people who do/don't believe in the "geek gene" hypotehsis to interpret the data.

Conclusion of the section: "We found a statistically significant relationship between seeing-bimodality and participants’ responses to the questions relating to the Geek Gene hypothesis"

So the "social defense mechanism" theory is definitely debatable, but apparently people who believe in the "Geek Gene hypothesis" are more likely to see bimodality where there is none.

> I don't believe that anyone can learn anything provided they have a good teacher.

The point isn't that everyone can learn anything. The point is just that the distribution isn't as bimodal as people seem to think it is. Also worth noting that the researchers posit normal distrbutions, not uniform distributions, as the alternative to bimodality.


I'm less impressed by "wrong about rigged data" results than most people seem to be. Yes, it implies that people bring assumptions to their answer, but is that really shocking?

People with a high prior for a thing assume that new data is most likely to conform to that prior. If you handed me a questionably-bimodal data set, my judgement of its distribution would absolutely depend on what you told me the data represented. Hopefully I'd answer right if I sat down and analyzed the thing in depth, but if you simply go "what kind of distribution does this look like?" then I'm going to include my outside expectations.

Yeah, there's a risk of bias here, largely from people not conserving expectations (if you already assume Geek Gene is true, unfavorable data should count more than favorable data). But saying "people used experience to interpret new data!" doesn't seem like it's proving much.


Great point. To go further, I'd argue biases are crucial for efficient decision making as biases are a form human judgement essentially boiled down to priors and used to make complex decisions quickly based on limited data. Without effective biases we'd all probably starve to death arguing. Of course the fatal flaw (the modern age?), is an unwillingness to both critique biases while not outright dismissing them just for being biases.


There's a "risk" of bias here? The point seems to be that the professors don't have a good prior -- they have a bias and are confusing it for a prior, and that's what was demonstrated by the experiment.

(that, and perhaps CS professors aren't particularly good at statistics)


I disagree with the "bias not prior" claim.

The professors were shown constructed data, and primed to apply a prior to it about grades. Prompting someone to use a prior inappropriately isn't the same as revealing a bias.

I see what you're getting at (if they wrongly assessed these curves, why would they be right about real data?), but there's a connective step that seems missing. If you show me a histogram and ask if it's bimodal, I can't tell you with certainty, so I have to guess. That's partly by shape, and partly by my knowledge of what the data is. Misjudging an actively-misleading histogram doesn't prove that I'm bad at assessing real data. (Since they just said 'yes' or 'no', I wouldn't accuse them of being bad at statistics - they didn't computer an answer.)

So even if real CS grades are bimodal, I would have expected the 'priming' result they found here. The professors were wrong, but this doesn't seem to have much predictive value.

More importantly, though, this whole study is bad. Section 3 is based off a single university's final grades, with no discussion of whether assignment grades were being curved before going into the final score (they are in at least some universities). So it's entirely possible that they took normalized data and found that it had normal distributions. Section 4 is unconvincing in several ways. "Assess this constructed data, maybe we lied about the origin" isn't a direct analogue to "are your actual grades bimodal?". "Assess this data for bimodality" threatens a priming effect that's incomparable to unprompted observations about grades. Priming effects are struggling to replicate, and may simply not work the way these researchers expected at all.

The study does an admirable job of acknowledging a lot of this, like the researcher-priming risk, but it doesn't actually remedy any of them. So I'm not sure what to say except "any of these methods could be meaningless".


You're countering (A => B) by claiming they are saying (B => A)

A = Belief that some students "have" it

B = View the distribution as bimodal

Even with a standard gaussian distribution, one may still believe that some students have it and some don't.

This is exactly what is happening. Many of the distributions are Gaussian, yet they are seeing it as bimodal.


Programming tests amplify modality because they require success on multiple items to succeed.

Consider teaching N skills. Success is binary, uncorrelated and everyone randomly succeeds some high percentage P on each skill. Scoring is the number of binary successes. You'll get a Gaussian distribution. That's most of education.

Now suppose scoring is 1 if all N skills are learned. You'll get a bimodal distribution. That's programming tested on whether the program runs.


There are two links discussing that in a blog post by the author of this article[1] (second to last paragraph)

1: http://patitsas.blogspot.ca/2016/01/cs-grades-probably-more-...


Sadly, they're both paywalled.


I get especially mad when people suggest that there are some ideas that are just special, that only special people can get. Pointers and recursion are at the top of this list.

Take Joel Spolsky, a person whose ideas I read and respect: http://www.joelonsoftware.com/articles/GuerrillaInterviewing...

> For some reason most people seem to be born without the part of the brain that understands pointers. Pointers require a complex form of doubly-indirected thinking that some people just can’t do

Really? Here's my 20 dollar challenge: give me a person and I can teach them pointers. Pointers, referencing and dereferencing, null pointers, pointers to pointers, the whole deal.

The problem is the pedagogy. Learning a new abstraction and a new way of thinking takes time. It's easy to get it but really not /get/ it. You have to do a lot of examples. One on one with a person, with real time feedback. Until they internalize it. Many many many simple examples at first, and then examples of using it in a real world context. Then more. If you already know this abstraction, consciously or unconsciously, it's impossible to un-know it, and the reaction when someone is struggling is "I explained it as simply as I can and I don't know why they still don't get it". It's not like that. It's like basketball: you can explain how to shoot, but really the student needs to do it, and you comment while they're doing it.


> I get especially mad when people suggest that there are some ideas that are just special, that only special people can get.

I think this is an important point.

After working with concepts in enough domains you start to see they are all made up of the same kinds of things; when it seems like they aren't, you're likely missing some implicitly used concepts that you don't even know you're missing which makes learning this new thing seem fundamentally more difficult than other things.

The way I think about why this would be (or would not be) is: are all concepts reducible to some small set of basic entities—i.e. are they all built up from some shared concept 'atoms'?

Now, I don't have a way of giving a definite answer to that, just my experience asking this question over and over again while picking up new concepts, and my experience so far puts me in favor of these 'atoms.' For instance, are there any concepts that can't be broken down into sets of 'types' and 'relations' between the types? (or something along those lines)

Could there be more significant barriers to people learning new concepts than limits to their intelligence? I've taught programming to a lot of people, and it's easy to see where they get caught up: they're doing fine, making progress, happy, excited, a little nervous; they make minor mistakes but quickly see their errors and it doesn't worry them too much. Then, they run into some difficulty that puts the question in their mind of whether they can really do this or not and you can observe their agitation increase. Typically they want to go do something else pretty soon after. If they are especially persistent they will face this scary situation and go back into it the next day—but more often they'll feel guilty about not going back to it for a long period, maybe do it once or twice after without much enjoyment or progress, and ultimately, quietly drop it.

What I have never seen is someone run into a wall with some concept, persistently revisit it and work earnestly to understand, and yet still fail. That's what they fear will happen (maybe an evolved feature to prevent spending too much time on theories that are going to dead-end?), but it doesn't.


I think statements like these brush off problems of learning difficulty as something to be surmounted with good pedagogy; incremental pain for incremental gain.

But when you run a youth pedagogical program of any sort, you can see that the consequences of speed or intelligence aren't things you simply overcome with a great teacher and applied effort. You can see that the consequences of both fast and slow accumulates for years.

The consequence of slow is that by the time you are 18 or whatever, you have learned fewer things. Sometimes the consequence of slow is that you simply don't even do X because of a rational consideration of opportunity cost.

In the race of money and life, almost all tortoises and hares in our society feel the bite of debt and financial worry. It is meh strategy for the tortoise to hope that hares are lazy and arrogant. It is meh strategy to hope that the tortoise would receive high quality education where the hare would not.


This is likely, but our current methodology for assessing the fast and the slow is so atrocious that it's doing more harm than good.


I don't think the question is whether a human can learn pointers or not but rather how much effort one has to put to get to the mastery level. From various physical tasks such as swimming, running, bicycling or mountaineering it is well known that few are born with inherent advantages such as metabolism, large lungs, muscle type, carb absorbtion, heart strength and so on. These people typically put in lot less effort and advance lot more faster than others who didn't had those advantages. For running marathons, you can find lots of examples where one has to put enormous effort over extended period of time just to barely succeed while others are competitive with little effort while starting with same initial abilities. It is however contentious if these analogies with developing physical abilities is applicable to mental abilities, primarily because brain is understood be far more complex, malleable and elastic. Even if person doesn't understand one set of abstractions such as pointers, it is likely that they excel at others such as recognizing perfect shade of color, musical composition or designing game levels. Instead of everyone trying to understand pointers, I think people should find their sweet spots, especially the one that they enjoy.


> Even if person doesn't understand one set of abstractions such as pointers, it is likely that they excel at others such as recognizing perfect shade of color, musical composition or designing game levels. Instead of everyone trying to understand pointers, I think people should find their sweet spots, especially the one that they enjoy.

This sounds like the assumption that lack of predisposition to one field is correlated with skill in other fields (eg, "X is not a math person so he's likely to have some talent in lit or the arts). I thought I recalled reading that what evidence there is doesn't support this: and that in fact those who have a knack for, eg, math are more likely to be good at other fields too. (This of course isn't talking about time spent practicing each field, which is rivalrous, but we're talking about "inherent advantages" here).


"give me a person and I can teach them pointers" Yeah, I don't think so. I have a 16 year old son that still hasn't mastered multiplication beyond single digits. It's not for lack of effort. We have spent $10,0000+ private tutors in addition to the efforts from his teachers/school, but he just doesn't get it. He isn't mentally retarded or autistic or anything else (we've had him tested many many times both privately and by the school system). He just has a terrible memory and has a hard time putting things in the right order. I guarantee that you could not teach him something as abstract as pointers no matter how good your pedagogy is.


Do people actually learn multiplication beyond single digits? I know up to 12x12 and do math a few hours a day for work.


I am fairly sure that almost everyone with a cs degree can calculate something like 56 * 13 with a pen, paper and some time.

The thing is that you can teach almost anyone to memorize a table which is why you can teach them to multiply single digits. But once you try to teach them how to follow an algorithm a lot of people just wont get it, which is why multiplying things they haven't memorized gets almost impossible for some.


Well, he has never been able to memorize the multiplication table. But if you try to get him to solve 10 x 13 he just can't do it. But he can solve 3 x 3 with a pencil, paper, and his fingers or some manipulatives.


I think it's meaningless to say that anyone can eventually learn X. Even if it's true, it's not really relevant.

Suppose you're a prospective CS student and you ask your trusted friend/advisor/teacher if they think you have what it takes. Which is a more useful answer?

  1) "Anyone can do it with enough guided practice"
  2) "Yes, but based on your math grades you'll probably spend several times as much time studying as your peers"


That depends a very great deal on what goal you're trying to advance with a useful answer. The former works towards encouraging the student to keep trying. The latter might help guide the student towards a different area where they might excel with lower time costs. Different people may have different opinions on what constitutes "useful" in such a scenario.


1) is more of a hypothesis whilst 2) [modified to "if you manage then you'll spend longer ..."] is pretty well known as a well tested theory for educators.


Some people get pointers immediately after one exposure before puberty. Not every student needs to practice. Some people, who maybe never came to your university at all, will never get linear algebra matter how much they practice. To say the distribution is unimodal is one claim. It's another matter to claim the distribution is narrow.


To be sure, if you persist long enough, you’ll manage to teach more people these things, but I’m quite willing to assert that there are some people (I won’t say anything about percentages) that will simply never be able to grok some of these concepts under any circumstances short of substantial neural adjustment. (Even if I ignore the easy target of mentally disabled persons which I would easily win your $20 with.)

Just like I will never be a marathon runner and never had the slightest chance of doing so, however hard I tried, because of biology.


Absolutely - I'm sure that a bit of innate talent can improve one's rate of learning, but 90% of the willing can do it. The best programmers that I've met just loved the art and science of computing, had respect for their craft and very high standards of quality, were undaunted or even motivated by new challenges, but most importantly, put in the necessary time.

I doubt things have changed much since I went to college in the 90s, but back then, the ones that did well at CS either had been hacking computers since the Apple II or spending enormous amounts of time at the lab (or both). Also, when at the Microsoft campus in the 90s, it was not a coincidence that the most expensive cars were the ones that were in the parking garage the latest.


> The best programmers that I've met just loved the art and science of computing, had respect for their craft and very high standards of quality, were undaunted or even motivated by new challenges, but most importantly, put in the necessary time.

This description doesn't conjure the image of a person unable to understand pointers.


Perhaps you're putting the cart before the horse. Did you play with computers a lot as a kid? Did it come naturally? For myself, I can answer "yes" to both. I suspect it's common for aptitude to come before motivation.


I firmly believe that every single person can learn any single concept in the world, given sufficient instructional time, resources, and contact.

However... this should not be confused with a belief that every single person can learn any concept in an economically scalable manner and useful period of time. In a world of limited resources, this can be a very important distinction.


I have two questions about your challenge. How many times am I allowed to give you people that you can't teach pointers to? And how rich are you?


Absolute claims about humans are always false [1].

1. Yes I know I just made an absolute claim about humans :)


It's very easy to find people you can't teach <insert concept here> to. All it requires is a trip to the residential care home.


That's if you find the $20 worth more than using the opportunity to examine the hypothesis that there is some special ability to understand pointers, something that may be missing even among otherwise fairly intelligent people. What he's giving you is a tool to test a hypothesis you find so valuable you read far enough down a page discussing it till you found a comment you wanted to reply to.

So, is the $20 worth it? Or would you rather be closer to the truth on the hypothesis?


A class (30?) students is too small to produce a curve which is unambiguously either normal or bimodal. Its always noisy. So if the CS profs, with their thousands of hours of experience with students, chose the bimodal interpretation I for one tend to believe it.


I don't know much about this, but I wasn't good at studying.

I'm totally living for development and everything to do with it, but most of the formal stuff completely eluded me.

I failed every math lecture at least once.

Because of my projects and thesis I still finished with a 2,7 which isn't good but also not really bad. Maybe I would have been one of those "college-dropouts who followed their dream" if companies in Germany didn't look so much on your degrees.

Also I never had the impression our grades were bimodal.

Yes about 40% dopped out and yes there were a few "eager beavers", but most were simply okay enough to finish their degrees.


I'm about the opposite of you. Not a good programmer, but I excelled at the 'formal stuff.' Automata theory, regular expressions, compilers, operating systems, discrete math... these are all areas where I did well. Labs and programming assignments held me back. Although I will say that I feel infinitely more comfortable working with memory allocation in C than I do with anything involving JavaScript.

I also don't think the overall grades were bimodal. I guess I was in the second mode in programming intense courses and the first mode in theory heavy courses. I also feel that most of my peers were either the same as me or the reverse, like you. Wouldn't that create a single mode bell curve?

Of course, this is all purely anecdotal


Oh, useful anecdote time!

I was talking to a professor a while back about this, and he reckoned that when the University of Canterbury (in NZ) switched from Java to Python for their 100 level papers, the grades went from bimodal to single-modal. Not sure why that is, and I don't remember him giving me a reason, but I guess it's fair to assume that a lot of students were really struggling with the initial learning curve of Java (boilerplate, static typing, more difficult array and string manipulation) compared to Python, and either just not getting it, or getting frustrated and giving up.

Now I didn't see any numbers to back up his claims, as it was a casual conversation, but I'd be inclined to believe him that they were bimodal.

Anyway, the point I'm making is that the study here references only one university, different universities have different curriculum and different teaching styles, which could affect whether the distribution is bimodal or not.


I am not sure we should assume bimodality in programming means we are all special snowflakes. That's a dangerous and self serving interpretation

It seems bimodality is correlated with "does it even compile" - the first hump is those who can't get it to compile and the second is those who can but then spread out normally on ability.

I would conjecture that the first hump would be seen in any educational environment where, for example we took illiterates and made them write essays. Those who had tried reading and writing in high school would have a better chance of putting ink on paper.

We just are seeing an artifact of software not being taught early enough in everyone.

Edit - how does my spelling corrector turn bimodality into bumps skirt...


They don't consider participation or hours spent studying as explanations for bimodality. This makes their method of determining belief in the "geek gene" questionable, and could invert the moral story of their results: Perhaps lecturers seeing low lecture attendance would like to believe their grade distributions are bimodal, but the data shows the lecturers' impact is at best a tiny factor among many, and grades are determined by factors out of the lecturers' control, such as students' ability.


I've never heard that computer science grades are bimodal, I thought it was that performance in programming is bimodal?

It's always been relatively obvious to me that grades were not bimodal since I could just click and see the grade distribution, which didn't appear to be bimodal.


"Your scores are bimodal!!!!" trying to get the "so what?"

Does that say something (e.g ,bad instructor) or does it support something ( idea that 50% of students should not even be taking course, for example)

Organic Chem class was bimodal which led to dismissal of instructor. Different instructor meant better learning and a different distribution.


Folks may also be interested in the accompanying blog post by the first author:

http://patitsas.blogspot.ca/2016/01/cs-grades-probably-more-...


So... This is a bad study, top to bottom.

Section 3 is based off a single university's final grades.

- There's no discussion of whether assignment grades were being curved before going into the final score (they are in at least some universities). So it's entirely possible that they took normalized data and found that it had normal distributions.

- Even if it's raw data, bimodality would appear per-assignment. Averaging problem sets with papers with tests, and assignments of varying difficulty, threatens to obscure any task-level bimodality.

- Finally, the paper acknowledges the risk of university-specific effects, but can't actually adjust for them. So Section 3 may not generalize at all.

Section 4 is unconvincing in several ways.

- The participant selection was open solicitation from multiple forums, with a high dropout rate after providing tests. That screams selection bias, but isn't acknowledged.

- "Assess this constructed data based on an inaccurate origin" threatens to carry over a prior of "all my real grades are actually bimodal" to the sample data. This would produce the observed results regardless of whether professors are misjudging normal data. This risk would be irrelevant if grades aren't bimodal, but Section 3 didn't sell me on that.

- "Assess this data for bimodality" threatens to prime seeing bimodality for any set of data. This renders it incomparable to unprompted observations about grades.

Supporting Literature is a disaster. It's a hit parade of papers and topics which have or may yet fail to replicate. The male/female candidates paper has a sibling which found the inverse result. The weapon bias study hasn't been found predictive outside the original study. The "brilliance-requiring disciplines" literature is such a disaster that there are multi-thousand-word essays breaking down why it's unconvincing.

I'd say that's not the author's fault, but this is dated September 2016. The male/female candidates result at least should have come with an acknowledgement that other studies have found complete different outcomes.

In summary: The data analysis here is fragile and unconvincing. The human studies misuse unproven effects to draw unsupported conclusions. The supporting literature grounds the preceding mess on other studies which were suspect by the time the paper was written.


I hated trying to read about pointers in c when I was about 15, a few years later a different book put it more clear and it all snapped into place for me.


So the assumption being made here is that students either get it or they don't based on genetic predisposition? I, for one, don't particularly subscribe to the notion that every CS student needs to score an INTP on the Meyers-Briggs to be successful.


ENTP makes a better developer anyway. :D

(In case anybody doesn't get the joke, ENTP is the "debater" type, so I am picking a fight about personality type, since that is what the ENTP is supposed to do, so self-referential humor: https://www.16personalities.com/entp-personality)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: