They probably can scale up to a point. Thoughtworks has 2100 employees. Google has 45,000.
How would you grade/score ~1000 code submissions per day, though? You could conceivably do Coursera/Topcoder-type automated grading, but that can only get you so far and can't distinguish good code vs. bad code vs. "copied from Glassdoor" code. It might be useful as an initial filter, though. You'd have to constantly be implementing new questions w/ associated grading scripts, though, as the problems would inevitably leak.
How would you grade/score ~1000 code submissions per day, though? You could conceivably do Coursera/Topcoder-type automated grading, but that can only get you so far and can't distinguish good code vs. bad code vs. "copied from Glassdoor" code. It might be useful as an initial filter, though. You'd have to constantly be implementing new questions w/ associated grading scripts, though, as the problems would inevitably leak.