I suspect the parent's suggestion is more along the lines of 1 snapshot that captures all Skittles in a pack, then ML object detection to classify and count outcome...processes an entire pack in a single shot, as opposed to 1 Skittle at a time.
Why not? Is the problem really just naive color identification? It looks more like classifying uniformity, shape, size, orientation, character imprint, malformation, anomalous objects, etc. Can the problem be solved with a more traditional image processing toolbox? Sure, why not? But what fun is there if we're not swinging HN's favorite rubber mallet at the problem...and then doing it again and again just because.
Keeping with the context of the GP's remark, a solution is more about achieving process speed while minimizing error and human intervention in the face of uncertainty--i.e. single-shot at the package level, which is at least a 50x speed increase over individual mechanical sorting.
Except one of his criteria for whether or not to count a defect was if it was uniformly costed or not. So now you’re back to the sorting machine with a tumbler or multiple cameras.
With that additional requirement, consider the following naive 2-camera solution flow:
1. load packet contents on transparent substrate
2. short vibration to randomly disperse
3. image capture top and bottom
4. ML processing
5. dispose of load
6. goto step (1)
Cameras for this class of application are cheap, but if 1 camera is an explicit design requirement, then a simple motor with 90-degree camera mount to capture top, rotate 180-deg, capture bottom is simple enough as well (albeit less reliable given the introduction of a precision electromotive element to the system).
If alignment is critical, then error budget can be appreciably extended by marking corners of transparent rectangular platter with distinct color/shape on top and bottom to serve as fiducials in computer-aided skew correction/alignment, compensating for inevitable drift over time; it could also supplement human validation of archived images so orientation is easily determined.
2 fixed cameras, of 1 fixed motor-mounted...take your pick? Suggest one?
Wouldn't really consider a discrete vibrator afixed to a solid platter a "machine" per se, but if you want to call it that, sure. I gave the problem 10 sec of back-of-the-envelope thought. 10 sec more suggests you can skip vibrator integration by selecting a more rigid transparent substrate, e.g. glass...Skittles would naturally rattle on that as a package is loaded. 10 sec more suggests you can constrain camera movement even further by using carefully oriented mirrors. How far down the rabbit hole would you like to go?