One approach that worked well for us when we tried to do this for an automated labyrinth project was to only operate on one color channel and template match an ideal ball while updating the template based on lighting conditions. Our project was slower moving though so the ball didn't deform quite as much. Our webcam was terrible so there was probably more visual distortion than I seem to recall.
Edit: also in our case we used a region of interest that moved with the ball position to cut down on the computation.
You might have good luck with the background removal techniques because your background is stationary. In our case the camera was stationary but the board constantly moved so that tactic was not very useful in our case.
Edit: also something to consider is using a kalman filter to help predict and smooth out noisy predictions. We did that and it helped considerably.
Edit: also in our case we used a region of interest that moved with the ball position to cut down on the computation.