There is a fundamental problem in general (not in my particular approach) which is that we don't know how to do non-convex optimization. There are many problems where it's just not possible with our current techniques to know if a minimum is local or global.
Of course, with better tuning one can obtain better optima in general (that's what the whole field is doing) and it's possible that I'm not applying the best techniques and I would get less randomness if I had a better model. But as far as I understand, even the best models can converge to different local optima depending on initialization.
Of course, with better tuning one can obtain better optima in general (that's what the whole field is doing) and it's possible that I'm not applying the best techniques and I would get less randomness if I had a better model. But as far as I understand, even the best models can converge to different local optima depending on initialization.