You don't need years to produce a phoneme that can be recognized by a native speaker. The important point is two-fold: being close enough to the actual sound, and working on minimal pairs where confusion can occur. It means you don't need to sweat over how you pronounce /r/ in French because however you make it it won't be confused with something else, but one sure should work on making their Chinese /t/ and /tʰ/ correctly. Also for listening given the sheer amount of audio and video that there is on the web it is easy to cram the required hours of listening in a few months.
My point is speaking early on doesn't help with learning the language and isn't a barrier. On the contrary, it slows down progress. You shouldn't try to speak or even attempt to write sentences until you have large enough vocabulary in your brain with all the proper pronunciation and spelling associations and patterns, like to understand 80% of the words in a random article large vocabulary (or more, it takes many years to get to 99% and higher). Otherwise you will learn wrong things and form associations with wrong pronunciations because you had to make assumptions due to broken methodology and those take a long time to get rid of and relearn.