Normally I'd point out how these are a lot less capable than GPT3. But the article ends up fine tuning GPT Babbage, and multiple free models can outperform Babbage, so this is very solid advise.
How about fine tuning testing w/ Davinci and then scaling it down for the other models or HF once you've proven it works. I believe the openai docs propose this approach (minus HF of course)
That's up to you. Many don't want to open an account and pay in order to explore what's possible. There are LLMs available on the HF Hub, such as google/flan-t5-xl.