OP here. I'm not sure about the details in your link, but basically my understanding lines up with [1]; robots.txt isn't guaranteed to be respected, but generally is.
FWIW, what I specifically have in robots.txt is
User-agent: *
Disallow: /
which seems to work well for me so far (i.e., I do not find my house documentation site on any search engine).
If I understand the details of the link, it was a particular feature of robots.txt that was considered undocumented/unsupported that Google dropped support for.
I think the point of it was that you could tell Google to crawl some pages (for links) but not index them?
1. https://www.searchenginejournal.com/google-robots-txt-noinde...