No. Both of the requirements "to interact" and "based on what it looks like" require unshakable foundations in reality - which current models clearly do not have.
They will inevitably hallucinate interactions and observations and therefore decrease reliability. Worse, they will inject a pervasive sense of doubt into the reliability of any tests they interact with.
Yes, you are correct that it entirely lays in the reputation of the AI.
This discussion leads to interesting question, which is "what is quality?"
Quality is determined by perception. If we can agree that an AI is acting like a user and it can use your website, we can assume that a user can use your website and therefor it is "quality".
For more, read "Zen and the Art of Motorcycle Maintenance"
It's completely at-odds with the strengths of LLMs (fuzzy associations, rough summaries, naive co-thinking).