rate limiting can be a double edged sword, you can be better off giving a scraper highest bandwidth so they are gone sooner, otherwise somthing like making a zip or other sort of compilation of the site available may be an option.
just what kind of scraper you have is a concern.
does scraper just want a bunch of stock images;
or does scraper have FOMO on web trinkets;
or does scraper want to mirror/impersonate your site.
the last option is the most concerning because then;
scraper is mirroring bcz your site is cool and local UI/UX is wanted;
or is scraper phishing smishing or otherwise duping your users.
Yeah, good points to consider. I think the sites that would be scrapped the most would be where the data is regularly and reliably up-to-date, and a large volume of it at that - so not just one scraper but many different parties may on a daily or weekly basis try to scrap every page.
I feel that ruling should have the caveat that if a fair cost paid API version for getting publicly listed data then the scrapers must legally use that (say no more than 5% more than cost of CPU/bandwidth/etc of the scraping behaviour); ideally a rule too that at minimum there be a delay if they are republishing that data without your permission, so at least you as the platform/source/reason for the data being up-to-date aren't harmed too - which may then kill the source platform over time if regular visitors somehow start going to the competitor publishing the data.
just what kind of scraper you have is a concern.
does scraper just want a bunch of stock images;
or does scraper have FOMO on web trinkets;
or does scraper want to mirror/impersonate your site.
the last option is the most concerning because then;
scraper is mirroring bcz your site is cool and local UI/UX is wanted;
or is scraper phishing smishing or otherwise duping your users.