In most cases getting banned is the big issue. The bigger the site, the more advanced their bot detection is. You can use luminato.io to get residential and mobile IP's, but it's pricey.
Some sites will also obfuscate the DOM, ie. removing classnames and ID's, which complicates the data extraction.
http://scrapinghub.com/ has a paid "do it for me" service, which may be an option depending on your budget.
For avoiding bans, having a large ipv6 range can help (e.g. like one you might get with a VPS at a proper hosting company). As for grabbing the content itself, I've used a lot of frameworks but I usually end up back at some combination of simple string search and regex.
Depending on what you're scraping you might run into a fair few JS-Only websites that are a pain to scrape. On top of all the things mentioned here you will need to run pages through a headless browser like puppeteer. For these sites you maybe be able to reverse engineer their APIs and attempt to scrape those rather than the pages themselves.
In most cases getting banned is the big issue. The bigger the site, the more advanced their bot detection is. You can use luminato.io to get residential and mobile IP's, but it's pricey.
Some sites will also obfuscate the DOM, ie. removing classnames and ID's, which complicates the data extraction.
http://scrapinghub.com/ has a paid "do it for me" service, which may be an option depending on your budget.