Of course, if you set up Home Assistant you can firewall them off the internet. That's how I do it too, with an IoT VLAN. It's not how these devices are intended to work, and not how they work if you just follow the manufacturer's instructions for Google/Alexa integration. You're replacing the vendor's cloud service with Home Assistant, effectively.
For example, I had to work out that in order to get Broadlink devices to stop rebooting every 3 minutes because they can't contact their cloud crap you have to broadcast a keepalive message on the LAN (it normally comes from their cloud connection, but their message handler also accepts it locally, and thankfully that's enough to reset the watchdog). This involved decompiling their firmware. I think that patch finally got merged into Home Assistant recently.
My point is that this is not the intended use for these devices. Normal people are going to put the gateways on the internet and enable the Google integration; in fact, it's quite likely that they will sign in to some IKEA cloud service as soon as you put the gateways on a network with outgoing internet connectivity, even before you enable the integration.
For example, I had to work out that in order to get Broadlink devices to stop rebooting every 3 minutes because they can't contact their cloud crap you have to broadcast a keepalive message on the LAN (it normally comes from their cloud connection, but their message handler also accepts it locally, and thankfully that's enough to reset the watchdog). This involved decompiling their firmware. I think that patch finally got merged into Home Assistant recently.
My point is that this is not the intended use for these devices. Normal people are going to put the gateways on the internet and enable the Google integration; in fact, it's quite likely that they will sign in to some IKEA cloud service as soon as you put the gateways on a network with outgoing internet connectivity, even before you enable the integration.