Indeed. I was quite impressed by his solution for randomly iterating through the IP space. I've had use cases before for randomly iterating through a space while ensuring to hit every space and they've never been quite as efficient in space/time complexity as his.
can you elaborate on what/how one would randomly iterate a space? i imagine it's like trying to draw a space filling curve on the place (for spaces with 2 components, such as a coordinate)
Title should be updated to include that this system scans only via IPv4. Doing such a thing with IPv6 would be a little more surprising. (7.9228163e+28 times more difficult)
There is a false assumption that IPv6 will make mass scanning like this impossible. In reality you just need to be more clever about it. (Remember way back when people used "needle in a haystack" security for dial-up systems, because nobody would ever have the resources to call every phone number in an area code?)
Link-local multicast (the replacement for ARP) allows tools like alive6 to very easily enumerate all live v6 addresses on a network. So once a spear phishing attack is sucessful, you can still scan the entire internal network.
Google hacks like "site:ipv6.*" and passive DNS monitoring allow you to easily separate used vs allocated/announced subnets on remote networks. IPv6 breaks in strange ways when you firewall ICMPv6, so ping scanning a subnet has become much easier.
There was also a great talk (i'll try to dig it up) that talked about predictable patterns in DHCPv6 implementations, so you can cut down v6 to a near v4 search space.
The best part of all is that very few security products on the market really support IPv6 correctly, so I suspect we will see more advanced attacks being possible because of IPv6 in the coming years than things being stopped.
Think of the internet like a tree, where the root is you and all other IPs are the leaves at the end. IPs close together tend to share more path of the tree as you attempt to reach them from the root. If you are sending an overwhelming amount of packets in one direction for too long, you have a higher chance of harming nodes (i.e. routers) along that path. Randomizing your end goal on the tree, by definition, equally spreads the packet spray accross the tree.
This is how they can claim: "... it'll only melt your own network. It randomizes the target IP addresses so that it shouldn't overwhelm any distant network."
Note "shouldn't", this was probably added due, in part, through use-case. If I were not scanning the whole Internet, and instead just scan a small section. Masscan has less of a space to randomize through, which means the tree is smaller and the shared paths are more frequent.
How likely is this to be used for anti-piracy efforts? I don't hear much about en masse copyright enforcement these days, but it seems like the ability to quickly scan large IP ranges would allow one to periodically (every couple minutes or so) obtain a list of every single seeded file in the US, at least for the people not using a VPN.
That's not how torrents work. You can't connect to a port on a seeder and get a list of torrents it's seeding. Even for trackers, you have to request a specific URL to get a (partial) list of available seeders.
On the other hand, it is easy to crawl DHT. "Crawling BitTorrent DHTs for Fun and Profit" (2010) says "We find that we can establish a search engine with over one million torrents in under two hours using a single desktop PC".
Oh I see, please forgive my ignorance of the torrent protocol. I was under the (apparently mistaken) impression that, like Napster, one could enumerate a user's shared files simply by knowing their IP.
I am not a lawyer, but as long as your local laws do not prohibit you from pinging a given IPv4 address, then I can't imagine any issues. Being technically legal doens't mean you won't step on some toes though. Everyone who runs full internet scans has reported getting lots of exclusion requests, (baseless) legal threats, and even retaliatory DDoS attacks coming back at their source IPs.
Massscan ships with an exclude list which you would do well to utilize:
If you try to run an internet wide massscan without this list, it will stop you and give you a warning about how to use the list. You then either manually override the warning, or use the list.