Remember that even ELB's in AWS have IP's that change all the time, and this itself is actually a source of vulnerabilities from apps that don't respect DNS TTL's (as has been seen in the forums repeatedly -- apps get connected to the previous IP instead of the new one). It's probably safer to retrieve and verify the IP for each request, and just cache if the IP is 'safe'. (And just doing IP subnet calculations is non-trivial in most less-common languages.)
Also, request throttling should be maintained and HTTP verb checking, to prevent being turned into a proxy for other attacks.
Actually, any decision to accept an arbitrary URL should be carefully examined in light of how hard it is to do safely.
As long as you manually get the IP for every domain. I.e. if they ask for "blah.com" you have to get the IP, check it, then turn it into "curl -H 'Host: blah.com' http://IP". (Otherwise, it may be a race condition that allows the DNS server to resolve to a different IP address the 2nd time. See https://en.wikipedia.org/wiki/Time_of_check_to_time_of_use )
1. Resolve hostname and remember the response
2. Verify that the response does not contain any addresses in a private IP space, or any other IP that is only accessible to you
3. Use the IP from step 1 when establishing a connection
With other solutions, you might end up being vulnerable to DNS rebinding attacks.
Bonus points for doing all your URL fetching in some sort of sandbox that enforces these access rules.