Routers at this level aren't just scale-ups of your home wifi/nat box. They aren't even scale ups of the simple IP routers for a basic IT data-closet that manages subnets and whatnot (already much more complex by dealing with vlans and subnets and dmz and vpn issues). At the level of big networking company they are a truly complex beast.
Just at the IP level they have to deal with (at the edges and across substantial WANS) BGP - a notoriously ugly and fragile protocol. Internal routing protocols such a OSPF are equally ugly and prone to breakage. Many are the tales of some small company misconfiguring their edge routers slightly (say a 1 char typo) and having the entire internet route through their T1, across their lan, out their backup T1. Other issues are BGP flapping, resulting in scary percentages of lost traffic. This doesn't even cover trickier stuff like routing loops...
Other considerations in big routers are things like ASN identifiers and peering points. Considerations like traffic cost, SLAs and QoS all go in to traffic balancing on such routers. MPLS clouds complicate (and oddly enough simplify) these things as well.
There are also important issues like Anycast, CDN and NAT that largely rely on router tricks and add to the complication.
Finally, on top of all this, is the security concern - you can't just throw a firewall in front of it, as many firewall issues are routing issues, therefore must also be present in the router.
All these layers interact and affect each other. Any given machine can only handle so much traffic and so many decisions, so something that is drawn as a single router on a networking chart may actually be several boxes cascaded to handle the complexity.
Oh yeah, and switches are getting progressively smarter with other rules and weirdnesses that provide horribly leaky abstractions that shouldn't matter to the upstream router, but turn out to add issues to the configuration and overall complexity on top of it all.
> Many are the tales of some small company misconfiguring their edge routers slightly (say a 1 char typo) and having the entire internet route through their T1, across their lan, out their backup T1.
This is what route filters are for. If you peer with someone and they advertise 0.0.0.0/0 or something equally ridiculous, and you accept this as a valid route then you deserve to fail (and then given a firm stare if you then proceed to advertise it to other peers).
A similar fail on the part of Telstra (http://bgpmon.net/blog/?p=554) was to blame for much of Australia dropping off the map earlier this year.
Also:
> Just at the IP level they have to deal with (at the edges and across substantial WANS) BGP - a notoriously ugly and fragile protocol.
I take offence at this. 80% of BGP related issues are due to misconfiguration by a given party, 19% is due to bad or missing route filters and the other 1% is due to bugs in router software. The actual implementation of BGP v4, originally designed back in the early 90's isn't completely without it's issues and behavioural quirks (I'm looking at you, route flaps) but the theory/algorithm behind is a work of art, and has coped amazingly with explosive growth, and growth that's only going to increase with IPv6.
Without it, there would be no HN
Thanks for your views on BGP - a lot of my knowledge of it comes from post-incident beers with our network grey-beards when I worked at an ISP, so my views are probably somewhat biased. There is nothing like a network explosion at an ISP to get people ranting about BGP, but I'll admit that the ranting is from a place of anger and frustration and largely venting rather then a fair technical assessment.
router tables are a stationary woodworking machines in which a vertically oriented spindle of a woodworking router protrudes from the machine table and can be spun at speeds typically between 3000 and 24,000 rpm
excluding static routes (which are then usually advertised to other peers), routING tables are dynamically built and only exist in non-persistent memory.
having up to date backups of router configuration is another matter entirely
Just at the IP level they have to deal with (at the edges and across substantial WANS) BGP - a notoriously ugly and fragile protocol. Internal routing protocols such a OSPF are equally ugly and prone to breakage. Many are the tales of some small company misconfiguring their edge routers slightly (say a 1 char typo) and having the entire internet route through their T1, across their lan, out their backup T1. Other issues are BGP flapping, resulting in scary percentages of lost traffic. This doesn't even cover trickier stuff like routing loops...
Other considerations in big routers are things like ASN identifiers and peering points. Considerations like traffic cost, SLAs and QoS all go in to traffic balancing on such routers. MPLS clouds complicate (and oddly enough simplify) these things as well.
There are also important issues like Anycast, CDN and NAT that largely rely on router tricks and add to the complication.
Finally, on top of all this, is the security concern - you can't just throw a firewall in front of it, as many firewall issues are routing issues, therefore must also be present in the router.
All these layers interact and affect each other. Any given machine can only handle so much traffic and so many decisions, so something that is drawn as a single router on a networking chart may actually be several boxes cascaded to handle the complexity.
Oh yeah, and switches are getting progressively smarter with other rules and weirdnesses that provide horribly leaky abstractions that shouldn't matter to the upstream router, but turn out to add issues to the configuration and overall complexity on top of it all.