FWIW I'm also on CenturyLink FTTH and just a week or two ago noticed latency spikes and packet loss which magically went away after 15 minutes. Good to read this analysis for future reference. I really wish end users had more control over ONT boxes similar to how we can use own modems for cable/DSL. A DOCSIS-like provisioning by ISP should be possible.
Off topic, but CenturyLink Fiber still uses PPPoE and 6rd instead of native dual stack in many markets and are unwilling to upgrade to more modern configurations.
AT&T has replaced the ONT+gateway (communicating over RJ45) by a SFP directly integrated in the gateway. According to some people I've talked to, they are no longer issuing the old hardware, and instead deploying exclusively this. It may make sense, as upgrading would just require swapping the SFP.
I wonder how hard it would be to connect the Nokia 3fe4960ac SFP to a linux server and initiate a DHCP on their 802.11q vlan, or PPPoE, or whatever else they may use?
I am not with at&t but with a very large Canadian provider that provides the same setup (Bell).
Yes, you can do just that. You can also use a commercial firewall with an sfp port.
SFP ports can be a little troublesome. There are a few standards (SFP/SFP+) and different link speeds.
The you need to know your ISP's specific info to configure your hardware. Bell uses pppoe on VLAN 35. You need your username/password for pppoe, which is not a given since the technician will usually do the initial config for you.
Another option is to use a media converter. They are very cheap and "dumb" devices, that simply converts the fiber to copper ethernet. You then connect using just about any router that supports vlans and pppoe.
If you also get bundled TV or Phone subscription on your fiber link, these will be on another VLAN and connection is much more obscure. Usually the logic is embedded in the all-in-one router they provide.
Good suggestion and question. Another challenge for bring-your-own-ONT is making a clean fiber connection without expensive tools, but I would imagine that's also solvable.
My ONT has a standard single SC connector. The only custom splicework on the install is the run from the street to the service entrance. From there it's an off the shelf single mode SC-SC cable to the ONT.
Knowing little about the GPON protocol, what does the ONT actually contain to authenticate to the network? With some quick web research, it seems like it's a serial number and/or a static password Would it be possible to replace the ONT with a well documented model that you have flashed with the appropriate identifiers?
You might have to figure out how to take the ISP's provisioning profile and make your own device use those parameters? Then again if the ISP didn't want dodgy devices on their TDM network they should remove the motivation by deploying non-broken gear in the first place.
I tried connecting a Ubiquiti ONT to a Calix OLT once, but couldn't get any combination of settings to make it work. The OLT saw the ONT, but we couldn't get packets to flow. (This was a test network, so no permissions issues or passwords to guess or anything. Just couldn't make a usable profile for the device. I will admit that I really didn't know what I was doing, I just saw one of the ONTs floating around on a Friday afternoon and poked it a bit.)
Wat? If PPPoE is running on the router, then how is the ONT meddling with TCP connections? Is PPPoE being run on the ONT rather than the router? I guess PPPoE isn't encrypted and the ONT could be deencapsulating and reencapsulating frames, but that seems unlikely?
That's weird! I don't know much about PPPoE but I wonder if it would be possible to mess with the framing so that the specific DPI/modification wouldn't work. Like add some nonstandard options to the header, and hope the ONT used fixed offsets for getting addresses.
Given that ONTs probably aren't subject to too much hardware security research, maybe it would be possible to hook up a debugger and NOP out the connection tracking hooks.
I was never happy with the performance of Calix CPE. We used them heavily at my last job, and indeed customers would have all sorts of trouble that we could never reproduce when we sent a tech. My favorite little hack was that I wanted live stats from the OLTs to be in our own database so that it could show up in our support portal and internal CRM and be aggregated for general network health statistics. (i.e. when someone went out to repair a fiber, they could instantly see the customers come back online, or more often... know while they were still out in the field that they didn't fix it) I wrote a program to scrape it (by ssh-ing in, thanks golang.org/x/crypto/ssh!, because their SOAP API returned no useful details), and after running for many days... it caused the OLT to stop routing packets entirely. No Internet routing, no management interface, it just flat out died unrecoverably. Anyway, they blamed my app, so I built them a static binary of the scraper that could run on Windows (they didn't have any Linux boxes) and after much back and forth they traced it down to a race condition between the two redundant processor modules in the OLT. So much whining how it was my fault, when it was their fault.
At the ISP before that we made our own CPE. The leads on that project really understood the Internet and managed to get reasonable latency, even over WiFi. But the incumbents still seem to not know about fq_codel, or how to put more than 4MB of RAM in their devices, and the users suffer as a result. This article reminded me of how mad it makes me, sorry for the rant. (I switched to a different industry where less lasers are involved.)
The CenturyLink CPE for DSL is pretty crap too. I had one that would reboot if you sent a fragmented IPv6 packet, among other problems like the web UI refusing to work if it was on long enough (thankfully EOL and they replaced it without much questions). The replacement didn't reboot, but I was seeing ping times go up to seconds, so I gave up and do PPPoE on my equipment now (I didn't want to ever run PPPoE cause it's stupid, but now I have to)
Yeah. I feel like these products were always designed to be ultra short lifespan devices, but in practice, they live forever. ONTs from 2013 are still being used and sold new, so clearly investing in fixing bugs and testing would have paid dividends. But I guess it's that case where you can't go to Amazon and buy the best ONT; Calix or whomever sells them to an ISP once, everyone has a good round of golf and dinner or whatever, and the customers are stuck being sad (and maybe the ISPs give out token credits; we gave out a lot of token credits). The ISP mostly competes by being available to a customer; if you dig up the street, then they don't really have an option to go elsewhere unless someone else digs up the street.
Hopefully some crazy person will launch an astronomically expensive fleet of satellites that solves these problems once and for all. All you need are a few engineers on the CPE team that give a damn, and you can fix the Internet for everyone in the world.
I remember something about this from a few years back. Can't recall the link now though.
His ssh sessions were constantly timing out. It only happened when he left the SSH session to idle. It turns out his router was dropping the TCP sessions because it considered them dead. He got around it by implementing a "keep alive" packet, of sorts. Very interesting stuff. I don't really work at such a low level in the stack regularly, so it's quite fascinating to see the strange issues people encounter with these tools. Especially when ISP's meddle around with stable protocols.
Also reminds me of how some ISP DNS servers totally ignore TTL values from DNS records[0].
This is a pretty common issue. See https://access.redhat.com/solutions/23874 The keep alive pings can be added on both the TCP and app level. If you ever cross a NAT, you will have some expiry on your connection. It's not really "meddling".
Yes it is. A packet being sent isn't reaching its destination because your ISP is choosing not to forward it? That we've come to expect that broken behavior is the reality that we live in, but a different route would be for the firewall/NAT device to forge an RST to both ends, since it will no longer be forwarding said packets on that TCP connection.
Given all the advances in technology, I don't think that's as bad an idea as it once was.
"choosing not to forward it" is an interesting phrase. NAT needs to have some expiry for each entry, because we don't have unlimited space for that table. Dropping the mapping entry has the same result as the other side becoming unavailable and is an understood state.
You can't just produce RST out of nowhere on the NAT expiry, because that connection may actually be active somewhere else. Consider a replicating pair of NATs - your connection gets moved from one to the other because (network reasons), but the previous one does not get a message about it because (network reasons). If it sent the RST packets it would actively kill live connections which it should not touch anymore.
Yeah PuTTY has the keepalives option for exactly this reason. My home router doesn't seem to need them but when I'm out and about on 4G they help. You also have the SO_KEEPALIVE option on TCP connections in general.
Quoting the article, the cause is identified "The Calix 716GE-I ONT device is working as designed by activating Denial of Service (DOS) attack prevention when too many connections are established, which includes jumbo or small packets". Sounds like a reasonable feature for residental devices, even if it isn't compatible with the niche usecase of running a Tor relay.
Probably the expected market for advanced users who would need this particular feature is tiny. Like, for the Tor relay usecase, there are something like 6000 relays worldwide, most of them probably provided by various organizations (where a single operator runs many relays) instead of hobbyists, most of them outside USA, and the vast majority of them using some entirely different network connection not affected by this particular device model in any way. The described scenario ("10000s of concurrent TCP sessions") is literally an edge case for residental use; the article does follow up with "What about BitTorrent or cryptocurrency and Web 3.0 apps?" but none of those have network behavior like that.
Like, perhaps this problem is also affecting other kinds of usage, but the original article does not attempt to claim that, and purely from their example it would be generous to assume that literally dozens of individuals would need this feature and, well, it's not worth to make and test features (even if they're just a configuration option) in this case.
Honestly, for various structural reasons, hobbyists are sort of actively discouraged from running Tor relays. It's less of an issue with middle relays than guard or exit but in practice Tor has a strong reliance on trust in relay operators, so small-bandwidth relays popping up onesy-twosy is much less desirable than institutional operators with significant resources.
Which is all just one reason that, of the set of people running Tor relays on residential internet connections, I'd wager a solid 99% shouldn't be.
A symmetric gigabit with unlimited transit isn't terribly small though. And the super-awesome side tail of residential service continues to march forward. My own home connection is symmetric 10 gbps.
The problem with this logic is that ordinary users don't become the target of a denial of service attack either. If it should exist at all, the default should be off. And if then no one would turn it on, it could just as well not exist.
Ordinary users become a target of DDoS way more often than you would think. These days it tends to be related to competitive multiplayer video games, but I'm sure there's still some IRC drama and small-time Minecraft hosting driving it.
In general it's extremely unlikely unless you are engaging in "high risk behavior," but at the scale of an ISP there are enough users doing that kind of thing (Twitch streaming, etc) that it becomes an appreciable frustration for your network operations.
> These days it tends to be related to competitive multiplayer video games, but I'm sure there's still some IRC drama and small-time Minecraft hosting driving it.
This sounds like the sort of thing with similar prevalence to things like running a Tor node. This might even be an example the other way, when your game server or what have you has thousands of peer connections and this thing breaks it by misinterprets that as a denial of service too.
I might be misunderstanding but doesn't the feature also help prevent home users' devices becoming part of a DDOS effort (high number of outbound connections)? There's stories here on HN about IoT devices and infected PCs/phones participating in DDOS on command. So I can see an argument that a home gateway device should try and help prevent participation by devices behind it.
In cases like that the correct answer is to detect weird behavior and call the customer on the phone to ask what's going on. If they say they know what it is because they're running Tor or hosting Ubuntu ISOs or playing P2P games or whatever, you don't have to do anything.
If they say they have no idea what you're talking about, you get to tell them they're infected, so they actually fix it instead of typing their bank password into the infected box the next week because you automatically removed the "huh, internet's slow" that might have led them to investigate it otherwise.
I like your idea and agree that implementing it would improve outcomes for customers. However, the ISP would be on the hook for additional customer support; it's a lot more involved to outfit your call center staff with playbooks for explaining exploited devices to an average customer than it is to toss in a semi-autonomous blocker. This does make things worse for "power users", but ISPs may have also found that said users are more willing to pay for special service agreements (a small business account for example).
It isn't at all clear to me that Centurylink sells a separate business-class service to residential addresses. I put in my home address at https://www.centurylink.com/small-business/business-fiber/ and was quoted the ordinary residential price.
Furthermore, you need some kind of ONT for fiber termination and it isn't clear that Centurylink uses a different ONT without this feature for business class customers.
So ISP delivers router that breaks your internet, and they won't replace it with a real ONT?
Then why not simply replace it yourself?
As long as it isn't PON, but just plain AON, that should be relatively straight forward.
There's authentication between the ONT and OLT that you would have to either implement or relay. This is an edge case because of running Tor. The average user isn't going to run into these problems.
An ISP is selling me fiber to transmit bits and an IP address to talk to the rest of the world. How many TCP connections I'm establishing is exactly none of their business unless they start receiving abuse reports (or run CGNAT, but that's not the issue here).
Whoever thought a *stateful ONT* was a good idea should be shot out of a canon.
Just wait until the connection timers in the ONT don't match your firewall. Then you'll have real fun.
An ISP (I run one) sells a residential connection to you as a user under a number of assumptions that you are like other residential users. That means that your usage these days is roughly 4 Mbps measured at the 95th percentile (in aggregate). When you run a Tor node you cause the following problems:
- your 95th percentile usage is now likely going to be substantially more than 4 Mbps
- your usage is likely to be much more constant (less bursty). This breaks statistical multiplexing amongst residential users. For reference, Netflix with HD video streams tends to burst to 25Mbps for a second and is then idle for 4-5 seconds.
- your usage is now exposing the ISP to DoS attacks and other interesting (read as expensive) problems caused by running a Tor node. This includes legal costs when dealing with investigations into malicious use of the network by nefarious people trying to hide illegal activities via Tor. Yes, your ISP has to bear the cost for legal issues that arise when its users engage in illegal activity over their internet connections.
- your Tor usage is likely to result in the IPs that are used by you to get added to various blacklists. This results in support costs for the ISP when your dynamic IP gets assigned to another user and causes problems for an unrelated.
If you really want to do this, colocate a Tor node in a data center. This kind of traffic is perfectly appropriate in commercial circumstances, and the price you pay will reflect the actual cost of the service being delivered. You're not going to cause nearly as much collateral damage with a dedicated internet connection as you will on a residential network.
Yes, Tor has its place, and if you're going to run a Tor node, think long and hard about the impact it will have before doing so. Many smaller ISPs are not at a scale where the company can afford to carry the costs needed to support traffic patterns that are generated by Tor. Small ISPs have to be very careful to balance the line between expanding to serve the needs of our customers and breaking even. Legal budgets only become a thing after an ISP has hundreds of thousands of dollars a month in revenue. Please, don't do something like this to a small ISP that's trying to help bridge the broadband divide. At the very least, run it by them before doing so.
These are all problems that are yours, not mine, unless you've put it in a contract. I don't give a poop about your multiplexing oversubscription. That's a business choice. The bet didn't work out. Data caps are a common way to fix it, but those in the contract. Notice at no point did CL ever say you can't run a Tor exit node. Of course, a common clause in these types of contracts is that the provider can just drop you at their leisure. That's also an option. But don't implement this hacky nonsense. My actual day job is writing bandwidth/packet rate/connection count limiters, so I'm well aware of how these things work.
Even regardless of the Tor issues, the problem OP is having is related to the quantity of TCP connections, not Tor itself. So the points are irrelevant. He could be connecting to arbitrary HTTP servers and run into the same problem.
And an all-you-can-eat buffet operates on an assumption that their customers won't be large prosumer eaters who stick to the expensive dishes. Even though some are, and they are in fact attracted to the business, it still works out. When it stops working out, or the losses from those customers become significant enough, then they can change their business model and terms.
I say more traffic going to true edge nodes is a good thing. The more vibrant the P2P ecosystem, the harder it becomes for ISPs to discriminate against communications not going to big tech, and the harder it is to monetize user surveillance. The more customers that view their connection as something for publishing and actively participating, rather than merely consuming, the better off we all are.
If you want to implement a bandwidth cap for your users, go right ahead. Just make sure to post it as prominently as your burstable speed. 10TB/month is 32 Mb/sec.
What a wild take. I was going to start a WISP but realized early on I would get potential customers like this. Let’s be real, you’re being cheap and don’t want to shell out extra cash per month for a business line or colocation.
The service that you're describing is usually called dedicated internet access or DIA. It is a distinct service from residential ISPs, and a more costly one for good reasons. Residential and business ISPs operate a shared resource on which they must impose limits to avoid impacts on other customers. This is as true of PON as other last-mile technologies.
Total ballpark, because it depends plenty on your market, proximity to carrier resources, etc, gigabit symmetric DIA tends to be in the neighborhood of $1000-2000 per month. A lot of the variance comes from the fact that it will be delivered by conventional fiber, not PON, in order to avoid resource contention. So trenching is usually involved in the installation, but the price of that is usually amortized into your 3-year contract.
> But what if a large number of TCP connections is intentional?
Sorry, that ship sailed long ago. Carriers have forever put restrictions on how their customers can use their internet connections, such as "no hosting servers" or even not getting a routable IP address. Traffic shaping is part of the deal too.
I think the only means we have to change the situation (in the face of a lack of competition) is to lobby for municipal internet. Or start a company.
It seems like all the real Internet Service Providers have died and all we're left with is web service providers with an incomplete internet implementation. This started with the wireless telcos where it was almost justified; they were late to the game and didn't have enough IPv4. But for established holders of large IP spaces this is exploitation if not outright fraud.
Ace's "Static IP VPN" gives you more IPv4 space and is uncapped, but forces a long-EOL Cisco router and is very slow (meaning 2010 speeds). I had this on-and-off, and pushed my main Internet traffic over it from 2013-2015, but it is less flexible.
I'd love to join Hoppy Network if it had a Seattle PoP, and if I could get an "uncapped" plan or at least a higher bandwidth version (even if it is super expensive). I don't demand it from you, though.
And at the same time I totally understand why you have limits (for real), you don't want people to abuse your service and slow down everything. Maybe I will join, who knows.
I believe technically both UDP and TCP are implemented on top of IP.
But UDP is basically IP + a port number and a checksum, while TCP is IP + a port number and includes a checksum as well, and a bunch of other stuff. So while TCP isn’t exactly implemented on top of UDP, it’s pretty close.
Something very much like TCP can be implemented on top of UDP, with whatever improvements or differences you might want to implement. (It’s just that both sides need to understand the custom TCP-like protocol.)
FWIW the properties of UDP can be provided by a custom protocol implemented to look like it's TCP. Send acknowledgements for packets that were actually dropped, etc.
I've seen a few pieces of niche Free software that do such a thing to get around super restrictive firewalls.
Glad I didn't pick CenturyLink for fiber when I moved here, but Wave G's incredibly unreliable in its own way which makes me wonder if they're using the same hardware. Kinda wish I picked Google Fiber.
Where is this place that has 3 fiber ISP choices? It is hard enough to find residences with 1 choice of fiber ISP. I have yet to see a single residential location in the US that has more than one option for a fiber ISP.
I can get xfinity, CenturyLink, and USI fibre at my house in Minneapolis. It's great, the competition is such that I pay $50 a month for symmetrical gig fibre.
OP (neelc) and GPP (kevingadd) are both talking about Seattle.
Having any ISP competition at all is relatively new for the Seattle metro. It was essentially only Comcast for a decade. Now we have Centurylink in large parts of the city. I don’t think Google Fiber is widely available here.
Centurylink rates for 940/940 Mbps in Seattle are $65/mo, which isn’t as nice as $50, but not bad. Comcast charged like $80/mo (coax) for ~200/20 as recently as 2017.
Wave G is a crap service. Wave took CondoInternet and threw it down the drain. My "Gigabit" there was actually 10-20 Mbps most of the time. And where I had it, the building had wiring exclusivity to Comcast and Wave, so no Ziply Fiber or Atlas Networks. In fact, if I had to choose between Wave G or AT&T Fiber and it's forced router/802.1X, I'd pick AT&T any day.
From what I know, Wave G uses Layer3 Ethernet switches whereas CenturyLink uses GPON.
CenturyLink has a much better service which is sadly held back by the crappy ONT. But if you don't max out your TCP connection count and don't need IPv6 or can live with 6rd/HE.net tunnels, IMHO CenturyLink Fiber beats Wave G.
Google Fiber (actually Webpass) was real solid in my previous place, plus it has native IPv6 which CenturyLink lacks. CenturyLink 6rd sucks, and Hurricane Electric tunnels are getting blocked by sites now.
Most annoyingly, the FreeBSD mirrors are slow via CenturyLink 6rd so I have to route that via Tor since FreeBSD pkg lacks a IPv4-only flag.
CL does have slightly better IPv4 peering and throughput, but Google Fiber/Webpass is totally worth it for $5 more. Yes, Webpass is slightly "slower" on IPv4, due to microwave links, but the real IPv6 and support and even sticky IPs make up for me.
Centurylink works great for me; it costs less than Comcast (which enjoyed a decade-long monopoly until recently, and is the only other option where I live), and delivers superior performance. I wish the ONTs didn't do this sort of thing, but I can't say I've noticed it.
I’m not sympathetic to the author at all. You’re essentially using a home ISP for commercial purposes by hosting Tor relays. If you need resilience, then you really ought to colocate at a DC. 10 gbit is not that expensive these days, and you would provide your own switch like mikrotik.
An ISP provides an internet connection. When it doesn't provide an internet connection and only provides a web service with some internet features it isn't upholding it's side of the contract or the advertising. This is far worse than any "speeds up to $x!" lie.
And it's not just tor relays that use a lot of TCP sessions. Pretty much all distributed protocols are going to hold open a lot of TCP connections. This is not a bad thing and it isn't a heavy resource usage. It's normal. What's abnormal are wireless telco style restrictions being applied in contexts where there is no justification for them.
Saying everyone who does more than use a browser should colocate at a DC is disconnected from reality.
I am not sympathetic to you at all. Running a Tor relay shouldn't require a commercial infrastructure for anyone who wants to.
Also, Tor is not the only service he mentions.
What I am going to do with my internet is my business, not anyone else's. I shouldn't be limited in any way or form the way I want to use the internet as long as I stay within the limits of law.
Off topic, but CenturyLink Fiber still uses PPPoE and 6rd instead of native dual stack in many markets and are unwilling to upgrade to more modern configurations.
EDIT: I do not use Tor at all.