AWS endpoints are similarly publicly accessible, protected by single access keys or session keys. That's the nature of the Internet. Your VPN endpoints are also publicly accessible; anyone with credentials or a RCE in the VPN endpoint can get through.
At some point you just trust the common shared public key libraries that protect public APIs and VPNs.
This article is working backwards to solving a problem that most development teams shouldn't have. I will first walk along with the article and try out their backwards solution before getting to my point:
If you're using Kubernetes managed by a cloud provider, e.g., Amazon EKS, authentication is already handled through IAM. This is a service that's designed to be exposed to Internet traffic and that AWS is essentially managing for you. If you're able to get past AWS IAM then the fact that your cluster API is public is moot because gaining access to an AWS account that has significant kubectl privileges is likely going to also expose a number of other privileges to make changes in AWS.
But of course, those suggestions bring us here to this article: Amazon suggests using a bastion or a cloud IDE, but we might find that too clunky.
This is where the article offers "a better alternative: Tailscale is basically one of those fancy easy-to-use VPN replacements that promises to get rid of VPNs and bastion hosts in favor of protocol-aware proxying. And, that's fine, but it's not cheap compared to a traditional open source VPN if you're using it for a company: https://tailscale.com/pricing
Besides not being very cost-effective, we're still working backwards to the actual issue here. Who needs kubectl access and why are they using it so much in production that a bastion or VPN is annoying? Why don't we just make it so that developers don't have to use kubectl in production anymore? Isn't their code being deployed with an automated pipeline?
There is a very good paper [1] on attackers using open container and kubelet registries from colleagues of my uni advisor. Funnily enough, this was supposed to be my thesis‘ focus but they were faster. Happens to the best of us I guess :)
I think something doesn't quite add up in OP's claim. Having full access to the control plane effectively let's you do anything you want with the cluster. If there were really millions of completely unsecured Kubernetes instances lying around, we should see all kinds of worms and botnets already taking advantage of them - and we should also see lots of attempted probes/attacks to infect new instances as soon as they come online (dark forest, etc). Yet I haven't heard about anything like this so far. This leads me to believe that the "public" instances actually are protected in some way, just not by a private network.
Not a Kubernetes expert, but doesn't the Kubeconfig contain some sort of secret that the client needs for authentication? You might be able to get some basic health check without that, but I'd be very surprised if you could do much more without authentication.
At least if that's how it works, this would be an example of "zero-trust" networking, where you just assume anything can be reached from the internet anyway and secure your services based on that premise. If done well, it shouldn't matter whether or not your services are actually exposed on the internet or not.
Of course in the interest of defense-in-depth, I'd still try to make sure my services are not actually exposed - bugs happen all the time and additional security layer can't do harm.
Of course that's how it works, there are various authentication methods that must be completed [2] to do anything. The author's claim is that similarly like you shouldn't expose RDP to the public internet, you shouldn't expose your k8s API.
In practice the difference is that while RDP (or database or WordPress) credentials are vulnerable to bruteforce, in almost all cases k8s clusters are secured by either mtls or jwt tokens - both utterly non bruteforcable. You're not completely right though, unsecured k8s clusters do happen[1]. And there is some defense in depth component, as you've mentioned.
And yeah, we're one RCE away from the cloud melting disaster. But then, the same is true for nginx or openssh.
[2] Fun but meaningless fact: kube-api defaults for both authentication and authorization are to allow anonymous users and allow everything to everyone (AlwaysAllow)[3]. This is meaningless because every k8s distribution in existence changes authorization default with appropriate flags.
In practice, no one's actually bruteforcing your RDP or database or WordPress. They're using leaked (or common) credentials - which is still a threat to any other type of service.
(That, or they exploit a history of vulnerabilities in the software behind things like RDP or databases - and you should assume that all of the software you're using has vulnerabilities... which are most severe in highly trusted systems like a control plane...)
OpenSSH is carefully designed with security in mind, far more widely used than Kubernetes, with a fairly minimized attack surface. Nginx is probably a bit less carefully designed, but also doesn't generally have full access to the entire system.
Nginx needs to be open to the Internet at large to work (assuming it's running a public website or something). And you probably need some way to manage it from the Internet. I'd say SSH is a pretty good choice there, especially over Kubernetes.
This is the way. Even in a zero trust environment, it’s wise to have your cluster endpoints internal to a VPC. Zero trust doesn’t mean no internal network, but rather no internal trusted zone.
It’s still a great idea to minimize the “blast radius” where you can.
Back in 2020 I once accidentally kept my personal k8s api open and got hit by a drive-by k8s-in-k8s bitcoin miner attack, it deployed an operator and it failed to run correctly but it created a ton of pods in my default namespace.
Thought about it for a few seconds, and yeah, I'm sticking with meh on this one.
I do have a managed K8s cluster running on one of the major hosting services. It seems I can indeed hit the /healthz endpoint without auth. But I don't see any reason to care. First off, the control plane URL is a UUID tacked onto the hosting service's DNS. Good luck figuring out what it is starting from the URLs of the sites hosted on it. Even if you're my ISP or somehow otherwise in a position to observe traffic to the CP, good luck correlating it to what's actually hosted there. Yeah you can see that the cluster is healthy, but not much of anything else without auth AFAIK.
Second, there's a tension between "network restrictions don't help that much, if your control system is actually secure, it should be secure enough to expose to the clearnet" and "defense in depth always, another layer never hurts anything". I'm leaning towards the first. You can always come up with an excuse to add another layer, but how much does it actually protect you from any reasonably plausible attacker? Lots of stuff far more sensitive and valuable than anything I've got has control interfaces exposed to the net like that, relying solely on the security of OpenSSH, K8s TLS cert system, etc for their protection. If somebody figured out how to bypass that, and chose to use the capability against my stuff, who's to say that any extra layers of security I could dream up would actually slow them down? I think it's not likely, and those extra layers are much more likely to trip me up and let me accidentally lock myself out than to stop any plausible attacker.
The most plausible attacker is, IMO, someone scanning the Internet en masse for 0day or 1day vulns that haven't been patched yet. Which not exposing it to the Internet prevents.
Most (initial) attacks are not targeted at you, they're just scanning the whole Internet.
I know it's considered security through obscurity but it still makes sense for me to run sshd on a different port. And so it might be a good idea to use a different port than 6443 for the Kubernetes API. It doesn't make anything more secure per se but it helps evading automated scanners and reduces noise in the logs which helps in security monitoring.
Yeah, sometimes I think that quip/rule is unhelpful. The King of England's or US President's driving routes are known only to the convoy and staff who need to know for example, that massively raises the bar for someone hoping to make some kind of disruption to it, it reduces the number of threats/actors they need to worry about it, surely it's fair to say it increases security and that's why they do it.
I think the line should be 'obscurity can never provide total security' [but it can be a valuable component].
The driving routes are not obscurity but actual secrets. Similarly, a private key is not obscurity but an actual secret.
Something that can be trivially observed (changed port -> nmap, secret obfuscated code in your executable, behavior of your server) is not a secret.
That said, obscurity can provide defense-in-depth (i.e. protection against scanning the default port).
-- Changing the port can be useful for other reasons too, our pfsense's at work run ssh on 2222 simply because otherwise the logs are absolutely full of scanners getting banned (despite having password auth disabled).
True, driving routes wasn't a great example. But I think the point stands if you take something like the way they drive (number of vehicles, how they manage junctions, etc.) or which exit they use.
You can observe stuff like the way police motorcyclists (I think they call them 'outriders') speed ahead to the next junction, stop traffic, convoy passes through, rear motorcyclists meanwhile catch up to the front of the convoy and the ones holding traffic then adopt their position at the rear, swapping each junction. But there's no benefit to documenting/publicising that (and I'm sure there's nuance to it I haven't picked up on). It's probably also easier to get involved in the police and go on that advanced driving course or whatever and learn all about the techniques they use than it is to actually be assigned to that team - it's not super secret, but the more you keep hidden the harder you make it to be subversive.
Those kubernetes controlplane endpoints are presumably secured with key-based authentication, right? Is it really worse than having an ssh server (with key-based authentication only) on port 22?
Why people expose their core infrastructure on the internet directly and unfiltered always baffles me. I guess they must be SWEs who have never touched a server.
All cloud providers provide public APIs for managing infrastructure, how is kubernetes any different?
Your viewpoint operates on the assumption that a private network provides additional security in comparison to the public internet, which is a very dangerous assumption to make.
It is public, so what? Obviously cloud operators consider it secure enough to offer that option by default. What are the reasons to consider it not secure?
It's a good idea to minimize the attack surface you expose on the public Internet. Basically the default assumption should be to consider everything insecure. Cloud vendors by nature take ownership of protecting all their public facing APIs and they're in the business of offering and securing those. That doesn't mean you should open everything to the Internet.
Not only that but cloud providers put their managed Kubernetes behind IAM.
If you've managed to breach that and you've got Amazon or Microsoft convinced that you are a privileged account owner, it would be trivial to modify the infrastructure to make a private API endpoint public or create a bastion host with access.
While I agree is likely secure enough, remember that cloud operators such as Aws used to be ok with making s3 buckets public by default, which did cause many issues until they finally changed that (and made it harder to make them public even intentionally).
S3 buckets have never been public by default. They were, however, very easy to make public before things like Block Public Access existed, so lazy devs would just click that button rather than doing proper access control.
Nah, it effectively auths you against whatever saml/oicd provider you're using to back tailscale, along with the acls you define. It's not just 'network access = auth'.
There is no such thing as a "private" network that is magically more secure than the internet, and if you don't trust it on the internet, you shouldn't trust it on your private network. If you have anything of value, there are going to be bad actors poking around in your private network eventually.
> there are going to be bad actors poking around in your private network eventually
Sure, but will those bad actors be poking around in your network at the same time as you've got an unpatched vulnerability in Kubernetes? Or will it just be the 1000s of Internet scanners and botnets looking to exploit it?
> There is no such thing as a "private" network that is magically more secure than the internet
Obviously there is, although it's not magic. I could equally say "There is no such thing as a 'private' key that is magically more secure than nothing".
> if you don't trust it on the internet, you shouldn't trust it on your private network
This bit is true. But just because you trust it doesn't mean it needs to be exposed.
"...For reasons I can’t quite understand, we’ve sort of collectively decided that it’s okay to put our Kubernetes control planes on the public internet. At the very least, we’ve sort of decided it’s okay to give them a public IP address..."
Is that what Google does with Borg? ;-)
"...Many of the developers at Google working on Kubernetes were formerly developers on the Borg project. We've incorporated the best ideas from Borg in Kubernetes, and have tried to address some pain points that users identified with Borg over the years..." - https://kubernetes.io/blog/2015/04/borg-predecessor-to-kuber...
Google never did that with Borg. In fact when I worked there there wasn’t even a way to talk to borg api from your laptop other than going via your personal workstation in the office
I worked at a company where everything was exposed to the internet. There was no vpn, except for getting access to key infra, like databases and caches. All our reporting, support tools, etc was freely available; if you had the right cookies.
At some point you just trust the common shared public key libraries that protect public APIs and VPNs.