I getting lost in this orchestrator world. Can somebody explain the use case for Minikube? Or Microk8s? All claim to be certified as "perfectly like kubernetes" and show they can be used in production, but are people using them? Why?
Kubernetes is considered too complex to deploy for mere mortals.
Here's rough list of things that I had to do, because I'm doing right now exactly that: trying to deploy kubernetes.
1. Prepare server by installing and configuring containerd. Few simple steps.
2. Load one kernel module. Configure ip forward sysctl.
3. Install kubernetes binaries from apt repository.
4. Run kubeadm init
5. Install helm (this part is optional, but makes things simpler): download binary or install it from snap.
6. Install network plugin like flannel or calico.
7. Install nginx ingress plugin.
8. Install storage provider plugin. Also you should have some storage service like NFS server, so Kubernetes can ask for storage.
If you want single-node Kubernetes, I think that should be enough. May be #6 is not even necessary. If you want real cluster, you would need to tinker with load balancer for which I don't have clear picture right now. I'm using external load balancer.
If you're using docker, my understanding is that containerd is already installed.
I actually spent few weeks trying to understand those parts and I have only shallow understanding so far.
With that in mind, simple kubernetes solutions probably have its place among those who can't or don't want to use managed kubernetes from popular clouds.
I have no idea about those simple kuberneteses though.
My opinion is that vanilla kubernetes is not that hard and you should have some understanding about its moving parts anyway. But if you want easy path, I guess it's something worth considering.
Flannel and Calico are responsible for assigning pod IPs, so you need them even on a single node.
One main reason you'd want to run minikube or kind is also that these clusters are easy to reproduce and don't pollute your system's network namespace and sysctl.
For Load Balancer in your case you would probably provision MetalLB in place of the cloud specific LB solutions that cloud providers deploy. It’s somewhat straightforward, though the steps I believe are specific to each network provider (flannel, calico etc)
Maybe a bit too hacky, but if you only plan to use nginx-ingress + HTTPS (and don't have spare /24 IPs around), then you can set up nginx on each node, run a script that generates a nginx config every few minutes (use the stream module to forward port 80 and 443 TCP/UDP to the ingress nginx)
Then add the IP addresses of the nodes as a wildcard DNS.
If you used docker compose and getting frustrated with the lack of some important features, using these lightweight kubernetes distributions are actually great. Blue/green deployment, a whole bunch of storage volumes supports, and load balancer with automatic letsencrypt supports, and great secret management (ability to mount secrets as files/directories inside a pod is a killer feature) are the reason I use kubernetes instead of docker compose for my side projects even though I ignore the rest of kubernetes features.
> All claim to be certified as "perfectly like kubernetes"
because they are. they are directly built from the go sources, all are wrappers around the meat of k8s. (which are the various control loops packaged into services like api-server, kubelet, schedulker, controller-manager, etc ... and etcd itself)
minikube does a big monolithic build for convenience. (it can do this because all involved components are in pure go.)
microk8s is also a distribution of k8s.
almost all distributions have convenience features to help you with installation/setup. but all they do is what "the hard way" setups do. fetch/copy binaries, generate/sync keys, setup storage (devicemapper, btrfs volumes, whatever), setup wrappers that then start the binaries with the long list of correct arguments, and set them up to start when the node starts (usually by adding systemd services or something).
In a sense they are distributions. Ubuntu and Fedora can also both do it all, and it's no clear where you'd use one vs the other. You're in the hands of different people.
Well also, minikube is meant to facilitate development and testing. We use minikube for local dev, tests, etc. and everything transfers over well to the production k8s cluster
um, they aren't missing anything (but see below). they are k8s, just as you rarely run the Linux kernel without userspace.
so if you want to get the genuine original mainline experience you go to the project's github repo, they have releases, and mention that the detailed changelog has links to the binaries. yeey. (https://github.com/kubernetes/kubernetes/blob/master/CHANGEL... .. the client is the kubectl binary, the server has the control plane components the node binaries have the worker node stuff), you then have the option to set those up according to the documentation (generate TLS certs, specify the IP address range for pods (containers), install dependencies like etcd, and a CNI compatible container network layer provider -- if you have setup overlay networking eg. VXLAN or geneve or something fancy with openvswitch's OVN -- then the reference CNI plugin is probably sufficient)
at the end of this process you'll have the REST API (kube-apiserver) up and running and you can start submitting jobs (that will be persisted into etcd, eventually picked up by the scheduler control loop that calculates what should run where and persists it back to etcd, then a control loop on a particular worker will notice that something new is assigned to it, and it'll do the thing, allocate a pod, call CNI to allocate IP, etc.)
of course if you don't want to do all this by hand you can use a distribution that helps you with setup.
microk8s is a low-memory low-IO k8s distro by Canonical (Ubuntu folks) and they run dqlite (distributed sqlite) instead of etcd (to lower I/O and memory requirements), many people don't like it because it uses snaps
k3s is started by Rancher folks (and mostly still developed by them?),
there's k0s (for bare metal ... I have no idea what that means though), kind (kubernetes in docker), there's also k3d (k3s in docker)
Worth mentioning there's a middle path, namely kubeadm. That's the "sanctioned" way to bootstrap clusters without going full from scratch and many other distributions actually use it internally.
I don't see why you would use Minikube in production (nor have I ever heard anyone do this), but Minikube is exceptionally helpful for local development, when you want to test against a real Kubernetes API server (as well as test any of your desired orchestration for your component).
This is a great use case I've found as well. If you have a product that is deployed to K8s, the ability to create clusters on demand for testing, whether local or otherwise, is awesome.
Also edge of the network deployments where you want consistency with datacenter deployments but don't have a lot of local compute resources or are otherwise limited.
I am adding another executor to a workflow engine. Minikube is a huge help for my dev work (I always test against a Local Real Instance, which is what minikube is).
It's helped on more than one occasion to show that a prod k8s instance lacked a feature or was misconfigured.
I've only used Minikube, kind, and k0s as sandboxes for production kubernetes deployments in the cloud (i.e. EKS). Given I'm already using Docker Desktop on my mac laptop though, the easiest thing to do is just use its built-in kubernetes. It works pretty well, and obviates the need for any of these micro-kubernetes distros.
Minikube is in a different category, alongside kind. These clusters are meant to be disposabale for development, and at least for kind, can't be updated easily.
I wish distros would stop making "docker" an alias for "podman", they are not the same thing and breaks all light-k8s implementations.(looking at you redhat)
I've encountered cases where podman CLI does not match Docker, specifically for network creation with IPv6--the commands are different. What are you experiencing?
CLI is not my issue, k3s and kind wont work with podman (or any rootless container for the matter) out of the box, in both you need to do some non-trivial cgroups configuration on the OS to make it work (in k3s this mode is experimental)
Please excuse my ignorance. I am aware of how Docker generally operates (at a noob level i.e. containerizing an application & uploading to container public registry for general availability).
Given that understanding, could someone please explain what "rootless" would mean? I want to understand these in simpler terms:)
Docker is running a daemon with root privileges to start all containers. So if your start a container with "docker run -d ...." you talk to a privileged process. That in turn means, all spawned containers can have root privileges (docker run -v /etc/shadow ... to change the root password of your host). "rootless" actually means running a container process as a normal user. (less attack surface because of less permissions). So if you would run "podman -v /etc/shadow" as a normal user, you wouldn't have the permissions needed to open the file.
As simple as possible:
Docker ("normally"):
run every command inside container with full root permissions on host
$root-> Docker -> container
Docker/Podman ("rootless"):
run every command as the current user
$user-> container
The other big piece is capabilities (specifically CAP_SYS_ADMIN) which as I understand it is related but kind of orthogonal to the question of root/rootless.
For example, buildah (the container-building part of podman) is daemonless and can use the fuse-overlayfs storage driver to build containers rootlessly— you appear as root inside the container, but from the outside, those processes and any files created are owned by the original invoking user or some shim UID/GID based on a mapping table.
But critically, this doesn't mean it's possible to just run buildah inside any Kubernetes pod and build a container there, because buildah needs to be able to start a user namespace, and must have the /dev/fuse device mapped in. I believe there continues to be ongoing work in this area (for example Linux 5.11 allows overlayfs in unprivileged containers), but the issue tracking [1] it is closed without really being IMO fully resolved, since the linked article [2] from July 2021 is still describing the different scenarios as distinct special cases that each require their own special sets of flags/settings/mounts/whatever.
Yup, and based on that mapping table the process inside the container is not allowed to create another namespace and/or fuse-overlayfs. That's why you need to mount /dev/fuse into the container (you might also need cap_sys_admin and cap_mknod). There is another link from RedHat which also explains it:
Typically, the way a normal Docker installation works is that dockerd (the Docker daemon) is an always-on background service running as root that exposes a socket file with group write privileges owned by the 'docker' group, allowing non-root users to send commands, effectively acting as a privilege-escalation mechanism. There were at least three reasons the daemon needed to run as root, which included needing to modify the host routing table to set up an overlay network, only root being able to create overlay filesystems, and at least some containers themselves having to run as root because they contained files that had to be manipulated in some way by uid 0 in the container.
podman in rootless mode gets around these by using slirp4netns to create pure-userspace overlay networks, fuse-overlayfs to create pure-userspace overlay filesystems (or a driver that can't deduplicate storage on older kernels), and uid/gid mapping in user namespaces to create the illusion inside of a container that an application is running as root when it isn't really root on the host.
Additionally, podman gets rid of the daemon and just uses normal fork/exec of the ephemeral podman process.
The upsides are:
- podman can run entirely in home directories and doesn't need to globally install config files or the container filesystems, making it easier for many users to share the same server.
- Running a malicious or compromised container won't compromise your host (big caveat here is unless it can exploit a vulnerability in user namespaces).
- Users who don't have root at all can still run containers. Note that while this appeared to be true using Docker because you could just be part of the 'docker' group to write to dockerd's socket, effectively this was giving you root.
The biggest downside is the userspace networks and filesystems are slow compared to their in-kernel counterparts, which is why you typically won't see it in any kind of production setting, but minikube is meant to be used as a small-scale mock of production kubernetes run by developers, so it can be a good fit there.
Note that rootless minikube was actually already possible, but way more convoluted than just using rootless podman as the container runtime.
I've seen netavark described as a much faster rootless networking stack. Do you know if that is the case? I know that Podman supports it. Does anything like that exist for storage?
Not an expert at all, but here's how I would simplify it. All corrections are welcome!
Docker has two main components. The daemon (you can think of it somewhat like a server) and the client (application you use to run commands).
When you install docker on your machine, it generally installs both. The daemon is a process that runs on your local machine and runs as root.
Rootless refers to the alternative method (used by podman for instance) to run the daemon as a standard user, and delegate root-level tasks to something else, like systemd for instance.
> Docker has two main components. The daemon (you can think of it somewhat like a server) and the client (application you use to run commands).
Is the daemon what they call the docker-engine? Is this what's available on Linux natively? Rootless makes sense here bc you wouldn't want one docker image able to interfere with another, or even the Linux system that is running the docker runtime/engine.
For Windows/Mac docker solutions, where does the daemon live/exist/run? Inside a virtualized Linux instance?
As I understand it, most of these alternatives to docker-desktop are all just wrappers around a virtualized Linux image running the docker engine/runtime. That's why many of them require a virtualization engine like Virtual Box. So are these no-commercial solutions just wrappers around one or more virtualized Linux runtimes where the docker engine/runtime is running natively?
If all the above is (approx) correct, then "what" is rootless with this announcement? The docker runtime/engine in the virtualized Linux instance?
I thought the docker engine/runtime on Linux was always able to run rootless docker images. So what is the news here if all these non-commercial solutions are just wrappers around the docker engine/runtime running in a virtualized Linux?
Yes for windows and Mac it runs a Linux VM. On windows it can also use WSL2 as the linux vm.
Docker-engine is the daemon built by docker. Podman is an opensource work a like. Docker-engine doesn't support running as a user other than root. Podman does. This announcement says minikube will work with Podman running as not root.
I remember hearing that development of docker-engine was ceasing, but could obviously live on as it was forked. I guess rootless is some of the work that Docker (company) wanted to keep proprietary and out of this open-source project.
Really quite a shame, although understandable from a commercial perspective.
Assuming that these improvements are finding their way back into an open-source project, I'm glad to hear about this work from minikube and Podman.
It means it uses user namespaces to map a non root user in the top level user namespace (where eg init runs) to a root user inside the container. This allows the container process to run as root inside its user namespace, retaining the full set of capabilities required to call privileged syscalls or access files owned by root.
Docker does all its work in a central daemon running as route. Any docker command you run is just sending messages to that central daemon.
You can see some downsides to this when you do the classic developer setup system of having a docker image with your tools and mounting a volume of your source tree into the container for building. When you build, the build products in your filesystem are owned by root because the code was actually running under the daemon. This can cause all sorts of pain.
When you run something like podman, there's no daemon - it's all just processes running as your user (like any other script) so files created end up on your filesystem owned by you.
With podman there is no daemon, everything is running as you. The standard setup for docker has a daemon running as root, which means when you start a container it has root privileges.
Just to make sure: "rootless" is really misleading. As far as I researched, podman either relies on suid binaries or privileged capabilities or both to do its magic. You might as well call it "capabilitiesful podman driver".
You do need an suid binary to e.g. set a new user id map, since this requires comparing the user id range owned by you to what you're mapping, but you only do it once and it's a simple, secure operation.
It's somewhere in between. You definitely need to enable features that are normally out-of-reach of regular users (i.e. user namespaces, network namespace, unprivileged ping, etc.) However it's still a far cry from full root access, and arguably a smaller surface area than regular run-everything-as-root mode.
While rootless is a curious technical trick I don't understand why the implementation ever left someone's laptop, both file and networking performance are utterly abysmal, which is completely at odds with one of the primary benefits of containers (near zero overhead).
On servers, yes, rootless doesn't make much sense.
But on on my dev laptop, "sudo docker" is tiring and adding docker to the sudoers group is a big security hole (why does everyone seem to think that "docker run" giving root privileges is ok ?!).
This indeed. The Docker team should not include the "adding your user to the docker group"-section in the install documentation. It is very unsafe and even though they link to a document on security implications I don't think all users will truly grasp the implications.
Better to hide this feature and promote the rootless docker mode for local use. On servers you won't be adding any unprivileged user to the docker group in any case.
Tangential, but are there any easy ways to run server applications on bare metal in a way that removes the need for an underlying OS in order to decreases the overall attack surface an attacker can look for exploits in? (Mainly talking about applications written in Go(TinyGo), Rust, and C++ that can be easily compiled to run on bare metal)
Unikernel is what you're interested in, but it's not as easy as taking some Linux-based server software and spitting out a bootable image for baremetal. If you strip the kernel and OS out you lose the network stack and all kinds of system services that most software depends on directly.
I think Google's distroless container images are worth checking out as a quasi-alternative: https://github.com/GoogleContainerTools/distroless You use them as a base for a docker image and copy in your server code. These images are tailor made to strip out _everything_ that's not necessary to run the software--there's no shell for example. So you're still running a Linux kernel, libc, etc. but there's nothing there for an attacker to use other than your app code. You yourself can't even get into a shell to debug or examine what the state of your app is (which can actually be kind of aggravating in development).
"Distroless" containers are pretty cool for making deployment images. I feel like a better name could have been chosen, because ultimately you are relying on a distribution and how they operate unless you're building an image from scratch and copying in your self-compiled dependencies.
I build my own distroless-like images for personal use using Fedora and RHEL, though I do follow the ubi-micro[0] build steps and include a tiny bit of user space components to enable debugging.
As an alternative to unikernels, that the other replies are talking about, which require special builds and might not work the same, you can also do something pretty simple:
Just run your program as the only process.
As a Linux host with no other software.
No /bin/sh, nothing else in the filesystem.
No I don't think so and in fact rootless containers can be slower due to user-level networking and overlay storage, but the goal is more isolation and security.