Think of all that is involved in running a single-machine application. You need to load the code, including shared libraries, into memory, resolve symbols to memory addresses, figure out where to allocate memory from, whether static, stack, heap, and when it's heap, when to swap pages to disk. To move data to and from disk, you need to know the sector and offset, the start and stop points for other files. When loading from and storing to memory, you need to know what memory blocks belong to what processes. The reason this doesn't seem complex to application developers is because the compiler and the kernel do this for you. You just give a file name and a variable name and those are automagically resolved to memory addresses and disk sectors and address spaces are kept separate without you needing to worry about it.
The issue here is we don't have a kernel and compiler for distributed applications. So instead we have no choice but to expose that complexity to developers. They need to specify the IP address and port of a remote service to invoke. We've solved the IP problem in part with DNS, but now what happens when you want to migrate or load balance? Kubernetes solves this with service discovery, so you just name a service and the container orchestration engine worries about resolving that to a pod at runtime. But you still need to specify a port. We haven't yet figured out how to automate allocating ports like we managed to automate register allocation on a CPU, especially when application code itself might need to know them. We still require you to know how much storage you need and specify that in the definition of a persistent volume. Ideally, it would be as easy as it is to ask for an array of integers in a programming language. The compiler will figure out how much storage an application requires and give you that much from some pre-allocated pool you as a developer don't have to worry about.
But again, there is no such pre-allocated storage pool. There is no such compiler. There is no POSIX filesystem or memory standard for multi-machine networked systems. There's a fragmented system of vendor-locked services providing storage servers, database servers, cache servers, http servers, message queues, some more open than others. Kubernetes is an attempt to provide abstractions that make it possible to define an entire network of such individual servers declaratively and it's a noble effort. But it's complex because the underlying problem space is complex. Distributed computing is at the point in its evolution right now that single-machine computing was in around 1950 or so, when you needed to tell the program exactly where in memory to find and store a variable, exactly where on disk to fetch a block of bytes. Will it ever get to where we are now with device drivers, compilers, and kernel allocators and schedulers doing all the heavy lifting for you? This is what a Kubernetes engine is trying to be, but it's early in the game. Very early. I don't see the point in writing an article implicitly shaming the developers for trying to provide higher level abstractions that make the simple cases easier, any more than criticizing a kernel developer in 1960 for inventing virtual memory as if they're admitting that symbol to address resolution in a multi-processing system is too complex. Of course it's too complex! And we're trying to make it less complex.
The issue here is we don't have a kernel and compiler for distributed applications. So instead we have no choice but to expose that complexity to developers. They need to specify the IP address and port of a remote service to invoke. We've solved the IP problem in part with DNS, but now what happens when you want to migrate or load balance? Kubernetes solves this with service discovery, so you just name a service and the container orchestration engine worries about resolving that to a pod at runtime. But you still need to specify a port. We haven't yet figured out how to automate allocating ports like we managed to automate register allocation on a CPU, especially when application code itself might need to know them. We still require you to know how much storage you need and specify that in the definition of a persistent volume. Ideally, it would be as easy as it is to ask for an array of integers in a programming language. The compiler will figure out how much storage an application requires and give you that much from some pre-allocated pool you as a developer don't have to worry about.
But again, there is no such pre-allocated storage pool. There is no such compiler. There is no POSIX filesystem or memory standard for multi-machine networked systems. There's a fragmented system of vendor-locked services providing storage servers, database servers, cache servers, http servers, message queues, some more open than others. Kubernetes is an attempt to provide abstractions that make it possible to define an entire network of such individual servers declaratively and it's a noble effort. But it's complex because the underlying problem space is complex. Distributed computing is at the point in its evolution right now that single-machine computing was in around 1950 or so, when you needed to tell the program exactly where in memory to find and store a variable, exactly where on disk to fetch a block of bytes. Will it ever get to where we are now with device drivers, compilers, and kernel allocators and schedulers doing all the heavy lifting for you? This is what a Kubernetes engine is trying to be, but it's early in the game. Very early. I don't see the point in writing an article implicitly shaming the developers for trying to provide higher level abstractions that make the simple cases easier, any more than criticizing a kernel developer in 1960 for inventing virtual memory as if they're admitting that symbol to address resolution in a multi-processing system is too complex. Of course it's too complex! And we're trying to make it less complex.