Sure, but the span of machine languages that are in actual use occupy a small cluster of possible design space. MIX, the language invented by Knuth, is not within this cluster. Maybe computer architectures in the 1960's looked like this, idk. I wouldn't know: I'm not that old.
But it's also a weirdly esoteric, crusty machine language. If it was a Turing tape machine, or an idealized stack machine, that I'd understand. Those are invented architectures that are amenable to analysis and in which the analysis (e.g. maximum stack size, maximum tape use) reflects fundamental properties of the algorithm.
But MIX is just this weird thing that neither resembles CPUs you are likely to use, nor is it particularly useful for analysis.
When Knuth wrote MIX, it was indeed a mix of existing machine languages (https://retrocomputing.stackexchange.com/a/18176) and very similar to them — but an improvement over them for pedagogy, e.g. even something as simple as his abstracting away the detail of whether the machine is a binary or a decimal computer.
Of course decimal computers and self-modifying code, and a lot of other things besides, went out of fashion, so he designed MMIX to replace MIX, during 1999–2011. The MMIX update to TAOCP started being put online in 1999 and was published in 2005 (Fascicle 1) and all the individual programs were finally put in book form by Martin Ruckert in 2015 (MMIX Supplement). The design of MMIX is close enough to (some/many) actual CPUs of the present/future — it's basically a RISC architecture like MIPS or RISC-V; in fact Knuth closely worked with Hennessy and Dick Sites in designing it. Yes it is “nicer” in some ways to write programs in than real CPUs (e.g. it has a whopping 256 registers, handles alignment automatically), but that's a fine choice for pedagogy IMO (avoid dealing with register spilling etc, while still being able to look into machine-level concerns like pipelining etc).
Maybe you read TAOCP at some point before 2005 (or missed the “don't bother learning MIX” part of the preface), in which case your complaint is valid (and of course it's not very convenient to read some parts from a different book), but IMO the machine language being a weird one has not been a problem for over two decades.
I think you’re missing the forest for the trees here. Knuth shouldn’t have been using a machine language for his examples, full-stop. Nothing was gained by giving examples in needlessly complex and obtuse assembly language.
(1) TAOCP shouldn't have used an assembly language at all,
or
(2) the specific assembly language used in TAOCP (MIX in the 1960s, MMIX since the 1990s) has a bad design.
In the comment I replied to, it seemed the argument was (2) rather than (1), so that's what I replied to.
As for (1), it has been discussed many times before, but apart from the reasons he himself gives (https://cs.stanford.edu/~knuth/mmix.html#:~:text=Why%20have%... and the preface to TAOCP and to the MMIX supplement — “Indeed, Tony Hoare once told me that I should never even think of condensing these books by removing the machine-language parts, because of their educational value”), in short:
• [quantitative] Most of the book is in English pseudocode, dropping down to assembly language only on occasions when the concreteness is relevant. (See also https://news.ycombinator.com/item?id=38444482 above.)
• [historical] The job he was hired by Addison-Wesley to do (you've written a few compilers in machine language — e.g. https://ed-thelen.org/comp-hist/B5000-AlgolRWaychoff.html#7 — write a book for us showing others how to do the same), meant a book for working programmers writing real programs (most of whom would be writing in languages for various different machines): no one in the 1960s was writing production compilers in FORTRAN or ALGOL (or COBOL or BASIC!); I may be wrong but I imagine very few compilers were written in “high-level” languages (PL/I notwithstanding) until C took off in the latter half of the 1980s, and Turbo Pascal was written in assembly even in the 1990s. A language that is similar to what serious readers would use, but abstracts away some of their finicky details and is a bit nicer for teaching in, is exactly what MIX and MMIX are. (Yes MIX was getting seriously out of date by the 1990s; that's valid criticism and why he spent so many years replacing it with MMIX.)
• [cultural] Among people seriously interested in computing/programming/algorithms, on the one hand there are the theorists, most of mainstream academia, who are happy thinking abstractly and asymptotically, not bothering too much with constant factors or machine-level considerations. Knuth, despite being part of academia and having to a great extent spawned this field, is at heart very much on the other side (here not counting the even large number of programmers who simply do it as a job): of hackers with “mechanical sympathy” who simply care about accomplishing a bit more with computers, getting their programs to run faster, and so on. In fact, this subculture is somewhat underground (occasionally surfacing like HAKMEM) and Knuth may be one of its few “respected” representatives in academia. For example: his presentation of circular lists of course includes the XOR trick to save one field (https://en.wikipedia.org/wiki/XOR_linked_list) from the very first edition (Exercise 2.2.4–18). “Packing” tries to put the children of one nodes among the “holes” of others? Of course (Exercise 6.3–4, and how his student Liang and him did hyphenation in TeX, and what he used for the word count program) (unlike CLRS which barely even mentions tries https://news.ycombinator.com/item?id=21924265). He has written (still being published) multiple volumes devoted to “backtracking” programs that many theorists are not interested in because the programs are only applicable in special instances, but Knuth is super excited to be able to count (say) the exact number of knight tours; there are hundreds of pages (each) on binary decision diagrams and on “dancing links”, a low-level way of implementing the “undo” operation in backtracking programs. A lot of what he writes is backed by his actually having written programs and seeing how they run; his volume on SAT solvers gives detailed measurements in “mems”. He has spoken multiple times about how skill in programming is being able to zoom in/out across multiple level of abstractions, knowing how some high-level goal of your program is being achieved by a specific register changing its contents. From this perspective, there is often value in being concrete about what the machine is actually doing.
> I may be wrong but I imagine very few compilers were written in “high-level” languages (PL/I notwithstanding) until C took off in the latter half of the 1980s,
There was an interesting time in the late 70's when many vendors had their own high-level systems software development programming languages: CDC's Cybil, Sperry-UNIVAC's PLUS, &c., and these were often used for compiler development as a replacement for assembly language. At Cray, the CFT compiler was in assembly language, and CFT77 was in Pascal.
But it's also a weirdly esoteric, crusty machine language. If it was a Turing tape machine, or an idealized stack machine, that I'd understand. Those are invented architectures that are amenable to analysis and in which the analysis (e.g. maximum stack size, maximum tape use) reflects fundamental properties of the algorithm.
But MIX is just this weird thing that neither resembles CPUs you are likely to use, nor is it particularly useful for analysis.