Hacker News new | past | comments | ask | show | jobs | submit login
The Birth of Standard Error (2013) (aueb.gr)
132 points by bcaa7f3a8bbc on Oct 11, 2019 | hide | past | favorite | 35 comments



This might have been a case of parallel invention. When I was using Michigan Terminal System in 1981, and it wasn't new then, I remember SERCOM/MSINK being distinct from SCARDS/SOURCE and SPRINT/SINK. Yes, that's cards as in punch cards. My impression is that similar concepts and terminology have also existed in the IBM world since $forever. I doubt that the MTS folks and the Bell Labs folks were aware of each others' identical inventions, but if there was a relationship I'd bet on the MTS folks having been first.


I'd posit, though without experience of contemporary systems of the era, that the article's more correct title would be 'the birth of standard error in Unix'..


I'm dropping a link [1] for the same post, but on the author's actual webpage; this mirror seems to have stopped updating circa 2016.

He's got some pretty neat stuff in there!

[1] https://www.spinellis.gr/blog/20131211/


This is about Unix stderr, not about Gauss's Central Limit Theorem and related statistical topics.


For those interested in the origins of the standard error in statistics, "The Most Dangerous Equation" by Howard Wainer is a good, short read: https://www.researchgate.net/publication/255612702_The_Most_...


Thanks for posting this, wonderful read. I didn’t really get the example with chromosomes (I would expect some references proving that X chromosome really does contain genes linked to the intelligence), but it was an enjoyable piece of writing.


Interestingly enough, I can’t find mention of de Moivre equation anywhere except of this article. Does anyone know its more commonly used name?


https://math.stackexchange.com/questions/253486/name-of-de-m...

As far as I can remember, Wainer doesn't really assert that the equation is known as "de Moivre's equation", that's just what he calls it in the context of his article.


Thank you!


Its more commonly used name is the (formula for the) standard deviation of the sample mean.


Right, thank you!


Thank you!

This is awesome


As someone who is prone to overengineering, I wonder if they also tried adding more standard streams after that (e.g., debug or verbose, different levels you would find in logging libraries today).

I think just out/err has been proven by history but that couldn't have been obvious to the original designers?


It'd be useful if there was a standardized way for applications to expose additional streams that worked exactly like stdin/stdout/stderr but with custom names.

At a first glance, named pipes (both the Unix and the Windows approach) sound like that, but they are global names - i can't have an application exposing a "commands" output stream that from the shell is piped to awk which is then piped to a "monitor" input stream of another application - and then have arbitrary number of instances of those.

You can ducktape a solution using randomly created unique global names and some shell script code, but what i think would be nice is something like:

    /* app1 */
    FILE* commands = stream("commands", "w"); /* write-only stream */
    /* app2 */
    FILE* terminal = stream("terminal", "r"); /* read-only stream */
that could be used via something like (:foo would be used wherever a file descriptor could be used to specify stream foo)

    # redirect commands stream from app1 to terminal stream of app2
    app1 :commands>&:terminal app2

    # redirect commands stream from app1 to stdout
    # (stdin of the right side which is empty, hence stdout)
    app1 :commands>&0

    # pipe commands to awk's stdin then grep then app2's terminal stream
    # regular app1 stdout is not affected and written wherever stdout is
    app1 :commands| awk blah | grep bleh 1>&:terminal | app2
the shell syntax would also need to be extended a bit to allow for parallel pipelines (e.g. app1 could also export a "comments" stream that could be piped to a separate file or to a projector application).


See the Directed Graph Shell (dgsh), which does allow multiple streams across commands. https://www.spinellis.gr/sw/dgsh/


Neat but i think this would be more useful if it was a standalone utility that did the launching itself instead of replacing your existing shell.


> I wonder if they also tried adding more standard streams

Common Lisp standardises a whole bunch of different streams (their names begin & end with asterisks, but I don't think that there's a way for HN to escape those):

- standard-input: probably what you think it is

- standard-output: ditto

- error-output: what a Unix user might call standard-error …

- query-io: a bidirectional stream, used for questions & answers asked of the user interactively

- debug-io: another bidirectional stream, used for debugging

- trace-output: used for print function traces & timing information

- terminal-io: a bidirectional stream representing the terminal itself

Yeah, like a lot of Common Lisp it's overengineered, but there are some intriguing possibilities in there, too.


Not quite the same thing, but Windows has the concept of multiple named streams in the same NTFS file - https://docs.microsoft.com/en-us/windows/win32/fileio/file-s...

So you could write a log file with separate streams for different verbosity levels.


Later versions of Research Unix I believe reserved another standard file descriptor for /dev/tty so it wouldn't have to have magic knowledge.


I've daydreamed about something like this for non-textual I/O (e.g. audio/video streams), so that each audio channel or display window or what have you would be a pair of file descriptors (for input and output).


I remember something similar where something would go wrong with lpr, and printing a PostScript document would instead print the (really long) textual PostScript source code.


lol I remember those days. It's amazing how much PostScript is required to render a few pages of text. I remember one print job I sent that used about 200 pieces of paper after it printed the actual PostScript source instead of parsing and rendering it for some reason.


PostScript, as a concatenative programming language, can be used to write very efficient code. By efficient I mean terse: doing much with very few code. However, it's usually autogenerated, so most ps code is long and ugly (but, obviously, good enough for it's purpose).


Well, when just rendering text the text and the postscript for it can correspond pretty much one-to-one. On the other hand, if you start to include the necessary fonts it quickly becomes big.


What became of all that paper?


Yes I think that, despite the humorous reaction in the movie, The Office, that simply printing something like PC LOAD LETTER was a vast improvement over the post script dump that would happen when an error occurred.


Still happens with a variety of operating systems, usually when there's an extra driver or somesuch in between the PostScript code and the printer itself that inadvertently interprets that code as plain text to be printed as such.

Similar thing happens with ZPL/EPL printers as well; I've troubleshot many a workstation where someone expected a shipping label and instead got a long string of "^XA^PW812 [...] ^XZ" across dozens of labels.


Be nice if they said the year of this invention, but it seems to have been in the 70s. I tried finding earlier references, two candidates:

- IBM JCL has DD statements for SYSERR, SYSIN, and SYSOUT, but I can't find the date that SYSERR was introduced.

- Any old Fortran IV programmers know that I/O unit 0 is STDERR, unit 5 is STDIN, and unit 6 is STDOUT. And Fortran IV is from 1966 (aka "Fortran 66").

However, I found a 1970 manual for Fortran IV, and at that time, unit 0 was illegal (see table 123-3 on page 13-4), so unit 0 must have been added later, adding token support for the claim made in this article.

http://www.bitsavers.org/www.computer.museum.uq.edu.au/pdf/D...


Please note that Dec-10 Fortran (like all Digital Fortrans) had a huge number of extensions. When I worked at Middlesex Polytechnic (later Middlesex University), we replaced our Dec-10 with two IBM 4381 super-minis, and none of the Digital Fortran code that used them could easily be ported (my own programs were portable, because I adhered fairly rigidly to the Fortran-77 standard).

I don't remember the Fortran 77 standard specifying STDERR, STOUT or STDIN in the sense the the C and C++ Standards do (in lower case) - but I might be a bit forgetful; this was over 30 years ago.


In the early days, stdin and stdout was a convention of the shell, not the operating system. Eg Programmers Workbench circa 1977: see execute() in https://minnie.tuhs.org/cgi-bin/utree.pl?file=PWB1/sys/sourc...

In Version 7 (1979), it moved to the kernel - see setregs() in https://minnie.tuhs.org/cgi-bin/utree.pl?file=V7/usr/sys/sys...

The V7 kernel model makes it generic - on exec, Unix will close() any file that has the close_on_exec flag set.

I believe that is the critical concept - stderr is supported by an OS level feature and inherits across various processes through pipes, forks and execs.

To tie back to the initial case in this thread, this is very useful for the Unix print system because it was/is a maze of squirrelly processes.


Right and the big question is whether unix was first to have stderr, or if it was predated on other OS’s.


Oh, that standard error. I assumed it was about the statistical concept. Might be helpful to amend the title so it specifies that.


Ah, the beauty of dogfooding, a.k.a. bootstrapping, as Douglas Engelbart called it¹.

1. https://www.youtube.com/watch?v=agdPQuFr0yg


Somewhat related, and an excellent view:

The Great 202 Jailbreak - Computerphile

https://www.youtube.com/watch?v=CVxeuwlvf8w


I once saw a video of someone describing this, can't for the life of me find it again.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: