How to program computers (kOS)

justincormack · on Dec 12, 2014

There was another thread about kos a while back [1] which was why I invited Geo to speak, as there seemed like a lot of interest.

[1] https://news.ycombinator.com/item?id=8475809

akkartik · on Dec 12, 2014

Looks like what Geo was covering in the first few minutes at least is summarized in this comment: https://news.ycombinator.com/item?id=8476633

101914 · on Dec 13, 2014

Do you think Whitney ever considered using the NetBSD kernel code instead of Linux kernel code? No doubt, small was a priority. Why not BSD?

And why doesn't kx have binaries available for BSD? They have Solaris. Burning questions I have always had. Maybe just based on demand from their licensses?

k and kdb are some of the very few things that keep me interested in programming and computers.

geocar · on Dec 13, 2014

kOS doesn't use the Linux kernel code. It supports a few of the same syscalls:

clone, execve(!), epoll_create, epoll_ctl, epoll_wait, dup2, stat, rename, unlink, getcwd, chdir, fstat, getdents, open, close, read, write, ftruncate, mmap2, munmap, ioctl, gettimeofday, socket, bind, connect, listen, acccept, socketpair, setsockopt

execve just runs another k interpreter (regardless of what you tell it to do) though, so it can just be used to run more k programs.

101914 · on Dec 13, 2014

I recall seeing a solictation on the kOS website for people who could get a Linux kernel down to a certain size. I presumed that meant the intent was to use Linux code; at least I was thinking he would end up getting responses from people with a Linux bias.

But I also recall reading the line "no linux", which really piques my interest. I thought that just meant no Linux userland.

Now I like this project even better!

Sounds like you have access to the kOS source. Lucky you. Enjoy.

anfedorov · on Dec 12, 2014

Not to be confused with kOS [1]?

1. http://ksp-kos.github.io/KOS_DOC/

justincormack · on Dec 12, 2014

No, its from http://kx.com/ see also https://en.wikipedia.org/wiki/K_(programming_language)

akkartik · on Dec 13, 2014

I tried to run K, but the link at http://kparc.com requires a password. Is it not generally available yet? Is there a mailing list or IRC channel somewhere?

https://github.com/kevinlawler/kona doesn't run edit.k successfully.

voidiac · on Dec 13, 2014

The free version of K can be downloaded at http://kx.com/software-download.php

akkartik · on Dec 13, 2014

But it doesn't look like that version can run edit.k from kparc.com.

So for all practical purposes kx.com and kparc.com are distinct languages/dialects.

Edit: I should clarify that I'm trying to make sense of the specific code in the OP, in particular to play with the views idea (the :: operator). I should have probably put this thread under https://news.ycombinator.com/item?id=8743325

geocar · on Dec 13, 2014

kOS is K5; it's as different from K4 as K4 was to K3.

Views are available in K4/Q/KDB+, so the version on kx.com does work with views. If you want to play with the views idea, try to solve a problem and see where a view helps you.

Here's one I did earlier this week; I've got this tool that takes a whole bunch of logfiles of userids and counts; they look like this:

    5033703751425413371 1
    5409789758109122623 4
    10102846067816284236 4

I call these columns `s(sessionid) and `c(count); I can load them with this:

    G:{g::select from(+`s`c!("*I";" ")0:"S"$":",x) where (c>0)}

Now I say something like: G"trk/tnl5/20141207/TRK_ITM" to get one of these structures into the g variable.

I actually have a bunch of filenames:

    t:{R:{(x,"/"),/:$:!"S"$":",x};,/R',/R'R"trk"}[]

So I can load the first one with: G t 0

This is convenient, since these files are big (1Gb each or so) I'll want to work with a subset:

    g:10000#g

This isn't part of my program, it's just something I do while developing. If I run "G" again it'll get the entire file again.

Now, I want to match these against another list of users. I'll do a similar trick:

    D:{d::0:"S"$":",x}

and now I can do: D"users.txt" to put this into the d variable.

My program needs to find the rows of g that d match.

Some of the sessionids in the logfiles are corrupt. They look like this instead:

    7241110807448127691320303160517889791474867567241110807448127691320312936057606400732600664532969577042775 4
    62349552899220301013203170219134834182361546234955289922030101320303170219134834182361546234955289922030101 4

These are wrong. I originally tried to figure out them by finding the longest string that was also at the beginning of the sessionid, but after talking it out with Oleg and Pierre I decided to try to simply match them against the lengths of d:

    ?:#:'d

said I only have two lengths (19 and 20), so this is a much smaller search than what I was trying before!

To do this, I used views:

    K::?:#:'d
    k::"S"$"S",/:$:'K
    f::k!+(K#\:/:g.s)

Now f is an index on g; instead of a `s column it has an `Sn column where n is length of the key; i.e. I have a `S19 and an `S20 column with the first 19 characters of `s and the first 20 characters of `s accordingly.

    s::g.s[&|/f[k]in\:d]
    c::g.c[&|/f[k]in\:d]

Now I can look at the `s and `c vars and do what I want to do; get my unique userids and my counts.

More importantly, I can apply this to all my files (I have a lot of them):

    n:0;m:0;{G x;n+:+/c;m+:+/g.c}'t

cturner · on Dec 13, 2014

Yeah, I tried the same and came to same conclusions. Someone who knew what they were doing could implement a new-k impl from the lang def on kparc. That is all you would need. Alas, I am not that guy.

jodrellblank · on Dec 13, 2014

His suggestion is that the cure for code bloat is, basically, codegolf?

cbd1984 · on Dec 13, 2014

So... is bloat extra features, extra code, or extra resource usage? Is it bloated if you have the same features in fewer lines, or fewer lines with more resource usage?

It's almost as if bloat is a meaningless snarl term with a mostly illusory definition beyond "bad".

geocar · on Dec 13, 2014

If they're "extra" then they are unnecessary; I define code bloat strictly as redundant code inside the source tree.

Redundant code has performance and resource ramifications: Programs that are bigger load slower (more bytes to pull from disk) and run slower (more trips in and out of cache memory).

Redundant code also has other ramifications: Two modules in the same source tree may attempt similar (or identical) algorithms and miss an opportunity to consolidate. One of them might hide a bug, where the other code path was executed more frequently and the bug fixed.

icsa · on Dec 15, 2014

I have seen this on a grand scale - 72MM LOC over multiple versions (10 year span) of multiple products by the same well-known software company. The occurrence of nearly identical code was the most surprising find.

One product had minimal redundancy. It used a plug-in architecture for the entire application - not just for extensions.

cbd1984 · on Dec 16, 2014

Your idea of "superfluous" is my idea of "just enough", and vice-versa.

For example, why have Bentley when we already have Ford? Multiple car manufacturers is surplus, waste, and bloat.

geocar · on Dec 22, 2014

I didn't use the word "superfluous". I define code bloat strictly as redundant code inside the source tree.

Car manufacturers do not exist inside the source tree.

They aren't even obviously unnecessary; After all, people have different tastes about whether they like a Bentley or they like a Ford.

The computer (however) does not have taste, and it doesn't care whether you like your strcmp or my strcmp.

pherocity_ · on Dec 13, 2014

Meaningless snarl? How DRY is your code, and have any refactoring attempts been made to dry up the code. Do you have unused functions, methods, classes that are there because you think you might need them in the future. Does every binary have all the code for other architectures that are not specific to the current platform of execution. Etc... I could go on with several more examples, but it's the exact opposite of illusionary.

cbd1984 · on Dec 13, 2014

If everyone has a different definition, the term has no real definition. That's my point.

pherocity_ · on Dec 13, 2014

Those aren't different definitions, they're all examples of artifact waste. If something has multiple examples, it doesn't mean it has multiple definitions.

cbd1984 · on Dec 16, 2014

Except your example of "extra" is my example of "just enough".

pherocity_ · on Dec 19, 2014

I didn't use the word extra.

codemonkeymike · on Dec 13, 2014

I was thinking the same thing myself. Like your code may be 1/10th the length but you need to find someone with a seemingly rare skill set to maintain it.

geocar · on Dec 13, 2014

But what skill is this?

Is it something we can develop in people? Teach? Practice?

It may be rare that people will do it and invent it on their own, but I know there was a point in my life that I did not want to write code this way, so I know at least one person can change.

codemonkeymike · on Dec 18, 2014

Yes people can change that is true, and I don't think anyone goes into a CS or IT program thinking this will be the type of code they write. The skills really are to think differently about code, and functional programming. Two things that are not taught in American universities(Or companies for that matter). So I feel like if I were to choose K as the language for my project which I expect to hire others to work on, I would have to build a K community in my area just to have the talent to work on my project.

geocar · on Dec 22, 2014

I'm not (yet) advocating we all program in K, but we start with the simpler problem of trying to talk about programming like scientists: To try to define things only in terms of what we can measure.

The speed at which code runs? The size that the binary is? The size of the source file? How long it takes to build software that runs correctly?

These are things we can measure, and while I suspect once we optimise programming languages for these things we will end up with something that looks like K, I do not think that it will be K because I have noticed that K is very bad at some things that I like to do.

geocar · on Dec 13, 2014

That's an interesting way to put it.

Yes. It might be.

agentultra · on Dec 13, 2014

Are there any post-scripts or notes from the ensuing conversations? An interesting point was made about humans not understanding programming and then dodges any explanation due to time.

geocar · on Dec 13, 2014

I'm sorry: It wasn't an intentional dodge. It really was getting late in the day.

Only one person was hostile. They wanted to point out that there was no point in writing it all on one line "just to be unreadable".

I told them I thought the readability was improved by having all the words together, but maybe it made it difficult to learn. I noted Arthur had recently annotated edit.k[1] and asked them if that layout made it easier for beginner programmers to learn to read.

I don't think they liked being referred to as beginner programmers because he mentioned that he lectures at Cambridge, so of course they know how to program.

But most people were very encouraging: A lot of people wanted to share something they didn't like about different programming languages, and wanted some ideas about how they can learn more.

I got asked to read some more snippets of code [2] so that people could see this and get a better feel for it.

I got a couple of questions about syntax highlighting and development tools, and if there might be some middle-of-the-road solutions that were terse but not-as-terse.

Also: Beers.

[1]: http://kparc.com/edit.k

[2]: http://nsl.com/k/sudoku/

icsa · on Dec 14, 2014

> I told them I thought the readability was improved by having all the words together, but maybe it made it difficult to learn. I noted Arthur had recently annotated edit.k[1] and asked them if that layout made it easier for beginner programmers to learn to read.

The issue, for me, is not readability but context. Even a word or two (e.g. cuts, begins, dims) helps greatly to establish the context of the code. Like having a map before hiking.

When I write k/q, I use the same layout as Arthur with the additions of comments for each definition on the same line beyond column 80 or 132. Works well for wide monitors.

E.g.

  c::a$"\n";b::0,1+c;d::(#c),|/-':b                / (c)uts, (b)egins, (d)ims

I've also experimented with folding editors to keep comments above their respective definition while only taking up one line when folded. Some editors (e.g. - jedit) can display the folded text as a tool tip without expanding the fold.

  / Docs
    (c)uts
    (b)egins
    (d)ims
  c::a$"\n";b::0,1+c;d::(#c),|/-':b

I hope this is useful or helpful.

geocar · on Dec 14, 2014

It is interesting.

Can you talk more about context? How exactly does it help you?

I don't think:

   c::a$"\n";b::0,1+c;d::(#c),|/-':b                / (c)uts, (b)egins, (d)ims

is more clear than:

   c::a$"\n";b::0,1+c;d::(#c),|/-':b

Or for that matter:

   ⚁::⚀$"\n";⚂::0,1+⚁;⚃::(#⚁),|/-':⚂

because they are just symbols to me, and one symbol is as good as another (except for the amount of space they take up on the screen and the ease in recognising it; I don't have many variables named lowercase L).

icsa · on Dec 15, 2014

TL;DR - Map symbols or uses of operators to meanings.

Context is semantics/meaning. I.e. - hints, documentation, use case, approach, etc.

It's not about the code but the description of the code. The above code is the same, however, the comments loads the basic intentions into my mental cache. I can then read/use the code without having to analyze every definition completely.

This is especially helpful for k operators that can be overloaded by type or meaning. E.g. x!y can be create dict, table -> keyed table, keyed table -> table, integer -> enumerated value, ...

A hint (outside of the code) helps me out quite a bit.

Btw, the code above can be written as: d::(#c),|/-':b::0,1+c::a$"\n"

Which seems more clear, given the context.

geocar · on Dec 15, 2014

> I can then read/use the code without having to analyze every definition completely.

What does `(c)uts` tell you?

I'm not actually familiar with this term.

> This is especially helpful for k operators that can be overloaded by type or meaning

We know the types of the data when we're tracing the code path.

> Btw, the code above can be written as: d::(#c),|/-':b::0,1+c::a$"\n"

You should let Arthur know.

icsa · on Dec 15, 2014

> What does `(c)uts` tell you? Cut points (by \n, in this case) if one were to use the cut operator _ to tokenize a by leading \n.

It's hint/reminder of the purpose of the definition.

N.B. - This feature (i.e. str$chr) is undocumented in k.txt. I would normally use &a="\n". However, as soon as I saw Arthur's comment, it made sense (and uses less memory).

> We know the types of the data when we're tracing the code path.

My goal is to avoid having to trace the code path and to understand the definition in isolation (or as isolated as possible).

E.g. - you were able to add HOME/END functionality for your Mac without understanding the entire code path.

> d::(#c),|/-':b::0,1+c::a$"\n"

> You should let Arthur know.

Will do. He definitely appreciates concision.

Btw, I'm using k5 for daily use. Are you? If so, we could share notes.

geocar · on Dec 16, 2014

> as soon as I saw Arthur's comment, it made sense

Interesting.

I'd like to talk more about this part.

> If so, we could share notes.

Sure. You want to email me your details?

icsa · on Dec 16, 2014

Sure thing. What's your email address?

agentultra · on Dec 14, 2014

I don't believe they teach rhetoric as a standard component of one's education anymore. The default position is not one of charity but of doubt. A symptom of our time.

I felt as though you were about to make a very critical point and I would very much like to know what it was.

geocar · on Dec 14, 2014

It's on the tip of my tongue; I'm not yet articulate enough to describe this thought clearly.

I had a go here:

https://news.ycombinator.com/item?id=8476294

I'm still working on it though...

justincormack · on Dec 12, 2014

This was from the conference I organized a couple of weeks ago. The other videos are also available [1].

Sorry about the screen visibility on this, as it was being live coded it was a bit hard to film.

[1] https://operatingsystems.io/

anoother · on Dec 12, 2014

Thanks for organising such a great conference. I missed this talk, so have been looking forward to the videos.

rasz_pl · on Dec 13, 2014

Next time organize some microphone stands too :)

lukifer · on Dec 13, 2014

It was honestly difficult not to be distracted by that.

justincormack · on Dec 13, 2014

Sorry, there was some sort of technical issue with the lectern mike.

rco8786 · on Dec 12, 2014

Feel bad for the guy holding the mic

justincormack · on Dec 12, 2014

Hi, thats me. Last minute issues when organizing a conference...