Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The free version of K can be downloaded at http://kx.com/software-download.php


But it doesn't look like that version can run edit.k from kparc.com.

So for all practical purposes kx.com and kparc.com are distinct languages/dialects.

Edit: I should clarify that I'm trying to make sense of the specific code in the OP, in particular to play with the views idea (the :: operator). I should have probably put this thread under https://news.ycombinator.com/item?id=8743325


kOS is K5; it's as different from K4 as K4 was to K3.

Views are available in K4/Q/KDB+, so the version on kx.com does work with views. If you want to play with the views idea, try to solve a problem and see where a view helps you.

Here's one I did earlier this week; I've got this tool that takes a whole bunch of logfiles of userids and counts; they look like this:

    5033703751425413371 1
    5409789758109122623 4
    10102846067816284236 4
I call these columns `s(sessionid) and `c(count); I can load them with this:

    G:{g::select from(+`s`c!("*I";" ")0:"S"$":",x) where (c>0)}
Now I say something like: G"trk/tnl5/20141207/TRK_ITM" to get one of these structures into the g variable.

I actually have a bunch of filenames:

    t:{R:{(x,"/"),/:$:!"S"$":",x};,/R',/R'R"trk"}[]
So I can load the first one with: G t 0

This is convenient, since these files are big (1Gb each or so) I'll want to work with a subset:

    g:10000#g
This isn't part of my program, it's just something I do while developing. If I run "G" again it'll get the entire file again.

Now, I want to match these against another list of users. I'll do a similar trick:

    D:{d::0:"S"$":",x}
and now I can do: D"users.txt" to put this into the d variable.

My program needs to find the rows of g that d match.

Some of the sessionids in the logfiles are corrupt. They look like this instead:

    7241110807448127691320303160517889791474867567241110807448127691320312936057606400732600664532969577042775 4
    62349552899220301013203170219134834182361546234955289922030101320303170219134834182361546234955289922030101 4
These are wrong. I originally tried to figure out them by finding the longest string that was also at the beginning of the sessionid, but after talking it out with Oleg and Pierre I decided to try to simply match them against the lengths of d:

    ?:#:'d
said I only have two lengths (19 and 20), so this is a much smaller search than what I was trying before!

To do this, I used views:

    K::?:#:'d
    k::"S"$"S",/:$:'K
    f::k!+(K#\:/:g.s)
Now f is an index on g; instead of a `s column it has an `Sn column where n is length of the key; i.e. I have a `S19 and an `S20 column with the first 19 characters of `s and the first 20 characters of `s accordingly.

    s::g.s[&|/f[k]in\:d]
    c::g.c[&|/f[k]in\:d]
Now I can look at the `s and `c vars and do what I want to do; get my unique userids and my counts.

More importantly, I can apply this to all my files (I have a lot of them):

    n:0;m:0;{G x;n+:+/c;m+:+/g.c}'t


Yeah, I tried the same and came to same conclusions. Someone who knew what they were doing could implement a new-k impl from the lang def on kparc. That is all you would need. Alas, I am not that guy.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: