Let's do this using the same approach, "find indices and assign over them in a c...

geocar · on May 20, 2021

If this were me, and I needed a function like us, I would have written this:

    us:{$[x~(z;y);,"_";y]}[(*K;,"*")]':

I would be interested in seeing anything that was shorter[1] and faster than that in any language, and I would be very curious to learn from anyone who could also do that faster than me.

But I'm not a fetishist: I didn't learn k because it was cute, and I don't wake up every day looking for ways to rewrite other people's code so that it is slower and bigger. Do you? Or is today special?

[1]: Measured in source-code bytes.

rscho · on May 20, 2021

> Do you? Or is today special?

Today is the usual. He does that in all threads related to array languages.

kazinator · on May 20, 2021

> I would be interested in seeing anything that was shorter

One way to do that would be to pose that as a problem on the Code Golf Stackexchange.

My rewrite-case solution condenses (by removal of all non-essential spaces) to 69 bytes if the symbols are in a hash table K, and the input list is in a variable y, rather than a big literal:

  (rewrite-case x y((@[K @sym]* . @rest)^(,sym _ . ,rest))(@else else))

A one-letter name could be chosen for the macro. Furthermore, one-letter names could be chosen for sym, rest and else:

  20> (r x y((@[K @s] * . @r)^(,s _ . ,r))(@e e))
  (foo bar * select _ fox where _ bravo)

Still working! Now down to 43 bytes.

(It's not because I cannot that I do not do this with all my code.)

Doh, why use . ,r in the backquote if we are golfing? That should be ,*r: using the splice operator, like Common Lisp's or Scheme's ,@, getting us to 42 bytes:

  20> (r x y((@[K @s] * . @r)^(,s _ ,*r))(@e e))
  (foo bar * select _ fox where _ bravo)

I suspect Code Golf Stackechange regulars could get it down to way in one of the dedicated golfing languages like Retina or what have you.*

What else can we do? The @[K @s] predicate syntax could be replaced by a custom operator defined by defmatch, looking like @(K s).

  21> (defmatch k (sym) ^@[K (sys:var ,sym)])
  k
  22> (r x y((@(k s) * . @r)^(,s _ ,*r))(@e e)) ;; 41
  (foo bar * select _ fox where _ bravo)

Just noticed the ) * is not fully golfed:

  23> (r x y((@(k s)* . @r)^(,s _ ,*r))(@e e)) ;; 40
  (foo bar * select _ fox where _ bravo)

The k pattern operator macro could be non-hygienic: it could implicitly bind a variable called s:

  24> (defmatch k () ^@[K @s])
  k
  25> (r x y((@(k)* . @r)^(,s _ ,*r))(@e e)) ;; 38
  (foo bar * select _ fox where _ bravo)

These last few feel like cheating because they are too special purpose. Defining anything you can only possibly use just once isn't making the overall program smaller.

kazinator · on May 21, 2021

Also-rans:

  1> [window-map 1()(do if(and[K @1](eq'* @2))'_ @2)y]
  (foo bar * select _ fox where _ bravo)
  2> (mapcar(do if(and[K @1](eq'* @2))'_ @2)(cons()y)y)
  (foo bar * select _ fox where _ bravo)
  3> (mapcar(lambda(:match)((@[K] *)'_)((@a @b)b))(cons()y)y)
  (foo bar * select _ fox where _ bravo)

rscho · on May 20, 2021

you'll never get J/K brevity though. Because in opposition to Lisp where control flow is explicit in the syntax, most of the control flow in array languages is implicit. I love lisp too, but for some (rare) use cases it's not the best.

kazinator · on May 20, 2021

You will never get J/K brevity by writing a condensed for of normal Lisp because spaces are often required to separate tokens.

You can get implicit control flow via macros.

Furthermore, all control flow is implicit is some way.

Take basic old progn: (progn (foo) (bar)) evaluates (foo) and then control implicitly passes to (bar) because that's the next thing.

The only control flow which is not implicit is that of a state machine described by a table or graph, indicating the next state for every input and current state.

(How can you say you can't get implicit control flow in Lisp, when I'm using pattern matching to express if you see this prefix, replace it with that?)

Speaking of macros, you could always resort to a macro like this:

  (j"... line noise in character string ...")

The j macro expands the string to code at compile time. The code could be a bona fide J implementation, or something like it. That's still not down to J, because of the (j"") wrapping chewing up five characters. In a Lisp with a programmable read table, that could be reduced to two character framing or something.

Avshalom · on May 21, 2021

okay so it's not really smaller because it uses UTF8 also it relies on order of operations to get rid of the parens around ⍵≡'*' but

us←{('_'@(1+⍸2{(⍺≡⊃K)∧⍵≡'*'}/⍵))⍵}

is one(1) character shorter.

rak1507 · on May 21, 2021

⍵≡'*' should be ⍵≡,'*' like in the original k

us←⊃,⍥⊆2{(⍺≡⊃K)∧⍵≡,'*':,'_'⋄⍵}/⊢ is shorter anyway

geocar · on May 21, 2021

    us←{'_'@{¯1⌽⍵⍷⍨(⊂⊃K),'*'}⍵}

rak1507 · on May 21, 2021

Ah, ⍷, clever, don't even need the outer {...⍵},

  us←'_'@{¯1⌽⍵⍷⍨(⊂⊃K),'*'}

geocar · on May 21, 2021

Yes!

You could also do: ⌽'*',⊂⊃K to save another character. I don't know if that's "better" though.

rak1507 · on May 21, 2021

Or alternatively '*',⍨⊂⊃K which I think I prefer over ⌽, but neither seem that 'better' in my mind anyway.

wanderer2323 · on May 22, 2021

Is this the same language or a different one?

rak1507 · on May 22, 2021

This is APL, k is a descendant of APL.

rak1507 · on May 21, 2021

Why do you pass in (*K;,"*") as x instead of hard-coding it in x~(z;y)? Clarity?

geocar · on May 21, 2021

Great question!

     v:("select";,"*";"from";"tacos")

     us:{$[#i:&{(y~*K)&"*"~*x}':x;@[x;i;:[;,"_"]];x]}              
     \t:100000 us v
    233
     us:{$[(*K;,"*")~(y;x);,"_";x]}':
     \t:100000 us v
    206
     w:(*K;,"*");us:{$[w~(y;x);,"_";x]}':
     \t:100000 us v
    179
     us:{$[x~(z;y);,"_";y]}[(*K;,"*")]':
     \t:100000 us v
    129

I think it's important to remember just how simple k is: We the programmer know that (*K;,"*") isn't supposed to change, so we should be explicit so k "knows" this as well. In addition to never changing, we also know the value is only used once, so we really don't want to look anything up in the workspace every time we call us: Again, let us be explicit.

On the other hand, this:

     us:{$[#i:& XXX ;@[x;i;:[; YYY ]];x]}

is an extremely recognisable idiom. It occurs several times in sql.k so the reader is probably used to seeing it at this point. It also has a similar syntactic structure to the APL '@' so it may be more "obvious" to an APL programmer. Maybe the reason I write it this way is that I'm not a very experienced APL programmer :)

rak1507 · on May 21, 2021

Thank you for the extensive answer, that's very interesting that there's such a performance difference. I would have naively assumed if anything keeping it as a constant would be faster.

Another question: Why do

  $[#i:& XXX ; @[x;i;:[; YYY]]; x]

when (if I'm not mistaken, which I could be) you can do

  @[x;i:& XXX;:[; YYY]]

Performance again? A quick test using ngn/k did seem to find the first one faster, but only marginally.

geocar · on May 21, 2021

That's a harder question. Why do other people do the things they do? One reason might be because it was convenient for them to write (and debug) things that way.

But it is possible they are doing it for performance reasons, so maybe it's worth considering why it should be faster? That is to say, should the programmer assume the latter is faster than the former?

To do that, I would suggest putting yourself in the shoes of the implementor: You have a choice of how it should be implemented. Knowing this decision will affect every use of @[x;i;f], should it begin with an if statement like this?

    if(y->n == 0)return x;

But before you answer, remember the next thing @[x;i;f] needs to do, is consider if x has any additional references, because if it does, it has make a copy of x. That means there already needs to be an if statement that looks like this:

    if(x->r > 0) x = copy(x);

So one way to think about this, is to ask if you are implementing @[x;i;f], should you choose one branch or two?

    $[#i:& XXX ; @[x;i;:[; YYY]]; x]

Also: Do you see the part that goes :[;YYY]? That's constructing an object. Do you think it is worth avoiding that allocation if possible?

rak1507 · on May 21, 2021

Ok, that all makes sense, thank you.

And thanks for pointing out the pattern in the first place, it makes sense now how you were able to parse the original expression so quickly.