jackrabbit_'s comments

jackrabbit_ · 2024-10-19T10:23:23 1729333403

For a while I wanted to explore new optimization opportunities for ArkScript, and so I wrote a very simple IR for my language ArkScript (lisp like that’s made easy to embed in C++ apps). I’m now going to improve this IR (to have function calls directly in it, so that I can do some inlining)!

o11c · 2024-10-20T00:36:05 1729384565

Hm, you seem to be implementing the exact same kind of IR and bytecode that everybody does. There are a lot more possibilities that are rarely mentioned, and some I argue are better. I have an extensive list (/ wannabe tutorial), which I believe all VM writers at least think about, regardless of what actually ends up being implemented: https://gist.github.com/o11c/6b08643335388bbab0228db763f9921...

I will confess to being a strong opponent of stack-based VMs. Admittedly, function calls are the one thing that is notably simpler in them compared to register-based machines.

Instead of 32 bits per instruction, have you thought about 16 bits per instruction then doing something special for the rare instruction that needs more (either a flag for adjacent "extra data", or a function-level table to index, or even a special register)? Especially in a stack machine you're likely to have a lot of instructions with no argument at all.

You're doing label-based IR, but block-based IR seems to be more popular these days (though that is mostly with SSA). Regardless, think about the case where optimization leads to empty blocks / no code between adjacent labels (it's probably fine for your current approach, but only because you currently can't have large functions at all). Also, it doesn't make sense to use `unordered_map` when `vector` will do.

jackrabbit_ · on March 27, 2021

Hi everyone! For those who don't know us, ArkScript is a programming language made in C++ to be easily used in projects, inspired by Lisp. The language aims to be small to avoid a collection of 100 keywords with very rare/specific ones. A small set of instructions (imo) is better for productivity and to get to the point without digging in documentation for hours (even though there are only 10, you don't have to reinvent the wheel). I'm truly excited about this one, a 2 years old issue, 3 month of discussion with the dev team, and a few days only to implement it and it's even better than what I thought it would be capable of. The thing is that we planned if macros (compile time if, working only on/with other macros), constant macros (compile time values) and function macros (code modification at compile time, with arguments). Constants can hold anything, from simple values (numbers, strings) to complete blocs (a complete if, while, whatever you want), functions can take from 0 to a (nearly) infinite number of arguments through a "variadic" notation (I couldn't find a better name): ...args. The current implementation only allows macros in the scope they were defined in and in the child scopes, also macros can not be nested except for if macros, which can old any other macros and be used anywhere. This means that we can do this: !{if (= VAR 12) !{foo (bar ...args) (print bar (len args))}} Where it get interesting is that we can do this also, and everything is still resolved at compile time: !{_> (first ...args) {!{if (> (len args) 0)((_> ...args) first)first}}} (yes I named our version of the threading macro _> for now, because our parser won't allow it otherwise, -> is forbidden) And I haven't planned that the implementation nor the specification would do that (not that it specifically forbidden it either), which is a great suprise!

TL;DR: our language now has macros, and they can do even more than what we designed them for, and that's cool to play with

jackrabbit_ · on Jan 24, 2021

A while ago, I posted about I made a fun POC of hacking the DNS protocol to send messages to a server (creating a communication service relying on DNS requests/replies): https://www.reddit.com/r/Python/comments/jf8zbf/i_hijacked_d...

To summarize the idea, the project is using QNAMEs to encapsulte the client messages (encoded in base 32 as a subdomain, for example: encoded-message.dns.server.com), and the server decode the message and sends a DNS TXT reply which content is base 64 encoded.

Well it only worked on the same machine at the time (or when I had luck and had a server binded to my port 53 somehow (to have a port appear as open|filtered, something must be binded to it, and I struggled for a lot of time before understanding why my requests were answered with ICMP type 3 error, port unreachable, when going online)).

Now it's fixed, and what's even better, I can send a DNS TXT request to Googlge (8.8.8.8) about encoded-message.dns.site.com, and since I've registered as my own DNS, everything the other big DNS don't know about will be forwarded... to my server. Thus I can just use the command dig on linux to send messages to my server, from everywhere in the world, which is the main point for this project: DNS requests are often unfiltered (that doesn't mean they aren't logged by your ISP !! the goal of the project isn't to avoid log but firewall filters), thus when you have a limited connection (no access to internet), oftentimes DNS requests can still go out on the internet. Which is very interesting (but slow) to communicate from an airplane to someone on Earth, if you don't want to pay $50 to have 2GB of Wifi on the plane. There are a lot of other possible uses, and that's awesome.

PS: yes, I know about iodine, arecibo and such things. I just wanted to try to do it on my own, to learn more about the protocol

jackrabbit_ · on Oct 12, 2020

A small programming language made for C++ developpers, inspired by Lisp, aiming for relatively good speed.

It was created because I wanted to try my hands at a Lisp like, and it turned out so good I kept working on it for nearly two years now!

jackrabbit_ · on Aug 29, 2020

After a lot of delay, it's here and we plan to use this module to develop small REST API, websites and maybe even our future package manager!

It can handle servers with get, post, put, delete routes, with functions to treat the received data (may it be route parameter or query body), and on the client side it handles proxy, basic authentification (login password or token), timeouts, headers, parameters (for application/x-www-form-urlencoded / writing user=hey&pass=nope more easily) and much more.

jackrabbit_ · on March 27, 2020

This is a very heavy test of non optimizable recursion, and (benchmark not published yet) ArkScript is only 2 times slower than JavaScript. It sounds a lot, but please note that I'm actively working on improving those numbers, and I'll add other benchmarks (fast allocations of lists and more) to have a better idea of what the language is capable of in term of speed.

gridlockd · on March 27, 2020

> This is a very heavy test of non optimizable recursion, and (benchmark not published yet) ArkScript is only 2 times slower than JavaScript

That statement does not help. Which implementation of Javascript and what kind of benchmark is only 2x faster?

For starters, here is a good overview of Javascript interpreter performance:

https://bellard.org/quickjs/bench.html

jackrabbit_ · on March 27, 2020

Woops sorry

I used the Ackermann Péter fucntion, still the same parameters (3, 6), on spidermoney (js engine for Firefox 74.0, windows 10, i7 8k)

Thanks for the link, I'll definitely dig down into that!

jackrabbit_ · on March 26, 2020

Good question actually! I knew and I know that making a scripting fast enough to make video games is really tough, so I wasn't expecting to be able to make a language able to make games on its own (understand: not as a scripting language, but for the whole project as you suggested) ; also even if I want to throw away a few things from C++, I can't leave static typing and effeciency of C++: I think mixing the pros of C++ and the pros of my language was a good catch, the core in C++ (strong, fast, using optimizations I can't make in ArkScript), the scripting of actors, entities and all in ArkScript (dynamic language, easier to script entites behavior with).

jackrabbit_ · on March 26, 2020

A while ago, I was using Lua as a scripting language for my games but wanted something between Lisp and Python, a language which would run on a VM to be able to export only the bytecode, and to be able to code in a lisp like syntax.

And ArkScript was born. I wanted to keep it tiny, under a megabyte, with a few keywords to do everything.

I think I can say I achieve what I wanted to do, now we are working with a small team to improve the code, the documentation, and the standard library. The language is indeed small and fast (even if we can do better, and I'm still seeking for more speed), with immutability (with 'let'), and closures (note that we can read the data captured by a closure easily, as a small object, through for notation: object.field).

nonbirithm · on March 27, 2020

Interesting project. By chance have you heard of fennel[0]? It is a compile to Lua language with s-expression syntax. Also I believe you can export LuaJIT's bytecode with string.dump and it would be portable across platforms. Not trying to diminish your work, however. This could be useful for having a "real" Lisp embedded in-game. I don't think I've seen a scripting language with immutability either.

Would it be possible to write a REPL using ArkScript or use the compiler from within the language?

Also, I'm really curious how you got so many people to work on the language. How did you accomplish this?

[0] https://github.com/bakpakin/Fennel

jackrabbit_ · on March 27, 2020

I've never heard of fennel before, looks interesting!

Yes it's possible to launch a REPL by using the `-r` or `--repl` option when launching ArkScript executable.

A few people contributing on the language are people I know from some programming discord servers, and the others came to work on the project after discovering it thanks to the hacktoberfest tags I put on the issue.

_630w · on March 27, 2020

What is a scripting language for you?

Because I think (could be wrong) elm, purescript, clojurescript and few others are immutable.

Reelin · on March 27, 2020

Along the same lines, Embeddable Common Lisp can optionally be configured to use a bytecode interpreter. I have no idea if said bytecode can be exported and moved between platforms though.

eatonphil · on March 26, 2020

Great project. Looked around the repo to understand how the FFI works without annotations. Maybe there is a section I missed. I'm curious how you translate between ArkScript types and C++ types? And what is the error-handling like when the programmer gets it wrong?

Also, if you create objects by making all variables within a function scope accessible outside the function, do you have a means for private local variables that don't get exported?

jackrabbit_ · on March 26, 2020

Thanks! Some people are going to be a bit angry about what I'm going to say, but the language is 100% dynamically typed, so we retrieve `Ark::Value` from the language, and in C++ we check the data type by using `value.valueType()`, we can then retrive the content of the value by using the right getter (number, string, etc). When an ArkScript developer get a program wrong, the virtual machine is designed to throw the code away at runtime, without being able to recover. It could sound strange, but my idea behind that is to throw away the exception design (as used in C++) to have a error code design as in C, or even better, a result design as in Rust. Everything is exported actually, if you care about users being able to modify values, don't worry, we can only read the values captured / created in a closure, only the closure itself can modify them. The privacy is a convention, by prefixing the variable name with an underscore.

eatonphil · on March 26, 2020

How do you know what types and how many arguments you need to conform to the foreign function? Every language I'm familiar with requires the programmer to declare the types and number of arguments.

jackrabbit_ · on March 26, 2020

That's where it's tricky, the compiler doesn't know, it's checked at runtime by the C++ function, which receives a list of Values, count them (actually quite convenient if we need to make variadic argument functions) and decide if it's fine or not, then does type checking on the needed argument. I know it sounds crazy because it can cause performance loss, but in a dynamic language like this one that's a cost I must pay.

blattimwind · on March 26, 2020

That's not what you'd typically understand as an FFI; that's just an interpreter / VM API. A foreign function interface would generally allow something running in the VM to call arbitrary external (hence foreign) functions that were not written against the VM's API, i.e. don't have a binding.

For example, this is what FFI could look like in some language:

   lib = ffi.load("kernel32)
   lib.TerminateProcess.args = (ffi.voidptr, ffi.unsigned)
   lib.TerminateProcess.returns = ffi.int
   lib.TerminateProcess.callconv = "stdcall"
   lib.TerminateProcess(1234, 56)

Arguably you neither want nor need FFI for a game scripting language.

jackrabbit_ · on March 26, 2020

You're right, I'm using the wrong name for this. At the beginning the goal was to have what you described as a FFI and what I thought a primitive "FFI" would look like (eg. what it is today, a system using the VM API), but I never managed to do both and the name stayed, and now I'm here a bit confused, messing up and mixing words..

eatonphil · on March 26, 2020

Right, I'm confused /curious about FFI calls not regular function calls.

jackrabbit_ · on March 26, 2020

I'm sorry, I used the wrong word (see the details in the above comment: https://news.ycombinator.com/item?id=22698450) for this..

It's not a FFI but an interface using the VM API.

eatonphil · on March 26, 2020

Wait but when you `(import "librandom.so")` and call `(random)` in your examples, that's an FFI. But I guess I'm getting the picture anyway. When there's a foreign function call, you count the number of arguments and convert them to their C++ types and call a helper function for making a call against some function pointer with so many arguments? I'm still confused though how you'd pass strings or function pointers for example.

jackrabbit_ · on March 26, 2020

That's the idea yes.

Since an example is worth a thousand words, here is a function for the http module, counting argument, checking the types and then using the values to create an http client (to make http requests to web servers): https://github.com/ArkScript-lang/modules/blob/refactoring/h...

eatonphil · on March 26, 2020

That's not exactly an FFI, that's an extension since you're coding against ArkScript's C++ API. An FFI is when you use only ArkScript to make calls against other languages that don't know about ArkScript's API.

saagarjha · on March 27, 2020

The wiki (https://github.com/ArkScript-lang/Ark/wiki/Embedding#creatin...) calls them “plugins” and they’re also called “modules”, which I think is a lot clearer than “FFI”. Essentially it’s a bunch of native functions that are exported with C linkage and compiled into a set of shared libraries, which are then loaded and the functions in them looked up at runtime. On the other (C++) side it’s similar to e.g. a JNI binding or any other language where you can pull values out of the runtime.

jackrabbit_ · on March 27, 2020

> An FFI is when you use only ArkScript to make calls against other languages that don't know about ArkScript's API

I should definitly rename this part as stated in another comment, since the "FFI" isn't a real one, I messed up with the vocabulary

ngold · on March 26, 2020

Super fun. This is a great project.

jackrabbit_ · on March 27, 2020

Thanks!

BubRoss · on March 26, 2020

That all sounds like lua, where does lua not work?

ijlx · on March 27, 2020

Presumably the lisp-like syntax and immutable variables by default. Lua has some functional features, but it isn't a primarily functional language like this.

BubRoss · on March 27, 2020

What is it missing though? It seems like this could be a different syntax compiled to Lua bytecode.

jackrabbit_ · on March 27, 2020

It could have been compiled to Lua bytecode, but I felt like having an external VM and ArkScript would have been too much. I always hate when I download a lib and 100 smaller libs come because the main lib relies on it

imtringued · on March 27, 2020

Small libraries are an accountability nightmare. Reducing the number of dependencies on a core library is very important.

BubRoss · on March 27, 2020

This also seems odd to me because Lua can be embedded in a binary easily or made into a shared library since it is just C with no dependencies of its own.

RicardoLuis0 · on March 27, 2020

Interesting, makes me think of Naughy Dog's GOOL/GOAL, also a Lisp-like scripting language that was used internally for many games, including Crash Bandicoot on PSX