Mal – Make a Lisp

tom_mellior · on April 24, 2021

I went through this a while ago (in Prolog). It was interesting, but with a lot of room for improvement. My criticisms as I remember them:

- Error messages about output being different from the expected use some sort of very explicit printing mode with lots of quotes and escaping. A message saying that one got "\\\'\\\'" where "\'\\\\'" was expected is needlessly hard to parse when the actual difference is \'\' vs. '\\' (or whatever).

- Many parts are marked optional, but the test script complains if you don't implement them, and later sections assume that they are implemented, so you have to treat them as compulsory anyway.

- Some things regarding environment updates were contradictory or at least not explained properly at first, with some very relevant information only coming later. I don't remember the specifics, but the problem was roughly that I set up my data structures thinking that everything could be implemented in a pure way, but at some point it became clear that you either need some other data structure or must use destructive modifications.

kanaka · on April 25, 2021

- There are several reasons for the weird way that the output and expected messages are printed. One of the main reasons is that in order to enable more flexible test case results, the expected output is a regex rather than a plain string, while the output is just a plain string. There is probably a way this could be made clearer. I'll add it to my TODO list to look at.

- Do you have some specific examples? The test cases are marked as either deferrable or optional. You shouldn't ever have to implement optional (and if so that's a bug in the tests or the guide). I think the deferrable items are marked pretty clearly in the guide where they become mandatory in later steps. If it's not clear, then that's a bug.

- This is one of the tensions that exists with trying to make the guide incremental; later features may require re-work of earlier functionality. I do try and minimize that as much as possible although I've found it can really vary depending on the nature of the target language. Note that the primary goal of mal/make-a-lisp is pedagogical (as opposed to say "the easiest way to make your own Lisp"). So sometimes the need to go back and re-work something is in line with that goal.

If you have any concrete guide text or test driver improvements (especially that further the pedagogic goals of mal), I'm always happy to review pull requests! :-)

tom_mellior · on April 26, 2021

Thanks for your answer. I probably meant deferrables rather than optionals that were de facto required. Sorry I don't have a concrete example handy, it's been a while.

dang · on April 24, 2021

If curious, past threads:

Mal – Make a Lisp, implemented in 79 languages - https://news.ycombinator.com/item?id=21670442 - Nov 2019 (11 comments)

Mal – Make a Lisp, in 68 languages - https://news.ycombinator.com/item?id=15226110 - Sept 2017 (69 comments)

Mal – Make a Lisp - https://news.ycombinator.com/item?id=12720777 - Oct 2016 (1 comment)

Make a Lisp - https://news.ycombinator.com/item?id=9121448 - Feb 2015 (41 comments)

Lisp implemented in under 1K of JavaScript - https://news.ycombinator.com/item?id=9109225 - Feb 2015 (16 comments)

miloignis · on April 24, 2021

I went through MAL last year and had a blast! It's an excellent project to help you get a feel for Lisps and tiny languages, and I think is what finally got me over my "eww parentheses" feeling. After finishing MAL, I modified it to implement John Shutt Kernel-style f-exprs to really up the power-to-weight ratio, and had a blast doing it. MAL makes a great basis for experimenting with additional language features, highly recommended!

diminish · on April 24, 2021

I liked MAL similar to Linux from scratch (LFS).

macintux · on April 24, 2021

I attempted this a few years back, didn't get far before I got hung up on the regular expression(s) required for parsing. Silly to get stopped there, but there was just enough friction between the specified expression, the language I was using (Erlang), and my understanding of the subtleties that I punted.

I think with Regex101 in my back pocket I'll give it another try.

https://regex101.com

progre · on April 24, 2021

I also could not get the regex to work, so I ended up writing a custom tokenizer. It ended up being like 30 lines of code, and unlike the regex I could understand how it works

KineticLensman · on April 24, 2021

In C#, I ended up with

   static public List<string> Tokenizer(string source)
   {
      // Initialise the token list.
      List<string> tokens = new List<string>();

      // Define a regex pattern whose groups match the MAL syntax.
      string pattern = @"[\s ,]*(~@|[\[\]{}()'`~@]|""(?:[\\].|[^\\""])*""|;.*|[^\s \[\]{}()'""`~@,;]*)";
      //                 empty  ~@ | specials     |   double quotes      |;  | non-specials

      // Break the input string into its constituent tokens.
      string[] result = Regex.Split(source, pattern);

This took a while to understand and get going but it really improved my understanding of regex.

duncaen · on April 24, 2021

Regex to parse lisp expressions?

the-smug-one · on April 24, 2021

Regexes are at least useful for parsing numbers and symbols.

But yeah, that shouldn't be where you get stuck.

macintux · on April 24, 2021

[\s,](~@|[\[\]{}()'`~^@]|"(?:\\.|[^\\"])"?|;.|[^\s\[\]{}('"`,;)])

Step 0, so I didn't get very far.

https://github.com/kanaka/mal/blob/master/process/guide.md#s...

kanaka · on April 25, 2021

It's a long regex, but it's just whitespace followed by an alternation with 5 different types of data: split-unquote, special characters, strings, comments, symbols. The string tokenizing branch is a bit complicated because it has to allow internal escaping of quotes. Early iterations of the guide didn't explain the regex in detail but the section now describes each of the regex components.

There are online tools to help visualize regex's. Here is a recent tweet including a visualization of mal's tokenizer regex: https://twitter.com/Mehulwastaken/status/1382292764834996230

cellularmitosis · on April 24, 2021

Whoa that regex is a monster. Try starting with simpler pieces and see if you get further this time around. Good luck! https://gist.github.com/cellularmitosis/75dc4aefe88438c14e94...

the-smug-one · on April 25, 2021

Well, you certainly don't need that regex to implement a Lisp.

bgorman · on April 24, 2021

The regex is used as a tokenizer, the outputs of which are then fed into the reader module.

User23 · on April 25, 2021

Yeah little weird since regexes can’t parse context free languages. I suppose most so-called regexes aren’t actually regular expressions, but it still feels like driving screws with a hammer.

kanaka · on April 25, 2021

Mal uses a regex for lexing/tokenizing. I didn't want people to get hung up on the lexing step (my university compilers class spent 1/3rd of the semester just on lexing). It's certainly a worthwhile area to study but not the focus of mal/make-a-lisp.

SanFranManDan · on April 24, 2021

If you are using erlang you should be using pattern matching not regex. Erlang/Elixir are one of the easiest languages to build parsers in with their binary string pattern matching.

macintux · on April 24, 2021

I agree, pattern matching is what I sorely miss every time I use anything other than Erlang. It's just enough of a hurdle I set it aside and didn't return.

It's a big step.

https://github.com/kanaka/mal/blob/master/process/guide.md#s...

agumonkey · on April 24, 2021

Brown University PLT textbook used lisp/scheme and their first paragraph was something like "nobody cares about parsing, let's get down to business in sexps"

I like parsing, I like regexes but I agree it's often a waste of time :)

progre · on April 24, 2021

One interesting thing about this project is the extensive use of Make. Not only is the test suite Make based, there is also a Mal implementation in Make

harryvederci · on April 24, 2021

If I recall correctly, the project started as "Make Lisp" (a lisp made with Make), and later changed to "Make A Lisp".

Y_Y · on April 24, 2021

Interesting languages that aren't on the list of implementations:

  APL/J/Kx
  Verilog
  Fortran
  LISP 1.5

jmrm · on April 24, 2021

Also COBOL and VHDL. I don't know how a hardware description language could implement a Lisp variant btw.

sedachv · on April 24, 2021

> I don't know how a hardware description language could implement a Lisp variant btw.

https://dspace.mit.edu/handle/1721.1/6334

rjsw · on April 24, 2021

There is a Verilog implementation of the MIT Lisp Machine.

Y_Y · on April 24, 2021

They actually do include a VHDL implementation! COBOL would be cool though.

lhjn · on April 28, 2021

I have a Dyalog APL implementation since 1 year or more on my fork, but still haven't released it because for reasons related to the input/output handling it cannot pass the self hosting tests :( To make the other tests pass, i wrapped the interpreter in a shell scripts that merges stdout and street, and that did the job for all bit the last step (self hosting). The self hosting interpreter (albeit very slow), appears to be working correctly when used interactively

lhjn · on April 28, 2021

If interested, https://github.com/laynor/mal/tree/dyalog

nicklecompte · on April 24, 2021

I was a bit surprised nobody’s tried a “verified” implementation in something like Coq/Idris/Agda (unless i missed one - I wasn’t familiar with all the languages in the list).

Might be a fun project!

KineticLensman · on April 24, 2021

I did this a while back. I documented my experience in several blog posts, the last of which is [0]. These describe the process, the gotchas and how I got round them. I was working on C# on a Windows box so the result isn't quite the same as the vanilla MAL.

I was massively pleased when I got MAL to self-host.

[0] https://www.non-kinetic-effects.co.uk/blog/2019/04/28/MAL-5

primer-1 · on May 3, 2021

What was your approach to running the tests under Windows?

KineticLensman · on May 9, 2021

I executed the test files manually. Unfortunately.

diego898 · on April 25, 2021

Michael Nielsen's "Lisp as the Maxwell’s equations of software" [1] was my first exposure to this in python, and I highly recommend it!

[1]: https://michaelnielsen.org/ddi/lisp-as-the-maxwells-equation...

pygy_ · on April 24, 2021

This is unintentional given the history of the name of the project explained in another comment on this page, but "mal" in French means "evil", tying into the famous aphorism.

kanaka · on April 25, 2021

I wouldn't say it was entirely unintentional :-). I was definitely aware that "mal" could mean "evil" when I named it. It was a bit more apropos when mal only had a single implementation in GNU Make macro language.

pygy_ · on April 29, 2021

Hehe :-)

rbonvall · on April 24, 2021

In Spanish it means "badly", which is an encouraging goal to give the project a try :)

andredz · on April 24, 2021

In Spanish it also means evil.

Example: No seas vencido por el mal, sino vence con el bien el mal.

Do not be overcome by evil, but overcome evil with good.