Alpaca – Functional programming inspired by ML for the Erlang VM

salimmadjd · on March 25, 2017

Really hoping this project gets more traction.

I'm learning Elm now and I'm really liking the syntax to the level that other languages feel rather cluttered to me now.

The more I'm playing with types and learning to leverage them, the more I appreciate their power (yes, I'm late to the game) so making this statically typed is very interesting.

However, there seem to be a saturation of new languages and not sure if there is enough eyeballs left for a new language that does not have a large corporate backing (FB, Google, Apple) or happens not to arrive on a perfect time with the right set of answers. Maybe BEAM, ML/Elm syntax and static typing is what everyone else is looking for.

Edit: Video posted today of creator of Alpaca (Jeremy Pierre) giving a talk at Erlang Factory. It gives a nice overview of the state of the language -

https://www.youtube.com/watch?v=cljFpz_cv2E

jerf · on March 25, 2017

There's always room for another language.

What's hard is cracking into the very, very top tier, the C++, C#, Java, etc. tier. I am also increasingly of the opinion that it simply takes massive corporate backing to get to that level, based on the observation that I haven't seen anything get to that level without it. Python's the only one that has arguably gotten there, I think, and it's still debatable.

That said, I do think that if you want to make a new language right now and really see it take off, you do need to find some problem that isn't well-solved, or come up with a reeaaalllly novel combination of things that didn't exist well before. It seems to me that this project is going to be shadowed by Haskell in a lot of ways.

But that's only if you want to see it take off. Not all languages are put out there with that intent.

WhitneyLand · on March 25, 2017

Seems a bit contradictory. Which is it, corporate backing or novel ability?

I would like to see language advantages better quantified. How confident can we be a language is a practical improvement, in what contexts is it true, and what do the improvements buy in cost, quality, innovation, etc.

If we had all this data for a new language it would probably be easier to gain critical mass.

grumpyprole · on March 25, 2017

No one is funding these studies, so I wouldn't hold your breath. It is down to you to decide for yourself. As an anecdote, I switched to Haskell professionally 5 years ago and am both happier and more productive.

WhitneyLand · on March 25, 2017

Interesting. Would you say it's the most productive language you've ever worked in?

What's your best guess as to how well this would apply to developers in general?

Once I used a tool chain with a steep learning curve, but I felt the rewards were clearly worth it. However, with that particular team it was difficult to get buy in. It seems not everyone is interested in a little pain for a lot of gain, especially if the concepts are very different.

grumpyprole · on March 25, 2017

For me personally it is the most productive and expressive language I have worked in.

There is a steep learning curve, which will make one a better programmer, but not without significant buy in. There is no free lunch.

Haskell is very expressive with its types, especially with regard to when effects happen, which makes it excellent as a shared design language. It's interesting for me to see Java/C# programmers struggle to explain some of their more modern stream abstractions to each other:

http://stackoverflow.com/questions/28459498/why-are-java-str...

The answers above are unable to explain succinctly what the APIs are doing, because the authors lack the necessary common language. They have to answer with wordy essays describing various scenarios and use cases.

platz · on March 25, 2017

Not the OP, but I don't think Haskell will ever be drop-in replacement for mainstream langs for cultural reasons, but that doesn't mean that those who do engage w/ it don't get get tremendous value from it or fail to find it their favorite language (I.e. Doesn't contradict what willtim said). And I think that's OK. (No, you're probably not going to convince a bunch of rubyists to use haskell). You can even train entire teams of willing Haskell developers from scratch, if needed. But the desire / buy in has to be there. IMHO anyways

Part of the problem is that dev is so large that it is hard to make "in general" statements anymore.

It's easier to talk about concrete/specific instances or cases

grumpyprole · on March 25, 2017

Yes I agree. Haskell is a powerful principled general-purpose language, but not everyone will need this.

But this does not mean I am happy with the current mainstream status quo :)

spraak · on March 25, 2017

I'd love to have something that takes the best of Go (static typing, fast compiles, binaries, community) and functional paradigm.

jerf · on March 25, 2017

Ocaml is close. Not sure how their concurrency is going, but I'm sure the act of mentioning that will bring someone out to bring us up to date.

toolslive · on March 25, 2017

There are several libraries for concurrency: - Lwt: https://github.com/ocsigen/lwt - Async https://realworldocaml.org/v1/en/html/concurrent-programming...

However pulling these over more than 1 core is still a problem. OCaml 4.05.0 should have infrastructure for that (although OCaml multicore has been somewhat a `duke-nukem forever` story)

Athas · on March 25, 2017

Compared to Go, OCaml is unfortunately a rather large language. It has many non-orthogonal features, some of which are not used widely. The impression I get from Go programmers is that the small size of the language is one of the chief attractors.

jnbiche · on March 25, 2017

> Compared to Go, OCaml is unfortunately a rather large language.

I agree. That said, ML is definitely a small language like Go, without OCaml's extras like the object system.

Alas, ML lacks Go's awesome and very modern standard library, which is a key part of Go's allure.

But yes, I would adore a functional language with Go's best features, particularly the standard library, solid concurrency, simplicity/ease-of-learning, fast compiles, binaries, static, etc.

wtetzner · on March 25, 2017

There's also Standard ML, which smaller, and fully specified, with multiple implementations.

But I think part of the problem with both is tooling. Build and dependency tooling in particular. Opam was a good step in the right direction, but I think OCaml and SML could both benefit from a Cargo-like tool, that made managing projects and their dependencies simpler.

leshow · on March 25, 2017

Haskell?

Some of it's parallelism and concurrency features should look familiar to you (it has an M:N threading model), complete with channels, and some stuff you've probably never heard of like STM. It compiles to a binary, has a type system much more powerful and expressive than Go's, and the community is very helpful.

I will say the compile times aren't very speedy, I assume you want fast compile times in order to type check your code, and for that there is ghc-mod.

bjt · on March 25, 2017

"It compiles to a binary..."

That's one of the things I love best about Go. Compile for your platform, then copy the binary somewhere else and run it. Awesome.

I really want this to be true for Haskell too, but there's a glaring exception with libgmp. Google "haskell libgmp" for many stories of people thinking they could just copy their haskell program to a new system and run it, only to realize they were wrong.

leshow · on March 26, 2017

I'm sorry I don't understand, the results seem to be about people having difficulty installing GHC, not deploying a binary. I can say anecdotally I've never had any problems.

edit: Ah ok, apparently libgmp is dynamically linked in binaries, but you can pass a flag to GHC to statically link all runtime dependencies. Is that what you were talking about?

unrealhoang · on March 25, 2017

How about Rust, no fast compile though.

bluejekyll · on March 25, 2017

With

  cargo check

It's significantly faster than before for type checking etc during development, which is I assume the point at which most people complain about compile speeds.

user1241320 · on March 25, 2017

There was a small attempt to what you describe: https://github.com/oden-lang/oden

platz · on March 25, 2017

this thread is like that time when there was a post about common lisp and then everyone started talking about clojure

Ixiaus · on March 25, 2017

Rust?

jnbiche · on March 25, 2017

Rust doesn't have fast compile, and I think it's hard to argue that a language without TCO is a functional language. Recursion is a critical part of the functional paradigm.

Also, I think a lot of people are attracted to Go because it's very simple to learn and use. Rust with its borrow checker is definitely not simple to learn and use.

But it's true Rust has some functional features.

bluejekyll · on March 25, 2017

That simplicity comes at a cost. The cost is duplicate code, less strict error handling and the billion dollar mistake.

I know that some people downplay the importance of these things; I find that because Rust has strong guarantees in these areas it helps to reduce work, reduce bugs and increase confidence in the software.

And use 'cargo check' during Dev for faster compiles.

jnbiche · on March 25, 2017

> That simplicity comes at a cost. The cost is duplicate code, less strict error handling and the billion dollar mistake.

In Go, I agree. But that's not necessarily the case for all simple languages. Take ML, for example. It's a very small, easy-to-learn language with excellent abstraction features that make it easy to avoid duplicate code, as well as excellent static checking and error handling.

Unfortunately, ML lacks a comprehensive, modern standard library like Go has.

platz · on March 25, 2017

> enough eyeballs left for a new language that does not have a large corporate backing

It's a solid point if the goal is winner-take-all style competitive victory. But I'm not sure software should co-op SV-startup-business exponential growth-or-die mindset. What happened to hacker culture? Are open source developers corporatist now?

/end-speculative-rant

salimmadjd · on March 25, 2017

The point I was making is for something to get enough traction, so that it would get active contributors who help mature the language, tools, etc.

I think there are only handful of people out there who can contribute in a meaningful way for a project like this. If they are consumed working on open source Swift or doing pull request on many things pushed by FB or Google or working contributing to existing projects like GHC, etc. Then the Alpaca project wont get the contributors it needs to show progress. If there is no progress, it falls into a vicious circle of no progress -> no traction -> no contributors -> no progress

wtetzner · on March 25, 2017

The good news is that you don't need that big of a community for a language to do well. It does, of course, need to be big enough, but you don't need to compete too hard with the big corporation-backed languages to have your language community grow enough that it can sustain itself.

Of course, I guess I don't have any real data on it, so this is just my intuition based on observing various languages. So, you know, just my 2 cents.

bbcbasic · on March 25, 2017

Open source is massively corporatist. Developers working for $0 to create tool chains that other developers spend their weekend learning so that the shareholders of their employers can increase their wealth.

alextheparrot · on March 25, 2017

There does seem to be a significant push to contribute to open-source at many large companies. Having money and people contributing as part (or all) of their day job can be quite a boon for projects.

I agree with the premise of your point, though.

leshow · on March 25, 2017

I think Elm is great, but the language designer has really crippled it IMO by not including abstractions for types. Something analogous to typeclasses would be hugely beneficial. As it is the language gets around it by making some built in things magic, I found this frustrating after a while.

psiclops · on March 25, 2017

I haven't used Elm yet, but if you like it, check out the language it is implemented in, Haskell

salimmadjd · on March 25, 2017

I've been trying to crack Haskell for a while, but didn't find it approachable at first. Somehow Elm and building simple web apps made grasping FP easier.

So now that I've learnt a bit of Elm, I find I can grasp Haskell more and spend a bit playing with. The best part is, after reading this [0] I'm finally grasping Monads.

[0] http://adit.io/posts/2013-04-17-functors,_applicatives,_and_...

ryanplant-au · on March 25, 2017

I learnt more about Haskell in ten days of using Elm than I would have in another ten days of studying Haskell, personally.

I think Elm makes a better introduction to FP concepts because there's much less you have to absorb before you reach the point where you can start practicing by doing useful work. Obviously part of that is the fact that Elm removes or hides certain things Haskell has, but an even bigger reason is that you can just say "...and then 'main' returns the HTML element or Html.program that actually gets displayed" and not have to go down the road of IO actions, functors, etc. You can stop at that point and start making working useful applications while getting comfortable with immutability, purity, the type system, and control flow and iteration under those constraints.

Learning Haskell first, you don't have that opportunity to stop and start practicing. You need to move on an understand at least IO actions, functors, applicatives, typeclasses, and other higher-level concepts before you can construct even a simple practice project. Dreaming up a coherent program structure/flow in this weird new immutable and pure world seems hard enough to a beginner without also having to understand how applicative functors fit into the equation. Having that opportunity to stop and stretch your legs by actually doing a project is a major help to a lot of people, that's what makes all the difference. And then 95% of what you've learned transfers directly into Haskell.

mikekchar · on March 25, 2017

I would recommend having a look at Purescript too. One of the very nice things about Purescript is that it generates very readable Javascript. One of the things I was always wondering about is how exactly tail call optimisation works. Looking at the output of Purescript for about 10 seconds answered any questions I had.

Also, though it has been posted here before (and I don't think it has been worked on for a while), I recommend playing with the Monad Challenges[0]. Well, specifically, just do set one (random numbers). You can easily write your own rand function that returns the seed value as a "random" number and then increments the seed. This will generate successive integers (1,2,3,4...). It makes it very easy to test. Then once you've done set one, go back and write map, apply, etc for Gen. One other nice thing you can do is to make a union type/ADT for your random number (i.e., (Int, Seed) ) and then try to see if it is a functor and applicative (also try to understand why or why not). Finally, you can figure out why Gen is structured the way it is.

I've played with that kata over and over and over again. It is simply beautiful.

[0] - http://mightybyte.github.io/monad-challenges/

platz · on March 24, 2017

I continue to be believe even as a static typing fan that static types are fundamentally incompatible with OTP and it's goals.

Distributed systems just seem to too thorny for static types to subjugate/bend to their will.

Sure, you can declare global invariants ahead of time that your cluster must uphold, but it's a bit less "distributed" in a real sense then

jerf · on March 25, 2017

You have to model sending a message across the cluster as marshaling into a binary form and unmarshaling it again. I don't mean that you "should" model it that way... you have to model it that way, because that's what is happening. Therefore, when receiving a message, you really only ever get a Maybe Message or Either Message Error or whatever you want to model it as. The act of marshaling the message back into the local representation is also when you check it for whether it conforms to the type restrictions you think it should have.

Because you must already model this as a process that can fail, I don't think it does break the static typing model at all. In fact I routinely "statically type" messages coming from things that were actually emitted by dynamic languages!

What gets tricky is if you try to model this as a process that can't fail. But the problem there isn't static typing, it's a specific instance of the general principle that you can not build robust systems based on the principle that networks can't fail.

I also think this is an instance of the general misunderstanding about static types, which I understand deeply because I once held it, that static types somehow prevent errors. They don't. What they do is provide a gateway that says "in order to get into this type, you must meet these criteria, and the compiler is going to statically check that you've verified these criteria". A static typing system doesn't force things through that gateway, it forces you to check whether things fit through that gateway, and do something with the things that don't. Then, it also allows you to strictly declare that everything that uses that type is statically checked to be "behind" that gateway, so there are no other ways around it to get in, thus creating a space in which you can count on the fact that the values have been checked for certain properties and you can now write code that counts on those without constantly checking them. A statically typed system faced with the task of, say, parsing a number out of a string, does not prevent a user from sending me a string of "xyz"; it just prevents me from just sending it through the system as-is.

platz · on March 25, 2017

> everything that uses that type is statically checked to be "behind" that gateway

In a distributed system, the largest the "gateway" can reliable be is a single node, because you don't get guarantees about the code that other nodes in the system are running. Even the single node case poses difficulties, because I believe in OTP the upgrade path means you have to transfer state during upgrades. What if the types of the state during the upgrade don't exactly match? Can multiple types of a thing exist simultaneously? How is these types versioned? etc... it gets complicated.

> Therefore, when receiving a message, you really only ever get a Maybe Message or Either Message Error or whatever you want to model it as.

Sure, you can receive messages as "Object" and then cast/parse them inside the node. Does that mesh with the vision of what people have when they want to bring static typing to erlang?

---

The hard part about thinking about OTP is not just the message passing, but also the myriad deployment & upgrade & versioning scenarios.

I am a fan of static typing over dynamic typing in everything else , i.e. normal programs.. just not _OTP-style_ erlang for distributed systems.

Even thinking about something like a gen_server (http://erlang.org/doc/man/gen_server.html) makes my head hurt... though if someone can figure out a way to do it that's faithful, more power to them.

louthy · on March 25, 2017

> Can multiple types of a thing exist simultaneously? How is these types versioned? etc... it gets complicated.

I don't use Erlang, but I have developed an Actor system for C# [1] which is based on its (and Akka's) concepts. Clearly without a static type-checker for the whole distributed system we have to manually get involved and patch the old and new so that we can hot swap processes. Versioning I've found is best done by maintaining the old process that accepts the old message format, maps it to the new one, and then forwards it on to the new process that accepts the new message format. Any other node that is lagging behind will continue to work, and any new one will send to the new address for the process.

This isn't really rocket science, and if you stick to a few basic rules it tends to work out just fine. That doesn't mean that type safety goes out of the window, it just means that in creating a distributed process you must accept that you can't retire the old contract without it causing potential problems.

Apologies if I'm missing your point about OTP, but ultimately it seems that at some point (as the GP says) you are marshalling a message into a text or binary format, and then unmarshalling. At that point if the unmarshalled static type doesn't match the type that the process expects, then it will be off to the dead-letter queue. I don't really see how that's any different to giving the wrong type to a function in a dynamic language, or using an incorrectly typed variable that is picked up by a compiler in a statically typed language. In each case it's type checking at the earliest possible opportunity.

[1] https://github.com/louthy/echo-process

jerf · on March 25, 2017

'Sure, you can receive messages as "Object" and then cast/parse them inside the node. Does that mesh with the vision of what people have when they want to bring static typing to erlang?'

No, that's not how you do it. You marshal things directly into the desired types. Check out either aeson for Haskell or how Go does things via either the json modules or the generic Text/Binary Marshaler/Unmarshaler.

"but also the myriad deployment & upgrade & versioning scenarios."

The answer to all of those things is mostly that even a lot of Erlang shops don't use live upgrading. You really have to have a very particular use case for that to be the best solution vs. a rolling upgrade and server restarts. Even if the language is capable of it, it still requires you to write services that can handle being upgraded, and it's much easier to write services that can handle being restarted, especially since you 100% have to write that anyhow because services get restarted anyhow. Most people don't have that use case. Web services certainly don't have that use case.

Once you drop that, it's a lot simpler.

"Even thinking about something like a gen_server"

gen_server is partially as complicated as it is as a side-effect of other decisions in the language. While the concept of a gen_server is a strength in Erlang, the specific implementation of gen_server as this "behavior" thing is mind-blowingly complicated for what you actually get. (It reminds me of Python's "metaclasses". I spent many hours wrapping my head around what that was, but in the end, all that it amounts to is what is now called a class decorator, which is way more sensible. A metaclass isn't a class decorator in theory, but in practice, class decorators are way easier to understand and cover 99.9% of the use cases, if not 100%.) When I implemented supervisor trees in Go, my solution for gen_server/gen_fsm/gen_* was just to... not. Behaviors are just a very, very weird half-object-ish system with a lot of limitations. They are easily replaced by simply having some sort of "interface" system, be it via conventional classes or interfaces. It's why you don't see "behaviors" as Erlang defines them anywhere else. Erlang has a lot to learn from and copy from, but that part isn't it.

toast0 · on March 25, 2017

After using hot code loading in production for the last 5 years, I don't see why you wouldn't use it, when it's right there. Maybe it's less thought to do a rolling restart, but it's a lot more effort expended by everything in the system to rebuild all the state that was in your processes.

A behavior is simply a list of functions you've declared that your module will export -- and a convention on what they might do. gen_server.erl is going to make lots of callbacks into your code, and rather than pass a huge list of funs, instead we pass the module name, and gen_server calls the exported functions from that module (this style means all the callbacks will hit your new code if you hot load, without you doing anything special; processing type changes is up to you, of course)

platz · on March 25, 2017

I know how aeson works, but the details of how it parses text into a HashMap that you can extract fields from into a data structure is somewhat besides the point, but I'll grant you the point that there are static solutions for message passing, sure.

It seems odd to me that they would include a unique feature like live-updating if it shouldn't be used.

I grant that live-updates and gen_servers may be anti-patterns, but my assumption was to consider the effects of static types on OTP and these are part of it.

If you identify some subset of erlang+OTP that is easier in some ways, great, I'm all for it.

I am just pointing out some complexities without making assumptions about what should be included or discarded. ( I do not know what erlang shops do in the small or in the large).

Perhaps what we want then is static types for "OTP-Lite"

yorwba · on March 25, 2017

I think even live-updating could be statically typed. Basically the live-update is a collection of functions that map every data type in the old process into the corresponding type in the new process. In the dynamically-typed case, these functions are just the identity. In the statically-typed case, if the new type has a new attribute, your mapping function has to define a reasonable default value. If you can't do that, your dynamic live-update would have gone badly anyway.

swhipple · on March 25, 2017

The main problem with pushing a typechecked live-upgrade in one shot is that you'll need to put a big lock around the distributed system (A non-upgraded node messaging an upgraded one would be fine, because the upgraded one knows the conversion function, but what happens in reverse scenario?)

It could be done without a big lock by splitting into three steps:

1) Push an upgrade that changes the types and adds the conversion functions. The valid type is the union of the old type and the new type. Wait until all nodes complete the upgrade.

2) Push an upgrade that instructs the nodes to convert their data and start using the new types by default. Wait until all nodes complete the upgrade.

3) Push an upgrade that removes the old types and conversion functions.

msangi · on March 25, 2017

Why is this a limitation only of type-checked functions?

The problem is whether the function can or cannot handle the content of the new message and that's completely orthogonal

swhipple · on March 25, 2017

The difference is that attempting to add code to a live system that mishandled an input (either from a non-upgraded node or an upgraded one) would result in a type-error and reject the upgrade.

You could still use Erlang's crashing and supervisor technique, but you would have the additional benefit of using static typing across a distributed system (where each node may or may not have received the upgrade yet).

runeks · on March 25, 2017

> [...] etc... it gets complicated

Which is exactly why we want to employ static types: in order to catch the difficulties in implementing it correctly. We describe the complications in the type system, through a model that captures them, to allow compiler errors -- rather than runtime errors -- to guide us in implementing it correctly.

Types only hinder getting an invalid program to compile -- which is exactly what we want.

platz · on March 25, 2017

In general, sure - but this post is about erlang/OTP, and the way you're speaking in generalities makes me think that you're just trying to persuade me about and champion the value of static types in general.

To digress slightly, consider an example from another domain, although I would rather keep this discussion about erlang. Now, Haskell is the only well-known language that has lazy (non-strict) semantics. Over the years many folks have proposed to make Haskell strict by default, alleviating some of the headaches that occur from non-strict evaluation. However appealing that may be, it would be a sad day if that occured, because we'd loose the only language to understand how lazy/non-strict evaluation affects how we design programs while there are countless strict languages, and lazy/non-strict evaluation has some very nice properties indeed.

Now to bring this back to erlang/OTP... sure, it is very nice when we add static types to erlang because we get all the nice things that static types provide, but we also loose some things. There are some features in erlang/OTP that are very dynamic, and forcing a static type system simply kills those features. I think that would be a sad day for the erlang, because you'd loose the ability to design distributed systems utilizing the full range of behaviors what the erlang/OTP system offers. There are already other actor systems in the world that offer static typing. You don't need erlang to build those systems—There is only one erlang/OTP that some some very unique features that none of other have.

Say, if we're talking about javascript, which runs at the level of a program on a single machine, I say bring on the types. If we have some other statically-typed actor system that works well for certain use cases, great. If we're talking about erlang/OTP, which is designed specially for fully distributed systems, I say let it be.

sgrove · on March 25, 2017

Would you mind elaborating, or sharing some papers on the subject? I'm particularly interested in a dialect of ReasonML that would use the BuckleScript compiler + ConcurrentML but target the BEAM VM, and I'd love to know how bad of an idea it might be. Maybe because it lacks e.g. session types it's hopeless, but I'm not sure.

So, would love to hear specifics!

platz · on March 25, 2017

There are parts of OTP patterns that seem inherently dynamic. Message passing is only one aspect. There are also deployment/upgrade concerns with a running system.

Actors can receive messages that change their behavior entirely ( http://erlang.org/doc/man/gen_server.html ). Features like this are not there by accident.

Actors can hot-upgrade code their dynamically while the process is running. For example, if an actor is hot-upgrading I'm not sure how it would work, if the types of the old state and the new state don't exactly match. Sure, you could write functions to do this, but you see the picture is much more complicated.

I don't think I've presented the best arguments off the top of my head here here, but if you think more about the deployment/upgrade scenarios, along with partial updates along in certain nodes of the system, you can think about how complex it could get.

Basically, never assume that you get to take the whole cluster down to do an upgrade. Comprehensive "red/black" deployment strategies used by other non-distributed languages are not really the OTP way of doing deployment/upgrades.

sitkack · on March 25, 2017

I have version N of a struct and then having version N+1 of a struct in-flight at the same time is almost impossible with current statically typed languages. In a dynamically typed language, as long as the contents of version N+k struct are additive and don't change the semantics, old code can read new data.

What needs to happen is both, immutable code, and versioned structs with pure functions that can upgrade and possibly downgrade structs as needed. The larger the distributed system, the versions of a struct (message) will be in-flight at a time. Services need to contain no state, so that they can be micro-rebooted and brought up with the new version.

Joe Armstrong had a comment on globally accessible but immutable code, which I think would go a long way towards the ability to statically type the inputs to a function in a distributed system. Interposition and routing would be the only way to upgrade or deprecate old code paths.

di4na · on March 26, 2017

Well the first question you have to answer is simple : what is the type of self(), your own pid. This is a really hard problem and what stopped SPJ in the late 90s/early 2000s

Then the second problem is that at any given time you can receive a message from another node/process 10 years in the future compared to you, that you know nothing about his code or types. How do you type check it?

Finally, the actor model in general allows unbounded nondeterminism. This is not really something you can build into a static type checker.

The "easy" solution is to make messages an opaque black box that can be anything... but at that point you are leaking static typechecking everywhere.

runeks · on March 25, 2017

> Distributed systems just seem to too thorny for static types to subjugate/bend to their will.

The more I've learned to leverage types, the more I realize that it's my limited knowledge of type systems that prevents me from expressing something in it. Types do not bend to the will of programs; programs bend to the will of types (in statically typed languages).

> Sure, you can declare global invariants ahead of time that your cluster must uphold, but it's a bit less "distributed" in a real sense then

I don't understand. The components of distributed systems communicate via protocols. What prevents the implementation of these protocols from leveraging type safety, thus transforming a runtime error into a compile-time one?

Static typing is about catching programmer mistakes, by communicating your intent to a compiler -- "I expect the type of this to be a Maybe Int, fail if that's not the case". There's no essential difference between a test informing you that a value-level property doesn't hold up at runtime, and a type error, informing you that a type-level property doesn't hold up at compile-time.

platz · on March 25, 2017

> What prevents the implementation of these protocols from leveraging type safety

Global invariants of a running distributed system are different than local invariants in a single program that you can stop, deploy re-compiled binaries to, and then start again.

Now, you can use static types in actor systems, and they are some of these that exist. These typed actor systems don't do all the same things that erlang/OTP does (that may be ok - maybe you don't need them). If your use case fits into what the typed actor systems actor systems provide, by all means, one of those are probably a better fit for you.

nv-vn · on March 25, 2017

I suggest you look into Session Typing, specifically Multiparty Session Types as these provide a refreshing approach to the problem of communicating threads. A lot of it is still kind of experimental but there's some good traction being made for sure and it's probably as expressive as you'd need to get to model 95% of the type information in an Erlang program. Type inference obviously isn't a choice yet, but I think a good language offering some of these features on the BEAM VM is all that is needed to make them hit the mainstream and actually get used for real software so that more work can go into the theory, etc. The problem is being solved on the bleeding edge of things, just not as fast as Erlang itself is progressing.

dalailambda · on March 25, 2017

While I agree that the OTP perhaps is not as easily statically typed, since it was built with Erlang in mind, I do think that static typing adds a layer of robustness to distributed systems, especially if you design it that way upfront. In my experience the problem comes when you try to apply static typing to a dynamic system.

leshow · on March 25, 2017

> In my experience the problem comes when you try to apply static typing to a dynamic system.

As in, because it compiles down to a dynamic system, it's no good?

There are plenty of languages that give us strong static guarantees and compile down to dynamic or untyped languages. Look at Purescript, Elm, etc. They all do quite well compiling down to JS.

Don't forget that assembly isn't strongly typed either, and most languages compile down to that. I don't see anything wrong with a static typed layer that compiles to dynamic code, the interface you're providing is still type safe.

dalailambda · on March 26, 2017

In regards to Purescript/Typescript, they're both statically typed and that results in friction when trying to integrate with the existing JavaScript ecosystem/libraries. Erlang/OTP might be different, but there will probably be situations where the type system is either incompatible with a certain library, or the type system is made less strict (e.g. an any type).

leshow · on March 26, 2017

That wasn't what I got out of the previous post, it seemed to be saying there was something inherently unsafe about compiling down to a dynamic language.

brightball · on March 25, 2017

I tend to agree with you. Message passing and static types don't mesh unless there is some type of contract between the sender and receiver. It would be a nightmare.

msangi · on March 25, 2017

How does dynamic typing help if there is no contract between sender and receiver? Even in that case they must agree on the content of message.

Even if you insist in keeping the message untyped, with a static type system one could always convert (and possibly reject) messages as soon as they are received into a more precise type. That would keep the code that the compiler can't verify to the edges of the system.

platz · on March 25, 2017

Distributed systems are not like traditional programs, because there is not just one "edge of the system".

Every node becomes an "edge" in it's own right, and doesn't necessarily have global coherence with the rest of the system.

louthy · on March 25, 2017

True, but if the sender wants the receiver to do something of value then it will need to meet a contract that the receiver enforces. That doesn't require a central repository of contracts, one node can diverge, but you must understand that parts of your network of services will start to fail. From that point of view it starts to look very much like the linker phase of a compilation, and that the types need to match up to the data structures being instantiated. It's just this 'linking' phase is in the programmers heads, and not particularly useful. A distributed system that can validate itself is a much more valuable concept.

platz · on March 25, 2017

Absolutely, the rubber meets the road at some point, nodes must understand/assume "contracts" about the data they are working with.

There already are static typed actor systems ( e.g. Orleans) which work well, but my point is that I believe OTP is more flexible for better or worse. Whether that flexibility is worth it to you for what you get is another matter.

Also I'm not sure how to think about binary compatibility between upgrades in such a system

louthy · on March 25, 2017

> There already are static typed actor systems (e.g. Orleans)

Yep, I develop one myself. And have gone to the extent of not allowing senders to even post a message if it's of the wrong type (processes in nodes publish the types they accept to a central store). I initially went along with the 'accept anything' approach (which Akka really majors on too), but found that for the large systems I was developing that it became a real headache to deal with.

> but my point is that I believe OTP is more flexible for better or worse. Whether that flexibility is worth it to you for what you get is another matter.

Yep, fair enough, if it works for you, who am I to complain? It's not worth it for me, because I feel quite strongly that the code I write should understand the types it's working with. It feels like this super-late binding can give false positives, appear to work, when in fact it's not. That scares the shit out of me when systems get large.

platz · on March 25, 2017

I gotcha, I am not even an erlang programmer. My two main languages are C# and Haskell and in general I abhor dynamic types.

All I am trying to do here is enumerate the difficulties in trying to Type "OTP" in Erlang (its main selling point) , and am not commenting on all possible actor systems.

louthy · on March 25, 2017

Understood :)

brightball · on March 25, 2017

Think of it this way: you have a struct type with 4 attributes that you want to pass to another function.

Currently, that function declares it will match on the pattern of those 4 attributes rather than a static type. Now, you update the Node and modify the type on the sending Node to have 5 attributes.

With pattern matching on the 4, everything still works. With static types on the struct the contract is now out of sync.

louthy · on March 25, 2017

I find this to be a really poor argument. Essentially you're lucky if your systems continue to work as others go off changing message formats without consideration for the code that will receive it?

On a suitably complex/large system this is a recipe for disaster. Things start to slowly rot. It is far better to maintain the old function, accepting the old struct, map it to the new struct and forward it on to the new function that accepts the new struct. Let the old one consume anything that's already queued, or being sent from other nodes that haven't yet been upgraded whilst the new one takes the new format.

cousin_it · on March 25, 2017

I've worked with systems like that for years, and it's fine. We can have several large binaries with different release schedules, passing around a big struct with 50 fields and many nested structs, with different people making changes to different parts. And nothing breaks. New code accepts old structs, old code accepts new structs, no conversion code required.

To achieve that, we follow the design of Protocol Buffers:

1) Each field in each struct has both a name and a numeric id. Only ids are used for serialization, so field names can be changed at any time.

2) All fields are marked as optional or repeated, never required. Most code is written to handle missing fields gracefully.

3) Changing the type or id of an existing field is forbidden. (Note that changing the contents of a nested struct doesn't count as changing its type.)

4) Adding a new field is okay, as long as you use an id that was never used before. (Each struct definition has a comment indicating the next available id to use.)

5) Removing a field is okay if you've checked that no one is using it anymore.

6) As a small but intentional bonus, you can change an optional field to repeated while preserving binary compatibility.

In the end it works out. You can think of breakages that could theoretically happen, but they don't.

louthy · on March 25, 2017

What you're describing sounds like a manually implemented type system.

> Each field in each struct has both a name and a numeric id. Only ids are used for serialization, so field names can be changed at any time.

Fair enough your field names can be renamed. But the 'contract' is field numbers, not names.

> All fields are marked as optional or repeated, never required. Most code is written to handle missing fields gracefully.

So if all fields are optional, and you provide no fields at all, what happens? I assume the process rejects it, because it's not of the correct type?

> Changing the type or id of an existing field is forbidden.

Forbidden by what?

> Adding a new field is okay, as long as you use an id that was never used before. (Each struct definition has a comment indicating the next available id to use.)

I can understand this being the least problematic change to a type. But it still leads to 'if x has y field' behaviour, as your code tries to manage the full range of possible message types it might receive.

> Removing a field is okay if you've checked that no one is using it anymore.

That sounds super fluffy.

> As a small but intentional bonus, you can change an optional field to repeated while preserving binary compatibility.

Sorry, I don't follow? This bit confuses me 'change an optional field to repeated'.

> In the end it works out. You can think of breakages that could theoretically happen, but they don't.

I can think of many:

* If picking of IDs is done by a human, at some point a human will make a mistake and re-use an existing one

* If 'Changing the type or id of an existing field is forbidden' is a human enforced constraint, then it will fail

* If you think a certain struct pattern can't happen any more (you think you've retired all nodes that send the old format), and then you deprecate the many matches that deal with legacy messaging, and then realise that actually there is an old node that does it after all.

* You may re-add a field to a type which was previously removed and cause unexpected behaviour in parts of the system that match on that old format

* Removing a field that you thought wasn't used any more but actually still is

By the way, I'm not suggesting it's not possible to develop robust systems without a static type system of some sort; but I do think the hoops you're jumping through in items 1-6 indicate the problems of not using static types. Each change in functionality could just use a new struct, with a new function, and the old function maps to the old struct to the new one. It captures precisely the change in logic in one place, has no runtime cost for nodes that are sending the new struct, and can't lead to the edge cases that I listed above.

cousin_it · on March 25, 2017

It helps that we have one big source control repository for the whole company, which runs many custom hooks that can block changes from being committed. For example, if I try to add a field whose id is lower than the next available id specified in the .proto file, my change won't commit. Same if I try to remove a field that's still used by someone else and my change breaks their build.

More generally, I think RPC interfaces need to be forward-compatible by design. If you have two binaries that are released on different schedules, and the API between them is fully rigid, how do you ever change it? Version the whole API, for a change that adds one boolean feature flag to one struct somewhere? Write a converter for fifty existing fields every time you add a new field, leading to O(n^2) programmer work over time? Come on.

Even more generally, I think static types are a great idea, but they work best locally. Communication over longer distances (in space and time) requires a different set of tradeoffs. There's a reason why people design network protocols and file formats with open-ended forward compatibility in mind. RPCs are kind of a middle ground, and I've found that the tradeoffs in Protocol Buffers work pretty well. YMMV.

di4na · on March 26, 2017

False. In erlang, your message passing is "at most once".

If you send a bad message, the receiver will crash or discard it and it is how it is intended to be.

Erlang embrace laws of maths and physics.

jordwest · on March 25, 2017

I would say there should always be a contract between the sender and receiver, whether that's using static types or otherwise. Not having a contract is a nightmare.

For example, say a satellite sends a number to the throttle control in feet/second, but the throttle control thinks its in m/s. To each of those systems, they're just passing a number and don't know any better.

brightball · on March 25, 2017

Every JSON API call currently works without a contract. In theory it should have one, but in reality it doesn't unless the server (hopefully) validates. Either can change at any time without informing each other.

WSDL based APIs on the other hand have clearly defined contracts at both ends but there's more overhead involved.

msangi · on March 25, 2017

It's not an explicit contract, but there must be an implicit one for stuff not to break

jrobn · on March 25, 2017

I also agree with this premise. I favor the gradual typing philosophy more and more. For me at least, productivity wise, being able to write something, play around with it, make changes, etc without worrying to much about satisfying type requirements is great. When the idea and and implementation feels solid go back and gradually add in type requirements.

I would love to see Erlang get a LLVM based JIT compiler backend. I think this http://llvm.org/devmtg/2014-04/PDFs/Talks/drejhammar.pdf is the latest work done in that area.

di4na · on March 26, 2017

There was a talk at the eef17 this week from the OTP team on it. It is on erlang solutions YouTube channel.

Sadly on phone so can hardly link it

di4na · on March 26, 2017

Found it !

https://youtu.be/PtgD5WRzcy4

Tarean · on March 25, 2017

I think it is necessary to go with an erlang like style if one wants to get a remotely acceptable cost model.

I am not sure how much static typing actually hinders and how much that is a matter of tooling, though. Maybe static typing could do things like checking whether the new version will be compatible with other nodes before deploying?

haspok · on March 25, 2017

I think this is great news for the Erlang VM: while you wouldn't want to use static typing for any program you write, there is a very specific use-case where you definitely want to do that: embedding business logic in your application.

I've been there, done that: encoding business rules in Erlang is no fun, hard to test, and definitely hard to read and modify later. In this particular domain the constraint of types does not slow you down, in fact, it speeds up development. A large amount of unit tests can become unnecessary just because of the type checking. And the more expressive your type system, the fewer tests you need - and the code and the remaining tests can concentrate on validating business logic instead of validating programming language logic ("here is a map - do I have a value with key X in it?" - maybe a bad example because of pattern matching, but I hope you get the idea).

You definitely have to be able to interface with OTP, but I don't see it as a huge problem - parts of your application could and should be written in Erlang, there is nothing wrong with that.

im_down_w_otp · on March 25, 2017

I'm not sure I'm following you regarding the difficulty of encoding business logic into your code in Erlang.

Erlang's function-head matching system is extremely close to being a Prolog-style logic programming system when used a certain way.

I've found it extremely easy to take what would normally be a big weird database of rules and values and instead precompile every possible route through the system into a bunch of generated function-head matched function calls + guard clauses. It makes assuring that given inputs will definitely produce correct outputs very easy, and makes processing the rules extremely fast.

Athas · on March 25, 2017

I am intrigued by this snippet from the README:

    type messages 'x = 'x | Fetch pid 'x

This appears to define a sum type where one of the variants is left with an implicit constructor. How do you pattern match on that? How do you do type inference?

j14159 · on March 25, 2017

That's a good question and the short answer is that we're deliberately breaking decidability to allow people to use types that are relatively in line with what we're used to in Erlang (I get into this a bit in the talk linked above). There's a reasonable argument to be made that we should knock this off of course but I'm generally biased in favour of making interoperation with the ecosystem simpler :)

We should actually clean up that example you pointed out, matching on the `Fetch` constructor first. What we're really trying to support is unions like:

    type number = int | float

There's a yet un-had argument about the utility of this as well of course and we may want to remove the ability to do this entirely. The way we type this is by actually using type-checking guard functions like `is_integer` to convey information to the typer at compile time.

jmcdiesel · on March 25, 2017

Can anyone explain why it seems like so many new languages are reinventing things that seemingly have little effect, but they seem to be changing them "just because"

We have comments in code for decades now. // and /* */ are easily the bigest standard, with # coming in second. Why '--' in this language? Why "``" in another language i saw recently?

I cant imagine this gives any real benifit to the coder or the compiler, and it seems to be more difficult because now the IDEs have to be configured for a(nother) new comment type, it has be become muscle memory again, its yet another "common ground" peice of code that requires context switching to change between languages...

I get doing new things with the functional features of a language, im all for trying new things and seeing what works... just seems wierd to have so many ways to comment code... such an insignificant part, why change?

SEMW · on March 25, 2017

> why change?

They didn't. The syntax is explicitly stated to be a mixture of OCaml and Elm. The -- comment syntax is from Elm. Elm in turn got it from Haskell.

lepoetemaudit · on March 25, 2017

It's -- in both Elm and Haskell for comments, so it's not new and part of the heritage of the language.

yawaramin · on March 25, 2017

You may be pleased to hear that ReasonML chose /* */ for its OCaml syntax refresh just for familiarity.

SmellyGeekBoy · on March 25, 2017

SQL uses '--' for comments.

pka · on March 25, 2017

There's also purerl [0], an Erlang backend for PureScript. Would be cool if the two projects could join forces.

[0] - https://github.com/purerl/purescript

sctb · on March 24, 2017

Related discussion from last year: https://news.ycombinator.com/item?id=11992773.

sergiotapia · on March 25, 2017

>Apache License, Version 2.0

Pardon my ignorance, but why not make it MIT and completely avoid any licensing issues?

geofft · on March 25, 2017

The Apache license has an explicit patent grant (the MIT license says "permission to use", which isn't a copyright grant, so it probably has an implicit patent grant), and an explicit statement that patches intentionally submitted for merge are submitted back under the Apache license.

The reason we have licenses at all instead of the Unlicense or similar is to make things unambiguous for courts and lawyers. Explicit is better than implicit. The length of the MIT license isn't actually a feature.

(And the contribution section seems like it avoids licensing issues that MIT doesn't.)

dmm · on March 25, 2017

License bikeshedding is fun, so please indulge me.

People dislike apache because it's complicated and requires annoying notices on distribution of modified versions. Debian and fsf say it's free. OpenBSD believes the patent provisions are non-free and refuses to include apache licensed software.

I think a project is better off having non-trivial contributers sign explicit license grants, even you admit that explicit is better than implicit.

geofft · on March 25, 2017

I don't understand how the OpenBSD project defines "free", so I can't usefully comment on how they consider the Apache license "non-free".

It sounds from https://softwareengineering.stackexchange.com/questions/2632... like they are reading the existence of an explicit patent clause in the Apache license (regardless of what that clause is!) as an "additional restriction". I want to know whether they believe the MIT license has a patent grant, and if not, what they think "Permission to use" means.

liveoneggs · on March 25, 2017

until they are taken to court the licenses are all just assumptions. Did OpenBSD have a lawyer review this decision and post the language somewhere?