> copy and paste a higher-level construct like an Agent or Genserver and add the...

josevalim · on Sept 7, 2020

A GenServer may become a concurrency bottleneck as any other language with concurrent data access may become a bottleneck depending on your abstraction of choice. This is nothing specific to Elixir.

What Erlang/Elixir bring to the table is that there is a good vocabulary and introspection tools to observe these issues. For example, if you have a GenServer as a bottleneck, you can start Observer (or the Phoenix LiveDashboard or similar), order processes by message queue, and find which one is having trouble to catch-up with requests. So we end-up talking about it quite frequently - it is easier to talk about what you see!

If you need distributed data, then by all means, use Redis or PostgreSQL or similar. ETS is not going to replace it. What ETS helps is with sharing data within the same node. For example, if you have a machine with 8 cores, you may start 8 Ruby/Python instances, one for each core. If the cache is stored on Redis, you will do a network roundtrip to Redis every time you need the data. Of course you can also cache inside each instance but that can lead to large memory usage, given each instance has its own memory space. This may be accentuated depending on the data, such as caching geoip lookups, as mentioned in the post you linked.

In Elixir, if you have 8 cores, it is a single instance. Therefore, you could cache geoip lookups in ETS and it can be shared across all cores. This has three important benefits: lower memory usage, reduced latency, and increased cache-hit ratio in local storages. At this point, you may choose to not use Redis/DB at all and skip the additional operational complexity. Or, if you prefer, you can still fallback to Redis, which is something I would consider doing if the geoip lookups are expensive (either in terms of time or money).

In any case, ETS is completely optional. If you just want to go to Redis on every request, you can just do that too! And for what is worth, if I need distributed state, I just use to the database too.

nickjj · on Sept 7, 2020

> Or, if you prefer, you can still fallback to Redis, which is something I would consider doing if the geoip lookups are expensive (either in terms of time or money).

What if the geoip lookup took 1.5 seconds to look up from a remote API? Is ETS still the right choice?

Based on your statement, it sounds like you wouldn't since that's a long time (relative to a 25ms response). But if ETS is meant to be used as a cache, wouldn't that defeat the purpose of what it's meant to be used for?

Like, if I wanted to cache a PostgreSQL query that took 1 second to finish. Isn't ETS the primary place for such a thing? But 1 second is a long execution time. I know Cachex (the Elixir lib) uses ETS to cache things, so now I'm wondering if I've been using it for the wrong thing (caching database calls, API call results, etc.).

Normally in Python or Ruby I would have cached those things in Redis and lookups are on the order of microseconds when Redis is running on the same $5 / month VPS as my web server. It's also quite speedy over a local network connection too for a multi-server deploy. Even with a distributed data store in Elixir, you'd hit the same network overheard right?

> if I need distributed state, I just use to the database too.

This part throws me off because I remember hearing various things in Phoenix work in a distributed fashion without needing Redis.

josevalim · on Sept 8, 2020

> What if the geoip lookup took 1.5 seconds to look up from a remote API? Is ETS still the right choice?

I would use ETS to cache local lookups (for all cores in the same node). Then fallback to Redis to populate the ETS cache. But again, feel free to skip one of ETS or Redis. The point is that ETS adds a different tool you may (or may not) use.

> Like, if I wanted to cache a PostgreSQL query that took 1 second to finish. Isn't ETS the primary place for such a thing?

Here is the math you need to consider. Let's say you have M machines with N cores each. Then remember that:

1. ETS is local lookup

2. Redis is distributed lookup

If you cache the data in memory in Ruby/Python, you will have to request this data in PostgreSQL M * N times to fill in all of the caches, one per core per node. Given the amount of queries, I will most likely resort to Redis.

In Elixir, if you store the data in ETS, which is shared across all cores, you will have to do only M lookups. If I am running two or three nodes in production, then I am not going to bother to run Redis because having two or three different machines populating their own cache is not an issue I would worry about.

> > if I need distributed state, I just use to the database too.

Apologies, I meant to say "persistent distributed state" as not all distributed state is equal. For ephemeral distributed state, like Phoenix Presence and Phoenix PubSub, there is no need for storage, as they are about what is happening on the cluster right now.

bitwalker · on Sept 7, 2020

My opinion is that this depends entirely on the cost relative to the overall task, and how likely cache hits are to occur. If cache hits are very likely and the task occurs frequently, I'd strongly consider storing it in ETS. If cache hits are unlikely, then it depends purely on how expensive the task is, but generally there isn't a lot of benefit to caching things that are infrequently accessed.

I wouldn't cache database queries unless the query is expensive, or the results rarely change but are frequently accessed.

Generally though, whether to store something in ETS or not is situational - your best bet is actually measuring things and moving stuff into ETS later when you've identified the areas where it will actually make a meaningful difference.

> This part throws me off because I remember hearing various things in Phoenix work in a distributed fashion without needing Redis.

This is true, but it depends on what kind of consistency model you need for that distributed state. The data you are referring to (I believe) is for Phoenix Presence, and is perfectly fine with an eventually consistent model. If you need stronger guarantees than that, you'll need a different solution than the one used by Phoenix - and for most things that require strong consistency, its better to rely on the database to provide that for you, rather than reinvent the wheel yourself. There are exceptions to that rule, but for most situations, it just doesn't make sense to avoid hitting the database if you already have one. For use cases that would normally use ETS, but can't due to distribution, Mnesia is an option, but it has its own set of caveats (as does any distributed data store), so its important to evaluate them against the requirements your system has.

lostcolony · on Sept 7, 2020

Couple of other solid responses here, but going to add my own -

GenServer is immaterial - it's the single process that can be a bottleneck. Just like a single thread can be in other languages. If you need multiple threads in other languages, you'll need multiple processes in Erlang. The nice thing here being the cost of spinning up more (even one per request) is negligible, the syntax is trivial, and the error model in the event something goes wrong is powerful, whereas threads in other languages are so heavy weight you have to size the pool, they can be complicated to work with, and if things go wrong your error handling options tend to be limited.

Use ETS in the places you'd use in memory caching. That's it. That's what it's meant for. If you need distributed consistency, it's not the right answer. If you need a single source of truth, it's not the right answer. But if you need a local cache, that does not require strong consistency(and that's very, very common in distributed systems), it works great.

nickjj · on Sept 7, 2020

> The nice thing here being the cost of spinning up more (even one per request) is negligible, the syntax is trivial

Do you have an example of the syntax?

Basically, how would you wire things up so it becomes difficult or maybe even impossible to shoot yourself in the foot with GenServer bottlenecks?

Also if you had to guess, why would the blog post author of the post I link not do this trivial syntax adjustment instead of rewriting everything to use ETS?

Unfortunately posts like theirs are what comes up first when you Google for things like GenServer vs ETS.

sodapopcan · on Sept 7, 2020

An Elixir/Erlang process may be used to both do work and store state. This is what a GenServer is for. I don't have experience building large systems (or even production code) in these languages but have been doing tons of reading and playing in the past months.

The intuition I've built so far is that holding state in a process (GenServer being an abstraction around a process) is reserved for stuff like "The current state of a games table" where you would have one process per game being played. This state is only being read and manipulated by itself and will be thrown away when the game is over (which might make it sound like an object!). If you suddenly had a requirement to show live stats from thousands of games being played, then one option would be to start sending it to ETS.

One big difference to point out is that to store state in a process you need to use the language's data structures (lists, maps, keyword lists) which can get slow when they grow HUGE. ETS is an actual key-value store with incredibly fast lookup (and can easily be read by multiple processes).

I hope that makes sense—I'm also testing my own knowledge here :)

nickjj · on Sept 7, 2020

Thanks, that makes sense.

And I think potentially even more sense if Live View is just creating a GenServer under the hood.

This would explain why if you had 1,000 people connected to your site through Live View, you would have 1,000 stateful processes running on your server. That would be 1 GenServer for each connected client, each in their own universe with their own un-shared data / state.

sodapopcan · on Sept 7, 2020

I think it's best to just think about processes instead of specifically saying "GenServer". GenServer is just one way to interact with processes. For example, if you wanted to run something in the background that doesn't hold any state you could use a task like: `Task.start_link(fn -> some_long_running_function() end)` (though technically I do believe Tasks use the GenServer API behind the scenes). You can also create and manage processes yourself with `spawn` though it's not recommended unless you REALLY know what you're doing though I think even then there are many use-cases for this (but again, not very experienced here).

Also yes, LiveView does indeed have one process for each one! The GenServer API is available in your LiveView modules.

bitwalker · on Sept 7, 2020

Process bottlenecks are a design problem, not a language or syntax problem; and are mitigated largely by a few points that can be factored in during design or PR review:

- Be wary of places where you have N:1 process dependencies, where N is large and the number of messages exchanged between each member of N and the single process are frequent/numerous. Since each process can only handle received messages sequentially, there is little point in spawning a lot of tasks in parallel if each process has to talk to the same upstream process to do anything

- Set up telemetry that samples the number of messages sitting in the process mailbox; if a process is becoming a bottleneck, it is going to be frequently overloaded and have a lot of messages in its mailbox. If you have the telemetry, you can see when this starts to happen, and take steps to deal with it before it starts causing problems for you. Likewise, its probably useful in general to have telemetry on how long each unit of work takes in server-like processes, so you can get a sense of throughput and factor that data into your design.

- Avoid sending large messages between processes, instead spawn a process to hold the data and then send a function to that process which operates on the data and returns only the result; or store the data in ETS if you have a lot of concurrent consumers. It can also be helpful to denormalize the data when you store it in ETS so you can access specific parts of it without copying the entire object out of ETS on every access. The goal here is to make messaging cheap and avoid copying lots of data around.

- Take steps to ensure process dependencies in your design are structured as trees, i.e. avoid dependency graphs that can contain cycles. It is all too easy for a change to introduce the possibility of deadlock if you play fast and loose with what processes can talk to each other. If your process dependencies mirror your supervisor tree, then you can protect against this by only allowing dependencies between branches in one direction (usually toward the parts of the tree that were started earlier in the supervisor tree)

I think the problem is that Elixir is still relatively young, and due to the language evolution and the lack of established documented doctrine from the Erlang community, there is a lot of techniques, tips, design patterns, etc., that are being rediscovered; likewise there are a lot of seemingly good ideas that turn out to be not so great in practice, but are encountered on the road to the truly sound patterns. So you get a lot of people writing about the lessons they are learning, and because of the gaps in knowledge, the result is that the information may be missing things, or providing a more complex solution when there is a simpler one, etc. Ultimately this is an important process, and now that Elixir has largely stabilized, this will only improve (and its is already pretty good, certainly far better than when I first started with the language years ago).

nickjj · on Sept 7, 2020

Thanks.

Do you have any code examples or practical applications on how to apply most of those bullet points?

Those are all very daunting things to approach without examples but they sound very important.

In Python or Ruby I would have just throw things into Redis as needed without thinking about it and this hasn't failed yet in years of development time with tens of millions of events processed (over time). Send ID of DB object to worker, let the worker library deal with it, look up the ID in the DB when the work is currently being done, let worker library deal with the rest and move onto the next thing.

And for caching, it's just a matter of decorating a function or wrapping some lines of code to say it should be cached and everything works the same with 1 or 10 web servers when the state is saved into Redis (major web frameworks in Python and Ruby support this with 1 line of configuration).

bitwalker · on Sept 7, 2020

For the use case you are describing, none of my points are important really - an HTTP request that hits a database, then pushes something onto a queue for background processing doesn't exhibit any problems from a process bottleneck point of view on that end of things. You still need to have some logic to deal with backpressure from the queue, but that is a language agnostic concern.

Where you could hit a bottleneck might be in the background processing though, take for example the following scenario:

- A pool of N background job worker processes each pull an item off a queue, and spawn a process to perform the task in isolation - A singleton process S provides exclusive access to some resource - Each task calls some code which needs to interact with the resource controlled by S.

The problem with the above is that all of that concurrency/parallelism is nullified by the fact that the tasks are all going to block on S to do their work, the bottleneck of the design.

To be clear, you should always gather telemetry first, but lets assume that you've gathered that and you can clearly see that this bottleneck is an issue (the process mailbox has frequently got many messages waiting to be received, the average time to completion for jobs is increasing). To solve this depends on why the resource is held by S in the first place.

If its because the resource is not thread safe and requires exclusive access, then unless you can find a way to avoid needing the resource in every task, there isn't much you can do, but this should be fairly uncommon in practice.

If S exists because you needed to store some shared state somewhere, and someone told you that an Agent or GenServer was the way to go, then you could move that data to ETS and make it publically accessible so that functions which operate on that data read it from ETS directly rather than call the process. Now you've removed that bottleneck entirely.

If S exists because it needs to protect access to some data, but not all of it, and most tasks don't need to access the protected data, then you can move the parts that do not need to be protected into ETS, and keep the rest in the process. This might reduce the amount of contention on that singleton process by a huge amount, but if even half the processes no longer need to block on accessing it, then you've regained at least that much concurrency in the task processing code.

---

The example above is something I've seen numerous times, but the important pattern to note is that you have some task that you've tried to parallelize by spawning multiple processes, but that task itself depends on something that is not, or cannot be done concurrently/in parallel.

Any time this pattern arises, you need to either find a way to enable concurrency in that dependency, or you should avoid doing the task in parallel in the first place. This is ultimately true of any parallelizable task - its only parallelizable if all of the tasks dependencies are themselves parallelizable, otherwise you end up bottlenecked on those dependencies and you've gained little to no benefit.

Where it becomes a bigger problem is when you consider the system at a higher level. Bottlenecks reduce throughput, which may end up, via backpressure, causing errors on the client due to overload, or depending on the domain, data being dropped because it can't be handled in time (e.g. soft real-time systems).

I don't have any code examples that really encompass all of this in one place, if you are interested in something specific, I can try to throw something together for you. Or if you have specific questions I can point you to some resources I've used to help understand some of these concepts.

ericmj · on Sept 7, 2020

> And I remember seeing a lot of forum posts around the dangers of using GenServers (unless you know what you're doing).

The danger is using a single process of the GenServer instead of multiple, so you can get single process bottleneck that won’t use multiple cores. You don’t have to know any intricacies of BEAM or OTP to know and design around using multiple process instances of the GenServer.

> I know you can do distributed state in Elixir too, but it doesn't seem as easy as it is in other languages.

You can use Redis in Elixir as well. Saying that Elixir is worse at distribution than Python/Ruby because ETS isn’t distributed is a bit like saying Python is bad at distribution because objects are not distributed. It’s especially strange since Elixir ships with a distribution system (so you can access ETS from other machines) while your other example languages do not.

nickjj · on Sept 7, 2020

> You can use Redis in Elixir as well.

Totally but a lot of folks say "but Elixir is good / easier because you don't need tools like Redis". But then when you try to do it without Redis you need to account for many things yourself and end up re-inventing the wheel. This is time spent developing library'ish code instead of your business logic.

It's sort like background job processing tools. Sure you can do everything in Elixir, but when you want uniqueness guarantees, queues, priorities, exponential back-off retries, periodic tasks, etc. you end up using a dedicated background processing tool in the end because it's very difficult and time consuming to write all of that from scratch.

But in practice almost all of those things end up being a requirement in a production grade system. You end up in the same boat as Python and Ruby.

josevalim · on Sept 8, 2020

> But in practice almost all of those things end up being a requirement in a production grade system. You end up in the same boat as Python and Ruby.

Well, there is definitely a spectrum. For example, in some communities you will hear the saying "you cannot never block the main thread". So if you want to do a file export? You need to move to a background worker and send the export to an e-mail. Sometimes sending the e-mails themselves is done in the background while it could be done on the go.

Languages like Java, Go, Erlang, Elixir, will be just fine with performing those tasks during the request, so you don't need to push those particular actions to a background job. And as you noted in the other comment, Phoenix ships with PubSub and Presence that work out of the box without a need for an external tool too.

But if you need a uniqueness, queues, back-off, etc, then do use a background job tool! But some of the places you thought you needed one, you may not actually need it.

querulous · on Sept 7, 2020

i would take most of these claims about elixir with a grain of salt

distribution, for example, is a much lauded feature of elixir/erlang but if you look into the implementation it's really just a persistent tcp connection with a function that evals code it's sent on the other end. you could easily write the equivalent in ruby or python or java but you probably wouldn't because it's not actually a very good idea. there's no security model, the persistent connections won't scale past a modest cluster size and the whole design ignores 30 years of experience in designing machine to machine protocols

similarly, people will mention mnesia or ets as a replacement for a database or redis. these are both very crude key/value stores with only very limited query capabilities. you should use them where you would use an in process cache in another language and not as a replacement for an out of process shared cache. they were never designed as such. and as in process caches they are really nothing special

in fact, a lot of elixir's marketing comes down to "do more with less" with a lot of focus on how you can do on a single machine what other languages take whole clusters to do. this is (partially) true. elixir/erlang are excellent for software that runs on appliance style hardware where you can't simply add machines to a cluster. it is, in fact, what erlang was designed to do. what this ignores though is that this is a terrible model for a service exposed over the internet that can run on any arbitrary machine in any data center you want. no one will advise you to run a single vm in aws or a single dyno on heroku for anything that matters.

elixir/erlang's features that increase it's reliability on a single machine are a cost you pay not an added benefit. the message passing actor model erlang built it's supervision tree features around are a set of restrictions that are imposed so you can build more reliable stateful services on machines that don't have access to more conventional approaches to reliability (like being stateless and pushing state out to purpose built reliable stores)

if you're building systems that need to run in isolation or can't avoid being stateful then perhaps elixir/erlang has some features that may be of interest. the idea that these features are appropriate for a totally standard http api running in aws or digital ocean or whatever backed by a postgres database and a memcache/redis cluster is not really born out by reality however. if it were surely other languages would have incorporated these features by now? they've been around for 30 years and the complexity (particularly of distribution and ets) is low enough you could probably implement them in a weekend

bitwalker · on Sept 7, 2020

> distribution, for example, is a much lauded feature of elixir/erlang but if you look into the implementation it's really just a persistent tcp connection with a function that evals code it's sent on the other end...

I mean, this is just straight up incorrect. Yes the underlying transport is TCP, but using remote evaluation is definitely _not_ the common case. Messages sent between nodes are handled by the virtual machine just like messages sent locally, that is the main benefit of distributed Erlang - referential transparency. Yes, you _can_ evaluate code on a remote node, which can come in handy for troubleshooting or orchestration, but it is certainly not the default mode of operation.

> there's no security model

I mean, there is, but it isn't a rich one. If one node in the cluster is compromised, the cluster is compromised, but the distribution channel is very unlikely to be the means by which the initial compromise happens if you've taken even the most basic precautions with its configuration. It would be nice to be able to tightly control what a given node will allow to be sent to it from other nodes (i.e. disallow remote eval, only allow messaging to specific processes), and I don't think there are any fundamental blockers, its just not been considered a significant enough issue to draw contribution on that front.

> the persistent connections won't scale past a modest cluster size

I mean, there is already at least one alternative in the community for doing distribution with large clusters, Partisan in particular is what I'm thinking of.

> these are both very crude key/value stores with only very limited query capabilities

What? You can literally query ETS with an arbitrary function, you are limited only by your ability to write a function to express what you want to query.

You shouldn't use them in place of a database, but they are hardly crude or primitive.

> elixir/erlang are excellent for software that runs on appliance style hardware where you can't simply add machines to a cluster. it is, in fact, what erlang was designed to do. what this ignores though is that this is a terrible model for a service exposed over the internet that can run on any arbitrary machine in any data center you want

I think you are misconstruing the point of "doing more with less" - the point isn't that you only need to run a single node, but that the _total number of nodes_ you need to run are a fraction of those for other platforms. There are plenty of stories of companies replacing large clusters with a couple Erlang/Elixir nodes. Scaling them is also trivial, since scaling horizontally past 2 nodes doesn't require any fundamental refactoring. Switching from something designed to run standalone in parallel with a bunch of nodes versus distributed _does_ require different architectural choices, and could require significant refactoring, but making that jump would require significant changes in any language, as it is a fundamentally different approach.

> elixir/erlang's features that increase it's reliability on a single machine are a cost you pay not an added benefit. the message passing actor model erlang built it's supervision tree features around are a set of restrictions that are imposed so you can build more reliable stateful services on machines that don't have access to more conventional approaches to reliability (like being stateless and pushing state out to purpose built reliable stores)

I'm not sure how you arrived at the idea that you can't build stateless servers with Erlang/Elixir, you obviously can, there are no restrictions in place that prevent that. Supervisors are certainly not imposing any constraints that would make that more difficult.

The benefits of supervision are entirely about _handling failure_, i.e. resiliency and recovery. Supervision allows you to handle failure by restarting the components of the system affected by a fault from a clean slate, while letting the rest of the system continue to do useful work. This applies to stateless systems as much as stateful ones, though the benefits are more significant to stateful systems.

> the idea that these features are appropriate for a totally standard http api running in aws or digital ocean or whatever backed by a postgres database and a memcache/redis cluster is not really born out by reality however. if it were surely other languages would have incorporated these features by now? they've been around for 30 years and the complexity (particularly of distribution and ets) is low enough you could probably implement them in a weekend

The reason why these features don't make an appearance in other languages (which they do to a certain extent, e.g. Akka/Quasar for the JVM which provide actors, Pony which features an actor model, libraries like Actix for Rust which try to provide similar functionality as Erlang) is that without the language being built around them from the ground up, they lose their effectiveness. Supervision works best when the entire system is supervised, and supervision without processes/actors/green threads provides no meaningful unit of execution around which to structure the supervision tree. Supervision itself is built on fundamental features provided by the BEAM virtual machine (namely links/monitors, and the fact that exceptions are implemented in such a way that unhandled exceptions get translated into process exits and thus can be handled like any other exit). The entire virtual machine and language is essentially designed around making processes, messaging, and error handling cohesive and efficient. Could other languages provide some of this? Probably, though it certainly isn't something that could be done in a weekend. No language can provide it at the same level of integration and quality without essentially being designed around it from the start though, and ultimately that's why we aren't seeing it added to languages after the fact.

querulous · on Sept 7, 2020

sending a message to a remote node is just a special case of eval. instead of arbitrary code you're evaling `pid ! msg`. and what is spawning a remote process if not remote code eval?

when i say there's no security model i mean there's no internal security model. you can impose network based security (restricting what nodes can connect to epmd/other nodes) or use the cookie based security (a bad idea) or you can even implement your own carrier that uses some other authentication (i believe there a few examples of this in the wild) but the default is that any node that can successfully make a connection has full priveleges

as for ETS, you can query any data structure with arbitrary functions. that's exactly what i mean when i say there's limited query capabilities. all you can really do is read the keys and values and pass them to functions

my experience and the experience of others is that elixir and erlang are not significantly more efficient than other languages and do not lead to a reduction in the total number of nodes you need to run. whatsapp is frequently cited as an example of "doing more with less" but it's compared to bloated and inefficient implementations of the same idea and not with other successful implementations. facebook certainly wasn't using thousands of mq brokers to power facebook chat. no one is replacing hundreds of activemq brokers with a small number of rabbitmq brokers

you can absolutely build stateless servers with erlang/elixir (and you should! stateless is just better for the way we deploy and operate modern networked services). my point is that many of the "advantages" of elixir/erlang are not applicable if you are delivering stateless services

when i said you could deliver erlang/elixir features in a weekend, i did not mean all of them. i meant specifically distribution and ets. you are right that the actor model, supervision trees and immutable copy-on-write data structures are all necessary for the full elixir/erlang experience. i generally like that experience and think it is a nice model for programs. i don't think however it is very applicable to writing http apis. java, rust, python, go, ruby and basically every other language are also great at delivering http apis and they don't have these same features

bitwalker · on Sept 7, 2020

> sending a message to a remote node is just a special case of eval. instead of arbitrary code you're evaling `pid ! msg`. and what is spawning a remote process if not remote code eval?

They are not equivalent at all, sending a message is sending data, evaluation is execution of arbitrary code. BEAM does not implement send/2 using eval. Spawning a process on a remote node only involves eval if you spawn a fun, but spawning an MFA is not eval, it’s executing code already defined on that node.

> as for ETS, you can query any data structure with arbitrary functions. that's exactly what i mean when i say there's limited query capabilities. all you can really do is read the keys and values and pass them to functions

You misunderstood, you can _query_ with arbitrary functions, not read some data and then traverse it like a regular data structure (obviously you can do that too).

> my experience and the experience of others is that elixir and erlang are not significantly more efficient than other languages and do not lead to a reduction in the total number of nodes you need to run.

I’m not sure what your experience is with Erlang or Elixir, but you seem to have some significant misconceptions about their implementation and capabilities. I’ve been working with both professionally for 5 years and casually for almost double that, and my take is significantly more nuanced than that. Neither are a silver bullet or magic, but they excel in the domains where concurrency and fault tolerance are the dominant priorities, and they are both very productive languages to work in. They have their weak points, as all languages do, language design is fundamentally about trade offs, and these two are no different.

If all you are building are stateless HTTP APIs, then yes, there are loads of equally capable languages for that, but Elixir is certainly pleasant and capable for the task, so it’s not really meaningful to make that statement. Using that as the baseline for evaluating languages isn’t particularly useful either - it’s essentially the bare minimum requirement of any general purpose language.

querulous · on Sept 7, 2020

i was not claiming the distribution code literally calls eval, just that it is functionally equivalent to a system that calls eval. you agree that it is possible to eval arbitrary code across node boundaries, yes?

i used erlang for 4 years professionally and elixir for parts of 5. i think both are good, useful languages. i just take issue with the misrepresentation of their features as something unique to erlang/elixir

advocates should talk up pattern matching, supervision trees and copy-on-write data structures imo. those are where erlang and elixir really shine. instead they overhype the distribution system, ets, the actor model and tools like dialyzer which are all bound to disappoint anyone who seriously investigates them

josevalim · on Sept 8, 2020

> i was not claiming the distribution code literally calls eval, just that it is functionally equivalent to a system that calls eval.

Not really. The distribution can only call code that exists in the other node. So while the system can be used as if it was an evaluator, it is not because of its primitives, but rather due to functionality that was built on top. If you nuke "erl_eval" out of the system, then the evaluation capabilities are gone.

I agree it is a thin line to draw but the point is that any message passing system can become an evaluator if you implement an evaluator alongside the message passing system. :)

> elixir/erlang's features that increase it's reliability on a single machine are a cost you pay not an added benefit

Agreed! Erlang/Elixir features should not be used to increase the reliability on a single machine. Rather, they can be used to make the most use of individual machines, allowing you to reduce operational complexity in some places.

dragonwriter · on Sept 7, 2020

> And as that blog post mentions, it recommends using ETS but Google says ETS isn't distributed. So now suddenly you're stuck only being able to work with 1 machine.

There is a mostly API-compatible distributed version of ETS in OTP, called DETS. And a higher-level distributed database built on top of ETS/DETS called Mnesia, again, in OTP. So, no, you aren't.

> I know you can do distributed state in Elixir too, but it doesn't seem as easy as it is in other languages. And it's especially more complicated / less pragmatic than other tech stacks because almost every other tech stack all use the same tools to share external state so it's a super documented and well thought out problem.

You can use the same external tools in Elixir as on platforms that don't have a full distributed database built in as it is in the OTP, so, I don't see how the fact that those external tools are widely used on other platforms makes Elixir harder.

b3orn · on Sept 7, 2020

> There is a mostly API-compatible distributed version of ETS in OTP, called DETS. And a higher-level distributed database built on top of ETS/DETS called Mnesia, again, in OTP. So, no, you aren't.

DETS is the disk based term storage, it is as distributed as ETS.

strmpnk · on Sept 8, 2020

I believe the distribution layer built on top of ETS and DETS you’re trying to name is mnesia. It supports distribution and allows a variety of interesting topologies. It’s not the only distributed data store available on the BEAM but it’s well tested, mature, and comes as part of the OTP libraries.