Deepstream 5.0: Resurrected using MIT license

an_opabinia · on Oct 27, 2019

The biggest pain point I've experienced with these pubsub services is how to negotiate client connections so that all the subscribers to a topic are directly connected to the same machine. In other words, clients' messages do not have to ever be relayed at the networking level. This means providing some reconnecting or connection-info data for the client, and it's why I think most of these frameworks fail to improve on home cooked solutions: you have to touch the message to help squeeze out efficiency.

A non-IP based routing mechanism requires some kind of header/parsing/deserialization of the message being passed. In application level code this is quite slow, you have to query some kind of map synchronized about cluster nodes. This is especially shitty if message addresses are ephemeral, especially per RPC call (which is the default behaviour in most cluster-based RPC frameworks), because then your near cache of the cluster map is never populated. Alternatively you can just broadcast to everyone (skip the cluster map) but then why cluster?

It would be great if we just had IPv6 adoption though. Then you could just uh, route. The DNS based routing in stuff like Kubernetes is close but not for external Internet facing clients. The whole point is to avoid a bastion host.

yasserf · on Oct 27, 2019

I totally agree. Having a single point of entry gives us some small benefits like a single connection/simplified deployments but it isn't efficient to have a provider connected on one node in a cluster and have to forward data to subscribers on another just because of how they were distributed by a load-balancer. Theres actually an action in the handshake protocol which allows a node in a cluster to redirect a connection to a different/more optimal node. It was used for multi-tenancy clusters previously. In theory you can run multiple isolated clusters (or individual nodes) and have multiple connections from the client, routed based on which subscription lives where. It's a pattern I have been talking about with a couple of users but it hasn't been required by anyone yet. It isn't as efficient as doing it on a network layer but it at least reduced intercluster traffic dramatically.

saurik · on Oct 28, 2019

I used to be really concerned about this, but one day I sat down and did the math and realized that the worst case scenario is that I'd have no more than two times as many messages, which means this isn't a scalability issue: if a topic has N subscribers, at bare minimum you are going to have N+1 messages to send a given event (one to receive it into the cluster, and N to dispatch it to the subscribers); if everyone is connected to some random machine, then you need to take the incoming event and send it to the right internal machine that is storing that topic (which is one extra message) and then that machine will need to indirectly send the message to the connecting computers of the subscribers (which is one extra message per subscriber), so you end up with 2N+2 messages. The alternative, where users cluster themselves based on topic with their entry connections, is that these individual end users have to maintain a ton of separate connections for inbound messages for each of the topics they are interested in, which is often going to work out to be much worse for everyone involved.

threeseed · on Oct 28, 2019

Wouldn't the ingress help here with sticky sessions e.g. cookies or a header.

I am assuming the RPC is happening over WebSockets though.

Zod666 · on Oct 27, 2019

What is deepstream? The website doesn't really say what it is for.

yasserf · on Oct 27, 2019

https://deepstream.io/tutorials/concepts/what-is-deepstream/

Generally deepstream is an alternative to using firebase, socket.io, featherJS and meteor. However rather than putting the logic within the server you instead run them as microservices and allow deepstream to handle load-balancing for you.

The two main things deepstream offers are:

- Events

You can publish and subscribe to events by simple doing:

deepstream.event.emit('event-name', data)

deepstream.event.subscribe('event-name', data => {})

Which is the functionality you expect to see from most pubsub mechanisms

- Records

Records is events with persistence. They are more heavy weight objects that persist their content in a cache + db and across all connected devices without the user/developer having to actually do anything. The also support offline support if you want to be able to get/set data while offline.

// get a record

const record = client.record.getRecord('unique-data-name')

// wait for record to be loaded from server

await record.whenReady()

// subscribe to any changes that happen in the future

record.subscribe(data => console.log('someone updated this object!', data)

// discard when finished

record.discard()

You can also set data directly without having to subscribe using

client.record.setData('some-record', newData)

The are multiple other aspects deepstream provides like merge-conflicts, permissions, authentication and some useful patterns, but the above code is about 90% of what you would expect a user to use in deepstream

manigandham · on Oct 27, 2019

It's a real-time pub/sub messaging system (competitive to NATS), with built in messaging patterns like request/response.

Also supports persistence of data records (JSON docs) and syncing changes to those docs (competitive to Firebase, etc), backed by Redis.

Works with different protocols (websocket, MQTT, HTTP) and has client libraries for popular languages.

emmelaich · on Oct 27, 2019

Yeah odd. The github link though has ...

> deepstream is an open source server inspired by concepts behind financial trading technology. It allows clients and backend services to sync data, send messages and make rpcs at very high speed and scale.

dankohn1 · on Oct 28, 2019

I added to the CNCF Cloud Native Interactive Landscape:

https://landscape.cncf.io/category=streaming-messaging&forma...

It's the the 24th streaming & messaging project or product we're tracking.

stuaxo · on Oct 27, 2019

Please add some text at the top explaining what it is.

tomcam · on Oct 27, 2019

https://deepstream.io/tutorials/concepts/what-is-deepstream/

yasserf · on Oct 27, 2019

thanks for the feedback! Do you mean in the actual Hero or to elaborate more within the 'what it is' section on the home page?

unixhero · on Oct 27, 2019

I spent 5 minutes trying to figure out what Deepstream, because I was intrigued. I was unable to figure it out based on the front page, documentation link and the GitHub repo.

yasserf · on Oct 27, 2019

What sort of documentation would make most sense? We went through iterations of example animated apps, code samples and text but can't seem to explain it clearly yet =( Any input would be much appreciated!

gravypod · on Oct 27, 2019

I think Facebook does a fantastic job of demonstrating the value of their tooling that they open source.

Buck: https://buck.build/

Presto: http://prestodb.github.io/

The front page of these contain the project name, one-line elevator pitch, long form description, and an automatically playing demo of the product.

grizzles · on Oct 27, 2019

This is a great comment. I've come across deepstream over the years. I've always come away thinking that I could make something similar that would serve my needs without adding a dependency into my project. You need to demonstrate some compelling feature.

Eg. sync

Naive copy sync: 53ms

DeepStream sync: 6ms

yasserf · on Oct 27, 2019

I would say the compelling feature is it's opensource 'serverless' data-sync with permissions, clustering, monitoring and auth built in. Serverless here meaning there is no server code required, just run the server. The easiest way to market it would just be the OS competitor to firebase.

The issue with that however is taking that approach means we lose the NodeJS ecosystem support which is massive (and required to add custom plugins/maintain the server). We tried it before and it was terrible in terms of metrics and participation.

Thank you for the feedback though! I will definitely be looking at redesigning the home page and will use your feedback when doing so!

grizzles · on Oct 27, 2019

> The issue with that however is taking that approach means we lose the NodeJS ecosystem support which is massive (and required to add custom plugins/maintain the server).

Could you explain what you mean here in greater detail? I'm only talking about changing your marketing copy. I don't understand why you lose the NodeJS ecosystem.

By the way I'd also remove most of the content from the front page. It's too busy. Make every word count.

yasserf · on Oct 28, 2019

Yeah when we made deepstream sound more like a standalone deployment, similar to nginx or rethinkdb (meaning you can get an executable/install via package managers and so forth) and downplayed the NodeJS part of it we ended up having alot less people contribute to the project or even using it as they wouldn't be aware that it could be installed via NPM. I guess my point is that the status quo of node server dependencies seem to be run as part of a bigger project (featherjs, meteor, socket.io, sockeetcluster) which means you npm install and configure it via javascript. The last couple versions have been me trying to navigate the landscape so I can provide a totally standalone/configuration based server that can also be extended by end users (hence all the typescript interfaces and support).

Basically I'm not certain how to pitch this exactly. I feel a bit of condolence from the fact that when it was a startup with over ten employees we had the same issue (specially since I was the tech guy)

Anyways, more than happy to hear any suggestions! I been involved with this project for over 3 years so I definitely have a biased view on trying to see the project for the first time.

grizzles · on Oct 28, 2019

I'd focus on sync as your headline feature. It's technologically complex to do well and it provides real value for mobile apps that want to do a great offline experience. Our phones are offline way more than most of us realize and we don't notice it because Google / Apple have did such a great job with sync. But many 3rd party apps are not great at that and consequently have crappier user experiences due to connectivity problems.

manigandham · on Oct 28, 2019

That's really not what serverless means (especially when right after you say just run the server). Id recommend leaving that out. Less buzzwords are better.

rumanator · on Oct 27, 2019

> What sort of documentation would make most sense?

A single sentence in the landing page with a clear description of what it is and what it does would suffice.

NotSammyHagar · on Oct 27, 2019

Just describe what deep stream is in one sentence on your landing page. Right at the top. "Deepstream is a pubsub solution that ...".

yasserf · on Oct 28, 2019

It isn't just a pubsub solution though

To compare to other frameworks out there:

featherjs: A framework for real-time applications and REST APIs

meteor: THE FASTEST WAY TO BUILD JAVASCRIPT APPS

deepstream: a fast and secure data-sync realtime server for mobile, web & iot

The problem is that once you provide more functionality then pub/sub you end up in this vague toolkit statement land.

I would assume that the issue is more around the fact that the term data synchronization is still not used as often and so doesn't give the same familiar feeling people get when looking at other pub/sub frameworks. Deepstream supports HTTP/MQTT/Binary and JSON websocket protocols for rpcs, records, events and presence. Maybe that should be the sentence!

A secure scalable realtime server that supports HTTP, MQTT and WebSockets for rpcs, data-sync, pubsub and presence functionality

lovelearning · on Oct 28, 2019

Just my opinion, but the two most useful pieces of information that helped me understand what this does and visualize where it fits were

1) https://deepstream.io/tutorials/concepts/what-is-deepstream/ > "What is it for" section

2) and your comment that it's an alternative to firebase, socket.io, meteor.

Perhaps you can think of using these two pieces of information as the front-page blurb rather than make it abstract.

mkl · on Oct 28, 2019

How well does Deepstream work with offline periods, e.g. stuff being created/modified on a disconnected mobile device that later connects and needs to sync?

Is it possible to make it work in a way that the server only sees encrypted data? I can see it working in a clunky kind of way if each client had two versions of each document, a clear one that is operated on, and one encrypted (possibly a record at a time or something) that is synced.

stemuk · on Oct 27, 2019

Looks promising! Does deepstream work well as a standalone server or is it best used in combination with express.js if I am building a real-time web app?

yasserf · on Oct 27, 2019

Deepstream works best as a standalone server!

Ideally you would just configure it using a config file for your permissions (https://deepstream.io/tutorials/core/permission/valve-introd...) and use http auth to authenticate your users (https://deepstream.io/tutorials/core/auth/http-webhook/).

You can see this sort of configuration here https://deepstream.io/guides/live-progress-bar/

If you want you could add a custom plugin that would take the HTTP server within deepstream and enhance it using express for all your non-deepstream routes, if you raise an issue for it I think it's something that can be done in the near future! But generally wouldn't recommend for non-pet projects since they serve two different purposes.

Edit: Also thank you!

harlanji · on Oct 28, 2019

Here is the project description because it took way too much navigation to find:

> deepstream is an open source server inspired by concepts behind financial trading technology. It allows clients and backend services to sync data, send messages and make rpcs at very high speed and scale.

cultofmetatron · on Oct 28, 2019

phoneix channels does the same thing and can fallback to long polling while relying on the erlang vm to scale out to many more machines.

jkarneges · on Oct 27, 2019

Why MIT? I recall it was AGPL before.

yasserf · on Oct 27, 2019

Yup! We changed it to MIT as AGPL actually limited a few people from using it in their companies (company wide license regulations) and I would rather promote adoption under a more permissive license.

dodyg · on Oct 28, 2019

Do you have .NET Core driver/client? or at least gRPC endpoint?

superpermutat0r · on Oct 27, 2019

I guess it's time to build my own cryptocoin exchange.

yasserf · on Oct 27, 2019

We actually had a couple of usecases that used deepstream for that sort of thing.

https://vimeo.com/143728632 was what led to deepstream being initially written a few years back

gstar · on Oct 28, 2019

The UI looks incredible - was it closed source, and did it die with the company?

yasserf · on Oct 28, 2019

Thanks! We had an awesome UX designer as a co-founder. It was closed source yup, also based on a pretty old stack of knockout and CSS so it isn't very salvageable