Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Launch HN: Jitsu (YC S20) – Open-Source Segment Alternative
265 points by vklmn on Nov 4, 2021 | hide | past | favorite | 110 comments
Hey HN! Vlad here with Sergey, Ildar, and Kirill. We are building Jitsu, an open-source Segment alternative (https://github.com/jitsucom/jitsu, https://jitsu.com/). We help companies collect events from their apps, websites, and APIs and send them to databases.

I've been doing data engineering for more than ten years (half of that time, I didn't know that it's called "data engineering”). Before Jitsu, I was a co-founder and CTO of GetIntent, an ad-tech startup. Although it was ad-tech (I'm sorry for that!), we also built a quite fascinating technology platform. We processed up to 1 million events per second at peak, and all those events needed to be stored somewhere.

We churned through a few data warehouse platforms along the way. In 2013, we started with Hadoop's HDFS and a bunch of map-reduce jobs on top of it. Then, when we decided to allow our customers to run ad-hoc reports, we switched to BigQuery. BigQuery was great, but expensive—especially with some customers obsessively clicking the refresh button. Finally, in 2017 we migrated to self-hosted ClickHouse which in my opinion is still the best analytics database in the world.

All that time, we spent a fair amount of effort to get data to the database. When you're dealing with millions of events per minute, running an INSERT statement per event won't work. What if the DB is down for maintenance? How can you be sure that all 50+ edge nodes are aware of recent DB schema changes? Also, did you know streaming data to BigQuery is costly while batching data is free?

We tried different approaches: first, we would write local log files, sync them to HDFS, and load data to BQ (or ClickHouse) with map-reduce jobs. To improve data freshness, we ditched HDFS and started to send data in batches to the DB directly from edge servers. We experimented with Kafka, but it felt too complex for that task at the time.

I always dreamed about a straightforward service, to which I'd throw JSON objects, and it would take care of the rest: queueing, retrying, updating database schema, etc.

Then I discovered Segment. I liked it at first. It seemed very developer-friendly with a nice API and excellent documentation. But the pricing model and data delays (the event gets to DB in 12 hours after it has been sent to Segment) killed the whole idea. And it was not open-sourced. In my opinion, being open-source and self-hostable is a must for such a fundamental part of the architecture as data collection.

I left GetIntent and got accepted to YC with a different idea for the Summer 2020 batch. The idea was to build a churn prevention and BI tool for online retailers. It didn't take off, but in the process we made a component to collect customer's app events and put it to DB. We tried to hack a solution on top of the ELK stack, but I was frustrated with ElasticSearch’s lack of SQL support. Here I was back to square one: there's no good open-source event collection service yet, and we needed to build one, once again.

So we decided to focus solely on that problem. We ditched all the previous code, which was in Java, rewrote the data collection server in Go and hacked together what we called EventNative [1]. It was received very well, and we started to get users.

Over the last 11 months, we've been busy building the UI, adding Connectors (to pull data from external APIs), polishing data warehouse support, adding javascript support to transform incoming data, and implementing dozens of other features.

Now we're launching Jitsu, an open-source Segment alternative. With Jitsu, we make it easy to collect data and send it to databases (we support all major players: ClickHouse, Redshift, Snowflake, BigQuery and Postgres). We’re deployed in production, including into a large gaming publisher, eSignature service, and many other great companies. We're going for an open-core model. So far we don't have paid features, but soon we'll have some, presumably around things like authorization and data masking. Also we run Jitsu.Cloud[2] which you can buy if you don’t want to self-host

Give it a spin: https://github.com/jitsucom/jitsu.

Thank you for reading this story - I hope it was interesting. I would love to read your feedback on Jitsu and answer questions!

[1] https://news.ycombinator.com/item?id=24120325 [2] https://cloud.jitsu.com



Great product. I'm a frequent user of Segment from the early days and have been curious to see when an open-source competitor comes around that will match feature-for-feature.

Thoughts:

1. You've got most major ads sources that I care about, but it seems that there is a higher bar to implementation. Segment lets me just plug in Google & FB ads and dump the entire shebang right into my data warehouse. A lot of marketing teams are going to have less time/resources to deal with implementation so smoothing this out is key.

2. Functions are an underrated and highly powerful feature of Segment. The ability to operate on data in transit, create custom connectors that "just run" (akin to CF Workers) and the like is a big selling point for more technically advanced marketing teams. It doesn't seem present here and that would hold a customer such as myself back on bigger scope projects.

3. I'd love to see a "compare us to your segment usage" where I select my data sources and destinations to see what you cover vs. Segment in a specific use case (and possibly pricing advantages on a self-hosted vs. non). This would make it much easier to sell through procurement and devops for new customers that are switching.

4. There are going to be a lot of people like me that are soon to start fresh in terms of marketing stack, so going after people before they select Segment might also be a play.

Looking forward to seeing where you all take this. Good luck!


Thank you!

1. Thats exactly the reason we have native connectors for Facebook and Google Ads (we didn't use ones from Airbyte and Singer). Jitsu can pull any combination from FB/GAds — it's almost like SQL! Airbyte/Singer just can't do that. Later we're going to vet other connectors too and decide if we need to re-implement them

2. We have functions too! https://jitsu.com/blog/javascript-transform


Ah, ok - I didn't see the transform compared to Functions. Very cool and I like the multiplexing.


Airbyte recently added support for custom GSQL & Facebook Marketing queries FYI ;)


Congrats on your launch, and looks really exciting! I'm curious how this compares to tools like Snowplow [1]? I guess Jitsu comes with more sources and destinations out of the box?

[1] https://github.com/snowplow/snowplow


Snowplow CEO here. We haven't used Jitsu before but are very familiar with Segment. It looks like Jitsu sits in the Segment product family, along with Rudderstack: basically a Customer Data Platform bundle of simple JSON event tracking, Fivetran-style transactional/SaaS data ingest, and then relaying of data out to various SaaS endpoints plus cloud DWs.

Snowplow started at the same time as Segment (2012) but has evolved along a separate tech tree. Micro-service architecture, cloud native, using Kinesis or Cloud Pub-Sub as the data transit, enrichment framework plus a Confluent-style schema registry supporting very rich and versioned JSON Schema-based event payloads. We are built by and for data platform teams; our open-source behavioral data engine doesn't have a UI (our commercial Behavioral Data Platform does). Hosted trial here https://try.snowplowanalytics.com/

Definitely room for both product families in the market! I'm sure Jitsu will do great.


Alex - love your work on Snowplow.

Looking at Jitsu as a Snowplow familiar person I tried to do a quick browse of their marketing site and couldn’t find anything about their back end architecture. Was immediately thinking that wasn’t the focus here which is concerning when thinking about enterprise scalable data patterns.

Also appreciate you taking the high road “room for both” while the founder of Jitsu says “we are better”

I’ll stick with the product with a solid schema strategy, thank you…


Essentially we're doing same thing. But we build Jitsu to be as simple as possible: you don't need to setup multiple services (just one Docker service!), data goes to DWH almost instantly. And we can pull data from more that 100 external APIs

Think of us as Snowplow 2.0 )


I really like this, congratulations on the launch. And this is such a huge space that there's definitively room for other options (aside from Segment).

I'm a little bit out of the loop in this event processing space. Do you think Jitsu could replace lower-level event processing implementations as Kafka/Kinesis? Or this is thought for more "high level" marketing stuff.


That's a good question. We're aiming to replace Kafka in some cases. There're many ways how people use Kafka. But it could be roughly divided into two buckets

- Kafka as a company wide message bus: dozen's of (micro)services sending data there, and consumers listens to data. Each service doesn't know which other service will consume the data. For that case, we're not looking to replace Kafka — we're going to work along with it. We have a PR about supporting Kafka as destination [1] (Jitsu sends data to Kafka), and we will support Kafka as a source at some point (PRs are always welcome :))

- Kafka is used just as a transport between web-app and DB. In that case Jitsu is a perfect replacement

[1] https://github.com/jitsucom/jitsu/pull/537

P. S. The same applies to Kinesis too


> We're aiming to replace Kafka in some cases

How are your handling data between collection agents and storage? With Kafka, I know what I'm getting when it sits between the two and the advantages it offers.


The same way as Kafka. Jitsu nodes (=collection agents) writes to write ahead log, and then either sends data to destination right away, or sends data in batches.


Thanks! I take it this file is where I can get started to learn more:

https://github.com/jitsucom/jitsu/blob/0aaa74b59eb9d8c885c80...

I see that it instantiates an "AsyncLogger" - does the service wait until data is written to the log prior to returning success to the client?

Is the WAL the same source used to feed both database storage destinations and other SaaS destinations?


Hi! My name is Sergey, I’m a Jitsu product engineer. I’ll gladly answer your question! AsyncLogger works asynchronously by design. There is a go channel which writes JSON’s to the log file. Answering your question: the service doesn’t wait until data is written to the log prior to returning success to the client. WAL log is designed for keeping events JSON’s between Jitsu instance restarts to prevent data loss. When you deploy your Jitsu application, it will handle service restart signals (e.g. sigterm) and closes database connections as well as other resources. All incoming events are stored in WAL log in this time. So, after the Jitsu starts, all events from WAL log will be passed to the main events JSON pipeline and stored to the destinations.


Is the WAL only used during restart, or also during normal operations? Trying to create a mental model of how data flows through the system and into destinations.


During normal operations as well. Jitsu supports destinations in two modes: stream and batch. In case of using batch mode: all JSON events will be stored into WAL asynchronously (client doesn't have to wait) and then batch destination processes WAL files in background and stores data in batches. In case of using stream mode: all JSON events will be stored into queue (which is persistent) and will be processed one by one and stored with insert statements into the data warehouse.


In the second point, presumably you mean only one-way communication, right?


Hey congrats on the launch, clearly a lot of thought and effort went into this. I'm pretty new to this space, and maybe this is a dumb question but how does this differ from Mixpanel? Would I use this for something different?


Mixpanel will store the data for you and do visualization. Jitsu just help you to get you data to your data warehouse.

Downside: you'll need to build all visualization by yourself. Fortunately that's easy with tools such as Looker, Mode, Metabase etc

Upside: you can do with your data whatever you want - built any reports, join with other datasets etc. You not limited by reports MixPanel team build.

In reality, Jitsu and MixPanel could co-exist. Jitsu support s MixPanel as a destination (e.g. you send data to Jitsu ; Jitsu sends it to MixPanel and data warehouse).


Feedback on "Jitsu support s MixPanel as a destination"

This is not really clear from the website. Mixpanel is mentioned in https://jitsu.com/sources but NOT in https://jitsu.com/destinations. Also the docs seem very clear about that.

We are currently looking at Segment, Jitsu and others. While we generally liked Jitsu, this was kind of a big deal for us and made us lean towards Segment.

However, no final decision has been made yet. ;-)


Mixpanel employee here, I actually just emailed Jitsu this morning to ask about adding Mixpanel as a destination. We're willing to work with them (or any of you!) to get the PR open. Feel free to reach out directly to josh@mixpanel.com!


How do I scale jitsu if the load from my app servers become too big? Will adding more jitsu nodes trample the database nodes that jitsu writes to? How should I plan capacity for a jitsu deployment in a multi node scenario, and what should I take into consideration when scaling it?


Yes, just add more jitsu nodes. It's hard to answer how many nodes do you need (depends on transformations, CPU/RAM/etc), but you can count on thousands request per second per node at least


How is event ordering handled when multiple instances of Jitsu are involved.


This looks really cool. I'm keen to try it.

It looks like it might play well with my current logging system of choice, Seq [0].

Do you support inbound webhooks? I can see webhooks as a destination but not as a source?

[0] https://datalust.co/seq


You can hack almost anything using inbound Event API (https://jitsu.com/docs/sending-data/api) and JavaScript transformations (https://jitsu.com/blog/javascript-transform)


A standard webhook source abstraction would be very useful, that captures the URI, POST payload and HTTP headers.

This way I can setup my source in Jitsu, get a unique URL, and then paste that URL into the tool generating webhook events (e.g. Shopify). A normalized schema based on the JSON payload doesn't need to be created for this to be useful.


Ok cool.

As a bit of feedback, I highly suggest adding Webhooks as a source on your marketing site.

The first thing I did is navigate to the Sources page and searched for "webhook" which brought up no results.

I then searched your docs which only mention Webhooks in the context of being a destination rather than a source.

I realise now that you have quite a flexible ingestion API, but it took quite a while (and your confirmation above) to understand this!

The product looks awesome though! Good luck with the launch.


Thanks for observation! We will add it. A fresh look to marketing materials is always appreciated!


Are there any examples on how the resultant SQL tables look like in postgres or clickhouse for a given event schema? I'd like to know how generic it is per event type (is it sth like (event id, blob), or tries to decompose each event field into a column - what about nested objects then, etc.)). Knowing this would greatly improve my understanding on reusability of jitsu for various event-collection tasks I may have.


That's what the website missing indeed. We have a few words about that in docs, but it's still not enough https://jitsu.com/docs/internals/jitsu-server#mapping-step

Overall, Jitsu tries to decompose (aka flatten) JSON as deep as possible. E.g. {a: {b:1, c:2}} will become a_b=1, a_c=2. If column is missing, it will be created. We don't decompose arrays so far


Can one disable the flattening? BigQuery, for example, supports nested objects just fine, and flattening them for no particular reason seems counter-productive.

I work on an application where we already have a schema in BQ, but we'd like to start moving events through something like Jitsu or Rudderstack. This uses nested objects extensively. Looking at Jitsu, it looks like we wouldn't be able to keep our existing table schema.

PS. Whoever wrote your BigQuery code does not understand Go contexts. Only functions should take a context argument; you should almost never store contexts in structs!


Can we expect react native integration any time soon ?


It depends on if we get a PR anytime soon. At the moment we're just 4 engineers, and unfortunately React.Native is not on the top of our list. I'd say we'll have it in 2 months unless we get a PR earlier


Congrats on the launch, much needed! Would love to see if it's possible for you to connect to https://june.so for easy to use product analytics.

Are you following the same tracking convention spec as Segment?


Yes, if segment compatibility mode is enabled


> We help companies collect events from their apps, websites, and APIs and send them to databases.

For those who don't know what "Segment" is (like me) - this Jitsu thing seems to only be relevant to web-based/web-oriented apps.


You mean we don't have libs for Mobile platforms / backend frameworks? That's true, tough we have iOS SDK[1] and community maintained Go client. However, all libs are merely wrapper around http api[2]. We will implement other client libs soon, but calling HTTP API directly works very well too

[1] https://jitsu.com/docs/sending-data/mobile-apps/ios-sdk [2] https://jitsu.com/docs/sending-data/api


Not affiliated but that is not accurate.

Segment is a fancy event router / multiplexer. You emit events to it and it sends them to reporting and storage destinations.

It does have more features for web apps but that is not the only use case.


Jitsu can multiplex events and send it to different destinations too. I admit, we don't have that much destinations as Segment (we support Amplitude, Hubspot, GA and Facebook). But we can send data to any HTTP-enpoint (see Webhook destination). Since the body of HTTP request is a JavaScript expression, with a little hacking you can support almost any service.


There's an http endpoint so it's easy to use from a browser, but of course it's usable from any process that can post to the endpoint.


This is super cool, congrats!

The website is heavy on Segment comparisons (which makes sense). However, you're not the only open source Segment alternative, so how do you view yourselves in comparison to e.g. RudderStack or Snowplow?


Many of your integrations talk about “syncing” rather than event collection, which to me sounds like what Fivetran is doing. Does that distinction make sense and how are you thinking about that?


We call it "push" (you send event to Jitsu) and "pull" (Jitsu pulls data) integrations. Technically speaking, push integrations are outnumbered. But that's because we took advantage of other open-source projects (Airbyte and Singer) to implement them.

Our core is push integrations, that's the most complex part of the system. We see "pull" integrations as an additional feature that helps to enrich the data after events made it to DHW


Adding on to the previous comment, how does this compare to rudderstack?


We're very similar indeed. But we attack the same problem from different angles:

- We truly believe that our product should be accessible for small teams too. That's why se made Jitsu very easy to deploy. I'm not sure you can deploy Rudder on Heroku, or on any service with a single Docker file.

- Our ETL component is open-source (and based on other great OSS projects - Airbyte & Singer). RudderStack haven't published the Cloud Extract (their ETL) to my knowledge.

- RudderStack aims to replace Segment, we go beyond that. We didn't copy Segment API one-to-one, we just added a Segment compatibility layer. Jitsu can be used for any kind of data. An example: a few companies (including our-selves) using Jitsu to collect open-source telemetry (anonymous usage). I'm not sure Rudder can be used for that use-case


Fellow YC company (hotglue) here - we love the Singer spec so it is cool to see your modeling after that. It is worth giving a shout out to Meltano who is helping grow it (an Airbyte competitor). Love what you all are doing Vladimir! Congrats on the launch :D


Rudderstack user here (and ex Segment). Thanks for sharing this stuff, very cool project.

Just to answer some of this:

- rudderstack has deploy ready helm charts, which I'd argue are significantly better than docker compose or docker setups because they set up all the other niggly parts. Would be cool to see that here :)

- rudderstack has gone quite far away from replacing segment. It's true that their core API is compatible and I think your transformation layer is really cool. However it can be used for those use cases because rudderstack doesn't really care about users or user IDs and can be used for any sort of data generally.

There's a piece in the docs talking about the fact that you don't get caught by adblock - whilst this may be true when someone launches it, that's not true of your platform. That's just the fact that a lot of smaller businesses will not get their URLs added to the ad block lists. I think it's a bit misleading to mention that in such a way because technically we're all tracking users and ad block is a way for users to choose not to be tracked, not be tricked into being tracked because someone has masked the tracking script ;) if a huge client (a la Adidas or something) decided to use your scripts I'm sure someone would eventually add it to the ad block lists that get propagated.

One of the things that would be cool would be some sort of opt in configuration. Segment has some awful consent SDK that is really bad, would be cool to see what you do there. GDPR is a big deal and browser fingerprinting is data processing. It's worth looking at your comments on being GDPR compliant btw https://www.eff.org/deeplinks/2018/06/gdpr-and-browser-finge...


Congrats on launching and thanks for making it easy to deploy using docker! I'd like to suggest that you make it available as a 1-click app on DigitalOcean as well.


Is https://meltano.com/ a more general version of this ?

i am wondering how this compares to it?


We have some overlap. In simple words: a Singer + DBT, Jitsu is Event Collector + Singer + Airbyte.

Meltano will pull data from Singer connectors and do transformations, but they won't run Airbyte connectors, and you can't push data to Meltano

Jitsu will use Airbyte or Singer to pull data, and you can push the data to Jitsu. But Jitsu won't run DBT transformation. Although we can trigger DBT cloud jobs: https://jitsu.com/blog/dbt-integration

P.S. Meltano has a CLI, and we don't (yet)


that was very clear. thank you!


Congrats on the launch!

Just wondering, do you have any plans to support CrateDB?

It supports SQL and understands PG protocol - perhaps supporting Postgres kinda already makes it close.


Congrats on the launch! We've started using the open source version for one of the tools we are building (CLI anonymized telemetry) and it looks good so far- thank for the great product. It was very easy and straightforward to get started, deploy and start collecting things (into BigQuery).

Overall, I like this recent trend a lot - more companies are building open-source, lightweight, GDPR compatible analytics, chats (e.g. Papercups). I hope there will be good ways to monetize and sustain this. Wish you all the best, folks!


CTO at Vertica here. How can we make sure Vertica is one of the DWH destinations in Jitsu?


Hi! We don't have plans to add Vertica yet. But if your team is willing to help or send PR we can reconsider!Feel free to reach out directly to me at vladimir@jitsu.com


Good job guys. Amazing work!

It was fun watching you grow Jitsu and love the way you provide support!


Do you support something similar as ajs_aid url param id override in Segment?


Great story. How do you feel Jitsu compares to Rudderstack?


Read the explanations below in the comment thread and the contribution from RudderStack itself.

Conclusion, Jitsu is wide event based tracking whereas RudderStack is focused on ID specific customer events.


Check the answer in another thread! https://news.ycombinator.com/item?id=29106531


what value does the airbyte / elt integration provide? Surely I could just run etls on airflow or similar on tables that jitsu generates?


Sweet! Great to see an OSS alternative to Segment.


Why would I use this instead of Airbyte?


Two reasons a) you can push events from your apps b) if you want to have more connectors available (Singer, and few native connectors)


Airbyte uses Singer too, why would it have less connectors? That doesn't make sense.


I miss the eventnative name :)


Great product! Lovely team!


The black banners at the top and bottom of your website breaks scrolling on mobile.


Thanks, we will fix that!


Great job!


Great work! It's funny that YC funded Segment as well as two direct Segment competitors (RudderStack is the other one, though I think they initially started with a different idea an pivoted to that afterwards), though given the size of YC this is probably expected and Segment is probably large enough to "deserve" some good smaller competition.

As someone who also builds an open-core product (though not directly modeled after an existing closed-source product) I really hope this kind of business model will become more accepted.


I don't believe that YC funded RudderStack, but they funded Segment indeed! It's not uncommon. If I have to guess, every batch or so have at least 2 companies directly competing with each over


Yes, RudderStack is not YC funded. Though I have learnt a lot from YC (and PG) through the years and attribute our entrepreneurship to them.

Founder of RudderStack here.


Ah sorry, I mixed you up with Freshpaint, which is actually a YC startup and Segment competitor that pivoted to their current business model.


hey! Co-founder of Freshpaint here. Yes, we're YC-backed along with Segment, so you're probably thinking of us. YC does often fund startups working on similar things.


They have been doing that in many areas now...

Look at the alternatives to closed source analytics that they have funded


I'm sure YC is out of Segment by now.


Please consider posting some form of this as a blog post as well, I love to hear about product inceptions.

HN doesn’t allow lesser users with lesser eyesight to read light grey on beige self text, unfortunately.


Sorry—it's on the list to make this more configurable. I know we take a long time but we get there eventually.


Congrats on the launch. Unrelated to the actual software, are any of you actual Jiu-Jitsu grapplers? If not, why'd you go with the name Jitsu?


To be honest, we just grab a first short .com domain name we liked a) which was available at reasonable price b) we like


Congrats on the launch!


Kudos on the work done and I hope you the best of luck.

A little note of advise: I wouldn’t start my company description as “the Y of X” or “the Y alternative to X”.

It’s okay to mention if you are similar to another well known company, but don’t use it to describe your company, specially not in the first line.


There's always a tradeoff. By describing our company as "alternative to X" you saving tons of time by explaining what you company does. People are overloaded with information, and average attention span very short. But I see the disadvantages too! We have an internal debate about tagline a few months ago, and essentially decided to go with "Open-source alternative to Segment". But it wasn't an easy decision!

However, I think that the product matters the most. You can change the tagline in a few clicks. Can't say the same about the product


It’s okay to use that for an elevator pitch, I’d just reverse the order.

First tell me what you do, then tell me what you are similar to.

>> We are building Jitsu, (https://github.com/jitsucom/jitsu, https://jitsu.com/) We help companies collect events from their apps, websites, and APIs and send them to databases. Think of us as an open-source Segment alternative


I disagree, much easier to understand when it's described like this


I’m just saying this is better:

We are building Jitsu, (https://github.com/jitsucom/jitsu, https://jitsu.com/) We help companies collect events from their apps, websites, and APIs and send them to databases.

Think of us as an open-source Segment alternative.


I agree.

I hadn't heard of Segment and now I'm reading your competitors website.


That's a chance you have to take when you're making this decision!


> Jitsu solves the AdBlocker problem...

This line alone is enough to infuriate me. So I am unable to block spying and data collection now?

I don't understand why we are still praising spyware tools?


I think calling this spyware is a pretty far stretch. We're talking about SAAS platforms that want to record events based on user behavior for a myriad of purposes; these are platforms that you are choosing to use, not tracking pixels dropped all across the web.

There are dozens of valid uses for this beyond ad-tech. Where we use Segment it barely even touches with marketing. Most of the value for Segment is piping user lifecycle events around to every platform and service you use to help enrich customer experience. Sure, call it sales or marketing or ad-tech, but that's really just an umbrella for trying to maximize revenue-per-customer - and isn't that the point of a SAAS platform?

I think we should be cautious about throwing the terms "spyware" and "malware" around right now; there are lots of very valid cases that should be labeled as such, but if we over-use the word it just makes it harder for us to delineate between powerful tools being used for good/valid purposes or deceptive ones.


Jitsu certainly can be used to build spyware (and Linux too, probably 90% of spyware is running on Linux servers), but following this logic any OSS project can be accused of being spyware. I

Yes, jitsu can be deployed at custom domain such as track.yourcompany.com. And while some AdBlockers will block *.segment.com, track.yourcompany.com will remain functional. We don't consider this feature unethical, though. It depends on how data collected by Jitsu is used. If the app owner sells it without telling end-users that's probably bad. However, I believe most of our users using the data to improve product experience. And Jitsu can be configured to respect do-not-track/gdpr settings.


I mean, that's a common difference between a SaaS and a self-hosted solution. For a SaaS service a block list can include *.somesaas.com vs a self-hosted service that has it's own domain per owner, meaning you'd have to add every single one to the list. You can always find a common pattern in the URL (e.g. the API), and block based on something else. There's also an issue[0] to block based on POST body.

[0] https://github.com/uBlockOrigin/uBlock-issues/issues/1357


Where is this line from?



The name is 1 letter off of an already existing opensource project: Jitsi


i thought this was a new product line within jitsi at first haha


It's also a Segment-like tool?


Mostly communications tooling IIRC


Very confusing indeed


Congrats on the launch!

Noticed a typo on jitsu.com - DHW should probably be DWH.


Ops... Thank you very much! Vercel is deploying the fix already


Congrats on your launch. Great to see more innovation in this space. Segment deserves some serious competition.

-RudderStack team.


What is the difference with Jitsu and Rudderstack. Aware of both projects but keen to get your take.


Disclaimer: Haven't looked at Jitsu in depth so my understanding below may be limited.

Reading through their comment below - "Jitsu can be used for any kind of data while Segment compatibility is just a thin layer on top".

I am guessing they have built a generic event API that can be used to send any JSON payload while RudderStack (like Segment) has a opinionated view of events - e.g. there has to be a userID (or anonymousID), that ID is persisted in a cookie (for web), every event must include that userID. Furthermore, there are certain standard for event tracking for specific verticals which RudderStack supports (e.g eCommerce https://rudderstack.com/docs/rudderstack-api/api-specificati...)

Having this opinionated view helps us map these events to all the 100s of destinations, otherwise, you cannot send any arbitrary JSON to these destinations. It also lets us build more post processing in the warehouse (e.g identity stitching, user sessions etc https://hub.getdbt.com/rudderlabs/, we are going to build more MLish applications like churn-models and release them too).

On the other hand, it becomes hard to send generic events (e.g. application telemetry) via RudderStack which seems possible via Jitsu. With RudderStack, you would have to create hacky userID to tag on every event which doesn't make sense.

In summary, go deep on one use case (customer-data) or wide as a generic event streaming platform.

Beyond this, there are other feature differences (transformation, reverse-ETL etc) but that's not a fundamental difference imo. They are just getting started and are a much smaller team so that's expected. Impressive to see what they have built.


We have suggested event structure (https://github.com/jitsucom/jitsu/blob/master/javascript-sdk...). If you want to send data to destination, you should either follow suggested structure or write your own JavaScript mapping (https://jitsu.com/blog/javascript-transform).

And we have DBT models too https://hub.getdbt.com/jitsucom/jitsu/latest/ !


Is that dbt project also doing the sessionization? I see this:

    jitsu_sessionization_trailing_window: 3
    jitsu_session_inactivity_cutoff: 30 * 60
EDIT: No idea why valid reply from dev is marked dead, but thanks! Really, really cool that you're using dbt for this process.


Hi! Ildar here, one of Jitsu's core engineers. Yep. That is exactly what it does.


(New accounts are subject to extra restrictions and sometimes software kills their posts. We review those and try to find and unkill all the good ones, though it takes time and we do miss a few. I've marked absorbb's account legit now so this won't happen again.)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: