More

tejasmanohar · on Nov 12, 2021

Agree RE: focus. Rudderstack is a bundle and there is a place for that.

Hightouch is 100% focused on activating data from the warehouse. Everything we build is Reverse ETL or built on top of Reverse ETL - that means we spend every waking minute thinking about progressing the Reverse ETL space, just like Fivetran are laser-focused on SaaS data ingest or Snowplow on behavioral/event data ingest (other parts of the Rudderstack bundle)

We're a big fan of RudderStack's drive. As an ex-Segmenter, I can say that competing with that team is not easy. Best of luck! We will continue watching from the sidelines :)

soumyadeb · on Nov 12, 2021

Thanks Tejas for your kind words :) We too have nothing short of enormous respect for what you guys have achieved. Looking forward to you and team to push innovation in this space forward.

tejasmanohar · on Nov 12, 2021

Thanks, Soumya! I agree. Congrats on all your continued success at RudderStack as well.

tejasmanohar · on Nov 11, 2021

This probably isn't the best place for an extended comparison, but since it's our launch post, I'll try to close the thread with a couple corrections for factuality. If anyone is interested in a deep-dive, email hello@hightouch.io, and I'm happy to set one up personally. And, I'm sure the team at Grouparoo would be willing to do the same ("contact us" at bottom of their website).

    * Add tags to contacts in mailchimp, zendesk or make lists of them in customer.io, Pardot, etc based on segmentation. I believe Hightouch Audiences is more like a filter.

With static mappings, audiences can be synced to destinations as tags :). The magic is in the abstractions, not features!

    * Full workflow with branches, PRs, test suite in a repo. I saw Hightouch added git syncing to a known branch yesterday and it looks cool, but it's not the full workflow yet.

Lots more coming soon here. Our git integration is bidirectional so you can totally do that stuff in git, but UI support is on the way. We've found the UI experience is a lot better of an experience than code for _most_ Reverse ETL workflows... so I see the value in this - I'lll check it out

If I have to be honest, the biggest thing that customers love about our product is that it works and accomplishes their use cases. Platform features are cool, but from time to time, I have to remind myself that Fivetran has proven that integrations and actually working comes first, and it is volume but not _just_ volume... our philosophy (destinations as a product), design, and progress there is quite differentiated from the space. You can read more in our Series A announcement from a few months ago at https://hightouch.io/blog/series-a

PS: I haven't tried Grouparoo in a while. I do love the concepts, will give it a swing!

bleonard · on Nov 11, 2021

It's hard to leave the comparisons dangling, for sure. But I'll defer for now. Congrats on the launch :-)

tejasmanohar · on Nov 11, 2021

Good callout. Sometimes, I joke that warehouse ingestion latency is the bane of my existence, but it's improving...

Our average customer runs Hightouch syncs roughly every hour, but we can actually run syncs up to every minute! HT has a lot of optimizations like only sending changes to destinations instead of all data every run.

On the warehouse side, we're seeing a lot of improvements. BigQuery has streaming insert APIs [0] implemented with a parallel database on the backend that's joined at read time. Combined with timestamp partitioned tables (sortable) and our in-warehouse diff'ing, you can actually create a streaming pipeline in Hightouch. Some companies like JetBlue are doing cool stuff with lambda views on top of Snowflake [1]. Our power users at Hightouch are running syncs as fast as every minute.

For wider context, we find 90%+ of business use cases to be just fine in batch. It's amazing to see how many people are still replacing... manual CSV workflows... with Hightouch :)

That said, there are some use cases for truly real-time workflows (e.g. a post-checkout email), and for that, customers either implement outside of Hightouch or lately, we've been fiddling around with letting customers plug directly into streams like Kafka, Kinesis, PubSub - though they lose the power of SQL aggregations _for now_.

Streaming SQL databases like Materialize [2] will fix this fundamentally, and Hightouch can connect to them. Email hello@hightouch.io if you want to try any of the new stuff!

[0]: https://cloud.google.com/bigquery/docs/write-api [1]: https://discourse.getdbt.com/t/how-to-create-near-real-time-... [2]: https://materialize.com/

tejasmanohar · on Nov 11, 2021

Haha thanks. Love some friendly competition :). In all seriousness, though we're focusing elsewhere, the OSS angle is cool.

If you're interested in self-hosted though, just reach out at hello@hightouch.io.

That said, IMO one of the coolest parts of our tech is our "hybrid architecture". Out of the box, no data is stored in Hightouch - it's all in your cloud (warehouse, s3 bucket). This is how fintech (Plaid, Blend, Betterment, + some banks now!) and healthcare brands like Headway use us. We've also done a ton of compliance work and have certificates for SOC2 Type II and whanot.

tejasmanohar · on Nov 11, 2021

Cofounder here! Not quite. In Hightouch, you define your model (SQL) and create a sync (point-and-click or JSON/YAML).

The syncs are declarative, not imperative. They don't map 1:1 to API calls by design. You tell us what you want the destination to look like, and we figure out how :). Kinda like how a database creates the best plan for your SQL query before executing it.

Here's an example - https://i.imgur.com/05T5iKK.png. This sync maps your users table to Salesforce "Contacts" and the mapping interface also encodes the foreign key relationship between Contact:Account in Salesforce. Under the hood, we do all the lookups, caching, batch API calls using the bulk API, automatically handle rate limits, and only send changes from your database.

This is one of our key design differences compared to iPaaS tools like Tray, Zapier, Workato, Mulesoft, etc., which tend to just map actions to API calls 1:1. Data integration being declarative is something I'm really passionate about personally... wrote a blog with more examples at https://hightouch.io/blog/the-future-of-data-integration-wha...

tomnipotent · on Nov 11, 2021

> You tell us what you want the destination to look like

Implicit mapping between SQL to target is great, but how does the SQL author know what SQL to write in the first place?

I've done no shortage of integrations like this, and there is no avoiding reading the target SaaS documentation to know what their schema looks like so I can shape data accordingly. Without that step, I can't even start writing SQL.

tejasmanohar · on Nov 11, 2021

Not implicit but - declarative! Our goal is to provide enough context in our docs, app (e.g. autocomplete, automatic schema discovery, etc.), and resources to guide users through this and then recipes on top for common workflows!

We do a lot of validation upfront (at both the schema & data layer), and I think it's still early days there... this is a big opportunity IMO. Great callout.

We find people start with a simple SQL model + sync and then bounce back and forth and edit their queries as they explore our columns.

matchagaucho · on Nov 11, 2021

What if the state of IsHireable changes in the system of record (SFDC)? Will Hightouch overwrite OLTP data with stale warehouse data?

joshwget · on Nov 11, 2021

Short answer is yes, but here's why:

By syncing data to a particular field in Salesforce, you're effectively saying that the source of truth for that field is the warehouse, and not Salesforce. If you expect a human to update a field, then Salesforce is the source of truth for that field, and Hightouch shouldn't write to it!

What we typically see is that Salesforce contains data that's expected to be updated and maintained in the tool, and then other "read-only" fields coming from Hightouch.

tejasmanohar · on Jan 26, 2021

Congrats! We at Hightouch [0] ("reverse ETL") are excited to see Airbyte here on HN. We've been following Michel & John for a while now since the YC days, and from the outside, it seems like they've been consistently shipping incredibly quickly ever sinec the open-source project launch.

@mtricot -- You mention that a big value prop of Airbyte is providing an interface for building custom connectors. Have there been interesting learnings on designing an ideal "interface" to provide developers? How does the interface you provide compare to that of Fivetran's Functions offering [1]?

[0]: https://hightouch.io

[1]: https://fivetran.com/docs/functions

mtricot · on Jan 26, 2021

Answering your first question: When talking about the interface we need to separate: the data protocol and the developer experience (DX) creating & maintaining a connector. We believe the data protocol we have in place should address 95% of the use cases and, as we get more sophisticated use cases we will evolve the protocol (for example for more scale). Regarding the DX, we are continuously working on it to make it a breeze and ensure super high quality.

Answering your second question: Fivetran functions are a nice escape hatch but none of the users we talked to mentioned those. They always mentioned building inhouse for missing connectors. My interpretation is that this is too much of a vendor lock-in for a cloud-based product.

tejasmanohar · on Sept 14, 2020

Hightouch.io cofounder here! Thanks for the S/O. Our online presence is quite limited so I'll post a summary here...

We've built an e2e marketing automation platform on top of your data warehouse. Marketers can interactively explore their customer base, run targeted campaigns in downstream email/ad/etc tools, and analyze results leveraging all the data they have in their warehouse.

RE: "messy data" -- Totally agree with bleonard's point that overall, the trend is towards data enablement. That said, I don't think any of the solutions in the markets (even Looker) suffice. I've attended dozens of calls with Looker users who first say that Looker offers self-service exploration but then fail to retrieve fairly basic information via a Zoom screen-share. The truth is it's really hard to do self-service data exploration generically. I think what's lacking from the "BI market" are more verticalized solutions on top of your warehouse (think UIs like Amplitude's funnel analysis, Intercom or Kustomer's segmentation interface, etc.).

To make our product work, we've built UIs that are super focused on particular tasks as well as a pretty nifty graph-based "modeling layer" that sits above your warehouse (which ideally, you use DBT/Dataform on) to abstract over complex JOINs and such.

This whole space is super fascinating to me. Always happy to exchange notes and talk shop RE: marketing, warehouses, customer data, etc. If you have thoughts, hit me up at tejas [at] hightouch.io.

tejasmanohar · on Aug 27, 2020

Many of the rules aren't rules. We ran a travel company and used Stripe in the past, which is also one of the disallowed industries. We got approval from Stripe after proving that we have a negligible fraud & chargeback rate due to being focused on business users

dotBen · on Aug 27, 2020

This, one of the biggest blinkers technically-inclined founders have is that they forget or ignore that so much is relationship driven.

Rules like Stripes (+ Wells Fargos) are not interpreted like code, everything is open to negotiation and degrees of freedom depending on the relationship established.

Nasrudith · on Aug 28, 2020

Those blinkers are called "not being utterly insane". The whole model of disruption is seeing a stupid practice saying. "No we aren't doing that stupid shit." watch practicioners of the existing stupid froth at the mouth and then either succeed or fail.

Seriously that is why honor based lending died to banks centuries ago. Relationship driven is a fucking stupid way to do finance.

dotBen · on Aug 28, 2020

Finance is totally relationship driven (and 'honor systems' isn't really a good example of it). My bank waives fees on everything because of my personal friendly relationship with my banker. They gave me preferred terms of my mortgage rate because of my formal relationship - it wasn't just the product of a formula at the end of the banker's computer screen.

I know they'll do all sorts of shit for me because of the hard (account age, $$) and soft (personal) relationship.

But my point is more broad - API agreements come to mind as an example, just because that is a space I've played in prior jobs. "But the API Ts&Cs say you can't do xx but they are doing xx". Yeah, they have a relationship and got a dispensation.

michannne · on Aug 28, 2020

That's an ideal. At the end of the day these are human-run industries, relationships are a big part of that.

lovegoblin · on Aug 27, 2020

Yeah this is something I think a lot of people in this thread are missing: OnlyFans is probably big enough that they can negotiate their contract with Stripe.

tejasmanohar · on March 17, 2020

Even with the same fare class, the tickets sometimes cost more through the change interface on United.com. We've added more detail to our post-- https://bookwithcarry.com/blog/united-cheap-flights-coronavi...

I can assure you that there was no intentional fudging here, and if there's something we're missing, we'd be the first that would like to know it. Due to all the discussion about fare classes, we've also included information in the blog post about problems in the change flow related to that.