Hacker News new | past | comments | ask | show | jobs | submit login
Launch HN: Hydra (YC W22) – Query any database via Postgres (hydras.io)
326 points by coatue on Feb 23, 2022 | hide | past | favorite | 77 comments
Hi HN, we’re Joe and JD from Hydra (https://hydras.io/). Hydra is a Postgres extension that intelligently routes queries through Postgres to other databases. Engineers query regular Postgres, and Hydra extends a Postgres-compliant SQL layer to non-relational, columnar, and graph DBs. It currently works with Postgres and Snowflake, and we have a roadmap to support MongoDB, Google BigQuery, and ClickHouse.

Different databases are good at different things. For example, Postgres is good at low-latency transactional workloads, but slow when running analytical queries. For the latter, you're better off with a columnar database like Snowflake. The problem is that for each new database added to a system, application complexity increases quickly.

Working at Microsoft Azure, I saw many companies juggle database trade-offs in complex architectures. When organizations adopted new databases, engineers were forced to rewrite application code to support the new database or use multiple apps to offset database performance tradeoffs. All this is expensive busy work that frustrates engineers. Adopting new databases is hard and expensive.

Hydra automatically picks the right DB for the right task and pushes down computation, meaning each query will get routed to where it can be executed the fastest. We’ve seen results return 100X faster when executing to the right database.

We've chosen to integrate with Snowflake first so that developers can easily gain the analytical performance of Snowflake through a simple Postgres interface. To an application, Hydra looks like a single database that can handle both transactions and analytics. As soon as transactions are committed in Postgres, they are accessible for analytics in real-time. Combining the strengths of Postgres and Snowflake in this way results in what is sometimes called HTAP: Hybrid Transactional-Analytical Processing (https://en.wikipedia.org/wiki/Hybrid_transactional/analytica...), which is the convergence of OLTP and OLAP.

Existing solutions are manual and require communicating with each datastore separately. The common alternative is trying to combine all of your data together into a data warehouse via ETL. That works well for analysts and data scientists, but isn't transactional and can't be used to power responsive applications. With Hydra engineers can write unified applications to cover workloads that had to be separate before.

Hydra runs as a Postgres extension, which gives it the ability to use Postgres internals and modify execution of queries. Hydra intercepts queries in real-time and routes queries based on query type, user settings, and Postgres' cost analysis. Writes and operational reads go to Postgres, analytical workloads go to Snowflake.

Recently committed transactions are moved from Postgres to Snowflake in near real-time using Hydra Bridge, our built-in data pipeline that links databases from within Postgres. The bridge is an important part of what we do. Without Hydra, workloads are typically isolated between different databases, requiring engineers to implement slow and costly ETL processes. Complex analytics are often run on older data, updated monthly or weekly. The Hydra bridge allows for real-time data movement, enabling analytics to be run on fresh data.

We make money by charging for Hydra Postgres, which is a Postgres managed service, and Hydra Instance, which attaches Hydra to your existing Postgres database. Pricing is listed on the product pages: https://hydras.io/products/postgres and https://hydras.io/products/instance.

A little about our backgrounds: Joseph Sciarrino - Former PM @ MSFT Azure Open-Source Databases team. Heroku (W08) and Citus Data (S11) alum. Jonathan Dance - Director @ Heroku (2011-2021)

Using Hydra you can create a database cluster of your own design. We’d love to know what Hydra clusters you’d be interested in creating. For example, Elasticsearch + Postgres, BigQuery + SingleStore + Postgres, etc. Remember - You can experiment different combinations without rewriting queries, since Hydra extends Postgres over these other databases. When you think about databases like interoperable parts you can get super creative!




  > Hydra automatically picks the right DB for the right task and pushes down computation, meaning each query will get routed to where it can be executed the fastest. We’ve seen results return 100X faster when executing to the right database.

This is really interesting. Could you talk a bit more about query pushdown and planning/optimization?

Is this through FDW's? Would love to hear more about the technical details.

Shameless plug -- I work at Hasura (turn DB's into GraphQL API's) and this seems incredibly synergistic and useful to get access to databases we don't have native drivers for at the moment.

Any chance of an OSS limited version?


Hi! JD here, Hydra's CTO.

Hydra does not use FDWs except for Postgres-to-Postgres communication (for now). What we found was that FDWs do not do pushdown very well, even when Postgres has full information. You can get FDWs to push down aggregations, but complex queries with subqueries etc quickly get slow again. In short, our goal is to have your queries take full advantage of the power of each datastores, and we found that FDWs do not accomplish that goal.

We want to support GraphQL at some point, so same goes for us!

We are thinking about an OSS version, I think how we do "limited" is a big part of what that means. What would you like to see in an OSS version? What would you use it for?


Thanks for the explanation =D

  > What would you like to see in an OSS version? What would you use it for?
I think that's a difficult question to answer because it's hard to do data-access partially. How do you gate that, so it doesn't give everything away for free and incentivizes people to still pay you?

Read-only access might be one way, but I'm unsure how popular that would be.

  > What would you use it for?
Generating GraphQL API's for other datasources by funneling them through Postgres


Parent comment is absolutely right that FDW as a general query router is still under heavy development. It's very likely that we'll see further improvement in forthcoming Postgres releases, which will come with additional benefits since FDW are used for a lot more than just "high-level" query routing in Postgres.


This would be great.

I know that EnterpriseDB is heavily invested in FDW development and core Postgres stuff, so maybe we'll see some more neat stuff come out of that team that makes it upstream.


In that case how does it work? Are the connectors wrappers around the other db drivers that expose a common Postgres wire protocol?


We have a similar product at Splitgraph, where we do use FDWs in the routing layer (along with some PgBouncer magic). We recently blogged about adding aggregation pushdown to our Snowflake FDW. [0]

[0] https://www.splitgraph.com/blog/postgresql-fdw-aggregation-p...


Just a suggestion, your home page's one liner is super vague and confusing. I had to scroll down to really figure out what is it that Splitgraph does...

"Splitgraph connects numerous, unrelated data sources into a single, unified SQL interface on the Postgres wire protocol."

Just my 2 cents.


Nice 2 cents! We just launched this LP recently so we're still testing it – we've also got a lot of pages to add this month that will hopefully clarify things.

The basic pitch is for a "Unified Data Stack – an integrated and modern solution for working with data without worrying about its infrastructure." Connecting unrelated data sources is one part of the product, but it also includes a data catalog, modeling, integration, warehousing...

By integrating the discovery, access and (optionally) storage layers, we reduce the friction for a lot of common workflows, kind of like GitLab does by bundling CI pipelines and version control. Even if each layer has some tradeoffs, the benefit of integrating them has a multiplicative effect on the platform itself. And if you need a more specialized provider for one layer, that's fine too – the "data middleware" model makes Splitgraph incrementally adoptable.

But yeah... marketing is hard, especially in the "bag of tools" stage of product/market fit when the optimal messaging can differ so much by use case. Thanks for your suggestion!


Big fan of Splitgraph and I know some other folks at Hasura are too


Steampipe [1] is an open source [2] project that uses Postgres FDWs to query 67+ cloud services (e.g. AWS, GitHub, Prometheus). The plugins [3] are written in Go similar to Terraform. We've found this approach very effective for DevOps data and pushdown works well for most (simple) cases. (Disclaimer: I'm a lead on the project.)

1 - https://steampipe.io 2 - https://github.com/turbot/steampipe 3 - https://hub.steampipe.io/plugins


This is awesome! I just have to ask, what's your monetization plan?

(I see your company has some pretty big costumers, but I suppose you're not just selling them a cloud version of Steampipe)


Turbot [1] is a bootstrapped company since 2014. Our namesake product is a cloud governance platform with a real-time CMDB, identity suite, policy engine and thousands of automated operations for tagging, security, deployment, etc.

Steampipe Cloud [2] is in private preview providing a hosted version of Steampipe (and more). We're iterating fast and would love your feedback :-)

1 - https://turbot.com 2 - https://cloud.steampipe.io


Whoa, how have I not heard of this before?

If it's OSS, I am definitely interested in experimenting with this. Maybe I can write a blogpost or something?


Yes, the CLI, FDW, plugins and mods are all open source. Please let us know how you go - we thrive on feedback :-)


Incredible! This is SUPER useful!


I was thinking the same thing - FDWs As A Service


Congrats on the launch! Two questions:

- How does this deal with specifics of the query languages of the different data stores? I'm not an expert with Snowflake, but I suppose it supports specific querying capabilities not found in Postgres' SQL dialect. How are those exposed to Hydra users?

- I'm confused by "As soon as transactions are committed in Postgres, they are accessible for analytics in real-time" vs. "Recently committed transactions are moved from Postgres to Snowflake in near real-time". Is data propagated to Snowflake synchronously or asynchronously? I.e. is it guaranteed that data can be queries from Snowflake right the next moment after a transaction has been committed (as suggested by the former) or not (as suggested by the latter)?

Disclaimer: I work on Debezium, another solution people use for propagating data from different databases (including Postgres) into different data sinks (including Snowflake)


Hi, JD here, Hydra's CTO. Thanks for the interest and questions!

Today, queries need to be Postgres-compatible to be intelligently routed, but queries with specific query syntax or functions beyond Postgres can be routed with our manual router[1]. This is our first solution to this problem and plan to iterate in response to customer pain.

Sorry for the confusion! Data moves asynchronously -- we're not trying to implement multi-phase commits -- but we can act on data very quickly once committed. Our solution here uses Postgres logical replication. Using the Data Bridge is optional and a customer's existing solutions are welcome as well.

[1]: https://hydras-io.notion.site/Router-a91f5282f1354c54a9ba894...


For anyone interested, Apache Calcite[0] is an open source data management framework which seems to do many of the same things that Hydra claims to do, but taking a different approach. Operating as a Java library, Calcite contains "adapters" to many different data sources from existing JDBC connectors to Elasticsearch to Cassandra. All of these different data sources can be joined together as desired. Calcite also has it's own optimizer which is able to push down relevant parts of the query to the different data sources. However, you get full SQL on data sources which don't support it, with Calcite executing the remaining bits itself.

Generally all that is required to connect to multiple data sources from CSV to Elasticsearch is just writing a JSON configuration file. Then can get SQL access via JDBC with the able to join all those sources together.

Unfortunately, I would not be too surprised if the query execution Calcite was found to be less performance-optimized than Hydra. There is ongoing work for improvement there. That said, there are users of Calcite at Google, Uber, Spotify, and others who have made great use of various parts of the framework.

[0] https://calcite.apache.org/


Calcite is actually pretty damn fast, the overhead is surprisingly minimal.

I am using it as the backbone for my hobby project that auto-generated federated GraphQL API's:

https://github.com/GavinRay97/GraphQLCalcite

The experience has been incredibly positive and the community has been incredibly helpful & supportive. It's one of the coolest technical projects I've ever seen and has sparked my interest in query engines and relational databases.

I posted some JMH benchmarks of an app that parses a GraphQL query, converts it to Calcite relational expressions, and then executes it against an in-memory DB and it ran on the orders of milliseconds:

https://lists.apache.org/thread/hofjx628864t0kt4kk8vo4tjfrxb...

Something very similar to Calcite but much lesser-known is the "Teiid" project:

https://github.com/teiid/teiid

Highly recommend checking the code out. It's got a brilliant query optimizer/planner tailored to cross-datasource queries, a cache system, and a translator architecture that can convert a generic SQL dialect to dozens of SQL/NoSQL flavors.

Also integrates other data sources like REST API's, S3, flat files etc as queryable data sources.


I was moreso referring to the pieces of queries that Calcite may have to execute itself, often because they are not supported by the underlying data source. That piece of Calcite hasn't been very highly optimized. There's definitely ongoing work in that area though.

For example, I have a former student who's been working on an adapter to Apache Arrow. Beyond having a fast execution engine, this should also eventually enable Calcite to work pretty seamlessly with other systems using Arrow without serialization overhead.


Is this student in contact with Jacques Nadeau?

He's one of the only people I know that's a committer/PMC of both Arrow and Calcite, super nice and brilliant guy.

Maybe he'd have good feedback or ideas around this?

https://twitter.com/intjesus


Indirectly. I know Jacques. I'm also on the Calcite PMC :) Thanks for the pointer though!


If you've responded to any of my emails, no thank YOU!


I've never heard of Calcite before and dug a bit deeper into this project and found it to be quite active. Only taking into consideration changes to Java files, this is the repo activity for the last 12 months:

     month  | authors | commits | files | churn 
   ---------+---------+---------+-------+-------
    2022-02 |      21 |      62 |    84 |  4387
    2022-01 |      26 |     109 |   243 | 32429
    2021-12 |      33 |      99 |   198 | 10461
    2021-11 |      19 |      49 |    77 |  6960
    2021-10 |      26 |      64 |   371 | 13626
    2021-09 |      18 |      41 |    68 |  2258
    2021-08 |      11 |      17 |    25 |  1924
    2021-07 |      17 |      30 |    51 |  2704
    2021-06 |      14 |      31 |    28 |  1708
    2021-05 |       9 |      17 |    35 |  1606     
    2021-04 |      11 |      46 |    99 |  4224
    2021-03 |      16 |      36 |   143 |  8471


The community is also great. Honestly, the main thing preventing things from moving faster is lack of time to review PRs. There's a lot of interesting things in the pipe!


Yeah I can see what you mean. There are currently 20 open pull requests [1] that hasn't changed in over 28 days, which accounts for 53% of all open pulls that are less than 4 months old.

1) https://oss.gitsense.com/insights/github?q=pull-age%3A%3C%3D...


What do you use to produce this analysis?


My product (https://gitsense.com) moves most of Git's history into a Postgres database and from there, you can execute the following SQL statement:

    select
        commit_ym AS month,
        count(distinct(author_email)) as authors,
        count(distinct(commit_id)) as commits,
        count(distinct(path_id)) as files,
        sum(total) as churn
    from
        z1_commits_422 as commits,
        z1_changes_422 as changes,
        z1_code_churn_422 as churn
    where
        commits.id=changes.commit_id and
        changes.code_churn_id=churn.id and
        lang='java'
    group by commit_ym
    order by commit_ym desc
    limit 12
By having most of Git's history in SQL, I can slice, dice and cross-reference code history, which is how my product works.


Calcite is definitely some cool tech. I can see why it would be attractive for bigger teams, but it seems like a big lift for smaller teams. Our goal is to make it easy for devs already familiar with Postgres to be able to use add databases without learning new tools or adding software... besides adding Hydra, of course!


That's a fair point. Although if you're already using a JVM language, it's incredibly easy to integrate. Just another JDBC data source with some JARs to add to your classpath :)


What's the difference whit regular ORM?(genuine ask)


ORMs map relational databases to objects. Calcite does not do that. Calcite takes different data sources (potentially with different data models) and presents them all in the relational model. You interact with your data entirely in SQL while ORMs typically have some DSL which uses whatever object model the ORM defines. ORMs are also typically designed to connect to a single data source at a time. I'm personally not familiar with any ORMs that allow combining data from multiple sources within the same application.

The primary advantage of Calcite for connecting multiple data sources are 1) easily joining data sources that use completely different APIs (assuming there is a Calcite adapter available) and 2) supporting more complex queries than the original data source supports without having to write code (other than SQL) to do the processing.


Don't forget having two industrial-grade academic query planners/optimizer implementations that took collective decades of engineering effort!

And the project being founded by someone who has written multiple RDBMS.

Calcite is a wonder.


This is super powerful. While I see the immediate value in this for simplifying applications, I can also see this becoming a powerful tool for data analysts & data engineers in speeding up their "time to insight".

I've had (early in their career) analysts report to me that struggle writing optimal queries across relational, non-relational, & graph DBs (they're usually great at one & mediocre on others). This will be a huge for them & our stakeholders who rely on them to get them trustworthy insights.


Wow. This seems like such a staggeringly good idea. Congrats on launch, and kudos for bringing this to life! Curious about the overhead (ie, benchmarks for the simplest scenario: vanilla postgres vs going through hydra for the same queries and load). But unless there's a huge hit there (which seems unlikely), this seems like a really exciting development.


Hi, I'm JD, Hydra's CTO. There's no perceivable overhead to using Hydra on queries being routed to Postgres. I think you would not be able to see Hydra in the noise in a benchmark -- but it's a great idea to demonstrate this! I will do a blog post! :) Of course, if you were to use Hydra Instance (where your Postgres database is remote) then there will be some network latency.


Been thinking about this sort of thing for awhile, your vision for how this should work is so much better than mine was.

One of the ideas I kicked around was “materialize-on-read” - when a query comes in but the underlying data is stale, refresh the views first then serve the query.

I’m wondering how much state you plan to put into the Hydra layer or if you plan to keep it mostly a router.


This is really nice! Congrats!

I once started building as a side project something similar but focused on querying cloud resources (like S3 buckets, ec2s, etc... discovering the biggest file from a bucket was trivial with this). I abandoned the project but someone else built a startup on the same concept - even the name was the same: cloudquery.

I built it using the multicorn [1] postgres extension and it is deligthful of how easy it to get something simple running.

[1] https://multicorn.org/


This looks great! Couple questions…

1.) Can you talk a bit about how this is better that the existing foreign data wrappers Postgres has available?

2.) Any thoughts on S3 support? More and more I see teams using S3 as a data store for certain specific use cases.


Hi, I wrote a response [https://news.ycombinator.com/item?id=30443033] about FDWs, I hope it answers your questions.

We will definitely think about S3 support! Would love to understand those use cases more and how Hydra could help.


Shame it’s not OSS but I get that. The ‘no lock-in’ statement on the site; if we would see speedups in both dev and execution performance by using this and develop everything on it from that point to improve working with data easier across the enterprise, how are we not locked in when we you decide to do something else or sell to Oracle? The latter happened to us, quite exactly and that’s why no OSS is no go for dev infrastructure.

Definitely nice work though and best of luck!


Hi, JD here, Hydra's CTO. It's still early days and we are considering open source; for now, we wanted to leave our options open, and OSS feels like a one-way door. I think you make a great point here - thanks for sharing your past pain / experience. Definitely food for thought.

Our "no lock-in" claim refers to your data, since Hydra is Postgres, you're not stuck using "HydraDB" forever -- it's relatively easy to migrate in or out since you can use well established Postgres tools. We also are open to licensing the product should you wish to self-host, on-prem, etc.


> Hydra automatically picks the right DB for the right task and pushes down computation, meaning each query will get routed to where it can be executed the fastest.

Does this mean the data is duplicated to all the available storage backends?


Hi, JD here, CTO at Hydra. In an HTAP scenario, local transactional data would be replicated, but your data warehouse will likely have a great amount of data that your Postgres database does not. You can still connect that data to Postgres with Hydra. Ultimately, it's up to you if/how you choose to replicate your data -- along with guidance from our team along the way.


Looks neat, but wasn't this the promise of Presto? Presto didn't seem to really work out. From what I've seen it converged to a mostly analytical engine. It's still very useful, but I've never seen it used (successfully) in an OLTP workload. Maybe there's some difference in the intended product trajectory that I'm overlooking here?


AWS's Athena uses Presto to pretty good effect, though I guess you could say those use cases are largely relegated to analytical purposes.

Back in my consulting days, I built a distributed query system based on Presto to integrate some custom/onprem data sources with more distributed/cloudy ones, Hive and such, and it worked well for that, too. Most of that was also ad-hoc, batch, or event-driven analytics, too, but there were plans for supporting production workloads.

I think maybe one reason people shy away from things like Presto (and the above) is the uneven performance guarantees; waiting for an unoptimized Hadoop or Orcfile query by accident because you joined on something or another is fine for one-offs, but might become costly in prod workflows.


> I think maybe one reason people shy away from things like Presto (and the above) is the uneven performance guarantees; waiting for an unoptimized Hadoop or Orcfile query by accident because you joined on something or another is fine for one-offs, but might become costly in prod workflows.

Right, so my question is: how is that solved with Hydra? Seems like you'd arrive at the same issue?


Presto is pretty successful but its focus is to be distributed query engine, not a proxy layer for the existing query engines. We use Trino ( formerly Presto) as our query layer and do something similar to Hydra at Metriql [1] with a fairly different use-case. Data people provide a semantic layer with the mecrics and expose them to 18+ downstream tools.

[1]: https://metriql.com


Sounds like a federated query engine with a cost based optimizer. I worked for a company that went pretty far down this path using another database.

Definitely a lot of potential, and also a lot of potential gotchas.

Translating from one SQL syntax (e.g. Postgres) to others, while maintaining the full capability of the other system, for example, turned out to be quite complex (but doable).

Will be following your project and wish you all the best. I suspect if you keep things sharp/focused and don't go too crazy with promises or uses cases, this could be quite successful.

Is this going to stay an intelligent router and sort of proxy? Or do you have plans for federated, heterogeneous joins, for example?

That's where things get interesting :) I think there's a lot you can do without even having to go that far.


Awesome. Do you plan to support providing the hosted instance on other providers (specifically, non-US companies like Hetzner)?

Alternatively, do you plan on offering a self-hosted version?

I would be interested in the Clickhouse integration. Specifically, it would allow me to easily add Clickhouse to my Rails-based product analytics tool [0] while still using the Rails ORM (as far as I understand this should be possible?).

0: https://github.com/shafy/fugu


Certainly, I see us expanding to other cloud providers as we follow customer demand, but it will take some time. I think if you wanted to move faster and have a higher level of control, self-hosted would be the way to go. We are offering that but it's not on our web site.

Definitely stay tuned [0] for Clickhouse! And yes, exactly, you can continue to use your ORM of choice.

0: social links are at the bottom of hydras.io :)


Thanks! I'll stay tuned for Clickhouse :-)


I see a lot of software named after beasts, and a lot of 'Hydra' programs/companies all doing different things. Imagine if someone in 300 BC thought about how we would base our future creations off mythological beasts... they would've increased the CIDR range on all available beast name ideas and written a whole bunch of extra stories.


Interesting. I'm curious about how you handle security now, and what the plans are. That is, is there any integration between the roles/rights my postgres session user has, and the roles/rights I have on the downstream database.


Could this also improve either developer experience or query performance when working with something like Redshift, which is a columnar OLAP store that already uses a Postgres dialect?


This looks amazing. Love the strong Snowflake integration -- very forward looking. I just passed this onto our Data Science team.


What does it take to collaborate on a backend? We've investigated building a Postgres extension for querying SpiceDB[0] and Hydra seems like it could help. What kind of consistency guarantees can be made?

[0]: https://github.com/authzed/spicedb


Hi! Definitely reach out to us to discuss further.


Congratulations on the launch - this sounds interesting.

I'm currently using Postgraphile[0], which uses Postgres' introspection API to discover the schema structure.

Would this still work with Hydra?

[0] https://www.graphile.org/postgraphile/


Absolutely! Hydra is 100% Postgres and supports any existing Postgres-compatible tools.


I don't have any experience with Hydra but I first used FDW to query external dbs about a decade ago. Also there was a pretty popular db that seemed to do a lot of the same things sponsored by facebook, the name escapes me though. Facebook used it pretty heavily internally though.


I like but how much you can truly do whiout the specifics some of the big query for example, thing are so much especific that you will end up whit required bigquery that sound like ORM whit postgre syntax. I like the idea.


This is very interesting. I’ve been building an internal system that looks a lot like this for the current startup I’m at.

Will be following.


Joe, CEO @ Hydra - Awesome, looking forward to talking! On the top banner on Hydras.io we link to our Discord


Great timing as Spanner just launched Postgres wire protocol support.


Congrats on the launch! Coming from a data science role, this could've been pretty useful for my previous projects. I had to rewrite all of my feature engineering queries when the company I worked at moved to Snowflake.

One question I have is how to Hydra balances writing postgres scripts vs leveraging system-specific features. For example, I remember going through Snowflake's documentation and found interesting functions for data aggregation. Can I leverage Snowflake-specific features when using Hydra?


Hi, JD here, Hydra's CTO. Thanks for the great question!

You can use our manual router[1] to route queries that use a specific syntax or functions. The way this works today is you wrap your query in an SQL function. In the future, we could detect use of a specific features and route those queries appropriately. I think there might be other ways to solve this as well e.g. by having a 'stub' aggregate function in Postgres for the function you want to call. We are working with customers to iterate on issues like this as they occur.

[1]: see "Manual Routing" at https://hydras-io.notion.site/Router-a91f5282f1354c54a9ba894...


Any plans to offer a self-hosted version of Hydra instance?


Hi, Joe CEO @ Hydra- Yes. It's not on our website currently, but we can offer Hydra self-hosted today. Ping me and we can get you set up. We have a new discord too https://discord.gg/SQrwnAxtDw


Looking forward to the support for MongoDB and other no-sql stores. Interested to hear how you're trying to approach that.


Hi! JD here, Hydra's CTO. NoSQL is certainly a challenge but we have a few ideas/angles on how to solve it. Certainly we plan to start with simple queries and then iterate from there based on what our customers need.

We are really excited about the prospect of bringing SQL and NoSQL together!


This is definitely something I could see myself paying for --- but only if I could somehow get relational performance for nasty Mongo aggregate queries.


Aggregations by their nature are designed to work on a substantial footprint of data. As a result changing the query model is unlikely to speed up the operation of aggregation. In fact, as most of these libraries require the data to be shipped to the client (whereas aggregation queries run on the server) you will likely see substantially reduced performance.


Hydra doesn't ship data to the client in order to then do further work like aggregations -- that's the whole point of Hydra -- but that also means that you won't be able to "workaround" a performance issue with an underlying data store. For that, we'd need to find a way to replicate the data to a data store that can solve the aggregation performance issue.


Congrats on the launch, this is amazing!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: