Hacker News new | past | comments | ask | show | jobs | submit login

This looks useful, I'd prefer self-hosted for some use cases though.



Thanks. Yes, in some cases where you're working with regulated data you'd need a self-hosted version. We're working on an enterprise version that allows this.

Would you be able to describe your use case and the kind of data you're using?


Sorry, what I have in mind is mostly just personal productivity stuff, relatively small data, but private.


Our BYODB plan lets you connect to 1 of your own databases. It's not quite self-hosted but you are in full control of your DB.


If you're looking for open source, self-hosted ELT, I suggest you check out Meltano (https://meltano.com), which we've been working on at GitLab since 2018!

We've decided to focus primarily on the CLI for the moment, which you can see showcased through the example code on the homepage: https://meltano.com, but we're working on a UI as well, as evidenced by today's release blog post: https://meltano.com/blog/2020/08/17/now-available-meltano-v1...

---

Meltano uses open source Singer taps and targets (https://singer.io) as its extractors and loaders, so to put together something similar to DropBase (which looks amazing, by the way), you could use:

- tap-spreadsheets-anywhere (https://github.com/ets/tap-spreadsheets-anywhere, https://gitlab.com/meltano/meltano/-/merge_requests/1813), which supports CSV and XLS over S3, HTTP(S), (S)FTP (etc),

along with:

- target-postgres (https://github.com/meltano/target-postgres, https://meltano.com/plugins/loaders/postgres.html),

- target-jsonl (https://github.com/andyh1203/target-jsonl, https://meltano.com/plugins/loaders/jsonl.html), or

- any of the others you find on https://meltano.com/plugins/extractors/, https://meltano.com/plugins/loaders/, or https://www.singer.io/.

---

For transformation, Meltano currently supports only dbt (https://www.getdbt.com/), which means that unlike DropBase, it's built for ELT rather than ETL, since transformation takes place inside the loading database, rather than in between the E and L steps.

I'm very interested in exploring the ETL direction more, though, because as DropBase clearly shows, there are still a lot of companies and people who may not be experts on SQL, but would benefit tremendously from sturdy ETL with an accessible interface and flexible integration points.

As I just wrote on our Slack workspace (there's a link on https://meltano.com):

> I’d love to see Meltano UI develop into that direction for simple transformations over Singer tap stream/entity schema and record JSON, so that we can do ETL as well as dbt-backed ELT.

> We’d probably start with a way of specifying transformations in `meltano.yml`, similar in spirit to the `select`, `schema`, and `metadata` extra’s (https://meltano.com/docs/command-line-interface.html#extract...), and/or by pointing at a Python file that can process each Singer message. Building a DropBase-style UI over that would be on the horizon too, once we’ve brought the Entity Selection interface back (https://gitlab.com/meltano/meltano/-/issues/2002) and add interfaces for metadata rules and schema overrides.

I'll create some more issues around this potential direction tomorrow :-)

---

If you or anyone else interested in open source, self-hosted ETL/ELT end up giving it a try, I'd love to hear what you think, so that we can figure out how to build it into this direction together!


Thanks for all the info - sounds interesting, I'll check it out.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: