Hacker News new | past | comments | ask | show | jobs | submit login

Reading the deployment information, there's an interesting tension here with applications that target self-hosting.

Deploying this requires running 5 different open source servers (databases, proxies, etc), and 5 different services that form part of this suite. If I were self-hosting this in a company I now need to be an expert in lots of different systems and potentially how to scale them, back them up, etc. The trade-offs to be made here are very different to when architecting a typical SaaS backend, where this sort of architecture might be fine.

I've been going through this myself with a hobby project. I'm designing it for self-hosting, and it's a radically different way of working to what I'm used to (operating services just for my company). I've been using SQLite and local disk storage so that there's essentially just 2 components to operate and scale – application replicas, and shared disk storage (which is easy to backup too). I'd rather be using Postgres, I'd rather be using numerous other services, background queue processors, etc, but each of those components is something that my users would need to understand, and therefore something to be minimised far more strictly than if it were just me/one team.

Huly looks like a great product, but I'm not sure I'd want to self-host.




Cheap, easy, powerful: choose any two.

- Cheap and easy: embed into one executable file SQLite, a KV store, a queue, and everything else. Trivial to self-host: download and run! But you're severely limited in the number of concurrent users, ways to back up the databases, visibility / monitoring. If a desktop-class solution is good for you, wonderful, but be aware of the limitations.

- Cheap and powerful: All open-source, built from well-known parts, requires several containers to run, e.g. databases, queues, web servers / proxies, build tools, etc. You get all the power, can scale an tweak to your heart's content while self-hosting. If you're not afraid to tackle all this, wonderful, but be aware of the breadth of the technical chops you'll need.

- Easy and powerful: the cloud. AWS / Azure / DO will manage things for you, providing redundancy, scaling, and very simple setup. You may even have some say in tuning specific components (that is, buying a more expensive tier for them). Beautiful, but it will cost you. If the cost is less than the value you get, wonderful. Be aware that you'll store your data on someone else's computers though.

There's no known (to me) way to obtain all three qualities.


> - Easy and powerful: the cloud. AWS / Azure / DO will manage things for you, providing redundancy, scaling, and very simple setup. You may even have some say in tuning specific components (that is, buying a more expensive tier for them). Beautiful, but it will cost you. If the cost is less than the value you get, wonderful. Be aware that you'll store your data on someone else's computers though.

Easy on the magic powder. Cloud vendors manage some stuff, they mostly abstract you the hardware and package management part, but that's about it. Hosting a postgresql DB on RDS instead of a VM somewhere or on bare metal doesn't change much. Sure redundancy and scaling is easyto setup. But you still have to stay up to date with the best practices, how to secure it through network, choose and set your backup policy, schedule when to upgrade, plan the potential downtimes, what is deprecated, what is new, when price will sky rocket up because AWS doesn't want many customers to still run that old version. Same applies to many individual tech sold as "managed" by cloud vendors.

A whole lot of admin overhead is not removed by running some individual tech as a managed service.

You only remove a significant part of the management when the whole stack of what comprises your user facing application is managed as a single entity but that comes with another problem. Those managed apps end up being very highly coveted targets by black hat hackers. All the customers over the world usually end up being pwnable at the same time thanks to the same bug/security hole being shared. It becomes a question of when, not if, your users accounts will become public data.


> Cheap, easy, powerful: choose any two.

I don't think there's any reason the same codebase can't support different trade-offs here.

Maybe I'm just looking at the past through rose-coloured glasses, but it seems to me that was the norm before we standardized on distributing apps as an entire stack through a docker-compose.yml file.

Your app depends on redis for caching. If no redis instance is configured, go ahead and just instantiate a "null" caching implementation (and print a warning to the log if it makes you feel better) and carry on.

You're using minio for object storage. Is there any reason you couldn't, I don't know, use a local folder on disk instead?

Think of it as "progressive enhancement" for the ops stack. Let your app run simply in a small single node deploy, but support scaling up to something larger.


IMHO, redis is either used as a key value store (which can easily be replicated in application code) or as a central storage to synchronize tasks (like counters).

For the first case, dev should just build on SQLite or use application code. For the latter case, choose a single storage engine and use it for everything (Postgres?).


SQLite is actually substantially more capable than many people think it is. I have served 20k MAUs from a reasonably sized single node server with some headroom to spare. Yes, it requires some thinking about efficiency and not necessarily going with nodejs + some ORM, but you can take SQLite quite far, even in a small to medium enterprise


SQLite works well with 2k DAUs on a single node, even with Django's not particularly efficient ORM. You just have to be careful about what you really need to write to DB and what's okay to either not save at all, or just throw into a log file for later offline analysis.


I don't see how these guys can think about MAU/DAU to assess DB load and sizing without talking about the rest of the app/arch details. Wouldn't ops/time be more agnostic?


Agreed, number of active users cannot make sense as a generic unit across systems...

I have 2 systems where in the first (route optimization platform), 1 user would, as part of just a normal 10 minute session:

- read ~100MB from the database - utilize 100% of 32-core machine CPU (and 64GB of RAM) - resulting in thousands of writes to the database - and side-effect processing (analytics, webhooks etc)

Over a course of a day, it would likely be ~10x for that single user.

In the other system - appointment scheduling - 1 user would, in 1 day, read ~1MB of data from the database, and make 2-3 writes from a single action, with maybe 1 email triggered.


Can make sense if it's about loading and showing not-too-complicated data, but no intensive computations. Then you can compare roughly similar applications


Maybe they mean cases somewhat similar to the software this discussion is about.

I.e. load and show stuff from databases (but nothing compute intensive).


SQLite has excellent read performance.

Insert / update performance is quite another story: https://stackoverflow.com/a/48480012 Even in WAL mode, one should remember to use BEGIN CONCURRENT and be ready to handle rollbacks and retries.


You can scale quite far using SQLite. That's what Basecamp is doing with their new self-hosted chat app, named ONCE Campfire. It is designed to scale to hundreds or even thousands of concurrent users with the right hardware: https://once.com/campfire.


I wonder why it needs 2Gb of RAM even for a low number of users though.


It ships as a Docker container. Docker recommends a minimum of 2GB RAM to run the Linux version of the Docker Engine, before adding constraints imposed by running apps.


Ruby on Rails is not known for being very RAM efficient, but this is only me speculating.


awesome, thank you for the information.


It still astounds me how hard software distribution is. We've had so many generations of solutions and paradigms that attempt to solve this problem (e.g. C, Unix, Java, Docker, etc) but the best we've come up with is client-server applications and the web. Yet it's still not trivial to host a reasonably "powerful" web application.

In theory, the components for a universal runtime are pretty apparent by now. Wouldn't it be wonderful if truly portable and end-user friendly cloud computing was a thing.


I suspect that a Docker container would in this case solve the issue.


The WebAssembly Component Model is attempting to do this though it seems to be progressing rather slowly.


How big the organization has to be not to be able to deploy it on a big enough machine with SQLite?


How big is not the question. The question is how small: think self-hosting within a small non-profit or other such operation, on someone's aging ProLiant in a basement.

A big organization likely would just buy their hosted solution.


Size likely has nothing to do with it, while reliability is a much bigger concern.

If your business uses said tool for all client work, and it's hosted on a single server (be it physical or a VM), that is a SPoF. Any downtime and your business grinds to a halt because none of your project or customer information is accessible to anyone.


Does everything need 99.99% availability?

It feels like you have to be a very big business for this to be a problem that is worth the extra engineering effort vs a single instance with an SQLite file that you backup externally with a cron job


Not availability but predictability.

Most stick trading systems and even many banking systems have availability like 40%, going offline outside business hours. But during these business hours they are 99.999999 available.

Usually operation without trouble is operation with plenty of resources to spare, and in absence if critical bugs.


> - Cheap and powerful: All open-source, built from well-known parts, requires several containers to run, e.g. databases, queues, web servers / proxies, build tools, etc. You get all the power, can scale an tweak to your heart's content while self-hosting. If you're not afraid to tackle all this, wonderful, but be aware of the breadth of the technical chops you'll need.

What about lowering the number of dependencies your application uses, like only depending on a database? Running a database isn't that hard, and it also greatly simplifies the overhead of running 5 different services.


This works up to a point. Past a certain scale, running a database becomes hard(er), and you also start to need proxies, caches, load balancers, etc. to maintain performance.

I would agree, though, that many software installations never need to scale to that point. Perhaps most.


At what point do you hit that scale with project management software, though? Maybe you can't get to the point where you're managing all projects across all of Walmart from the same instance, but certainly you can run pretty much anything of reasonable size.


sqlite can scale to thousands of concurrent users.

Personally for me the issue with all these new project management platforms is that the target demographic is either: - they're so small they can't afford to self-host - they're big enough they can afford the pro plans of more established tools

Small companies can get by perfectly with the free/cheap plans of other tools, like Trello+Google Workspace. Heck if you're a one-man team with occasional collaborators free Trello + Google Workspace (~$6/mo) is enough.

A box to provision the Huly stack might come out at more per month...


> But you're severely limited in the number of concurrent users

You can easily handle 100-200 concurrent active users on a decent CPU with SQLite, if you don't do anything crazy. And if you need a project management solution that needs more than that, you probably are not too concerned about the price.


I think the point they're making is that it's not exactly cheap either, given the amount of upfront knowledge (and time investment to gain that knowledge) required, or the cost of vetting and paying people who do have that knowledge.

So "cheap and powerful" just looks like "powerful", at which point you may as well make it easy, too, and go with a managed or hybrid solution.


It's "cheap" as in "not paying $39/mo per user", or whatever other subscription / licensing fees.


Why is open-source "cheap"?

It seems like you look at the cost of things only through the lens of licensing and not the cost of people to run/maintain them.

I have nothing against OSS per se, but in my experience, the financial analysis of OSS vs paid software is much more subtle.


the cost of the people is represented by it not being "easy"


(Also building a product in the productivity space, with an option for users to self-host the backend)

That's interesting, for us there was actually no trade-off in that sense. Having operated another SaaS with a lot of moving parts (separate DB, queueing, etc), we came to the conclusion rather early on that it would save us a lot of time, $ and hassle if we could just run a single binary on our servers instead. That also happens to be the experience (installation/deployment/maintenance) we would want our users to have if they choose to download our backend and self-host.

Just download the binary, and run it. Another benefit is that it's also super helpful for local development, we can run the actual production server on our own laptop as well.

We're simply using a Go backend with SQLite and local disk storage and it pretty much contains everything we need to scale, from websockets to queues. The only #ifdef cloud_or_self_hosted will probably be that we'll use some S3-like next to a local cache.


I think that's great if you prefer to operate a service like that. Operating for one customer vs many can often have different requirements, and that's where it can make sense to start splitting things out, but if the service can be built that way then that's the best of both worlds!


> We're simply using a Go backend with SQLite and local disk storage and it pretty much contains everything we need to scale

How do you make it highly available like this? Are you bundling an SQLite replication library with it as well?


This is probably where the real trade-off is, and for that it's helpful to look at what the actual failure modes and requirements are. Highly available is a range and a moving target of course, with every "extra 9" getting a lot more complicated. Simply swapping SQLite for another database is not going to guarantee all the nines of uptime for specific industry requirements, just like running everything on a single server might already provide all the uptime you need.

In our experience a simple architecture like this almost never goes down and is good enough for the vast majority of apps, even when serving a lot of users. Certainly for almost all apps in this space. Servers are super reliable, upgrades are trivial, and for very catastrophical failures recovery is extremely easy: just copy the latest snapshot of your app directory over to a new server and done. For scaling, we can simply shard over N servers, but that will realistically never be needed for the self-hosted version with the number of users within one organization.

In addition, our app is offline-first, so all clients can work fully offline and sync back any changes later. Offline-first moves some of the complexity from the ops layer to the app layer of course but it means that in practice any rare interruption will probably go unnoticed.


> just copy the latest snapshot of your app directory over to a new server and done.

In practice, wouldn't you need a load balancer in front of your site for that to be somewhat feasible? I can't imagine you're doing manual DNS updates in that scenario because of propagation time.


I don't know for sure but the explanation given sounds very much like they expect it to be a 100% manual process.


> Servers are super reliable

We have clearly worked in very different places.


What about the user interface? I’m all in on Go + Sqlite, but the user facing parts need a UI.


Sure, this is just about the syncing backend. Our "IDE" frontend is written in vanilla JavaScript, which the Go backend can even serve directly by embedding files in the Go binary, using Go's embed package. Everything's just vanilla JS so there are no additional packages or build steps required.


Are you using something like Wails or Electron to wrap this into a desktop app? I’ll have to sign up for Thymer and check it out!

I’ve been prototyping my app with a sveltekit user-facing frontend that could also eventually work inside Tauri or Electron, simply because that was a more familiar approach. I’ve enjoyed writing data acquisition and processing pipelines in Go so much more that a realistic way of building more of the application in Go sounds really appealing, as long as the stack doesn’t get too esoteric and hard to hire for.


Not GP, but someone else who's also building with Go+SvelteKit. I'm embedding and hosting SvelteKit with my Go binary, and that seems to be working well so far for me. Took some figuring out to get Go embedding and hosting to play nicely with SvelteKit's adapter-static. But now that that's in place, it seems to be reliable.


Yeah, I think keeping infra simple is the way to go too. You can micro-optimize a lot of things, but this doesn’t really beat the simplicity of just running SQLite, or Postgres, or maybe Postgres + one specialized DB (like ClickHouse if you need OLAP).

S3 is pretty helpful though even on self-hosted instances (e.g. you can offload storage to a cheap R2/B2 that way, or put user uploads behind a CDN, or make it easier to move stuff from one cloud to another etc). Maybe consider an env variable instead of an #ifdef?


> Deploying this requires running 5 different open source servers (databases, proxies, etc)

That is the nature of the beast for most feature rich products these days. The alternative is to pay for cloud service and outsource the maintenance work.

I don't mind a complex installation of such a service, as long as it is properly containerized and I can install and run it with a single docker-compose command.


This reminds me of Sentry. This podcast interview had an interesting discussion of self-hosting: https://www.flagsmith.com/podcast/sentry

> In terms of how much thought and design you put into the self-hosted story, this is one of the things that we've been slowly realizing that every time the product gets more complex and the infrastructure required that run it gets more complex. As soon as you include a time series database, then that's adding another ratchet up of complexity if you're self-hosting it, but more outside of docker name. How much did you have that in your mind when you were designing the platform?

> I would say that's changed a lot over time. Early on, Sentry’s goal was to adapt to infrastructure. We use Django. We’ve got support for different databases out of the box, SQLite, Postgres, MySQL. MySQL is going all this other stuff at some point too. We had that model early on. ... Our whole goal, and this is still the goal of the company, we want everybody to be able to use Sentry. That's why the open source thing is also critical for us. Adapting infrastructure is one way to do that. Now, we changed our mind on that because what that turned into is this mess of, we have to support all these weird edge cases, and nobody is informed about how to do things. We're like, “That's all dead. We only support Postgres.” Key value is still a little bit flexible. It's hard to mess up key values. We're like, “You can write an adapter. We only support Redis for this thing. We use a technology called Cortana. We only support ClickHouse for this thing.” We're over this era of, “Can we adapt it because it looks similar?” and it made our lives so much better.


Venture-backed open-source typically wants the software to run everywhere because some percentage of devs will have the power to get the enterprise they work for to shell out $$$ for the hosted service.

Great for short/medium-term but unsustainable long-term.


Super important point. I work for a very large famous company and deployed an open source project with a bit of customization which became one of the most used internal apps at the company. It was mainly front end code. It gained a lot of traction on GitHub, and the developer decided to create 2.0, which ended up having dependencies on things like supabase. I did all I could to try to deploy supabase internally but it was just too much of an impedence mismatch with our systems, so we ended up punting and going with another provider. If they just went with raw Postgres it would have been fine as we already have a Postgres provider internally, but I wasn't willing to commit to being the maintainer for a supabase and its many moving parts as a frontend engineer.


Every decision for an external dependency that a self-hosted service makes is another chance for it to have that impedance mismatch you mentioned.

Postgres is a relatively uncontroversial one, but I had the benefit of working for a company already operating a production postgres cluster where I could easily spin up a new database for a service. I went with SQLite/on disk storage because for most companies, providing a resilient block storage device, with backups, is likely trivial, regardless of which cloud they're on, being on bare metal, etc.


SQLite is fine and dandy as long as you don't do a lot of updates. SQLite locks the entire database for a transaction. It may be fine for quite long, or you may face slowdowns with just a few parallel users, depending on your use case.


Seemingly unpopular opinion, but coming from more ops related background, I always appreciate configurable deployments. If I want to test-deploy the thing on my own workstation I hope the defaults will work, but in a corporate environment I may not only want, but actually require flexibility.

Infra, ops and dev intersect at different points in different orgs. Some may provide custom kubernetes operator and abstract the service away, some orgs may provide certain "managed" building blocks, e.g. postgres instance. Some will give you a VM and and iscsi volume. Some will allocate credits in cloud platform.

Having the ability to plug preexisting service into the deployment is an advantage in my book.


It's not an unpopular opinion, but as far as I can tell Huly is not a configurable deployment. It's a strict set of dependencies. If you can't run Mongo, Elastic, and a bunch of other bits, then you can't run it. The more pieces they add the more likely any one potential user can't run it.

The options are either to minimise the dependencies (the approach I advocated for in my parent comment), or to maximise the flexibility of those dependencies, like requiring a generic-SQL database rather than Mongo, requiring an S3 compatible object store rather than whatever they picked. This is however far more work.


This is clearly a case where Huly's primary focus is their complex SaaS option. The self-host option has to follow core or it bitrots. I'm fine with this trade-off personally.


  > there's an interesting tension here with applications that target self-hosting
Being already set up in Docker simplifies this quite a bit for smaller installs. But I notice a send tension - the introduction of some new tool on every project.

I'm reasonably proficient with Docker, been using it for over a decade. But until now I've never encountered "rush". And it did not surprise me to find some new tool - actually I probably would have been surprised to not find some new tool. Every project seems to foregoe established, known tools for something novel now. I mean I'm glad it's not "make", but the overhead for learning completely new tools to understand what I'm introducing into the company is attrition.


I don't really see where you are getting that

https://github.com/hcengineering/huly-selfhost


That's actually supporting the posters' argument.

Take a look at all the configs and moving parts checked in this very repo that are needed to run a self-hosted instance. Yes, it is somewhat nicely abstracted away, but that doesn't change the fact that in the kube directory alone [1] there are 10 subfolders with even more config files.

1: https://github.com/hcengineering/huly-selfhost/tree/main/kub...


> Yes, it is somewhat nicely abstracted away, but that doesn't change the fact that in the kube directory alone [1] there are 10 subfolders with even more config files.

That's just what you get with Kubernetes, most of the time. Although powerful and widely utilized, it can be quite... verbose. For a simpler interpretation, you can look at https://github.com/hcengineering/huly-selfhost/blob/main/tem...

There, you have:

  mongodb       supporting service
  minio         supporting service
  elastic       supporting service
  account       their service
  workspace     their service
  front         their service
  collaborator  their service
  transactor    their service
  rekoni        their service
I still would opt for something simpler than that and developing all of the above services would keep multiple teams busy, but the Compose format is actually nice when you want to easily understand what you're looking at.


As someone that develops native Kubernetes platforms: Providing the raw resources / manifests is almost the worst way of providing a user install. That works great as long as you never have a breaking change in your manifests or any kind of more complex upgrade.

Which brings me back to the initial question: Is this complexity and the external dependencies really needed? For a decently decomposed, highly scalable microservice architecture, maybe. For an Open Source (likely) single tenant management platform? Unlikely.

It highlights the problem of clashing requirements of different target user groups.


We can also take a look at the linux kernel that powers the docker instances and faint in terror.

These “moving parts” are implementation details which (iiuc) require no maintenance apart from backing up via some obvious solutions. Didn’t they make docker to stop worrying about exactly this?

And you don’t need multiple roles, specialists or competences for that, it’s a one-time task for a single sysop who can google and read man. These management-spoiled ideas will hire one guy for every explicitly named thing. Tell them you’re using echo and printf and they rush to search for an output-ops team.


These moving parts require active understanding and maintenance, as they will change on each and every upgrade, which also requires manual upgrade steps and potential debugging on breaking changes. OCI images let you worry less about dependencies, but what they don't eliminate is debugging and/or upgrading k8s configuration manifests (which we are looking at here).

> We can also take a look at the linux kernel that powers the docker instances and faint in terror.

Sure, and computers are rocks powered by lightning - very, very frighting. That doesn't invalidate criticism about the usability and design of this very product my friend.


These moving parts require active understanding and maintenance, as they will change on each and every upgrade, which also requires manual upgrade steps and potential debugging on breaking changes

Maybe they won’t change or migrations will be backwards-compatible. We don’t know that in general. Pretty sure all the software installed on my PC uses numerous databases. But somehow I never upgraded them manually. I find the root position overdefensive at best.

If it were a specific criticism, fine. But it uses lots of assumptions as far as I can tell, cause it references no mds, configs, migrations, etc. It only projects a general idea about issues someone had at their org in some situation. This whole “moving parts” idiom is management speak. You either see a specific problem with a specific setup, or have to look inside to see it. Everything else is fortune telling.


I am a firm believer that any self-hosted app should require, at most, a docker-compose.yml and possibly a .env file to get a basic service up and running. Requiring me to clone your repo or run some script from your site doesn't inspire a lot of confidence that it will be easy to maintain and keep up to date, whereas if you provide a prebuilt Docker image, I know that I can always just pull the latest image and be up-to-date.


That doesn't preempt the parent's concerns. If your docker-compose has a mongo container now I have to either self-host mongo or use atlas.


Thanks so much for the analysis. I wish that people would simplify shit more.

I think that self hosted has two meanings and not every person that self hosts want to use docker.


If docker is spinning all the services up it’s not as big of a deal.

Unfortunately, JavaScript based apps can be quite convoluted to output HTML and JavaScript.


>If docker is spinning all the services up it’s not as big of a deal.

Until something goes wrong, or the business side of the house asks for some kind of customization.


Same for manually hosting it and an update breaking it.

Docker can be a min to packaging an application.

Appwrite is a good example of packaging a complex app nearly flawlessly with docker and making updates a little more seamless.

I continue to have my reservations about docker having used it for a long time but some applications are helpful.

It’s unrealistic to eliminate it on the basis of it not being perfect for any and all scenarios.

It makes software available to more people to be able to run locally, and I’m not sure that’s a bad thing.


The most complex system that I've seen that you could self host is the Sentry APM solution, have a look at how many services there are: https://github.com/getsentry/self-hosted/blob/master/docker-...

That's the main reason why I've gone why Apache Skywalking instead, even if it's a bit jank and has fewer features.

It's kind of unfortunate, either you just have an RDBMS and use it for everything (key-value/document storage, caches, queues, search etc.), or you fan out and have Valkey, RabbitMQ and so on, increasing the complexity.

That's also why at the moment I use either OpenProject (which is a bit on the slow side but has an okay feature set) or Kanboard (which is really fast, but slim on features) for my self-hosted project management stuff. Neither solution is perfect, but at least I can run them.


We had to roll back to an earlier version of Sentry for this exact reason. It went from a few gb to using 18gb+ of RAM and a factor more number of containers. The older version had every feature we wanted, so there was no need to move forward.


As a counterpoint though, the issue for me is "environment pollution" - not then number of products.

My metric for this is something like Portainer, where it's installation instructions tend to be "install this Helm chart on the K8s cluster you already have" or "run this script which will install 10 services by a bunch of bash (because it's basically some developers dev environment)".

Whereas what I've always done for my own stuff is use whatever suits, and then deploy via all-in-one fat container images using runit to host everything they need. That way the experience (at most scales I operate at) is just "run exactly 1 container". Which works for self-hosting.

Then you just leave the hooks in to use the internal container services outside the service for anyone who wants to "go pro" about it, but don't make a bunch of opinions on how that should work.


> ..."run exactly 1 container". Which works for self-hosting.

On Linux.

The container everything trend that also has the effect of removing non-Linux POSIX-ish options from the table.


It seems like there are different categories of "self-hosting" that are something like "corporate self-hosting" and "personal self-hosting" plus some in-between, such as for the lone wolf admin contractor, etc.


> I've been using SQLite and

Postgres’ inability to be wrapped in Java (ie SQLite can be run as a Java jar, but PG need to be installed separately, because it’s in C) gives it a major, major drawback for shipping. If you ship installable software to customers, you’ll either have to make it a Docker or publish many guidelines, let alone customers will claim they can only install Oracle and then you’ll have to support 4 DBs.

How have you found SQLite’s performance for large-scale deployments, like if you imagine using it for the backend of Jira or any webapp?


SQLite is written in C. And the most commonly used SQLite driver interfaces to that compiled C code directly via JNI. The PostgreSQL client driver is written in pure Java on the other hand. Has nothing to do with language used. You can wrap anything in a jar (it’s just a zip file).

You could certainly embed PostgreSQL in a jar and you can do something similar to this

https://github.com/electric-sql/pglite

I don’t think there’s that much interest, but it is doable.

EDIT: https://github.com/zonkyio/embedded-postgres


If you need a database that you can embed in Java, have a look at:

H2: https://www.h2database.com/html/main.html

HSQLDB: https://hsqldb.org/

Apache Derby: https://db.apache.org/derby/

Though those would certainly be a bit niche approaches, so you'll find that there are fewer examples of using either online.


Can't such issues be solved by having something like a helm chart with meaningful scaling cofigs provider? Nexclozlud dies somthing like this and it makes scaling the cluster a breeze.


Well, it gives you more projects with which you can use it to manage!


I did the same for my self-hosted product, chose MySQL + PHP, one of the simplest stacks possible that's supported by almost all hosting providers out-of-the-box.


I wrote about exactly this trade-off for Bugsink (self-hosted Error Tracking) here: https://www.bugsink.com/blog/installation-simplification-jou...


I'd like to see more of something like this https://github.com/electric-sql/pglite in combination with a redis/valkey compatible API and then you could still have it run out of the box, but allow folks to point to external servers too for both of those services if they need to scale.

Although you do lack the simple on-disk db format that way.


When I see things like this where you need a package manager to install a different package manager to install the tool, and the only setup information assumes a bunch of stuff in Docker on a single machine, I just default to assuming it's deliberately over-engineered and under-documented specifically to dissuade self hosting.


I have seen open source products that pull 5 redis containers each per type of data/features alone making it all very complicated to operate. That's where the button for "Cloud Pricing" comes into play.


> it's deliberately over-engineered and under-documented specifically to dissuade self hosting.

This right here. Shitty architecture choices as a deliberate moat.


Personal observation: Company use this strategy to redirect user into using their cloud platform and open-source is an go-to-market strategy.

I think this pattern is not that harsh if they have a script to guarantee setting up a k8s cluster or some sort of that.


It always seems like Rails is ahead with understanding what is actually important for most people (before or if mosz people get around to understanding it). Looking at 8 in regards to this issue reinforces that once again.


My philosophy is that's fine, as long as I get a terraform module that sets up the infra for me. Knowing how to glue the pieces together is often the vast majority of the effort.


That's kind solves, isn't it? Create interfaces for the components (DBs, Queues, etc) and let the users decide.


Are you against using containers?


Not at all, but they don't solve this problem. They help, but just because a database is in a container doesn't mean you don't need to know how it operates. Containerised databases don't necessarily come with maintenance processes automated, or may automate them in a way that isn't suitable for your needs.

Postgres vacuuming comes to mind as an example. Pre-built docker images of it often just ignore the problem and leave it up to the defaults, but the defaults rarely work in production and need some thought. Similarly, you can tune Postgres for the type of storage it's running on, but a pre-built container is unlikely to come with that tuning process automated, or with the right tuning parameters for your underlying storage. Maybe you build these things yourself, but now you need to understand Postgres, and that's the key problem.

Containers do mostly solve running the 5 built-in services, at least at small scale, but may still need tuning to make sure each pod has the right balance of resources, has the right number of replicas, etc.


Yeah, for a software I'd use personally, if I don't see a docker-compose.yml I walk away from such a software. Painful enough to write those docker files for client projects that I work on. Huly has docker-compose files hidden and meant for dev? But a quick look into it show a lot of environment variables which is a great thing if you want to use your existing database once the software's use outgrows whatever specific limitations your docker host has. https://github.com/hcengineering/platform/blob/develop/dev/d... Their huly-selfhost project lets you run a setup to create that docker-compose file and it looks decent.


Sorry I’m not, but I thought customers would prefer a Kubernetes deployment than docker-compose? Isn’t docker-compose for the programmer’s machine, and isn’t K8s the big microservice organizer of large companies, requiring sysadmins to rewrite the file? Can K8s use docker-compose.yml files?


Kubernetes cannot ingest compose files as-is, no.

From a users point of view: If I'm interested in a project, I usually try to run it locally for a test drive. If you make me jump through many complex hoops just to get the equivalent of a "Hello World" running, that sucks big time.

From a customers point of view: Ideally you want both, local and cluster deployment options. Personally I prefer a compose file and a Helm chart.

In this specific case I'd argue that if you're interested in running an OSS project management product, you're likely a small/medium business that doesn't want to shell out for Atlassian - so it's also likely you don't have k8s cluster infrastructure, or people that would know how to operate one.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: