Show HN: Postgres Docker image with common extensions

matt123456789 · on Sept 6, 2021

I’m not accustomed to seeing a Docker image configured with Ansible. Can someone who is familiar with this pattern of using Ansible to configure a Docker image and commit the result [0] explain why this approach might be useful? Preference? Power of Ansible? Uniformity with non-Docker approach?

[0]: https://github.com/supabase/postgres/blob/de88f3dc1c80fbf7d1...

angelico · on Sept 6, 2021

Hi! I'm one of the contributors to the repo. Primary reason was for uniformity. Initially, we built our EC2 and DO Droplet images with Ansible and our Docker image with a Dockerfile [0]. It became tedious over time as adding a new extension meant double the work since we had to write separate instructions for Ansible and the Dockerfile. Also, the Dockerfile itself started to become long-winded and cluttered with the amount of extensions that we were pumping in.

[0]: https://github.com/supabase/postgres/blob/0c5178588791d9472c...

frafra · on Sept 6, 2021

That is a very interesting point.

As there are multiple approaches to address that issue (I use Dockerfile templates with loops for example, but I am not a fan of them), I was wondering if you considered different approaches too. In my experience, users like to have regular Dockerfile, as they are familiar with it. Would it be possible or make sense in your opinion to run Ansible inside the Dockerfile and keep a more standard approach to Docker image building?

Disclaimer: I am not familiar with Ansible.

angelico · on Sept 6, 2021

When the transition from using a Dockerfile to Ansible was done, the primary consideration was to be able to reuse as much existing instruction/ task files as possible. As such, iirc, this was the first approach taken. It fortunately became the last one as well since it worked.

For context, producing AWS EC2/ DO droplet images are more of the more important objectives of the repository. Hence, the idea of using Ansible which is used to build these images naturally became the first approach in order to consolidate everyone together.

As for running Ansible inside the Dockerfile, I have yet to try that out myself but it does sound possible since one can run Ansible on itself locally. Would have definitely considered it as the next approach to attempt if the first one failed.

e12e · on Sept 6, 2021

Interesting approach - would it not make sense to have a docker file that got the ansible playbooks, and ran ansible "on itself" to build?

Ie rather than the typical "apt install" have "ansible" in the build-part?

mr-karan · on Sept 6, 2021

This makes inspecting the Dockerfile so much more harder. Not only this is non uniformity, Ansible really isn't the tool of choice if you want to "repeatedly" perform reproducible steps. So, I wonder why not just a Dockerfile instead of writing a playbook and then using `docker commit` (a command I've barely used/seen being used in my 2+ years of using docker daily).

rgoulter · on Sept 6, 2021

> Ansible really isn't the tool of choice if you want to "repeatedly" perform reproducible steps.

Ansible is a tool for provisioning a system in a particular way. What's the difference between running ansible against an ephemeral VM vs running against a container?

yjftsjthsd-h · on Sept 6, 2021

So, I wonder why not just a Dockerfile instead of writing a playbook

One of the maintainers commented in a sibling comment to yours that they use the same ansible to build VM images, so that was more work.

> and then using `docker commit` (a command I've barely used/seen being used in my 2+ years of using docker daily).

Well, if you use Dockerfiles then of course you won't see a command used mostly by alternative build processes, but that's hardly a meaningful objection.

manojlds · on Sept 6, 2021

Ansible is the tool for this vs crude shell commands in Dockerfile.

brigandish · on Sept 6, 2021

To be fair, I can see why anyone would want to avoid Dockerfiles. Nowadays I use Packer with file and shell provisioners to build Docker images and avoid all that nastiness, especially the dreaded `&& \`. It's also quicker and I can do things like automate tagging and the like, but mainly avoiding `RUN` is enough for me.

mwarkentin · on Sept 6, 2021

You can use HEREDOC for multi line commands in Buildkit now: https://www.docker.com/blog/introduction-to-heredocs-in-dock...

tmpz22 · on Sept 6, 2021

Ansible is a configuration management tool and docker is a container runtime (am I getting the terminology right)? They can compliment each other and provide their individual strengths to many types of deployments. There is some overlap in how both tools and their larger ecosystems impact a system - but that’s not necessarily a bad thing.

I personally prefer a different setup, but any team that already has a strong skill set in both tools may benefit from building on top of preecisting domain knowledge.

manojlds · on Sept 6, 2021

You are talking of Ansible to configure host where docker is installed and containers are run right? That's a more common pattern.

Here, Ansible is used to build the image itself, which is, afaik, less common. But don't see why it can't be done though.

husarcik · on Sept 6, 2021

Curious. What setup do you prefer?

frafra · on Sept 6, 2021

Isn't a bit problematic to run multiple processes in the same container? Docker can determine when a container is healthy or not, if it should be restarted and such; having a main process, like systemd, hiding what's happening behind doesn't seem great.

Wouldn't be better to run the same image with different commands or have one image for each service? Logs are usually written to stdout/stderr in containers, so they can be gathered, aggregated, stored etc. by Docker or Kubernetes and handled with the tools you like the most, without having to share log volumes across services and having multiple ways to handle and rotate logs to maintain.

angelico · on Sept 6, 2021

Hi! I'm one of the contributors to the repo. Just to clarify, our Docker image [0] only contains the latest version of Postgres (13) and the common extensions listed out here [1]. All the other features such as this [2] and this [3] are only available in the AWS EC2 or DO droplet images. We're currently updating the README to make that clearer :-)

[0]: https://hub.docker.com/r/supabase/postgres

[1]: https://github.com/supabase/postgres#extensions

[2]: https://github.com/supabase/postgres#enhanced-security

[3]: https://github.com/supabase/postgres#additional-goodies

frafra · on Sept 6, 2021

Hi! I have just seen the commit on README.md, thank you for clarifying that :)

The image runs postgres directly indeed, but in the entrypoint, and then keeps it open using tail -f /dev/null. That, together with Ansible, makes it pretty peculiar image :)

sofixa · on Sept 6, 2021

When you run the "fat" container in an orchestrator that's less of a problem, because Kubernetes, and Nomad, can have as many health checks as you want to determine the health of all the things running inside.

But yes, generally it's not considered a good practice.

debarshri · on Sept 6, 2021

I think it is problem. For eg. Let's say the healthcheck is on the main postgres process but the side processes crash, it will create a problem. You will have a container running in degraded status. So yes, It is very problematic. In practice, I have seen this kind of setup create lot of weird and unexpected behaviors.

avereveard · on Sept 6, 2021

Health checks are better than just to rely on whether the process is running anyway

geordee · on Sept 6, 2021

So, PostgreSQL is the new Low-Code Platform!

I believe with PL/V8 we should be able to run/render JavaScript+Web applications (database)server-side!

edoceo · on Sept 6, 2021

In 2000 this dev-pair I knew wrote an extension for MS-SQL to embed scripting (VBScript) and also a module for IIS that had some magic too (I can't remember all the details)

But the database was rendering, and even caching and had these wild stored procedures. It was performant compared to the typical ASP that was popular at the time. We used all your RAM :)

kiwicopple · on Sept 6, 2021

Some of the extensions include: PostGIS, pg_cron, pgAudit, wal2json, and plv8.

gattacamovie · on Sept 7, 2021

We use zalando's spilo image and its patroni operator for many years. Not only has the extensions, but everyhing configured and 'turn key solution' for both k8s and simple docker if you like. We compared many options which are listed in this thread and this came on top from both architecture, operability, popularity, etc. Works on both k8s and ocp. We also use pure patroni (non-docker) with vms, also a great

LATELLY, one may want to look at yugabyte and cockroachdb - they have wire compatibility with pg protocol. Depending on the way you use pg, it might be a much better choice for you. It's cloud native, have enterprise support, etc.

sandGorgon · on Sept 6, 2021

Is pgbouncer enabled by default in here ?

One other request - could you write how to use it (connect to it) ?

angelico · on Sept 6, 2021

Hi! I'm one of the contributors to the repo. Just to clarify, our Docker image [0] only contains the latest version of Postgres (13) and the common extensions listed out here [1]. All the other features such as this [2] and this [3] are only available in the AWS EC2 or DO droplet images. We've since updated our README to make that clearer :-)

You can still connect the DB with a PgBouncer image spun up in another container however. Unfortunately, I can't really recommend you which one since there doesn't seem to be an official Docker image for PgBouncer and I myself have never tried any of the existing ones out there. If you're looking to use PostgREST however, they do have an official Docker image that you can use over here [4].

[0]: https://hub.docker.com/r/supabase/postgres

[1]: https://github.com/supabase/postgres#extensions

[2]: https://github.com/supabase/postgres#enhanced-security

[3]: https://github.com/supabase/postgres#additional-goodies

[4]: https://hub.docker.com/r/postgrest/postgrest/

gyzmau · on Sept 6, 2021

Not really replying to your question.

But recently I got the occasion to work with https://github.com/CrunchyData/postgres-operator .

It's a bit rough around the edges wit the v5, which is very recent, but overall I was really happy with it.

I am still not using it in production but it's on the way.

mdasen · on Sept 6, 2021

I'm curious if you've looked at https://github.com/zalando/postgres-operator and had thoughts on how it compared. The CrunchyData operator seems to use Patroni and links to the Patroni documentation which links to the Zalando Postgres Operator from the same company as Patroni.