I’m not accustomed to seeing a Docker image configured with Ansible. Can someone who is familiar with this pattern of using Ansible to configure a Docker image and commit the result [0] explain why this approach might be useful? Preference? Power of Ansible? Uniformity with non-Docker approach?
Hi! I'm one of the contributors to the repo. Primary reason was for uniformity. Initially, we built our EC2 and DO Droplet images with Ansible and our Docker image with a Dockerfile [0]. It became tedious over time as adding a new extension meant double the work since we had to write separate instructions for Ansible and the Dockerfile. Also, the Dockerfile itself started to become long-winded and cluttered with the amount of extensions that we were pumping in.
As there are multiple approaches to address that issue (I use Dockerfile templates with loops for example, but I am not a fan of them), I was wondering if you considered different approaches too. In my experience, users like to have regular Dockerfile, as they are familiar with it. Would it be possible or make sense in your opinion to run Ansible inside the Dockerfile and keep a more standard approach to Docker image building?
When the transition from using a Dockerfile to Ansible was done, the primary consideration was to be able to reuse as much existing instruction/ task files as possible. As such, iirc, this was the first approach taken. It fortunately became the last one as well since it worked.
For context, producing AWS EC2/ DO droplet images are more of the more important objectives of the repository. Hence, the idea of using Ansible which is used to build these images naturally became the first approach in order to consolidate everyone together.
As for running Ansible inside the Dockerfile, I have yet to try that out myself but it does sound possible since one can run Ansible on itself locally. Would have definitely considered it as the next approach to attempt if the first one failed.
This makes inspecting the Dockerfile so much more harder. Not only this is non uniformity, Ansible really isn't the tool of choice if you want to "repeatedly" perform reproducible steps. So, I wonder why not just a Dockerfile instead of writing a playbook and then using `docker commit` (a command I've barely used/seen being used in my 2+ years of using docker daily).
> Ansible really isn't the tool of choice if you want to "repeatedly" perform reproducible steps.
Ansible is a tool for provisioning a system in a particular way. What's the difference between running ansible against an ephemeral VM vs running against a container?
So, I wonder why not just a Dockerfile instead of writing a playbook
One of the maintainers commented in a sibling comment to yours that they use the same ansible to build VM images, so that was more work.
> and then using `docker commit` (a command I've barely used/seen being used in my 2+ years of using docker daily).
Well, if you use Dockerfiles then of course you won't see a command used mostly by alternative build processes, but that's hardly a meaningful objection.
To be fair, I can see why anyone would want to avoid Dockerfiles. Nowadays I use Packer with file and shell provisioners to build Docker images and avoid all that nastiness, especially the dreaded `&& \`. It's also quicker and I can do things like automate tagging and the like, but mainly avoiding `RUN` is enough for me.
Ansible is a configuration management tool and docker is a container runtime (am I getting the terminology right)? They can compliment each other and provide their individual strengths to many types of deployments. There is some overlap in how both tools and their larger ecosystems impact a system - but that’s not necessarily a bad thing.
I personally prefer a different setup, but any team that already has a strong skill set in both tools may benefit from building on top of preecisting domain knowledge.
Isn't a bit problematic to run multiple processes in the same container? Docker can determine when a container is healthy or not, if it should be restarted and such; having a main process, like systemd, hiding what's happening behind doesn't seem great.
Wouldn't be better to run the same image with different commands or have one image for each service? Logs are usually written to stdout/stderr in containers, so they can be gathered, aggregated, stored etc. by Docker or Kubernetes and handled with the tools you like the most, without having to share log volumes across services and having multiple ways to handle and rotate logs to maintain.
Hi! I'm one of the contributors to the repo. Just to clarify, our Docker image [0] only contains the latest version of Postgres (13) and the common extensions listed out here [1]. All the other features such as this [2] and this [3] are only available in the AWS EC2 or DO droplet images. We're currently updating the README to make that clearer :-)
Hi! I have just seen the commit on README.md, thank you for clarifying that :)
The image runs postgres directly indeed, but in the entrypoint, and then keeps it open using tail -f /dev/null. That, together with Ansible, makes it pretty peculiar image :)
When you run the "fat" container in an orchestrator that's less of a problem, because Kubernetes, and Nomad, can have as many health checks as you want to determine the health of all the things running inside.
But yes, generally it's not considered a good practice.
I think it is problem. For eg. Let's say the healthcheck is on the main postgres process but the side processes crash, it will create a problem. You will have a container running in degraded status. So yes, It is very problematic. In practice, I have seen this kind of setup create lot of weird and unexpected behaviors.
In 2000 this dev-pair I knew wrote an extension for MS-SQL to embed scripting (VBScript) and also a module for IIS that had some magic too (I can't remember all the details)
But the database was rendering, and even caching and had these wild stored procedures. It was performant compared to the typical ASP that was popular at the time. We used all your RAM :)
We use zalando's spilo image and its patroni operator for many years.
Not only has the extensions, but everyhing configured and 'turn key solution' for both k8s and simple docker if you like. We compared many options which are listed in this thread and this came on top from both architecture, operability, popularity, etc. Works on both k8s and ocp.
We also use pure patroni (non-docker) with vms, also a great
LATELLY, one may want to look at yugabyte and cockroachdb - they have wire compatibility with pg protocol. Depending on the way you use pg, it might be a much better choice for you. It's cloud native, have enterprise support, etc.
Hi! I'm one of the contributors to the repo. Just to clarify, our Docker image [0] only contains the latest version of Postgres (13) and the common extensions listed out here [1]. All the other features such as this [2] and this [3] are only available in the AWS EC2 or DO droplet images. We've since updated our README to make that clearer :-)
You can still connect the DB with a PgBouncer image spun up in another container however. Unfortunately, I can't really recommend you which one since there doesn't seem to be an official Docker image for PgBouncer and I myself have never tried any of the existing ones out there. If you're looking to use PostgREST however, they do have an official Docker image that you can use over here [4].
I'm curious if you've looked at https://github.com/zalando/postgres-operator and had thoughts on how it compared. The CrunchyData operator seems to use Patroni and links to the Patroni documentation which links to the Zalando Postgres Operator from the same company as Patroni.
[0]: https://github.com/supabase/postgres/blob/de88f3dc1c80fbf7d1...