I think singularity is about keeping program dependencies together, not isolatio...

sarusso · 2024-05-26T11:14:33 1716722073

I agree to a certain level. However, it's hard to ensure dependencies to work in the right way without isolation. These two support tickets are a showcase of the essence of the problem: "Same container, different results" [1] and "python3 script fails in singularity container on one machine, but works in same container on another" [2]. In my experience with Singularity, there were many issues like these.

I am not sure why they had to call it a "containerization" solution. It gets a bit philosophical, but IMO containers are meant to "contain", not to just package. To me, Singularity is more a "virtual environment on steroids", and it works great in that sense. But it doesn't "contain".

The hard truth is that Singularity was designed more to address a cultural problem in the HPC space (adoption friction and push back of new, "foreign" technologies) rather than to engineer a proper solution the the dependency hell problem.

HPC clusters still use Linux users and shell access, meaning that it is up to the user to run the container: there is just no container orchestration. This means that the user has to issue a command like "singularity run" or "docker run". And since not long time ago, to let users do a "docker run" it meant to have them part of the docker group, which is a near-root access group. Just not doable.

Singularity also works more or less out of the box with MPI in order to run parallel workloads, either locally on multi-nodes. However, this has a huge price as it relies on doing an "mpi singularity run", and it requires to have the same MPI version inside and outside the container. To me, this is this is more a hacky shortcut than a reliable solution.

I believe that the definitive solution in the HPC word will be to let HPC queuing systems to run and orchestrate containers on behalf of the users (including to run MPI workloads), thus allowing to make use of any container engine or runtime, including Docker. I did some trials and it works well, almost completely solving the dependency hell problem and greatly improving scientific reproducibility. A solution like the one presented in the OP contributes in the discussion towards this goal, and I personally welcome it.

With respect to Singularity, I think they just had to name the project "singularity environments" rather than "singularity containers" and everything would have been much more clear.

[1] https://github.com/apptainer/singularity/issues/476 [2] https://github.com/apptainer/singularity/issues/3484