Jesus the amount of remote-exploitable bugs in Gitlab the last months is astonis...

colechristensen · on Feb 26, 2022

>Gitlab is the only open-source alternative that can keep up with Github feature-wise.

I think it's a bit the opposite, GitHub seems to have started adding most of these features as a response to competition with GitLab. Not that GitHub hasn't pulled ahead in some ways now.

saxonww · on Feb 26, 2022

We've talked about this a lot internally. They make a big deal about releasing on the 22nd of every month, but they usually have to turn around and release a patch shortly thereafter. 14.8 is a bit of an extreme: 14.8.0 on the 22nd, 14.8.1 (bug fix) on the 23rd, then 14.8.2 (security) on the 25th.

We'd personally prefer to see a better release when it's done, vs. them keeping this farce of a 'release streak' going and then asking customers to install another upgrade within days.

0xbadcafebee · on Feb 26, 2022

Seems like you could just wait a few weeks until upgrading to a new release?

kokey · on Feb 26, 2022

When I used to maintain a GitLab server I used to do exactly this by upgrading almost a month behind their schedule.

passerby1 · on Feb 27, 2022

The problem with this approach is being vilnerable to major zero-days (like 30k servers hack in Nov 2021) every month.

lbotos · on Feb 26, 2022

I've proposed to other customers who want less releases to track one minor version behind for the experience you are looking for. Still within security fix range, up-to-date and you will jump the the latest patch release (and none should follow unless it's a security release).

blyry · on Feb 27, 2022

I always ran 1 ( sometimes 2) point releases behind, never ever had an issue upgrading. Omnibus on Ubuntu.

EnFinlay · on Feb 26, 2022

I know Gitlab takes security seriously and I think part of why we hear so much about it is because they're so transparent.

registeredcorn · on Feb 27, 2022

That's the rub, isn't it? I feel some of the least secure places are the ones which never mention (or realize) that they have a security problem.

I don't have any particular opinion of Gitlab, but it does seem to be that acknowledging fault is more valuable than the alternative. If I were to attack a service, I'd probably tend to avoid the one that actively updates it's security regularly.

Genbox · on Feb 27, 2022

I feel some of the least secure places are the ones which never mention (or realize) that they have a security problem.

It is an age old conundrum that people seem to struggle with a lot. As part of my previous profession, I've security reviewed hundreds of open source libraries. They fall into 3 categories:

1. The libraries/applications that have either had a really security conscious developer behind it or a very rocky past such that it was given an extraordinary amount of attention security wise. These account for about 1% of software

2. The "normal" libraries/applications that nobody care about with regards to security. As long as the code works everyone keeps using it. They are often insecure by default, but the lack of attention by SME means they wont be judged. They account for 80% of software.

3. The horrid ones. Built-in code execution as a feature. Authentication systems with more critical bugs that you can possibly image. We never talk about these because... well, we could spend months on polishing a turd. Nobody dares to publicly speak up and say "don't ever use X, Y and Z" for fear of the repercussions. They account for the rest of software.

GitLab used to be in category 2, but got moved into 1 about two years ago when security professionals started to give it attention.

As to your feeling, 99% of software have security issues, some worse than others. We rarely talk about them.

bogwog · on Feb 26, 2022

How many people would even want to host a public Gitlab instance? Gitlab.com is free, works well, and even gives you some free CI minutes.

I’ve been running a private instance for about a year and am absolutely in love with it. Gitlab CI is the killer feature IMO, and self hosting it means I never have to worry about usage limits.

But if all you need is basic git hosting with an issue tracker, I don’t see a reason to use Gitlab over something like Gitea.

jancsika · on Feb 26, 2022

I run a Gitlab instance. It wouldn't be a pain except that user spam is nonstop.

How does gitlab.com deal with this? Or do they just put up with rando users signing up and spamming the snippets/issues/etc.?

dnsmichi · on Feb 26, 2022

GitLab employee here.

You can disable signups, or require all users to being approved by an administrator, if that works for your instance. https://docs.gitlab.com/ee/user/admin_area/settings/sign_up_... There are more ways, like limiting specific domains for signup.

Future spam detection ideas shared in https://news.ycombinator.com/item?id=30479511

jancsika · on Feb 26, 2022

Thanks.

> You can disable signups,

I want potential GSoC participants to be able to sign up.

> or require all users to being approved by an administrator, if that works for your instance.

I don't have time to hand separate the Indonesian casino spammers from the potential GSoC participants. And I do mean "by hand"-- the Gitlab UI requires me to click a button to open up a secondary menu, then choose add or delete, then wait for the user screen to reload.

At least when sifting through an email spam folder back in the 90s I could press the delete button multiple times in a row. Even that would be a relatively usable solution.

dnsmichi · on Feb 26, 2022

Thanks for the additional context. Agreed, manually approving and filtering is not efficient here. Spamcheck suggested in https://news.ycombinator.com/item?id=30480296 should be the path.

mathstuf · on Feb 26, 2022

The API is good enough to write some Python code that does this way faster. Some autoclassification based on keywords helps a lot too.

I made some scripts to do this, but would have to extract them from beside the user data in the repo.

boleary-gl · on Feb 26, 2022

GitLab employee here.

We have internal tooling that we're working on incorporating into GitLab itself to help with this: https://about.gitlab.com/blog/2021/08/19/introducing-spamche....

southerntofu · on Feb 26, 2022

Hello, do you have any plans/desire to support ActivityPub federation on Gitlab? It's a killer-feature-to-come for Gitea and certainly would help dealing with spam, as admins could allowlist trustworthy instances on an opt-in basis, enabling easy cooperation across related communities.

boleary-gl · on Feb 26, 2022

I don't think it's currently scheduled: https://gitlab.com/gitlab-org/gitlab/-/issues/30672

southerntofu · on Feb 26, 2022

Yeah i saw that issue two years ago. It's sad nothing has moved on here, whereas the forgefriends project (ex-fedeproxy, not directly related to forgefed) has been super active in the past year (checkout their monthly reports) in this area of forging interop.

EDIT: someone on that issue summarized the issue pretty well:

> Its really annoying how fragmented gitlab is rightnow. I have a dozen accounts on a dozen instances. This feature combined with oauth login to other instances, would make it like there is one big gitlab we all use!

jancsika · on Feb 26, 2022

Sorry, I'm not sure I understand. How does that help "dealing with spam?"

southerntofu · on Feb 26, 2022

Because once you have federation you can either use an operator/domain web-of-trust, or you can use allow/denylists on your instance. That's how email or XMPP is kept mostly spam-free (on a selfhosted server most spam - if not all - i receive is from gmail addresses, not from selfhosted servers who are easily denylisted if they start sending spam).

In particular, if an instance or specific repository concerns only people from specific projects/instances, it would be easy to allowlist those specific instances and not have to deal with spam at all.

jancsika · on Feb 26, 2022

> on a selfhosted server most spam - if not all - i receive is from gmail addresses, not from selfhosted servers who are easily denylisted if they start sending spam

And most - if not all - potential GSoC contributors are from gmail addresses. So again, I don't understand how this could be a general solution to spam.

southerntofu · on Feb 27, 2022

I think you don't get my point. I'm not advocating for denylisting gmail.com because it produces spam (although this has tempted me on more than one occasion), i'm saying fighting spam in federated environments has decades of experience of various techniques that work well. Open nodes (eg. remailers) have terrible reputation and are denylisted pretty much everywhere, but specific communities/servers can maintain a decent reputation as long as they have some form of moderation/cooptation. By opting into the federation, Gitlab could support various advanced workflows depending on your threat model:

- a new organization using your project? maybe grant their whole gitlab instance "issues" read/write access to the project

- publishing FLOSS in a "community" setting where random people submitting contribution is not expected? maybe we can check the PGP WoT before deciding whether to accept that PR

- running a federation of organizations, some of whom may run their own instance? allowlist all the instances so they can interact across instances

- running a public forge like gitlab.com, codeberg.org, or chapril.org? maybe maintain an allowlist of servers who ask for it and pledge to fight spam

- feel adventurous? setup an entirely public instance and help catch spam and reporting it to denylists

All this is already possible on email level, but pointless as you pointed out as trustworthiness of the mail server is not correlated to trustworthiness of the forge.

jancsika · on Feb 26, 2022

Sounds like a potential solution.

When will it ship?

boleary-gl · on Feb 26, 2022

Looks like it shipped in 14.8 (4 days ago)

https://docs.gitlab.com/ee/user/admin_area/reporting/spamche...

jancsika · on Feb 26, 2022

Wait a sec... this is from the feature request[1]:

> Just because I don't think I said it explicitly anywhere above: Because we are using an obfuscated, non-free component (the preprocessor), we can't include spamcheck in CE (users of CE expect no proprietary code to be included in the pacakge), but only in EE.

So... is it available in the current version of gitlab-ce or not? I don't want to waste time trying to get it running only to find out you've only made it available for enterprise editions and gitlab.com.

1: https://gitlab.com/gitlab-org/omnibus-gitlab/-/issues/6259

dnsmichi · on Feb 27, 2022

Non-free obfuscated code cannot be included in the community edition unfortunately. https://gitlab.com/gitlab-org/gitlab-foss/-/blob/master/LICE... The architecture in https://gitlab.com/gitlab-org/spamcheck#architecture-diagram shows the spam detection, where the ML training models remain obfuscated to not give spammers an advantage.

You can run EE without license, it provides the same features as CE. Maybe that is an option for you: https://docs.gitlab.com/ee/update/package/convert_to_ee.html I've created an MR to help clarify the docs: https://gitlab.com/gitlab-org/gitlab/-/merge_requests/81751

jancsika · on Feb 27, 2022

If I didn't care about the open source license I'd simply use github. (Which, unfortunately, may be the only solution that doesn't continue eating more and more of my time.)

Anyhow, this sounds like a death knell for gitlab-ce. My GSoC use case isn't fringe (there are 100s of GSoC orgs), and Gitlab wouldn't have spent money on the ML approach for EE if it weren't generally important.

jancsika · on Feb 26, 2022

Oh wow, thanks!

I'll have a look.

rvz · on Feb 26, 2022

Ask GNOME [0], Redox OS [1].

[0] https://gitlab.gnome.org/GNOME

[1] https://gitlab.redox-os.org/

ModernMech · on Feb 26, 2022

> only open-source alternative that can keep up with Github feature-wise.

Imo they're still leading Github in some areas. My favorite Gitlab exclusive feature is a very simple one: folders inside of projects. Usually Github has followed a pattern of catching up to Gitlab, so I think it's probably only a matter of time before they add this feature (if they haven't already, it's been a while since I checked).

Actually I take that back, my favorite feature is that I can host my own private instance. Gitlab gets even better if you have admin access to the system. And that's one feature Github will probably take a while to copy, if ever.

Operyl · on Feb 26, 2022

> And that's one feature Github will probably take a while to copy, if ever.

Eh you can already self host GitHub, longer than GitLab has been around, it’s just not free to do so.

ModernMech · on Feb 26, 2022

> it’s just not free to do so.

My general rule of thumb is that if a product lists its price as "Contact our sales team", then its effectively unavailable to me as an individual. So I guess if we're talking about what exactly the Github "feature" is that MS won't copy, it wouldn't be that there's an enterprise option, but that Gitlab is free and open source and practical for me as an individual to install. Obviously if you pay Microsoft enough money they'll do whatever you want (up to and including I guess buying Github itself as a company from them. Always has been an option, it's just not free to do so.)

Operyl · on Feb 26, 2022

Pricing isn’t hidden: https://github.com/pricing

$21 per user per month.

ModernMech · on Feb 26, 2022

Well that confirms I can't afford it!

Interestingly, this page isn't linked to from https://github.com/enterprise (except in the footer), where the calls to action are to either "start a free trial" or to "contact sales".

granzymes · on Feb 26, 2022

It's also prominently displayed in the header.

ModernMech · on Feb 26, 2022

Only if you're not logged in. If you're logged in it's not there at all.

Why does this feel like people are trying to turn this into some sort of gotcha? It wasn't even my main point. The point is that GitHub enterprise is not an option for individuals.

bastardoperator · on Feb 26, 2022

Hense the title "Enterprise"...

ModernMech · on Feb 26, 2022

Exactly. But the original argument was "well ackshually, you can self host Github, you just have to pay for it". Technically true, practically not.

Operyl · on Feb 26, 2022

While not practical for you, it's not priced out for every single small team or individual user ever. If I recall correctly, there's no minimum seat requirement still. You're changing the position of the goal post on your original comment :).

Original comment from you:

> Actually I take that back, my favorite feature is that I can host my own private instance. Gitlab gets even better if you have admin access to the system. And that's one feature Github will probably take a while to copy, if ever.

ModernMech · on Feb 27, 2022

Why are people being so pedantic about this? What is going on here?

I never claimed it's "priced out for every single small team or individual user ever". You even quoted it right there:

> my favorite feature is that I can host my own private instance.

I have 33 people on my personal Gitlab instance. It would cost me over $8k per year to run a Github Enterprise instance with that many users and I don't have the kind of money to do that. Until I can do that with Github for $0 then my point stands.

bastardoperator · on Feb 26, 2022

Yeah, enterprise pricing typically works that way. It's designed for businesses, not individuals. Smaller teams pay 3.33 cents a month based on the pricing page that was pointed out to you and individuals can sign up for free. I also find the copy claim to be entirely ironic.

They're both commercial companies chasing dollars. The OSS sales model is simple. Get companies/people hooked on OSS, have them reach the boundaries of the OSS product and then sell them a commercial license. Most people can and will just move to the better enterprise offering which is probably why I see so little GitLab out in the wild. Based on what I'm seeing/hearing from MSFT, GitHub is basically a free product that now comes with your MSFT enterprise agreement. If you agree to spend enough Azure or Visual Studio dollars MSFT agrees to pony up GitHub licenses. GitLab can't compete on that level, and the offering alone isn't compelling enough for your average fortune 500 when it comes to spending money.

PragmaticPulp · on Feb 26, 2022

I had to upgrade a slightly older, internal-only GitLab instance for a company a while ago. I was shocked by how many upgrade path dependency problems there were. Ended up having to roll back to a backup and do the upgrade in a stepwise fashion through various intermediate steps according to their long document on the process, including a fair number of manual commands and migrations and a lot of Googling for obscure error messages.

I enjoy GitLab and it’s the real only option for open self-hosting, but it made me miss the days of having GitHub Enterprise.

donmcronald · on Feb 26, 2022

Yeah. I just booted up an old VM to update it. It was on v13 and told me I needed to upgrade to 14.0 first, then 14.8. I did that. Now it's totally broken with crappy error messages and I'll get to waste a few hours today trying to fix it.

I switched to Gitea a long time ago and have no regrets.

cpitman · on Feb 27, 2022

Literally going through this right now. Performing the upgrade from 13 to 14, following the required upgrade path.

In the past, you could go from current version -> latest z release -> X.0 release of the next major -> latest minor release of Z. As long as gitlab completed the db migrations and came up, you were (relatively) ok to continue to the next upgrade.

Turns out 14 introduces a brand new upgrade failure case. Starting with 14, upgrades can include async db migrations that must complete before you continue upgrading. But once you start upgrading, it's too late, things are hosed, so now I'm falling back to a backup and starting over.

Schinken_ · on Feb 26, 2022

I am not even sure if most people would need all the features. Nice to have yes, but I have been running Gogs for at least 2 years and never thought I need more. This is for personal usage though.

Faaak · on Feb 26, 2022

CIs + runners + embedded registry is a really nice add-on that I couldn't do without

KronisLV · on Feb 26, 2022

Agreed! However the amount of maintenance that you need to do because of the vulnerabilities is disappointing, especially due to how contrived the upgrade paths are: https://docs.gitlab.com/ee/update/#upgrade-paths

Thus, it might be a better idea to look into a less popular stack that's less likely to be targeted as much due to not being such a juicy target.

For example:

  Code: Gitea/Gogs/GitBucket
  CI: Drone/Jenkins (okay there are probably better options than Jenkins, to be honest)
  Registry: Nexus/Artifactory (not just for containers, they support most formats and have better control over cleanup of old data so you don't have to schedule GitLab cleanup yourself)

Of course, at the end of the day all of those still have an attacks surface, so i'm really leaning more and more into the camp of exposing nothing publicly since it's a losing battle.

tenken · on Feb 26, 2022

What maintenance?! I just bump the docker-compose.yml version numbers and Stop/Start the service. It's very painless... My cellphone has more frequent updates than Gitlab does.

KronisLV · on Feb 26, 2022

If you do that with minor versions, you should generally be fine. When you need to upgrade across major versions, you'll most likely be met with the following in case you haven't followed the updates closely:

> It seems you are upgrading from major version X to major version Y.

> It is required to upgrade to the latest Y.0.x version first before proceeding.

> Please follow the upgrade documentation at https://docs.gitlab.com/ee/update/index.html#upgrading-to-a-...

In addition to that, you should NEVER just bump versions without having backups (which you've hopefully considered), so there is probably another step in there, either validating that your latest automatic backups work, or even just manually copying the current GitLab data directory into another folder, in the case of an Omnibus install, or doing the same manually for all components in the more distributed installation type.

Disclaimer: this has little do to with GitLab in particular but is something you should consider with any and all software packages that you upgrade, especially the kind with non-trivial dependencies and data storage mechanisms, like PostgreSQL. Of course, you can always dump the DB but it's easier to back up everything else as well by taking the instances offline and making data copies of all container volumes/bind mounts.

tenken · on Feb 26, 2022

* I never skip (minor) versions.

* I have automated backups created every 2 days stored via S3, I've done a full restore twice in 5+ years of uptime.

* I run Gitlab at home and at work.

None of the points touch on a maintenance burden ... Just saying. Skipping versions while updating any software is just being a lazy sysadmin and praying it works. Typically skipping to major versions during upgrades always comes with breaking changes so operator beware.

KronisLV · on Feb 27, 2022

> Skipping versions while updating any software is just being a lazy sysadmin and praying it works. Typically skipping to major versions during upgrades always comes with breaking changes so operator beware.

It's nice that GitLab actually prevent you from doing that and give you messages that it's unsupported and direct you to their documentation, which describes the supported upgrade paths...

However, at the same time one cannot help but to wonder about why you can't go from version #1 to version #999 in one go. Most of the software at my dayjob (at least the one that i've written) absolutely can do that - since the DB migrations are fully automated, even if i had to create a lot of pushback against other devs going: "Well, it would be easier just to tell the clients who are running this software to just do X manually when going from version Y to Z."

But GitLab's updates are largely automated (the Chef scripts and PostgreSQL migrations etc.), it's just that for some reason they either don't include all of them or require that you visit certain key points throughout the process, which cannot be skipped (e.g. certain milestone releases, as described in their docs).

Of course, i acknowledge that it's extremely hard to sustain backwards compatibility and i've seen numerous projects start out that way and the devs give up on the idea at first sign of difficulty, since it's not like they care much for that and it doesn't always lead to clear value add - it's a nice to have and they won't earn any less of a salary for making some ops' person's life harder down the line.

I also have automated backups with BackupPC, however i expect software to remain reasonably secure and stable without having to update that often - props to GitLab for disclosing the important releases, but i'm migrating over to Gitea for my personal needs as we speak, even if having someone else manage a GitLab install at work is still like having a superpower (with GitLab CI, GitLab Registry etc.).

I actually wrote an article about how really frequent updates cause problems and lots of churn: https://blog.kronis.dev/articles/never-update-anything (though the title is a bit tongue in cheek, as explained by the disclaimer at the top of the article).

tenken · on Feb 27, 2022

Your db migrations may support updates from #1 to #99 but your OS does not directly support updates of MySQL 5 to MySQL 8 with issues. For example there are plenty of examples of deprecated my.cnf configuration values. Similarly APT on Ubuntu will prompt how to handle a my.cnf that differs from the Distribution release when upgrading Versions. Often times this is more painful than minor version updates.

I think the version milestones in Gitlab are akin to dependency changes for self-hosted Gitlab. An example is the Gitlab v9 (?) Postgres upgrade to Postgres v11 I think, it was opt-in for a prior version of Gitlab than required at that version milestone. It's difficult to make db migration scripts for Gitlab,as in your example, that may depend on newer Postgres idioms not available in the legacy db version. So you can't simply support gitlab updates from X to Y version due to underlyng dependency constraints...

Thanks for the insightful discourse.

KronisLV · on Feb 27, 2022

> Your db migrations may support updates from #1 to #99 but your OS does not directly support updates of MySQL 5 to MySQL 8 with issues.

That's just the thing - more software out there should have a clear separation between the files needed to run it (binaries, other libraries), its configuration (either files, environment variables or a mix of both) and the data that's generated by it.

The binaries and libraries can easily have breaking changes and be incompatible with one another (essentially treat them as a blob that fits together, though dynamic linking muddies this). The configuration can also change, though it should be documented and the binaries should output warnings in the logs in such cases (like GitLab actually already does!). The data should have extra care taken to make it compatible between most versions, with at least forwards only migrations available in all other cases (since backwards compatible migrations are just too hard to do in practice).

Alas, i don't install most software on my servers anymore, merely Docker (or Podman, basically any OCI compatible technology) containers with specific volumes or bind mounts for the persistent data. GitLab is pretty good in this regard with its Omnibus install, though there are certainly a few problems with it if you try to do too many updates or have a non-standard configuration.

I actually wrote more about it and why i just migrated away from GitLab to Gitea, Sonatype Nexus and Drone CI on my blog: https://blog.kronis.dev/articles/goodbye-gitlab-hello-gitea-...

Of course, i'll still use GitLab in my company because there it's someone else's job to keep it running with a hopefully appropriate amount of resources to keep it that way with minimal downtime and all the relevant updates. But at the same time, for certain circumstances (like my memory constrained homelab setup), it makes sense to look into multiple lightweight integrated solutions.

You can actually find more information about what broke for me while doing updates in particular, seemingly something cgroups related with gitaly related stuff not having the write permissions needed inside of the container, which later lead to the embedded PostgreSQL failing catastrophically. In comparison, right now i just have Gitea for similar goals which is a single binary that uses an SQLite database, as well as the other aforementioned tools for CI and storage of artefacts, which are similarly decoupled.

It's probably all about constraints, drawbacks and finding what works for you best!

dnsmichi · on Feb 26, 2022

> how contrived the upgrade paths are: https://docs.gitlab.com/ee/update/#upgrade-paths

Thanks for your feedback, agreed. I've created an issue https://gitlab.com/gitlab-org/gitlab/-/issues/353862 - please add additional thoughts and suggestions there as well. Thanks!

KronisLV · on Feb 26, 2022

Well, i don't think that there's actually that much that can be done here, since that page does contain adequate documentation and a linear example migration path:

  8.11.Z -> 8.12.0 -> 8.17.7 -> 9.5.10 -> 10.8.7 -> 11.11.8 -> 12.0.12 -> 12.1.17 -> 12.10.14 -> 13.0.14 -> 13.1.11 -> 13.8.8 -> 13.12.15 -> 14.0.12 -> latest 14.Y.Z

It's just that the process itself is troublesome, in that you can'd just go from let's say 11.11.8 to 14.6.5 in one go and let all of the thousands of changes and migrations be applied automatically with no problems, as some software out there attempts to (with varying degrees of success).

Of course, it's probably not viable due to the significant changes that the actual application undergoes and therefore one just needs to bite the bullet and deal with either constant small updates or few and longer updates for private instances.

But hey, thanks for creating the issue and best of luck!

forty · on Feb 26, 2022

Side question: is the gitlab docker registry really that useful?

The fact that any non-protected runner can push to it makes it useless to store images to be used in other CI pipeline (unless I missed something, which I would be really glad)

mhio · on Feb 27, 2022

The registry is fine for our use case, which is to manage all the dev artefacts on a private repo before CI might push a release to a production registry.

> any non-protected runner can push to it

My understanding is the job token inherits its permission set from the user causing the job to run. If the user has `write_registry` to a project (developer up), then the job does. Do you see more access than that?

The access can be limited per project to specific projects by setting a scope [0] but your description sounds like it might access within the project that is the issue.

0: https://gitlab.sam3.io/help/ci/jobs/ci_job_token#configure-t...

forty · on Feb 27, 2022

The whole point of having protected runners is to do things that developers are not allowed to do. If any developer can push images to the registry without any review/approval, and those images are used in other CI pipeline that's a problem for us.

Having a separate production registry is good indeed, but for images to be used for CI itself, having something self-contained within gitlab would have been nice.

0xbadcafebee · on Feb 26, 2022

> it's madness to even consider running a publicly-reachable Gitlab instance...

why would you want it to be a public instance?

thfuran · on Feb 26, 2022

That seems pretty reasonable for any open source project to want.