Hacker News new | past | comments | ask | show | jobs | submit | hesselink's comments login

Does anyone else find this article unreadable? It sounds more like a marketing piece than an explanation of what merge queue is.


Merge queues are, as the name implies, queues for pull requests/merges. They're kinda useless if your commit traffic is low (e.g. <10 per day), but become necessary once it grows past your daily CI time budget, roughly (which can happen on large monorepos).

As a very simple example, if your CI takes 10 minutes, your CI time budget is 6 merges per hour.

This is because if you merge two things in parallel without validating CI for the combined changes, your main branch could end up in a broken state.

Merge queues run CI for groups of PRs. If the group passes, all the PRs in the group land simultaneously. If it does not, the group is discarded while other group permutations are still running in parallel.

This way you can run more "sequential" CI validation runs than your CI time budget allows.

In our monorepo, we get a volume of 200-300 commits per day with CI SLO of 20 mins.

Without a queue, our best case scenario would be getting capped at ~72 commits per day before seeing regressions on main despite fully green CI (in real life, you'd see regressions a lot earlier though because throughput of PRs is spiky in nature)


> Merge queues run CI for groups of PRs. If the group passes, all the PRs in the group land simultaneously. If it does not, the group is discarded while other group permutations are still running in parallel.

That is a way of handling even higher volumes than GitHub is talking about, at the cost of a system that is a bit harder to think about. From the article:

With GitHub’s merge queue, a temporary branch is created that contains: the latest changes from the base branch, the changes from other pull requests already in the queue, and the changes from your pull request. CI then starts, with the expectation that all required status checks must pass before the branch (and the pull requests it represents) are merged.


The core principle is the same. How permutations are selected, of course, affects the performance and usability of the system.

Uber's[0] implementation, for example, does some more sophisticated speculation than just picking up whatever is sitting on the queue at the time.

Queues come with quirks, e.g. small PRs can get "blocked" behind a giant monorepo-wide codemod, for example. Naturally, one needs to consider the ROI of implementing techniques against aberrant cases vs their overall impact.

[0] https://www.uber.com/blog/research/keeping-master-green-at-s...


GitHub's merge queue does support merging multiple PRs in a single merge operation. It's the "Maximum pull requests to merge" setting


It's awful.

I scrolled down to the how does it work section where the first sentence is:

> Merge queue is designed for high-performance teams where multiple users regularly commit to a single branch

Half of the how does it work section is buzzwordy fluff.


They don't need too since the link posted here is literally the press release announcement. For the inner details one should look at the documentation at https://docs.github.com/en/repositories/configuring-branches... . It for example has a detailed example as how it handles a pull requests failing ahead in the queue https://docs.github.com/en/repositories/configuring-branches...


Collaborative coding is powerful. But to be at your team’s most optimized state, you need automated branch management that enables multiple developers to commit code on a daily basis, without frustration. This can happen if your team’s branch is busy with many team members onboarding the same commit onramp. This can be frustrating for your team, but, more importantly, it gets in the way of shipping velocity. We don’t want that journey for you!

This is why we built merge queue. We’ve reduced the tension between branch stability and velocity. Merge queue takes care of making sure your pull request is compatible with other changes ahead of it and alerting you if something goes wrong. The result: your team can focus on the good stuff—write, submit, and commit. No tool sprawls here. This flow is still in the same place with the enablement of a modified merge button because GitHub remains your one-stop-shop for an integrated, enterprise-ready platform with the industry’s best collaboration tools.


You didn't copy all the emojis! :P


I think this website filters them.


Unbearable corporate buzzword soup. Yikes.


Banal stuff from ten years ago devops continuous delivery material. It’s a good feature maybe you’re just unfamiliar with some of the theory basics?


The feature may be good, but no theory on earth can make me read stuff like that without getting sick.


I read it yesterday and couldn't figure out exactly what they were talking about and upon re-reading, yeah, it's bad marketing copy.

The problem is probably whoever wrote the blog post (who is likely not even the named author, depending on how their marketing team does things) tried to add a lot of high-level stuff to make it make sense to them without really needing to understand the details, and then dolled it up with a bunch of useless vapid quotes from customers and what not, because that is what marketing people think matters. Maybe it does make sense to have mealy-mouthed corporate speak for the overall product, since some executive is probably deciding whether to use GitHub as a whole and they might care if a big company uses it. I don't know that it makes much sense for specific features like this, especially in a fairly technical product like GitHub.


It's totally unreadable. There's 5% meat in it almost at the very end, the rest is about selling the feature.

The problem is that if you have multiple branches going into the same (mono)-repo, then they might all pass a localized CI-check, but fail if they are all merged. This is because the branches have an interaction between them. It can lead to a stall in commits and because everything hinges on the repo, work is going to stall as well.

So you serialize the branches, and impose an order on them: [x_1, x_2, x_3, ...]. Now, when running CI on one of these, x_j say, you do so in a temporary branch containing every branch x_i with i < j. This will avoid a stall up to branch x_j, if you started to merge the branches in order. If CI fails on branch x_j, you remove it from the list (queue) of branches to be merged and continue.


I feel like since the Microsoft acquisition almost all their communications at all levels have gone from detailed info about features to fluff marketing pieces.

Beyond that, their API docs prior to the acquisition were some of the best in the industry, readable and concise. Now they are just a complicated mess.


> I feel like since the Microsoft acquisition almost all their communications at all levels have gone from detailed info about features to fluff marketing pieces.

Comms teams are really terrible in this regard. They insist on a singular 'voice', which means that every article is going to go through their review and get rewritten to their standard - that standard may involve removing technical content and instead making it more layman/ marketing friendly.

It's an incredible mistake that I see made everywhere after companies hit a certain size. It then becomes up to engineers to build their own engineering blog with less oversight and then guarding it from the comms teams, which most engineers aren't interested in doing.


For GitHub the layman is a programmer no? So why remove technical info


The layman is, to a comms team, a manager, CISO, or some other mystery person I really couldn't explain to you. Yes, it's ridiculous and incorrect but that's my point.


> Does anyone else find this article unreadable? It sounds more like a marketing piece than an explanation of what merge queue is.

Yes. The first useful line in the article is

"With GitHub’s merge queue, a temporary branch is created that contains"

and to reach there you have to skip fluff paragraphs halfway down the article.


I also have a hard time understanding what it really is.

What I think it is: instead of you trying to merge into the main branch, you try to merge into a branch where all pull requests before you are already merged in.

That way any pull request before you can't cause any merge conflicts, because they are already taken into account.

At least that's what I deduct from all the marketing fluff. Maybe I'm completely wrong.


Yes, but I find it hardly different from most articles today. Whether it's a blog post or a news story, an article will likely be a mediocre exercise in creative writing (with the intent to persuade) or a bunch of marketing waffle.


It's only readable if you already know what merge queues are. Or reading their actual docs [1].

Merge queues address the problem of how to (1) merge in a lot of changes (2) while guaranteeing no breaking/conflicting changes are merged.

[1] https://docs.github.com/en/repositories/configuring-branches...


https://graphite.dev/blog/what-is-a-merge-queue

This explanation is actually a lot better


Yeah, I read the whole thing, sounds interesting, came here to see if someone could actually explain what it is.


It automates the post-approval coordination stages of a PR for maintainers.

Let's say you're an open source maintainer with 3 pending Pull Requests to merge: [1, 2, 3]. Each of which is based off `main`, has passed CI and has been approved.

If you merge all 3 at the same time, there is a chance to break the build: Your CI is testing `main <- 2`, but you're merging `main <- 1 <- 2`. A common example would be when (1) is a user-supplied change, and (2) is a dependency/localisation change, which don't cause merge conflicts but they do break the build/tests.

To do this safely, you need to re-run CI on (2) after merging (1), which is currently a manual process: you need to know that (2) is next to be merged, then rebase/pull + rerun CI for (2).

(There used to be a manual step of 'merge once CI is passed' here, GitHub has recently improved this workflow to allow automation)

Merge queues fully automate the safe approach: it merges (1), runs CI on (2) which fails, then runs CI on (3), which passes and gets merged.


What happens if someone wants to merge when the queue is already running CI? Does it interrupt CI and start over, or does it run CI to the end and then kick CI off again with every new merge added to the queue since the last CI kickoff? Or does it merge on a successful CI and put together a new queue with those new waiting merges right after?


Thanks! Github should hire you to write these posts.


You know when you submit a PR at the same time as someone else, they both pass CI, and then both merge without editing conflicts, but then it turns out there were semantic conflicts and you accidentally broke `master`?

This fixes that. It removes the race condition that exists because of the gap between testing a branch and merging it.

The solution is very simple - have a queue of PRs and automatically test & merge them one at a time.

There are some optimisations you can do to speed things up a bit, e.g. testing a bundle of PRs all at once, but that's the gist of it.

It is basically essential on any repo that has a high rate of PRs. I'm surprised so many people here haven't heard of it.

Gitlab has the same feature but they annoyingly called it something worse - merge trains, and it's only in Gitlab Premium.


to me it's the ability to test your merge on a virtual main branch


Yes, this set off all my 'gpt warning bells'. Anyone know the latest on automatic gpt detectors? Feels like it should be easy but last I checked they had a lot of false positives.


> Anyone know the latest on automatic gpt detectors?

There are many out there but I don't know about the "latest". GPT-4 itself says it has only a 10% chance of having been generated by a LLM.

These detectors are really unreliable. I've fed them content that I generated from GPT-4 and they never detect it as AI-generated.

I pity the students whose teachers will use them to detect plagiarism.


The problem is that you need to actually do training to detect AI text, and no one wants to spend money on that. The actual implementation is very easy:

1. get a corpus of real text

2. generate a corpus of AI text

3. train a model until it can tell the difference

The problem is step 2 is semi-expensive and step 3 is really expensive, so everyone is trying to shortcut the process, and of course it doesn't work.


I found the images and animations at the bottom extremely descriptive. I've been needing a feature like this, and it's immediately intuitive for me.


Yes. But look at the bottom. There's an image with the PR review screen. There's one change:

* Normally, the big green button says "Merge pull request"

* Now, the big green button says "Merge when ready"

In a large project with lots of activity, a stampede of people pressing "Merge" at the same time will cause trouble. "Merge when ready" is supposed to solve this.

It seems to mean:

> "GH, please merge this, but take it slow. Re-run the tests a few extra times to be sure."


Here's in-depth details on how it works. [1] Basically, each PR gets put in its own branch with the main branch + all the PRs ahead of it merged in. After tests pass, they are merged in order.

[1] https://docs.github.com/en/repositories/configuring-branches...


Aha, so GitHub merge queue = GitLab merge trains (or at least very similar).


Yes that’s pretty much what it is. Both are replicas of bors, and implementations https://graydon.livejournal.com/186550.html


Bors is also very similar to the Zuul CI system used for OpenStack. It has the equivalent of a merge queue (with additional support for cross repositories dependencies): https://zuul-ci.org/docs/zuul/latest/gating.html You can then have pull requests from different repositories all serialized in the same queue ensuring you don't break tests from any of the repositories participating.


Also continuous integration best practices advance one funeral at a time, it seems.


So does each new PR start new tests that will supersede the previous PR’s tests? If one PR’s tests fail, does it block all PRs behind it in the queue?

I’ve read docs several times and never found them very clear about the details.


Each PR on the queue is tested with whatever commits it would have were it merged to the target branch in queue order. So if the target branch already has commit A and commits B and C are in queue, commit D will be tested on its own temporary branch with commits A B C and D. If the tests for C fail, C is removed from the queue, and D is retested with just commits A B and D (because that's what would be on the target branch by the time it merges).


OK, thank you.


Its an announcement article that I think sells it pretty well. Its not product documentation.


Yeah, the embedded video helps a bit.


It's completely embarrassing, whatever marketing person wrote it needs to be got rid of.

> The result: your team can focus on the good stuff—write, submit, and commit. No tool sprawls here.

The good stuff? Tool sprawls? Is this written for teenagers?

> Merge queue is designed for high-performance teams where multiple users regularly commit to a single branch.

I think you meant "highly active". High performance means something else. But I can kind of see it emerging from your awful sales person brain.


You can be critical without being unnecessarily harsh.


Not everything that is harsh is unnecessary.


In what world is ranting about PR copy and saying someone should lose their job over quite literally doing what they were asked necessary?

Who, exactly, is it necessary for? The original commenter getting their rocks off on insulting someone else’s job? Others coming in and laughing at someone insulting someone else? Critically necessary.

It’s not like the original article author is going to come in, see this comment, and reflect deeply on themselves and their work.


I admit I was harboring a small hope that someone from GitHub / Microsoft might see the criticism here (not just mine) and that it might help reduce the frequency with which that sort of sales person tries to communicate with their market of software engineers. It was a bit unpleasant to suggest someone should lose their job. While they were presumably asked to write the piece, they were not asked to write it so tastelessly.


> While they were presumably asked to write the piece, they were not asked to write it so tastelessly.

So you sit next to them and therefore know what their assignment was and how well they executed on it?


If the marketing person that wrote this was told to write this, I don't see why he should be got rid of.


For doing such a bad job I guess. Marketing doesn't have to be written like the audience are teenagers. But yes it was a bit unpleasant to say that.


Or maybe "high-contention"


It is worth noting that according to a (Dutch language) article I read this morning, 90% of the listings that were removed had not been booked in the last year according to an Airbnb spokesperson.

Paywalled Dutch language link: https://www.volkskrant.nl/nieuws-achtergrond/airbnb-raakt-dr...


Almost as if there was some global event preventing travel.


Non-paywalled Dutch language link: https://www.ad.nl/amsterdam/verhuurplatforms-door-registrati...

> Het gaat vooral om slapende advertenties van woningen die al een tijd niet verhuurd werden.

> It mainly concerns sleeping listings of units that have not been rented out for a while.

The number of listings is also expected to increase again if the number of tourists bounces back up after Covid.

Of course, the linked article also mentions this, albeit without claims of it being large part of the number:

> This may concern ‘dormant’ advertisements of which the tenant is no longer active. For example, because the corona pandemic has shut down tourism in Amsterdam for a long time.


I was confused as well. I think (from some other comments) that the issue is that you can scroll to text that is specific to you and others might not have on their page (e.g. 'cancer' on some medical page) and then somehow gather from the requests that you scrolled there. It seems pretty hypothetical, but I can see the issue with forcing a scroll depending on what text is on the page, I guess.


And yet you provide none. What do you base this on?


It isn't, I don't understand where this comment comes from. You can derive the entire thing without writing a line of parsing code.


Haskell type classes by default pick the implementation based on a single type variable in its signature, but it can occur in multiple positions (including the return type). A commonly used language extension (multi parameter type classes) extends this to an arbitrary number of type variables.


The thing is, you're focusing on when you've detected that there is an issue (a crash). A lot of the issues with NULL are the fact that you can't easily detect if beforehand. It's not indicated in the types, or the syntax. That means that it's incredibly easy for a NULL issue to sneak into an uncommon branch or scenario, only to be hit in production.


But why was the scenario not tested before production? Should that not be the case anyway?


You can never guarantee you really have 100% test coverage in all scenarios in complex software.


Indeed. It gets asymptotically more expensive. Whereas a typechecker is a system of tests that's able to "cover" 100% of the code.


It’s cute that you think tests find all problems.


Ctrl+V for making and operating on block selections.


Interesting, it doesn't seem to work here because my employer's VPN blocks the torrents.


To add to that, there's also the fact that any user of any ISP that throttles/blocks torrent traffic is probably going to have the same experience. Plus the fact that there's at least two people I know whose ISPs actually cut users' down speed by double of any up speed (in order to "passively" cut down torrenting without actually restricting it). So PeerTube becomes less a YouTube replacement and more an interesting implementation of streaming video via Bittorrent (which some clients already support).


I aliased 'g' to 'git'. Combined with git aliases ('st' for 'status' etc) this is almost as short but doesn't steal all the short names from your shell. So instead of 'gs' you have 'g st'. Upside is you save two letters even on uncommon git commands :)


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: