I would love to see a chart of traffic to other sites when GitHub goes down. My bet is that HackerNews and Twitter both get significant spikes from all those bored developers.
Bored? Git's a distributed version control system, so no excuses. Get back to work!
But in all seriousness I kind of wish GitHub provided a way to mirror things like issues and PRs so you never have to be fully reliant on one service. Not being able to read these really does make it impossible to get work done offline.
Go has a mirror of our GitHub project via this thing "Maintner" I wrote (running at http://maintner.golang.org/) that syncs GitHub in realtime to a log of mutations. (As well as syncing Gerrit and all its comments etc).
So then we can slurp all of our GitHub & Gerrit history into RAM (takes about 5 seconds and 500 MB) via https://godoc.org/golang.org/x/build/maintner/godata#Get and walk it in-memory and do stuff with in. (runs our realtime bots, alternate web UIs on planes, alternate search, stats, etc.)
> Maintner is short for "Maintainer"...the name of the daemon that serves the maintner data to other tools is "maintnerd".
Nice work! But as is custom on HN, I'll bikeshed on the name instead of delving into the contents of the tool. Why shorten the word by just 2 letters? Is there something special about the tooling that makes 8-letter projects more desirable than 10-letter projects, or is it linked to the removal of Artificial Intelligence from the process? I'd personally mis-type that name all the time.
An inspiring new word invented by a redditor on March 23, 2014. The redditor was actually trying to spell exaggerated. This word was misspelled so horribly that when it was googled the only thing that came up was a link to the comment itself.
I keep meaning to dig into Fossil (SQLite's VCS/Project Management system), but I have no faith I could convince a team to use it.
Another model to look at is Trac, which had pretty extensive integration with SVN and integrated (ie, cross-linked) issue tracking and wiki, and stored all the data and change history in a svn repository.
> I keep meaning to dig into Fossil, but I have no faith I could convince a team to use it.
Developer or Fossil and SQLite here: I agree. In my experience, you'd have better luck convincing the team to switch from vi to emacs. For all its many and well-documented faults, the Git/GitHub paradigm is what people want to use because it is what they are familiar with.
All the same, I intend to keep right on using Fossil, thank you very much!
So here is the idea I've been thinking of lately: What if Fossil were ported or enhanced to use Git's low-level file-formats so that unmodified git clients could seamlessly push and pull against the (enhanced) Fossil server. Call the new system "Fit" (Fossil+Git). Using Fit, you could stand up a GitHub replacement for an individual project in 5 minutes using nothing more than a 2-line CGI script. Git fan-boys could continue to use their preferred interface, while others who prefer a more rational and user-friendly design could use the Fossil-like "fit" command. Everybody could share code, and everybody would have a nice web-based interface with which to collaborate with tickets and wiki and all the other cool (and to my mind essential) stuff that Fossil provides. And nobody who already knows git would be forced to learn a new command-line interface.
I'd be all over writing the code for "Fit", except that I'm already over-extended. Anybody who thinks this is a good idea and would like to pitch in and collaborate, please contact me privately. Thanks.
That sounds like a great idea. By leveraging Fossil's existing UI and getting it to work with Git, it might really gain a lot of traction, and become a viable alternative to roll-your-own-github services like GitLab Community Edition.
To be attractive to github users it would have to have a better issue tracker and a better patch review system. Given that sqlite uses a mailing list for both I'm guessing it doesn't have that.
My guess is their sales pitch would be for you to run your own instance of Github Enterprise or doing something fancy with Web Hooks. At work we did the former for a while before switching to Gitlab. We had less downtime than Github (not that they have much) but since the ops is on you in that situation, YMMV.
How about storing issues and PRs in the actual git repository? They should index them for the UI, sure, but the source of truth should be a branch of the git repo just like it is with gh-pages. It should be possible to file a new issue by committing a markdown file to the correct branch and pushing it to Github. Their hub command line tool and their own client could facilitate adding all the correct metadata. This would allow people to work while Github.com is offline and synchronize everything once the service outage is over.
If you have access to a distributed database that is already the best available for manual conflict resolution, why wouldn't you want to use it to store this kind of data? A Github outage is like a partition when you think about it from a distributed db perspective, so treat it like an individual node died and still allow reads/writes to the rest of the cluster that can be merged back once the node comes back online.
> How about storing issues and PRs in the actual git repository?
Access to the git repository is regulated; issues and PRs aren’t. Anybody can fill an issue/PR on your repo, but only you (and your team) can modify the repo. You’d need to also store all the comments on all issues and PRs, even closed/rejected ones. In some repositories, that’d be huge.
> And the amount of actual data needed to store issues is peanuts anyway.
It’s not. Take something like github.com/Homebrew/homebrew-core. There are 15k closed PRs there. There have been 20 new ones today. It’s not rare to have 10-20 comments per PRs; some of them even go over 200. Add CI status, actual PR contents (git patches), comments reactions, edits, labels, milestones, assignees, reviews, projects.
IMHO having a tool to fetch issues locally is a good idea; storing them in the repo is not.
From the perspective of someone mid code review during the outage, it's worked as well as it could too. They preserve comment drafts client side that haven't posted to the server yet across page reloads. While it's a little frustrating to have to submit something 3-5x, they're definitely doing something for this use case already.
Also as others pointed out, just because the GitHub app is down doesn't always mean the GitHub git server itself is down.
It would be clever to store issues and PRs on a special branch in the repository. I can't currently think of why this wouldn't work (apart from business-wise, it would reduce lock-in considerably).
Well, it's unlikely to be feasible due to Git's on disk structure.
With git, every commit has an sha checksum of the commit contents & metadata. And each commit's metadata points to the previous commit.
So, if a maintainer wanted to fix a typo or do any other kind of correction/update to an old issue you'd need to rewrite every commit on disk since that one with the updated chain of checksums. eg from the fixed comment, to the head of the special issue/pr branch.
That alone would probably make syncing with the repo's a real pain for anyone working on medium to larger sized projects. ;)
Your objection is predicated on the assumption that issues would need to be edited "for all time". That's not true for anything else in git, I'm not sure why that would be true for issues. I mean, if you had an "issues/" directory in your repo, and files named "issue1" "issue2" and so on, you would just edit the issue whenever. If someone made a branch for the issue, then they'd want to merge your edit into their branch ASAP - which is good, because it indicates that they read and understood the change. And this merge would be particularly painless since changes to "issues/" are absolutely not going to conflict with anything in "src/".
I agree... I wish GitLab would do this too. No reason that everything can't be modeled as a git branch.
GitLab I know models their Wiki as a git repo. I think issues and PRs should be their own repo as well.... a branch for each issue or pr? Could tie it all up using submodules (or not, whatever)... just please someone take the jump first and do this.
DISCLAIMER: I work for GitLab.
We have a feature to replicate GitLab to different locations (EEP). We call it "Geo", which stands for "Geographical Replication". Part of the "Geo" effort is Disaster Recovery which is under heavily development. We want with Disaster Recovery to be able to reliably promote any Secondary node to a Primary, so if your US datacenter melts down under a nuclear war, you can start working with your copy in EU, India, China, etc.
Or you could just read disaster porn^W^W post-apocalyptic sci fi.
So, a lot depends on what kinds of info you want to find. But overall, I'd say it's not worth reading, because life's not going to be much fun. Even if you're prepared.
(only partially serious here) I'm not sure recovery from a nuclear event that wipes out US data centers is economically feasible. If an event were to effectively wipe out 20% of the worlds economy I think having my github issues online is the least of my worries.
If it's not Git based then I'm not interested in it.
I see no reason to have a RDBMS for issues, merge requests, pull requests, etc. All of these could be modeled in multiple ways using a Git database. Issues are files in a repo called issue1.md, issue2.md ... or issues are branches on a repo ... or something else.
But they want you to Rely on Github... if they could find away to make Git only work with Github they would in a heart bet.
Github is not a open source company, it is not really even a supporter of free software IMO they are a danger to free software for this very reason. They are following the old school Microsoft model of Embrace and Extend... I am waiting to see if they can extinguish,
GitHub clearly contributes a fair bit to open source - not only their own projects, but the free hosting for open source projects. You have no basis for this accusation, and frankly we are all stupider for having read it.
1 Claim was that Github is not a open source company, and they are not. Their core business is not open source, yes they have some side projects that are, but so does MS, I do not believe anyone is going to say MS is a "open source company"
the second claim I made is they are a threat to free software, I used free software here for a reason, free software and open source are different.
GitHub may support some Open Source things, they do NOT support free software.
Git hubs general actions over the years show they do not support free software at all, and have limited support for Open Source software.
They are classic Free Software leaches, using and abusing free software not for the ethical reasons around free software but to advance their profit center
See I am support free software for ethical reasons, not monetary reasons.
That is with out even getting into the MASSIVE issue that is having all of these open source projects on a SINGLE platform. Unless you think github is "too big to fail" which I can assure you it is not
GitLab Enterprise Edition does not have an open source license. Since GitLab.com runs GitLab EE, their two (presumably) primary sources of revenue (EE licenses and GitLab.com paid accounts) come from non-open source software.
But: it's still fair to call GitLab as a company way more Open Source than GitHub. All of their development happens out in the open, the vast majority of their codebase is Open Source, and the source code for GitLab EE is even made available.
> if they could find away to make Git only work with Github they would in a heart bet.
Sorry, but this is unfair :) I remember the early days of Gitlab, before they added all the insanely cool features they have now, when they were just a Github opensource copycat. My thought at the time was : "wow, github is super cool to let them go. At least the almost exact design copy could be a legal problem". Clearly, we would have heard about Github vs Gitlab back then, if Github wanted to lock people in.
Well, if we "heard about GitHub vs. Gitlab" more people outside HN might hear about GitLab. Might be more dangerous, than the chances of a win shutting GitLab down.
It is partly the MS EEE model and partly the Adobe model, where they give away free or low cost services to get indivuals hooked, often at young age, then when they are employed at larger firms they push the firms to adopt that software internally.
Get a bunch of Open Source Developers to do your marketing at their "real jobs" so they can sell the Enterprise Version.
Since GitHub is a private company it is Unknown (at least I can not find the info) if they are profitable or not, or what their revenue numbers even are so it is unclear if that is a successful plan or not.
GitHub could very well run into the same problems as SoundCloud.
That's our chart of backend sessions / sec for GitLab.com [1].
There's also a chart for frontend sessions [2], please keep in mind that there's a lot of CI / bot sessions on these dashboards. In addition to that there's much other interesting info on our HAProxy Grafana dashboard [3]. In case anyone's interested in even more metrics, there's a bunch of interesting dashboards on our Grafana instance [4].
There's also plenty of DIY distributed issue tracker tools to keep issue tracking inside a git repo (often as yaml+markdown). I'm still surprised a good one hasn't stepped up with direct GitHub issue/project tool integration+sync, but that should certainly be a possibility.
Sorry to hear you're having trouble scaling GitLab. We have many organizations running GitLab and successfully scaling to 10's of thousands of users, and GitLab.com which is the largest GitLab installation. As a GitLab Enterprise customer our support team is happy to help you review your scaling problems and resolve them. Please submit a support request at https://support.gitlab.com.
O crap I meant to say GitHUB Enterprise. That's what has trouble. We also have Gitlab Enterprise and that's fine. Sadly for reasons internal we are mostly stuck with GitHub E.
I'm glad to hear that GitLab is working fine for you. And I'm sorry that you feel stuck with GitHub Enterprise. Please know that GitLab has a high fidelity importer https://docs.gitlab.com/ee/workflow/importing/import_project... and our support team has a lot of experience with assisting people with the migration if the migration is the problem.
When I break our GitHub webhooks, I joke it's time for people to practice our Disaster Recovery (DR) procedures. In all seriousness, this is a good opportunity to practice work without GitHub. Any service can go down; can you deploy a critical bug fix without it? If not, why not and what can you do to fix it?
To the best of my knowledge, GitHub org and usernames in the URI are case-insensitive for both the website and clone URIs. I haven't tested ssh clone URIs to know if they are also insensitive, but I'd guess they are
You would only need to change it if the "presentation" format bothers you (again, as far as I know)
Is there, at least theoretically, a way to prevent other people from pushing to my repo? That seems like it would suck re griefing for any project that might become even mildly politically sensitive for whatever reason.
That states: "git-ssb's permissionless model has an interesting consequence: anybody can push to anybody else's git repository." The guide doesn't show any key-sharing in order to do that. Are you saying that's incorrect?
Marak It sounds like you are working with private repos. With git-ssb currently a repo is either public or private. Private repos are encrypted to a fixed set of recipients so only those keyholders can access it. Public repos are unencrypted.
Yes, so basically in a centralized permission model some authority (the database) decides if any write is authorized or not, but in decentralized, any peer just writes anything, and then the readers decide whether they interpret that as valid or not.
Here is a description of a model that embraces both any-one-can-edit with degrees of consensus on who is allowed to edit.
http://viewer.scuttlebot.io/%25GKmZNjjB3voORbvg8Jm4Jy2r0tvJj...
but if you decide that someone cannot edit it, from there perspective they still can, but they are just excluded from your perspective.
Your comment is entirely nonsensical in this context. I want a way to be able to publish a repo and have the people subscribing to the repo be able to only pay attention to my changes in an automated way. This software currently doesn't implement that, as far as the guide that was linked suggests. It is therefore utterly useless - every time someone decides to grief my repo, it requires manual intervention to resolve.
Once you have that very, very basic ability to replicate what people expect when they subscribe to a person's git repository, you can start playing with automatically merging together people's changes - but in practice, merge conflicts are a thing and there's no good way to resolve them. If you can come up with a way to automatically resolve merge conflicts, you'd be rich, frankly speaking.
you said
> Is there, at least theoretically, a way to prevent other people from pushing to my repo?
so I answered, _yes, theoretically_ we have ideas for how to implement that. you can also unfollow and block griefers, but so far pretty much everyone has been nice and we just havn't needed to implement that yet.
How do you intend to automatically resolve merge conflicts, which is what your document suggests you want to do?
Why is this not resolved by a good permissions model and the ability to fork? Why should my users have to care about blocking griefers when they just want to pull from repo?
Guessing absolute failure rate previously not encountered by current graphing tool, incorrect scaling factor. Or, it was decided at some point that always reporting 0.000XXX% failure rate, even if correct, didn't offer an intuitive metric, so zeroes were intentionally truncated.
#3) distributed source code merges based on content hashes (SHA1) instead of using centralized locks/unlocks (check in / check out) model of CVS/SVN.
Git itself only takes care of #3.
Github handles #1 and #2 (and also gets #3 by being built on top of Git).
You can't go back in time and wonder if Linus should have addressed #1 and #2 because he wasn't interested in starting a hosting company. Instead, he focused on the data format (Merkle trees, BLOBs, SHA1) and a sync protocol (git pull, etc) for Git.
If people wonder why we can't just use email for #1 (communications), you have to see that Github has become a "Schelling Point"[1]. Attempting to use email groups & mailing lists will not prevent the emergence of a Schelling point. Email can be a workflow for existing contributors (e.g. contributors of the Linux kernel source) but it's not convenient for discovery of new repositories (e.g. the web's "landing page" of a repo).
As for #2 (hosting), not everybody who wants to share a repository wants to pay $9.99/month VPS or other hosting plan from a web hosting provider. It would also be inconvenient to host it from the home laptop and punch a hole through the ISP router to make it work. Github solves hosting+bandwidth for free for modest non-commercial projects.
To restate, Linus' Git is a distributed _protocol_ but Github is a _service_ acting as a platform for the distributed protocol.
#1 can be done by creating a separate submodule repo that only stores docs+issues files. It's up to the repo's users to agree on a system by which the files should be organized, but it's doable.
I'd propose directories "issues/open", "issues/closed", with each issue filename being "{created:yyyy-MM-dd} - {subject}.md". Symlinks could be used to track ownership/responsibility if each repo contributor has their own directory in the repo too.
It doesn't have to be a "crappy manual system" - I'm simply suggesting that given that git itself is a damned good distributed versioning database for arbitrary content, then we might as well also use it for distributed issue-tracking. A simple offline-mode browser-based editor that lives in a single HTML file within the repo would provide a nice GUI on top.
Hmm, I think I might be on to something... anyone want to start a project?
>git itself is a damned good distributed versioning database for arbitrary content, then we might as well also use it for distributed issue-tracking.
For what it's worth, it's interesting to see that the Fossil distributed SCM includes an issue tracker but they made a deliberate architecture decision to not propagate the tickets data.[1] They had a chance to make your "distributed-issues-tracking" idea a 1st-class concept in Fossil but decided against it.
Also, the issues/tickets is just one example feature. Github will continue to evolve to add more and more sophisticatted SDL/ALM (application lifecycle management) like JIRA and Microsoft Team Foundation Server.
Those features are not easy to implement in a peer-2-peer SCM with practical usability.
Thank you for the link. I read through their justifications and I think using a git-submodule solves their problems of polluting the main project history and permissions issue. Using directories for mutually-exclusive state grouping (e.g. "closed"/"open"/"new") solves the directory problem.
The reason is github did a fantastic job of implementing useful features. The visual design is unmatched and they have done a great job implementing developer oriented integrations and social features.
A more federated approach to this sort of thing might have been nice, but so far nothing I have seen comes close to the value-add offered by github.
You're sidestepping the main reason I believe it worked so well. It benefits from network effect. It is a collaborative tool and people like to have their work on there so others can collaborate with them.
That's important but I think people over-estimate it. It's both. I predict that if you analyzed the github network, you'd find many hubs are based around companies that chose to move their workflows to github based on features other than network effects. Or at least, the existing network was only one of many reasons.
As a business, we (and most other companies I know) chose github for features and performance. Nice that other open source stuff is there but doesnt matter for what we pay for.
Lots of answers so I'll try to address them all here. It was a rethorical question. We know what GitHub offers. I'm not a fan of its UI, particularly on mobile, but that's beside the point.
The point is, why we failed, once more, to have a distributed solution, even when the underlying tech assumes it.
Email was the last widely successful distributed medium. And it's dying, unfortunately.
Of course centralized services are easier to implement and use. Doesn't mean we should settle.
Ok, that's not really the same thing but either way, what's the point of having a decentralized project management system?
Code is one thing which clearly allows for many benefits in having the entire local history but pushes more work towards the merging stage. When it comes to issues and discussions, it's often much easier to have a single source of truth without worrying about merge conflicts.
And the issue with github being down isn't an issue of centralization as much as it is about availability of a service. You're free to use github enterprise or gitlab and host the service yourself if you feel you'll get better reliability and performance, however I'm pretty sure you won't beat github's overall without significant investment of time and resources.
Perhaps having a simple read-only offline cache of the latest project management state is a good middle-ground for most of the problem and it shouldn't be that hard to do - but again that's up to how much demand there really is for it.
Looking at the status graphs, it seems like there was some clearly anomalous data starting around midnight, about 9 hours before the actual outage "began". Maybe a gradual botnet ramp-up, and 9:27 AM is when it got bad enough to overload some critical service? (Or really any other threshold-based failure scenario.)
They seemed to be suffering some external attacks / DDoS but I never saw a post-mortem from them on it, hopefully one is forthcoming (or maybe is out but I missed it)
Pages are static sites served from separate infra from the main github.com from what I've heard, so you should be fine in almost all cases. I've never had one go down unless I pushed a dud build.
... It may surprise some HN readers, but AWS outages aren't the only reason other sites go down. Having multi-AZ just means you're resilient to a localised, single-AZ AWS failure. It doesn't help for the 1000 other ways your service could go down.
An ex-github employee who used to give talks on Github scaling challenges revealed they shifted from Rackspace.
See yourself - https://github.com/holman/ama/issues/553
The problem is not the files, it's the project management stuff (issues and PR tracking). That is a SPOF. If they could decentralize that, then nobody would care if they are down half the time.
They can - it's called Github Enterprise (or if you're a medium to large corporation, JIRA). I thought occasional downtime was just something people factored into free services. You can always file your issues later, or use a chat service (or, heaven forbid, email) if there's an immediate need for feedback.
I don't see why you would expect them to. It's nice and all, but they're under no obligation to tell the world everything that happened every time they have a minor outage. Not every outage is newsworthy or even that interesting, and oftentimes they're little more than a PR piece for the company.
A lot of package manager tools download their packages from Github. If you happened to be refreshing your dependencies at the time, or doing a clean build, then you'd be SOL.
Homebrew I believe goes through GitHub. Many of the Vim plugin managers also do, or at least have the option. I think CocoaPods and Carthage, both for iOS development, do as well.
I think it's somewhat common for a new package manager to use Github as a kind of CDN for a while, until they get big enough they can do their own.
In the face of a lack of information, HN comments begin to throw around unfounded speculation & tongue-in-cheek jokes run rampant. I suppose that in the absence of information, many stay silent, & the remaining see a thread lacking comments
You're still missing the point, by assuming that the outages are code related when you call them avoidable. Distributed systems at scale are terrifyingly hard to control.
When I said "probably a fairly rare one", I didn't mean that the outages are rare. I meant that new code is probably a rare cause of the outages that happen. They have other causes unrelated to new code.
(I'm also skeptical that GitHub major outages happen "nearly weekly", but I don't have data.)
I'm not "missing" anything. I worked at Google for 7 years much of which was spent working on, you guessed it, distributed systems infrastructure. You guard against this by carefully canarying things and putting robust testing, monitoring, and deployment procedures in place. A release might take a few days, but you can be reasonably certain your users won't be your guinea pigs, and if shit does hit the fan, rollback is easy, and you can reroute traffic elsewhere while you roll back. Most of the time no rollback is needed: you just flip a flag and do a rolling restart on the job in Borg. For some types of outages (most of which users never even see) Google has bots that calculate monetary loss. And the figures can be quite staggering and motivating, so people do postmortems and try their best to make sure the outages don't happen again.
Gmail is several orders of magnitude larger than Github will ever be, and in recent memory I can only recall it being down once, and for a very small subset of users.
I'm not even assuming this outage was caused by a "change". There are DoS attacks, infrastructure/network outages, storage pool problems.. immediately assuming that someone pushed some code and it broke things seems like an extremely short-sighted view on how production systems fail.
It's called continuous delivery, man. Merge to master and it goes out the door automatically, even if it's busted to hell. All the cool kids are doing it.
Shitty tests? Lack of integration tests? Lack of test coverage for a particular scenario? Test environment that does not represent the config that's deployed in production? There are literally hundreds of reasons why things like that could happen. Depending on the system in question you could have an isolated environment on which you replay prod traffic before cutting a new release and then investigate failures. All new features are engineered so that they could be easily turned off using flags. Once that's done you could canary your release to, say, 5% of your user base (at Google it'd be in the least loaded Borg cell), and if something goes wrong, you quickly disable busted features in response to monitoring and alerts. You let that stew for a while, then deploy your release worldwide, and start working on the next one.
Let me get it straight... So the advantage of CI/CD is not automatic testing + roll outs, rather it is removal of a test requirement? Why waste resources on CI/CD in that case? Just remove the test requirements and deploy. In fact, remove canary as well - the more traffic hits the broken release the faster it will become obvious that release is broken.
The above was sarcasm.
If your org has CD and it does not have CI validation step that provides 100% test coverage of your code, then you do not have a CD - you have MSP - Merge->Ship->Pray system in place.
I'm not sure how you got that out of what I wrote. If anything, it's a recognition that unit tests alone are never sufficient, and _drastic increase_ of testing effort, _in addition_ to CI. Humans are in the loop. Testing is near exhaustive, because it happens in an environment nearly identical to prod, with a copy of prod traffic. Users just don't see the results. Compare that to just "push and pray" approach of a typical CD setup.
I'm sorry, if you are adding a new feature, then no old features could break without unit and integration tests indicating brokenness.
What one should do is push the new feature dark, i.e. not enabled for anyone except the automated test suite user. That's the user that should exercise all old paths and validate that no old path is broken when the feature is enabled by a user but unused. After that is validated in production one can enable the feature for either a percentage or users or 100% of users depending of how one is playing to live test and live QA the feature.
The important part is that no new release can break existing usage patterns.
That's CI/CD. Everything else is magic, unicorns and rainbows.
The crucial difference is that CD postulates that if a change passes your automated test suite, it's good enough to immediately go live. I've dealt with many complex systems, some with pretty exhaustive test coverage, and this wasn't true in any of them. Unless your service is completely brain-dead simple (which none of them are once you have to scale), you always need a human looking at the release and driving it, and turning things off if tests missed bugs.
And last I checked their jobs aren't, in fact, automated. They just moved to the various cloud providers and were renamed to "SRE", with a major boost in pay.