Hacker News new | past | comments | ask | show | jobs | submit login
GitHub was down (status.github.com)
395 points by bithavoc on July 31, 2017 | hide | past | favorite | 241 comments



I would love to see a chart of traffic to other sites when GitHub goes down. My bet is that HackerNews and Twitter both get significant spikes from all those bored developers.


> bored developers

Bored? Git's a distributed version control system, so no excuses. Get back to work!

But in all seriousness I kind of wish GitHub provided a way to mirror things like issues and PRs so you never have to be fully reliant on one service. Not being able to read these really does make it impossible to get work done offline.


Go has a mirror of our GitHub project via this thing "Maintner" I wrote (running at http://maintner.golang.org/) that syncs GitHub in realtime to a log of mutations. (As well as syncing Gerrit and all its comments etc).

So then we can slurp all of our GitHub & Gerrit history into RAM (takes about 5 seconds and 500 MB) via https://godoc.org/golang.org/x/build/maintner/godata#Get and walk it in-memory and do stuff with in. (runs our realtime bots, alternate web UIs on planes, alternate search, stats, etc.)


> Maintner is short for "Maintainer"...the name of the daemon that serves the maintner data to other tools is "maintnerd".

Nice work! But as is custom on HN, I'll bikeshed on the name instead of delving into the contents of the tool. Why shorten the word by just 2 letters? Is there something special about the tooling that makes 8-letter projects more desirable than 10-letter projects, or is it linked to the removal of Artificial Intelligence from the process? I'd personally mis-type that name all the time.


I'd prefer it just for searchability. Good luck finding anything online called "maintainer".


I think he was excgarating the issue a bit. Your logic is sound


> excgarating

I commit to re-use this as often as possible!

http://www.urbandictionary.com/define.php?term=Excgarated

An inspiring new word invented by a redditor on March 23, 2014. The redditor was actually trying to spell exaggerated. This word was misspelled so horribly that when it was googled the only thing that came up was a link to the comment itself.


Glad someone caught it :)


ahh, a googlewhack


Should have called it Matenerd, shorter and much better name


I think it's a Go joke. Just be glad it's not called "m"!


Oops, instructions not clear, accidentally created another suite of build tooling called "gb"


> alternate web UIs on planes

Could you expand on this? Sounds interesting. You mean like an offline UI powered by the in-memory mutation list?


I think they just meant lighter. They load quite a bit and on really high latency and slow connections it's painful to use.


> have to be fully reliant on one service

My inner cynic says that that's probably the reason why they don't provide this service.


I keep meaning to dig into Fossil (SQLite's VCS/Project Management system), but I have no faith I could convince a team to use it.

Another model to look at is Trac, which had pretty extensive integration with SVN and integrated (ie, cross-linked) issue tracking and wiki, and stored all the data and change history in a svn repository.


> I keep meaning to dig into Fossil, but I have no faith I could convince a team to use it.

Developer or Fossil and SQLite here: I agree. In my experience, you'd have better luck convincing the team to switch from vi to emacs. For all its many and well-documented faults, the Git/GitHub paradigm is what people want to use because it is what they are familiar with.

All the same, I intend to keep right on using Fossil, thank you very much!

So here is the idea I've been thinking of lately: What if Fossil were ported or enhanced to use Git's low-level file-formats so that unmodified git clients could seamlessly push and pull against the (enhanced) Fossil server. Call the new system "Fit" (Fossil+Git). Using Fit, you could stand up a GitHub replacement for an individual project in 5 minutes using nothing more than a 2-line CGI script. Git fan-boys could continue to use their preferred interface, while others who prefer a more rational and user-friendly design could use the Fossil-like "fit" command. Everybody could share code, and everybody would have a nice web-based interface with which to collaborate with tickets and wiki and all the other cool (and to my mind essential) stuff that Fossil provides. And nobody who already knows git would be forced to learn a new command-line interface.

I'd be all over writing the code for "Fit", except that I'm already over-extended. Anybody who thinks this is a good idea and would like to pitch in and collaborate, please contact me privately. Thanks.


The pushback will come in two forms.

First, a backend that works exactly like git is very different from one that works almost like git. You'll be blamed for other people's problems.

Second, you'll have to have a way to deal with commit history the way git does (arbitrarily, and able to be rewritten at any time)

Otherwise the first push -f or rebase breaks the whole thing.

Reading from and writing to a git backend (that is, making fit a client too) might be the safer option. Sort of like git-p4 in reverse.


That sounds like a great idea. By leveraging Fossil's existing UI and getting it to work with Git, it might really gain a lot of traction, and become a viable alternative to roll-your-own-github services like GitLab Community Edition.


To be attractive to github users it would have to have a better issue tracker and a better patch review system. Given that sqlite uses a mailing list for both I'm guessing it doesn't have that.


This'd be a great idea, but I don't have the skillset.


My guess is their sales pitch would be for you to run your own instance of Github Enterprise or doing something fancy with Web Hooks. At work we did the former for a while before switching to Gitlab. We had less downtime than Github (not that they have much) but since the ops is on you in that situation, YMMV.


I'm of the impression that issues and PRs are all accessible via an API. What more should they do?


> What more should they do?

How about storing issues and PRs in the actual git repository? They should index them for the UI, sure, but the source of truth should be a branch of the git repo just like it is with gh-pages. It should be possible to file a new issue by committing a markdown file to the correct branch and pushing it to Github. Their hub command line tool and their own client could facilitate adding all the correct metadata. This would allow people to work while Github.com is offline and synchronize everything once the service outage is over.

If you have access to a distributed database that is already the best available for manual conflict resolution, why wouldn't you want to use it to store this kind of data? A Github outage is like a partition when you think about it from a distributed db perspective, so treat it like an individual node died and still allow reads/writes to the rest of the cluster that can be merged back once the node comes back online.


> How about storing issues and PRs in the actual git repository?

Access to the git repository is regulated; issues and PRs aren’t. Anybody can fill an issue/PR on your repo, but only you (and your team) can modify the repo. You’d need to also store all the comments on all issues and PRs, even closed/rejected ones. In some repositories, that’d be huge.


Of course the issue data in the repo should be considered read only.

And the amount of actual data needed to store issues is peanuts anyway.


> And the amount of actual data needed to store issues is peanuts anyway.

It’s not. Take something like github.com/Homebrew/homebrew-core. There are 15k closed PRs there. There have been 20 new ones today. It’s not rare to have 10-20 comments per PRs; some of them even go over 200. Add CI status, actual PR contents (git patches), comments reactions, edits, labels, milestones, assignees, reviews, projects.

IMHO having a tool to fetch issues locally is a good idea; storing them in the repo is not.


From the perspective of someone mid code review during the outage, it's worked as well as it could too. They preserve comment drafts client side that haven't posted to the server yet across page reloads. While it's a little frustrating to have to submit something 3-5x, they're definitely doing something for this use case already.

Also as others pointed out, just because the GitHub app is down doesn't always mean the GitHub git server itself is down.


You might wanna check out Fossil, it's version control system with integrated bug tracking and wiki. It doesn't really work for all projects, though.

[0] http://fossil-scm.org/index.html/doc/trunk/www/index.wiki


GitLab.com has repository mirroring https://docs.gitlab.com/ee/workflow/repository_mirroring.htm... (this will soon be a paid account feature)

It doesn't mirror issues and PRs but it can do a one time import of them https://docs.gitlab.com/ee/workflow/importing/import_project...


Shame that data isn't store in a side repository where it could be replicated like the repository.


It would be clever to store issues and PRs on a special branch in the repository. I can't currently think of why this wouldn't work (apart from business-wise, it would reduce lock-in considerably).


Well, it's unlikely to be feasible due to Git's on disk structure.

With git, every commit has an sha checksum of the commit contents & metadata. And each commit's metadata points to the previous commit.

So, if a maintainer wanted to fix a typo or do any other kind of correction/update to an old issue you'd need to rewrite every commit on disk since that one with the updated chain of checksums. eg from the fixed comment, to the head of the special issue/pr branch.

That alone would probably make syncing with the repo's a real pain for anyone working on medium to larger sized projects. ;)


Your objection is predicated on the assumption that issues would need to be edited "for all time". That's not true for anything else in git, I'm not sure why that would be true for issues. I mean, if you had an "issues/" directory in your repo, and files named "issue1" "issue2" and so on, you would just edit the issue whenever. If someone made a branch for the issue, then they'd want to merge your edit into their branch ASAP - which is good, because it indicates that they read and understood the change. And this merge would be particularly painless since changes to "issues/" are absolutely not going to conflict with anything in "src/".


Ahhh, I see what you mean. Yeah, that would probably work. :)


I agree... I wish GitLab would do this too. No reason that everything can't be modeled as a git branch.

GitLab I know models their Wiki as a git repo. I think issues and PRs should be their own repo as well.... a branch for each issue or pr? Could tie it all up using submodules (or not, whatever)... just please someone take the jump first and do this.


DISCLAIMER: I work for GitLab. We have a feature to replicate GitLab to different locations (EEP). We call it "Geo", which stands for "Geographical Replication". Part of the "Geo" effort is Disaster Recovery which is under heavily development. We want with Disaster Recovery to be able to reliably promote any Secondary node to a Primary, so if your US datacenter melts down under a nuclear war, you can start working with your copy in EU, India, China, etc.


If your US datacenter melts down under a nuclear war, I can guarantee that none of your devs will give a damn about contributing to the code base.

You might want to modify your scenarios.


As someone interested in these scenarios, where could I find out more information about what would happen post-nuclear explosion?


A lot of this is out in the open

There are active simulations going on: http://www.sciencerecorder.com/article.php?n=scientists-cond...

There are preparedness drills: http://www.independent.co.uk/news/world/americas/us-politics...

There's plenty of policy analysis: https://www.cato.org/publications/policy-analysis/social-eco...

Older .mil reports: http://www.dtic.mil/cgi-bin/GetTRDoc?AD=ADA080907

There are entire books taking apart what happens in specific fields: https://www.ncbi.nlm.nih.gov/books/NBK219154/

Or you could just read disaster porn^W^W post-apocalyptic sci fi.

So, a lot depends on what kinds of info you want to find. But overall, I'd say it's not worth reading, because life's not going to be much fun. Even if you're prepared.


In that event, how much would your "guarantee" be worth? b^)


it depends on your SLA ;)


(only partially serious here) I'm not sure recovery from a nuclear event that wipes out US data centers is economically feasible. If an event were to effectively wipe out 20% of the worlds economy I think having my github issues online is the least of my worries.


If it's not Git based then I'm not interested in it.

I see no reason to have a RDBMS for issues, merge requests, pull requests, etc. All of these could be modeled in multiple ways using a Git database. Issues are files in a repo called issue1.md, issue2.md ... or issues are branches on a repo ... or something else.


Yeah I know I'll be on vacation for a long while!


Hope it will be on the moon or in a nuclear shelter. Otherwise all that fallout may ruin your trip.



But they want you to Rely on Github... if they could find away to make Git only work with Github they would in a heart bet.

Github is not a open source company, it is not really even a supporter of free software IMO they are a danger to free software for this very reason. They are following the old school Microsoft model of Embrace and Extend... I am waiting to see if they can extinguish,


Contentless, illogical mince.

GitHub clearly contributes a fair bit to open source - not only their own projects, but the free hosting for open source projects. You have no basis for this accusation, and frankly we are all stupider for having read it.


Well I made 2 claims..

1 Claim was that Github is not a open source company, and they are not. Their core business is not open source, yes they have some side projects that are, but so does MS, I do not believe anyone is going to say MS is a "open source company"

the second claim I made is they are a threat to free software, I used free software here for a reason, free software and open source are different.

GitHub may support some Open Source things, they do NOT support free software.

Tom Preston-Werners infamous post on open sourcing only some things (http://tom.preston-werner.com/2011/11/22/open-source-everyth...) Is a key piece of evidence to prove this.

Git hubs general actions over the years show they do not support free software at all, and have limited support for Open Source software.

They are classic Free Software leaches, using and abusing free software not for the ethical reasons around free software but to advance their profit center

See I am support free software for ethical reasons, not monetary reasons.

That is with out even getting into the MASSIVE issue that is having all of these open source projects on a SINGLE platform. Unless you think github is "too big to fail" which I can assure you it is not


Terribly sorry for this rubbish reply, but here you go: http://i.imgur.com/ueRIgtq.png

I love it, and it's perfect for when you want to say something witty like FUD or D&D, but can't justify it. Now with CIM, I can!


What? Github employs members of the Rails core team including Aaron Patterson. They literally pay for open source software development.

I'm sure they employ direct maintainers/contributors to other open source projects as well.


Are they paying him to help make an open source alternative to github? If not, then it's not really relevant.

They are actively working against open source tooling for software development by developing features only for their closed platform.

Gitlab is a much better example of a company that fully embraces open source.


GitLab does not fully embrace Open Source.

GitLab Enterprise Edition does not have an open source license. Since GitLab.com runs GitLab EE, their two (presumably) primary sources of revenue (EE licenses and GitLab.com paid accounts) come from non-open source software.

But: it's still fair to call GitLab as a company way more Open Source than GitHub. All of their development happens out in the open, the vast majority of their codebase is Open Source, and the source code for GitLab EE is even made available.

This is a spectrum, not black and white.


And you believe that makes them an "Open Source Company"

To me in order to be a Open Source company your primary product must itself be Open Source.

RedHat as an example... RHEL is open source

GitLab.. GitLab is open source

These are open source companies

Hiring a few devs for work on some side projects that are open source does not make one an Open Source Company

If they open GitHub Core then they can claim to be a open source company


Aaron Patterson, who I'll continue to use as an example, works full time on Rails while being paid by Github. Rails is not a side project for him.

Also as mentioned elsewhere in this thread, Github has published plenty of open source software. Electron, Atom, resque, and updates to git itself.


It is a side project for the company, as are all the other projects you mentioned

As I clearly stated, GitHub can not be a Open Source company simply because they hired some devs to work on Open Source

Do you consider Microsoft to be a Open Source Company?


The Atom editor is widely used and was a major inspiration for VS Code. Electron, which is a GitHub project is also used for countless other projects.


"major inspiration for VS Code"

Which is a Microsoft product, and Microsoft is the inventor of EEE. Ergo, GitHub is complicit in an EEE attempt. Q.E.D. Case closed.

/s


So anyone who releases open-source code, that Microsoft might use in another open-source project is complicit in EEE? wow.


My apologies; I should have made the "/s" - which denotes sarcasm - much more apparent.


Yup. GitHub also released/contributed to a lot of other open-source Ruby stuff, like Resque.


> if they could find away to make Git only work with Github they would in a heart bet.

Sorry, but this is unfair :) I remember the early days of Gitlab, before they added all the insanely cool features they have now, when they were just a Github opensource copycat. My thought at the time was : "wow, github is super cool to let them go. At least the almost exact design copy could be a legal problem". Clearly, we would have heard about Github vs Gitlab back then, if Github wanted to lock people in.


Well, if we "heard about GitHub vs. Gitlab" more people outside HN might hear about GitLab. Might be more dangerous, than the chances of a win shutting GitLab down.


So you fault them for being evil geniuses by not acting evilly?


More likely they got sound legal advice and realized they could not actually win in court, or their investors killed that idea as a waste of money

You do not hear about many VC funded startups initiating lawsuits for a reason. That is the realm of established companies.


Just because it's A common motivation and strategy, doesn't mean it's THEIR motivation and strategy.

A more likely strategy/motivation is that the product is you. AKA: farming the users.

Create a desirable pasture, and the animals farm themselves.


It is partly the MS EEE model and partly the Adobe model, where they give away free or low cost services to get indivuals hooked, often at young age, then when they are employed at larger firms they push the firms to adopt that software internally.

Get a bunch of Open Source Developers to do your marketing at their "real jobs" so they can sell the Enterprise Version.

Since GitHub is a private company it is Unknown (at least I can not find the info) if they are profitable or not, or what their revenue numbers even are so it is unclear if that is a successful plan or not.

GitHub could very well run into the same problems as SoundCloud.


"git on the block chain"


At GitLab.com we're not seeing an increase in usage: https://www.dropbox.com/s/eiz2h7tdnownvfl/Screenshot%202017-...

From http://monitor.gitlab.net/dashboard/db/haproxy-stats?refresh...

But of course it is hard to mirror your repo to GitLab.com when GitHub.com is down, so maybe other sites are a better measure.


That's our chart of backend sessions / sec for GitLab.com [1]. There's also a chart for frontend sessions [2], please keep in mind that there's a lot of CI / bot sessions on these dashboards. In addition to that there's much other interesting info on our HAProxy Grafana dashboard [3]. In case anyone's interested in even more metrics, there's a bunch of interesting dashboards on our Grafana instance [4].

[1] - http://monitor.gitlab.net/dashboard/db/haproxy-stats?refresh...

[2] - http://monitor.gitlab.net/dashboard/db/haproxy-stats?refresh...

[3] - http://monitor.gitlab.net/dashboard/db/haproxy-stats

[4] - http://monitor.gitlab.net/?orgId=1


Its not like your editor is down.


My team was working on merging pull requests when this happened.

We might actually be at fault! That would be kinda cool.


If only you were using a decentralized VCS, you wouldn't have this single point of failure…


The problem isn't the VCS part so much as all the code review/project management features of Github.

I seem to remember there was a decentralized VCS which included issue tracking besides commits, but it never was mainstream.



Yep, that was the one.


There's also plenty of DIY distributed issue tracker tools to keep issue tracking inside a git repo (often as yaml+markdown). I'm still surprised a good one hasn't stepped up with direct GitHub issue/project tool integration+sync, but that should certainly be a possibility.


Fossil SCM


the linux kernel development works just fine w/o github.


They use self-hosted git repo and bugzilla.

It's not exactly less of a point of failure.


I think the spf there is Linus' email account, right?


The VCS is decentralized but they made the process centralized.


but github is honestly so nice. I'll admit in a very subjective way. whats your favorite decentralized vcs?


The joke is that git is decentralized–it's just GitHub that's a single point of failure.


We have Gitlab Enterprise and it has trouble scaling.


Sorry to hear you're having trouble scaling GitLab. We have many organizations running GitLab and successfully scaling to 10's of thousands of users, and GitLab.com which is the largest GitLab installation. As a GitLab Enterprise customer our support team is happy to help you review your scaling problems and resolve them. Please submit a support request at https://support.gitlab.com.


O crap I meant to say GitHUB Enterprise. That's what has trouble. We also have Gitlab Enterprise and that's fine. Sadly for reasons internal we are mostly stuck with GitHub E.


I'm glad to hear that GitLab is working fine for you. And I'm sorry that you feel stuck with GitHub Enterprise. Please know that GitLab has a high fidelity importer https://docs.gitlab.com/ee/workflow/importing/import_project... and our support team has a lot of experience with assisting people with the migration if the migration is the problem.


I'm sorry to hear that. GitLab Enterprise Edition should scale fine to 100,000's of users. Are you in touch with our support about this?


thats why i asked what his favorite was. just curious to see what the feature differences would be


But reviewing, issues, PRs, etc. are. A lot of the planning parts are down.


> implying developers spend most of their time in editors.


Well the parent was implying they spend most of their time in Github.


I don't see that implication. I personally spend most of my time on GitHub.


I wouldn't be surprised if this became an issue in the future, with Atom's popularity on the rise…


Uuuh ... Shouldn't they be, you know, developing? Or has GitHub turned into Reddit when I wasn't looking?


what are you doing on GH all day? you are supposed to _write_ code.


When I break our GitHub webhooks, I joke it's time for people to practice our Disaster Recovery (DR) procedures. In all seriousness, this is a good opportunity to practice work without GitHub. Any service can go down; can you deploy a critical bug fix without it? If not, why not and what can you do to fix it?


I had to change a username from capitalized to uncapitalized and use my updated remote afterwards, apologies if I broke it for everyone.


To the best of my knowledge, GitHub org and usernames in the URI are case-insensitive for both the website and clone URIs. I haven't tested ssh clone URIs to know if they are also insensitive, but I'd guess they are

You would only need to change it if the "presentation" format bothers you (again, as far as I know)


Did you know that if you type "Google" in the Google search box you can break the internet? :|

(BTW, this was meant as a joke... although I'm not discarding that kmfrk broke Github just yet!)


It's kmfrk's fault! Get them!


"Git 'em!" FTFY.


If anyone is interested, I've been working with a git host that is actually distributed across a p2p network using SSB.

see:

https://github.com/clehner/git-ssb

https://github.com/noffle/git-ssb-intro

It's been working fairly well so far. We are using git-ssb to manage a few projects instead of putting them into Github.


Hey, that's really cool. Have you considered submitting it for a Show HN?


Is there, at least theoretically, a way to prevent other people from pushing to my repo? That seems like it would suck re griefing for any project that might become even mildly politically sensitive for whatever reason.


Everything is key based, so only key holders push to your repo. It's all based on the SSB protocol.

I'd suggest reading https://github.com/noffle/git-ssb-intro to get an idea.


That states: "git-ssb's permissionless model has an interesting consequence: anybody can push to anybody else's git repository." The guide doesn't show any key-sharing in order to do that. Are you saying that's incorrect?


I don't want to give you the wrong answer so I've forward this question to the SSBC network for one of the core developers to answer better.

I'd be surprised if there was no security for pushes. The repos I've worked on did require an invite from the creator.


Marak It sounds like you are working with private repos. With git-ssb currently a repo is either public or private. Private repos are encrypted to a fixed set of recipients so only those keyholders can access it. Public repos are unencrypted.


Yes, so basically in a centralized permission model some authority (the database) decides if any write is authorized or not, but in decentralized, any peer just writes anything, and then the readers decide whether they interpret that as valid or not.

Here is a description of a model that embraces both any-one-can-edit with degrees of consensus on who is allowed to edit. http://viewer.scuttlebot.io/%25GKmZNjjB3voORbvg8Jm4Jy2r0tvJj... but if you decide that someone cannot edit it, from there perspective they still can, but they are just excluded from your perspective.


Your comment is entirely nonsensical in this context. I want a way to be able to publish a repo and have the people subscribing to the repo be able to only pay attention to my changes in an automated way. This software currently doesn't implement that, as far as the guide that was linked suggests. It is therefore utterly useless - every time someone decides to grief my repo, it requires manual intervention to resolve.

Once you have that very, very basic ability to replicate what people expect when they subscribe to a person's git repository, you can start playing with automatically merging together people's changes - but in practice, merge conflicts are a thing and there's no good way to resolve them. If you can come up with a way to automatically resolve merge conflicts, you'd be rich, frankly speaking.


you said > Is there, at least theoretically, a way to prevent other people from pushing to my repo?

so I answered, _yes, theoretically_ we have ideas for how to implement that. you can also unfollow and block griefers, but so far pretty much everyone has been nice and we just havn't needed to implement that yet.


How do you intend to automatically resolve merge conflicts, which is what your document suggests you want to do?

Why is this not resolved by a good permissions model and the ability to fork? Why should my users have to care about blocking griefers when they just want to pull from repo?


Status now shows Major Service Outage:

12:32 EDTMajor service outage.

https://status.github.com/


unicorns all around


Pages Builds Failure Rate spiked to over 2000%. I don't know how that's possible, but it seems pretty bad.


Maybe 20+ failing retries for any single page build?


Guessing absolute failure rate previously not encountered by current graphing tool, incorrect scaling factor. Or, it was decided at some point that always reporting 0.000XXX% failure rate, even if correct, didn't offer an intuitive metric, so zeroes were intentionally truncated.


It'd be nice to have an intuitive explanation on the status page for what PBFR means if it can go over 100%.


Bad Metric is Bad.


Insert remark on why we use a centralized service for a distributed source control system, etc. No one seems to care, unfortunately


>Insert remark on why we use a centralized service for a distributed source control system,

Because Linus didn't put features into Git that Github solves.

You must break apart the different features of Github:

#1) communication (issues tracking, bug reports, pull requests, README.MD landing page, etc)

#2) hosting disk storage & bandwidth

#3) distributed source code merges based on content hashes (SHA1) instead of using centralized locks/unlocks (check in / check out) model of CVS/SVN.

Git itself only takes care of #3.

Github handles #1 and #2 (and also gets #3 by being built on top of Git).

You can't go back in time and wonder if Linus should have addressed #1 and #2 because he wasn't interested in starting a hosting company. Instead, he focused on the data format (Merkle trees, BLOBs, SHA1) and a sync protocol (git pull, etc) for Git.

If people wonder why we can't just use email for #1 (communications), you have to see that Github has become a "Schelling Point"[1]. Attempting to use email groups & mailing lists will not prevent the emergence of a Schelling point. Email can be a workflow for existing contributors (e.g. contributors of the Linux kernel source) but it's not convenient for discovery of new repositories (e.g. the web's "landing page" of a repo).

As for #2 (hosting), not everybody who wants to share a repository wants to pay $9.99/month VPS or other hosting plan from a web hosting provider. It would also be inconvenient to host it from the home laptop and punch a hole through the ISP router to make it work. Github solves hosting+bandwidth for free for modest non-commercial projects.

To restate, Linus' Git is a distributed _protocol_ but Github is a _service_ acting as a platform for the distributed protocol.

[1] https://en.wikipedia.org/wiki/Focal_point_(game_theory)


#1 can be done by creating a separate submodule repo that only stores docs+issues files. It's up to the repo's users to agree on a system by which the files should be organized, but it's doable.

I'd propose directories "issues/open", "issues/closed", with each issue filename being "{created:yyyy-MM-dd} - {subject}.md". Symlinks could be used to track ownership/responsibility if each repo contributor has their own directory in the repo too.


That's an awful lot of extra work to insure against maybe 1 hour per year of downtime.


Just because present state is 99.7579% uptime (for the month) doesn't mean it will always be so.

You back up your data, why shouldn't you backup your github data?


Backup is one thing, choosing to run a crappy manual system just in case a vendor goes down is entirely different.


It doesn't have to be a "crappy manual system" - I'm simply suggesting that given that git itself is a damned good distributed versioning database for arbitrary content, then we might as well also use it for distributed issue-tracking. A simple offline-mode browser-based editor that lives in a single HTML file within the repo would provide a nice GUI on top.

Hmm, I think I might be on to something... anyone want to start a project?


>git itself is a damned good distributed versioning database for arbitrary content, then we might as well also use it for distributed issue-tracking.

For what it's worth, it's interesting to see that the Fossil distributed SCM includes an issue tracker but they made a deliberate architecture decision to not propagate the tickets data.[1] They had a chance to make your "distributed-issues-tracking" idea a 1st-class concept in Fossil but decided against it.

Also, the issues/tickets is just one example feature. Github will continue to evolve to add more and more sophisticatted SDL/ALM (application lifecycle management) like JIRA and Microsoft Team Foundation Server. Those features are not easy to implement in a peer-2-peer SCM with practical usability.

[1] https://fossil-scm.org/xfer/doc/trunk/www/qandc.wiki


Thank you for the link. I read through their justifications and I think using a git-submodule solves their problems of polluting the main project history and permissions issue. Using directories for mutually-exclusive state grouping (e.g. "closed"/"open"/"new") solves the directory problem.


The reason is github did a fantastic job of implementing useful features. The visual design is unmatched and they have done a great job implementing developer oriented integrations and social features.

A more federated approach to this sort of thing might have been nice, but so far nothing I have seen comes close to the value-add offered by github.


You're sidestepping the main reason I believe it worked so well. It benefits from network effect. It is a collaborative tool and people like to have their work on there so others can collaborate with them.


That's important but I think people over-estimate it. It's both. I predict that if you analyzed the github network, you'd find many hubs are based around companies that chose to move their workflows to github based on features other than network effects. Or at least, the existing network was only one of many reasons.


As a business, we (and most other companies I know) chose github for features and performance. Nice that other open source stuff is there but doesnt matter for what we pay for.


Totally agree - though I wish they would show files > 2Mb though on the web editor.

I develop directly on github - I even make all my commits to the Master branch, as this allows me to code with a nexus 7 tablet if necessary.

So this outage was a PITA for me. However I have plan B and started up tomcat...

Saved the day!

P.S. Thanks github for the free hosting! I can't really complain .


Lots of people care, but we also recognize that the advantages of using a centralized system outweigh the disadvantages for many use cases.


Lots of answers so I'll try to address them all here. It was a rethorical question. We know what GitHub offers. I'm not a fan of its UI, particularly on mobile, but that's beside the point.

The point is, why we failed, once more, to have a distributed solution, even when the underlying tech assumes it.

Email was the last widely successful distributed medium. And it's dying, unfortunately.

Of course centralized services are easier to implement and use. Doesn't mean we should settle.


We didnt fail, nobody built it (probably from lack of demand).


Nobody built it = fail to provide a solution


Ok, that's not really the same thing but either way, what's the point of having a decentralized project management system?

Code is one thing which clearly allows for many benefits in having the entire local history but pushes more work towards the merging stage. When it comes to issues and discussions, it's often much easier to have a single source of truth without worrying about merge conflicts.

And the issue with github being down isn't an issue of centralization as much as it is about availability of a service. You're free to use github enterprise or gitlab and host the service yourself if you feel you'll get better reliability and performance, however I'm pretty sure you won't beat github's overall without significant investment of time and resources.

Perhaps having a simple read-only offline cache of the latest project management state is a good middle-ground for most of the problem and it shouldn't be that hard to do - but again that's up to how much demand there really is for it.


I'll quote myself for emphasis: Of course centralized services are easier to implement and use. Doesn't mean we should settle.


That's it. I'm starting a github on blockchain.


Count me in when you launch an ICO.


I wish you weren't kidding.


Because most things are easier when you can have one canonical source of truth.


Looking at the status graphs, it seems like there was some clearly anomalous data starting around midnight, about 9 hours before the actual outage "began". Maybe a gradual botnet ramp-up, and 9:27 AM is when it got bad enough to overload some critical service? (Or really any other threshold-based failure scenario.)


or a bad commit being deployed on their fleet.


What was happening to Github for a week or so in late June - early July? I see "The status is still red at the beginning of the day" for a whole week.

https://status.github.com/messages/2017-07-03


They seemed to be suffering some external attacks / DDoS but I never saw a post-mortem from them on it, hopefully one is forthcoming (or maybe is out but I missed it)


They don't do post mortems.



Maybe not public ones, but I'm pretty sure they perform internal post mortems when they have outages.


Do these general Github outages affect GH Pages as well, or is that service portion segmented to some degree?


Pages are static sites served from separate infra from the main github.com from what I've heard, so you should be fine in almost all cases. I've never had one go down unless I pushed a dud build.


I think it started as minor as I was receiving a unicorn once per 10 pages. It's currently happening on almost all.

Of course, I'm trying to dig into a WebKit issue and need the issues to load!


Where is github hosted?

Do they use AWS or another commercial cloud provider, or do they have their own servers in data centers (hopefully scattered around the globe)?

If AWS, are their services spread among multiple availability groups? I'm just wondering how this could happen.


... It may surprise some HN readers, but AWS outages aren't the only reason other sites go down. Having multi-AZ just means you're resilient to a localised, single-AZ AWS failure. It doesn't help for the 1000 other ways your service could go down.


I like how this comment has 3 replies saying different things.

I am now feeling less informed than I was after the first reply.


An ex-github employee who used to give talks on Github scaling challenges revealed they shifted from Rackspace. See yourself - https://github.com/holman/ama/issues/553


They ran out of rackspace now they are on real hardware. https://github.com/holman/ama/issues/553


I believe they manage their own hardware in a datacenter


Here's this from 2009. (https://github.com/blog/530-how-we-made-github-fast)

I says Rackspace, but 2009 is a long time ago so it could have changed.


They're hosted at Rackspace. I'm not sure if they're hosted on the Public Cloud or whether they have a dedicated hardware or what though.


No they shifted from Rackspace. https://github.com/holman/ama/issues/553


Github is back online.


It has leveled up to a major outage!


Dang. It's too bad their customers' source control files aren't distributed and decentralized, or they could keep working and ignore this.


The problem is not the files, it's the project management stuff (issues and PR tracking). That is a SPOF. If they could decentralize that, then nobody would care if they are down half the time.


They can - it's called Github Enterprise (or if you're a medium to large corporation, JIRA). I thought occasional downtime was just something people factored into free services. You can always file your issues later, or use a chat service (or, heaven forbid, email) if there's an immediate need for feedback.


I wonder why you can't export them as a read-only git repository. (Read-only to avoid people messing around with the history)


You can actually export them, but the format is undocumented, and is meant only for import into GitHub Enterprise:

https://developer.github.com/v3/migration/migrations/


Because that would make it easier to migrate away from Github, which Github doesn't want you to do.


Anyone have any knowledge of what specifically happened?


I saw a comment earlier mentioning that GitHub allegedly doesn't release post mortems publicly? If this is true, that's upsetting.


They have published post-mortems before, but the previous bout of downtime went unacknowledged. Maybe it's a new policy.

Here's a postmortem they did several years ago: https://github.com/blog/1261-github-availability-this-week


I don't see why you would expect them to. It's nice and all, but they're under no obligation to tell the world everything that happened every time they have a minor outage. Not every outage is newsworthy or even that interesting, and oftentimes they're little more than a PR piece for the company.


My apologies. I knew my Perl 6 wrapper for GLFW was bad, but never realized it'd be so bad that GitHub would choke to death on it.


Are there any other major sites that are down?


All software depending on Github repo releases downloads are down, e.g: Rancher CLI.


Release downloads are affected?


yep, my Circle-CI deploy fails, the following URL is not available: https://github.com/rancher/cli/releases/download/v0.6.0/ranc...


It just became a major service outage.


This is happening too frequently now.


It's starting to work again for me. I was able to approve a PR and merge it.


Looks like I am still able to push to/pull from my repos without issue.


I've had a few CI jobs fail when attempting to pull, but I've also had some succeed. Seems that it isn't an all-or-nothing outage.


Whatever happened to gittorrent?


Thoughts on the cause?


Definitely Bitcoin-related.


SegWit's first victim.


Not enough mongodb. clearly they are not webscale yet.



Bug in their Ethereum-backed MongoDB instance.


while (! github.works) { this.add(new Buzzword()); }


AI started backing itself up on github.


Defcon let out on Sunday and there's a lot of bored hackers with leftover energy.


Ludum Dare? :-)


How does this affects all your dependencies?


Why would it?


A lot of package manager tools download their packages from Github. If you happened to be refreshing your dependencies at the time, or doing a clean build, then you'd be SOL.


Really? I was under the impression that most language package managers downloaded from a CDN. I know that pip, npm, yarn, cargo, and hex do, at least.


Homebrew I believe goes through GitHub. Many of the Vim plugin managers also do, or at least have the option. I think CocoaPods and Carthage, both for iOS development, do as well.

I think it's somewhat common for a new package manager to use Github as a kind of CDN for a while, until they get big enough they can do their own.


Thanks for the explanation.


>GitHub is having a minor service outage

It's definitely not minor.


I knew I shouldn't have released the new version of my project yesterday. :p

Sorry everyone


Githubs uptime is pretty bad. Isn't it under 95% for the year now?


For 2017 calendar year.

good: ~99.50% major: ~0.04% minor: ~0.46%

There's 7 years of history on their status API. https://status.github.com/api/daily-summary.json


That would be 18 days out of a year. I'd be amazed if it was that low. I'd expect closer to 98-99% (7-3 days per year)


95% would mean it would have over 18 days offline!

I wouldn't even put it in the 99% (3.7 days) uptime category either.


Not really.

It is just very public when they go out.

Steam, Blizzard, Activation are down at least 6 hours a week. While Bank of America is offline 24hours/month. Scheduled downtime is still downtime.


> Scheduled downtime is still downtime.

Except in the SLA.

I've argued more than once when watching a vendor announce "emergency scheduled maintenance"(!) as an outage is looming.


The stat you refer to is for the day only. Status page shows 99.5159% for the past month, I'd guess it would be in that range for the year aswell.


no



It may feel that way when the outages are during your business hours.


In the face of a lack of information, HN comments begin to throw around unfounded speculation & tongue-in-cheek jokes run rampant. I suppose that in the absence of information, many stay silent, & the remaining see a thread lacking comments

& now we've got this meta one in the mix


How many more can we expect before they develop appreciation for testing _before_ they push to prod?


Newly introduced errors in site code are only one of many sources of failure for a site like GitHub, and probably a fairly rare one.


It's a nearly weekly occurrence. Avoidable one, I might add. When was the last time a major Google service had a major outage?


You're still missing the point, by assuming that the outages are code related when you call them avoidable. Distributed systems at scale are terrifyingly hard to control.

When I said "probably a fairly rare one", I didn't mean that the outages are rare. I meant that new code is probably a rare cause of the outages that happen. They have other causes unrelated to new code.

(I'm also skeptical that GitHub major outages happen "nearly weekly", but I don't have data.)


I'm not "missing" anything. I worked at Google for 7 years much of which was spent working on, you guessed it, distributed systems infrastructure. You guard against this by carefully canarying things and putting robust testing, monitoring, and deployment procedures in place. A release might take a few days, but you can be reasonably certain your users won't be your guinea pigs, and if shit does hit the fan, rollback is easy, and you can reroute traffic elsewhere while you roll back. Most of the time no rollback is needed: you just flip a flag and do a rolling restart on the job in Borg. For some types of outages (most of which users never even see) Google has bots that calculate monetary loss. And the figures can be quite staggering and motivating, so people do postmortems and try their best to make sure the outages don't happen again.


So no Google service has ever experienced an outage? I distinctly remember Gmail being down on several occasions.


Gmail is several orders of magnitude larger than Github will ever be, and in recent memory I can only recall it being down once, and for a very small subset of users.


You're too advanced for the typical reader.

It's a startup site with half the people not having a test environment.


Everyone has a test environment; some people happen to send prod traffic to it.


Why are you continuing to assume that this outage was caused by a release of some kind?


Every change is a release if you squint right.


I'm not even assuming this outage was caused by a "change". There are DoS attacks, infrastructure/network outages, storage pool problems.. immediately assuming that someone pushed some code and it broke things seems like an extremely short-sighted view on how production systems fail.


They don't do any testing before push to prod?


It's called continuous delivery, man. Merge to master and it goes out the door automatically, even if it's busted to hell. All the cool kids are doing it.


I know what it's called. So do they have tests or not?


How was it merged to master if it did not pass the tests?


Shitty tests? Lack of integration tests? Lack of test coverage for a particular scenario? Test environment that does not represent the config that's deployed in production? There are literally hundreds of reasons why things like that could happen. Depending on the system in question you could have an isolated environment on which you replay prod traffic before cutting a new release and then investigate failures. All new features are engineered so that they could be easily turned off using flags. Once that's done you could canary your release to, say, 5% of your user base (at Google it'd be in the least loaded Borg cell), and if something goes wrong, you quickly disable busted features in response to monitoring and alerts. You let that stew for a while, then deploy your release worldwide, and start working on the next one.


Let me get it straight... So the advantage of CI/CD is not automatic testing + roll outs, rather it is removal of a test requirement? Why waste resources on CI/CD in that case? Just remove the test requirements and deploy. In fact, remove canary as well - the more traffic hits the broken release the faster it will become obvious that release is broken.

The above was sarcasm.

If your org has CD and it does not have CI validation step that provides 100% test coverage of your code, then you do not have a CD - you have MSP - Merge->Ship->Pray system in place.


I'm not sure how you got that out of what I wrote. If anything, it's a recognition that unit tests alone are never sufficient, and _drastic increase_ of testing effort, _in addition_ to CI. Humans are in the loop. Testing is near exhaustive, because it happens in an environment nearly identical to prod, with a copy of prod traffic. Users just don't see the results. Compare that to just "push and pray" approach of a typical CD setup.


I'm sorry, if you are adding a new feature, then no old features could break without unit and integration tests indicating brokenness.

What one should do is push the new feature dark, i.e. not enabled for anyone except the automated test suite user. That's the user that should exercise all old paths and validate that no old path is broken when the feature is enabled by a user but unused. After that is validated in production one can enable the feature for either a percentage or users or 100% of users depending of how one is playing to live test and live QA the feature.

The important part is that no new release can break existing usage patterns.

That's CI/CD. Everything else is magic, unicorns and rainbows.


The crucial difference is that CD postulates that if a change passes your automated test suite, it's good enough to immediately go live. I've dealt with many complex systems, some with pretty exhaustive test coverage, and this wasn't true in any of them. Unless your service is completely brain-dead simple (which none of them are once you have to scale), you always need a human looking at the release and driving it, and turning things off if tests missed bugs.


That's the exact argument that was used by sysadmins to explain why their jobs could not be automated.


And last I checked their jobs aren't, in fact, automated. They just moved to the various cloud providers and were renamed to "SRE", with a major boost in pay.


Pushing changes on Monday morning... way to ruin everyone's week!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: