I would love to see a chart of traffic to other sites when GitHub goes down. My bet is that HackerNews and Twitter both get significant spikes from all those bored developers.
Bored? Git's a distributed version control system, so no excuses. Get back to work!
But in all seriousness I kind of wish GitHub provided a way to mirror things like issues and PRs so you never have to be fully reliant on one service. Not being able to read these really does make it impossible to get work done offline.
Go has a mirror of our GitHub project via this thing "Maintner" I wrote (running at http://maintner.golang.org/) that syncs GitHub in realtime to a log of mutations. (As well as syncing Gerrit and all its comments etc).
So then we can slurp all of our GitHub & Gerrit history into RAM (takes about 5 seconds and 500 MB) via https://godoc.org/golang.org/x/build/maintner/godata#Get and walk it in-memory and do stuff with in. (runs our realtime bots, alternate web UIs on planes, alternate search, stats, etc.)
> Maintner is short for "Maintainer"...the name of the daemon that serves the maintner data to other tools is "maintnerd".
Nice work! But as is custom on HN, I'll bikeshed on the name instead of delving into the contents of the tool. Why shorten the word by just 2 letters? Is there something special about the tooling that makes 8-letter projects more desirable than 10-letter projects, or is it linked to the removal of Artificial Intelligence from the process? I'd personally mis-type that name all the time.
An inspiring new word invented by a redditor on March 23, 2014. The redditor was actually trying to spell exaggerated. This word was misspelled so horribly that when it was googled the only thing that came up was a link to the comment itself.
I keep meaning to dig into Fossil (SQLite's VCS/Project Management system), but I have no faith I could convince a team to use it.
Another model to look at is Trac, which had pretty extensive integration with SVN and integrated (ie, cross-linked) issue tracking and wiki, and stored all the data and change history in a svn repository.
> I keep meaning to dig into Fossil, but I have no faith I could convince a team to use it.
Developer or Fossil and SQLite here: I agree. In my experience, you'd have better luck convincing the team to switch from vi to emacs. For all its many and well-documented faults, the Git/GitHub paradigm is what people want to use because it is what they are familiar with.
All the same, I intend to keep right on using Fossil, thank you very much!
So here is the idea I've been thinking of lately: What if Fossil were ported or enhanced to use Git's low-level file-formats so that unmodified git clients could seamlessly push and pull against the (enhanced) Fossil server. Call the new system "Fit" (Fossil+Git). Using Fit, you could stand up a GitHub replacement for an individual project in 5 minutes using nothing more than a 2-line CGI script. Git fan-boys could continue to use their preferred interface, while others who prefer a more rational and user-friendly design could use the Fossil-like "fit" command. Everybody could share code, and everybody would have a nice web-based interface with which to collaborate with tickets and wiki and all the other cool (and to my mind essential) stuff that Fossil provides. And nobody who already knows git would be forced to learn a new command-line interface.
I'd be all over writing the code for "Fit", except that I'm already over-extended. Anybody who thinks this is a good idea and would like to pitch in and collaborate, please contact me privately. Thanks.
That sounds like a great idea. By leveraging Fossil's existing UI and getting it to work with Git, it might really gain a lot of traction, and become a viable alternative to roll-your-own-github services like GitLab Community Edition.
To be attractive to github users it would have to have a better issue tracker and a better patch review system. Given that sqlite uses a mailing list for both I'm guessing it doesn't have that.
My guess is their sales pitch would be for you to run your own instance of Github Enterprise or doing something fancy with Web Hooks. At work we did the former for a while before switching to Gitlab. We had less downtime than Github (not that they have much) but since the ops is on you in that situation, YMMV.
How about storing issues and PRs in the actual git repository? They should index them for the UI, sure, but the source of truth should be a branch of the git repo just like it is with gh-pages. It should be possible to file a new issue by committing a markdown file to the correct branch and pushing it to Github. Their hub command line tool and their own client could facilitate adding all the correct metadata. This would allow people to work while Github.com is offline and synchronize everything once the service outage is over.
If you have access to a distributed database that is already the best available for manual conflict resolution, why wouldn't you want to use it to store this kind of data? A Github outage is like a partition when you think about it from a distributed db perspective, so treat it like an individual node died and still allow reads/writes to the rest of the cluster that can be merged back once the node comes back online.
> How about storing issues and PRs in the actual git repository?
Access to the git repository is regulated; issues and PRs aren’t. Anybody can fill an issue/PR on your repo, but only you (and your team) can modify the repo. You’d need to also store all the comments on all issues and PRs, even closed/rejected ones. In some repositories, that’d be huge.
> And the amount of actual data needed to store issues is peanuts anyway.
It’s not. Take something like github.com/Homebrew/homebrew-core. There are 15k closed PRs there. There have been 20 new ones today. It’s not rare to have 10-20 comments per PRs; some of them even go over 200. Add CI status, actual PR contents (git patches), comments reactions, edits, labels, milestones, assignees, reviews, projects.
IMHO having a tool to fetch issues locally is a good idea; storing them in the repo is not.
From the perspective of someone mid code review during the outage, it's worked as well as it could too. They preserve comment drafts client side that haven't posted to the server yet across page reloads. While it's a little frustrating to have to submit something 3-5x, they're definitely doing something for this use case already.
Also as others pointed out, just because the GitHub app is down doesn't always mean the GitHub git server itself is down.
It would be clever to store issues and PRs on a special branch in the repository. I can't currently think of why this wouldn't work (apart from business-wise, it would reduce lock-in considerably).
Well, it's unlikely to be feasible due to Git's on disk structure.
With git, every commit has an sha checksum of the commit contents & metadata. And each commit's metadata points to the previous commit.
So, if a maintainer wanted to fix a typo or do any other kind of correction/update to an old issue you'd need to rewrite every commit on disk since that one with the updated chain of checksums. eg from the fixed comment, to the head of the special issue/pr branch.
That alone would probably make syncing with the repo's a real pain for anyone working on medium to larger sized projects. ;)
Your objection is predicated on the assumption that issues would need to be edited "for all time". That's not true for anything else in git, I'm not sure why that would be true for issues. I mean, if you had an "issues/" directory in your repo, and files named "issue1" "issue2" and so on, you would just edit the issue whenever. If someone made a branch for the issue, then they'd want to merge your edit into their branch ASAP - which is good, because it indicates that they read and understood the change. And this merge would be particularly painless since changes to "issues/" are absolutely not going to conflict with anything in "src/".
I agree... I wish GitLab would do this too. No reason that everything can't be modeled as a git branch.
GitLab I know models their Wiki as a git repo. I think issues and PRs should be their own repo as well.... a branch for each issue or pr? Could tie it all up using submodules (or not, whatever)... just please someone take the jump first and do this.
DISCLAIMER: I work for GitLab.
We have a feature to replicate GitLab to different locations (EEP). We call it "Geo", which stands for "Geographical Replication". Part of the "Geo" effort is Disaster Recovery which is under heavily development. We want with Disaster Recovery to be able to reliably promote any Secondary node to a Primary, so if your US datacenter melts down under a nuclear war, you can start working with your copy in EU, India, China, etc.
Or you could just read disaster porn^W^W post-apocalyptic sci fi.
So, a lot depends on what kinds of info you want to find. But overall, I'd say it's not worth reading, because life's not going to be much fun. Even if you're prepared.
(only partially serious here) I'm not sure recovery from a nuclear event that wipes out US data centers is economically feasible. If an event were to effectively wipe out 20% of the worlds economy I think having my github issues online is the least of my worries.
If it's not Git based then I'm not interested in it.
I see no reason to have a RDBMS for issues, merge requests, pull requests, etc. All of these could be modeled in multiple ways using a Git database. Issues are files in a repo called issue1.md, issue2.md ... or issues are branches on a repo ... or something else.
But they want you to Rely on Github... if they could find away to make Git only work with Github they would in a heart bet.
Github is not a open source company, it is not really even a supporter of free software IMO they are a danger to free software for this very reason. They are following the old school Microsoft model of Embrace and Extend... I am waiting to see if they can extinguish,
GitHub clearly contributes a fair bit to open source - not only their own projects, but the free hosting for open source projects. You have no basis for this accusation, and frankly we are all stupider for having read it.
1 Claim was that Github is not a open source company, and they are not. Their core business is not open source, yes they have some side projects that are, but so does MS, I do not believe anyone is going to say MS is a "open source company"
the second claim I made is they are a threat to free software, I used free software here for a reason, free software and open source are different.
GitHub may support some Open Source things, they do NOT support free software.
Git hubs general actions over the years show they do not support free software at all, and have limited support for Open Source software.
They are classic Free Software leaches, using and abusing free software not for the ethical reasons around free software but to advance their profit center
See I am support free software for ethical reasons, not monetary reasons.
That is with out even getting into the MASSIVE issue that is having all of these open source projects on a SINGLE platform. Unless you think github is "too big to fail" which I can assure you it is not
GitLab Enterprise Edition does not have an open source license. Since GitLab.com runs GitLab EE, their two (presumably) primary sources of revenue (EE licenses and GitLab.com paid accounts) come from non-open source software.
But: it's still fair to call GitLab as a company way more Open Source than GitHub. All of their development happens out in the open, the vast majority of their codebase is Open Source, and the source code for GitLab EE is even made available.
> if they could find away to make Git only work with Github they would in a heart bet.
Sorry, but this is unfair :) I remember the early days of Gitlab, before they added all the insanely cool features they have now, when they were just a Github opensource copycat. My thought at the time was : "wow, github is super cool to let them go. At least the almost exact design copy could be a legal problem". Clearly, we would have heard about Github vs Gitlab back then, if Github wanted to lock people in.
Well, if we "heard about GitHub vs. Gitlab" more people outside HN might hear about GitLab. Might be more dangerous, than the chances of a win shutting GitLab down.
It is partly the MS EEE model and partly the Adobe model, where they give away free or low cost services to get indivuals hooked, often at young age, then when they are employed at larger firms they push the firms to adopt that software internally.
Get a bunch of Open Source Developers to do your marketing at their "real jobs" so they can sell the Enterprise Version.
Since GitHub is a private company it is Unknown (at least I can not find the info) if they are profitable or not, or what their revenue numbers even are so it is unclear if that is a successful plan or not.
GitHub could very well run into the same problems as SoundCloud.
That's our chart of backend sessions / sec for GitLab.com [1].
There's also a chart for frontend sessions [2], please keep in mind that there's a lot of CI / bot sessions on these dashboards. In addition to that there's much other interesting info on our HAProxy Grafana dashboard [3]. In case anyone's interested in even more metrics, there's a bunch of interesting dashboards on our Grafana instance [4].
There's also plenty of DIY distributed issue tracker tools to keep issue tracking inside a git repo (often as yaml+markdown). I'm still surprised a good one hasn't stepped up with direct GitHub issue/project tool integration+sync, but that should certainly be a possibility.
Sorry to hear you're having trouble scaling GitLab. We have many organizations running GitLab and successfully scaling to 10's of thousands of users, and GitLab.com which is the largest GitLab installation. As a GitLab Enterprise customer our support team is happy to help you review your scaling problems and resolve them. Please submit a support request at https://support.gitlab.com.
O crap I meant to say GitHUB Enterprise. That's what has trouble. We also have Gitlab Enterprise and that's fine. Sadly for reasons internal we are mostly stuck with GitHub E.
I'm glad to hear that GitLab is working fine for you. And I'm sorry that you feel stuck with GitHub Enterprise. Please know that GitLab has a high fidelity importer https://docs.gitlab.com/ee/workflow/importing/import_project... and our support team has a lot of experience with assisting people with the migration if the migration is the problem.