Hacker News new | past | comments | ask | show | jobs | submit login
How Git servers work, and how to keep yours secure (nytpu.com)
167 points by danielg0 on March 12, 2021 | hide | past | favorite | 50 comments



Gitea[0] is a pleasant, easy-to-set-up 'GitHub lite', if you are looking for something more turnkey and less custom:

  - users/organizations
  - issues
  - PRs
  - milestones
  - releases
  - wikis
  - activity/contrib graphs
I have it running on a $5/mo Linode instance for some of my personal projects.

[0] https://gitea.io/


It supports LFS too. I use it for my Unity hobby projects, because I'm already paying for my VPS (which has tens of gigabytes of space), so paying GitHub for extra space would be silly.


Is the code review process roughly on par with the github experience? I can't find many examples of a PR with inline comments, etc.


Is there a rock-solid git server that I can use on a home server for versioned immutable backups of misc. files on personal devices (e.g., account config), as well as private software development git repos?

(I've done a cheaper version of this -- except for the immutable part, and the separation of accounts between devices -- in the past using SSH+SVN to a home server, and it was great.)

I was thinking immutable from the perspective of a device. A given device can pull branches of certain repos, and make commits to the branches. But a device's user account on the git server doesn't have permission to affect past commits. So, for example, if my dodgy Linux smartphone is compromised, a hypothetical person who isn't being nice can't do anything to my backups, other than make bogus additional commits.

Maybe each device has its own branch (e.g., `big-laptop`, `little-laptop`, `smartphone`, `media-server`), where they can commit their changes, and maybethey can pull from main/trunk. And then the physical console for the git server lets me inspect and merge changes from the different devices, so that other devices can pick up those changes.

I thought about starting with Gitlab CE, but that's pretty big, so, even if the features could be made to do what I want, I don't know whether I'd always be running too many vulnerabilities that defeat some of my purposes.


"Is there a rock-solid git server that I can use on a home server for versioned immutable backups ..."

A few things ...

First, 'git' is built into the rsync.net platform and you can do anything you like with it, remotely, over ssh:

  ssh user@rsync.net "git clone git://github.com/freebsd/freebsd.git freebsd"
I personally track a number of repos I consider important and keep my own source trees up to date without running git locally.

Second, the ZFS snapshots that are taken, nightly, of your entire rsync.net account are immutable (read-only) so if you clone/update your git repos into your account, they are protected from ransomeware/mallory.

Third, we finally have LFS / git-lfs support which pleases me greatly.


A question I had for a long time: is rsync.net affiliated in some way with the authors of the rsync utility?


No, there is no affiliation at all.

However, in late 2005 / early 2006, when we spun it out[1] as a standalone corporation and registered the domain name, etc., I did request, and receive, explicit permission from the authors/maintainers of rsync to adopt, and use, the rsync.net name.

[1] rsync.net began operation in 2001 as an add-on feature to JohnCompanies which was the first provider of the VPS as we now know it.


Thanks for elaborating. I sort of wish this was mentioned somewhere in a FAQ section but maybe that’s just me.


Maybe this git-config setting will do what you need?

       receive.denyNonFastForwards

"If set to true, git-receive-pack will deny a ref update which is not a fast-forward. Use this to prevent such an update via a push, even if that push is forced. This configuration variable is set when initializing a shared repository."


Depends on what you are after, but why not GIT + SSH and run regular snapshots of the repo? That way a dodgy person doing bad things could ultimately be undone relatively quickly.

Any sort of backup of the git repo would also achieve the same thing.

If you are super paranoid, you could also do git over email ala the linux kernel on sensitive repos and only apply trusted patches yourself.


gitea is super easy to set up and self-host. It has branch permissions. Relatively straightforward to, e.g., allow devices to push only to their own branch but not allow force pushes that overwrite old commits.


I recently asked myself the same question, found no satisfactory answers, and wrote a solution. It uses the backup software "restic" to provide secure hosting on a variety of cloud providers. Restic is immutable by design, and my software basically "backs up" the .git folder to a restic repository. I use S3, but you could easily use any cloud storage provider or local NAS or anything else restic supports.

Shameless plug: https://github.com/CGamesPlay/git-remote-restic


> A given device can pull branches of certain repos, and make commits to the branches. But a device's user account on the git server doesn't have permission to affect past commits. So, for example, if my dodgy Linux smartphone is compromised...

Yes, Gitolite can do this.

R is read access only

RW is read and write access

RW+ is read, write and the ability to overwrite history (rebasing)


I was going to say, I've got my got server on my NAS, which is itself on a ZFS filesystem, through which snapshots provide immutable backups;

But considering the case of a malicious got contributer, access to any of my devices and ssh keys is already a wayyy bigger issue to begin with, and likely entails restoring the rest of the system to a known-secure state due the sheer number of files an intruder could have tampered with outside of version-controlled directories.


Is there any reason your example would ever happen? That seems a little far fetched of a security concern to me.


It absolutely happens for some organizations. There's industries around it.

And anyway, seems like good practice, and shouldn't be hard to do, and should fit with workflows already familiar from work.

What would be bad practice is to give less-trusted devices (e.g., Linux development phones, or some disposable PC on which I had to install some sketchy software) access to all my files and backups.

Using git this way might be a simple (given we have to know git anyway) way to give the goodness of backups and selectively syncing various kinds of files both ways with the less-trusted devices.


Gitolite/gogs/gitea should all be capable to enforce policies like what you described.

If your concern is data loss/malware, anything on the git level is going to be insufficient (but can still be useful of course, as you said)

I’d echo the suggestion of zfs snapshots replicated on a separate mirror of disks. I can recommend zrepl to set up the snapshotting/replication/pruning part. Syncoid is another popular one.


Is there a reason you can’t continue to use Subversion for this? Sounds like its feature set (immutable commits) is closer to your requirements? Subversion still exists and is still easy to set up on a home server. (It’s what I still use for this purpose, even in 2021.)


If interested in self hosting your git repos have a look at gitolite: https://gitolite.com/gitolite/index.html


Gitolite is good if you want a locked-down git server with sophisticated access control

Gitolite provides a command-line UI only; you can use it with gitweb or cgit to allow people to view repositories in a web browser.

No issues or pull requests or fancy stuff like that!


gitolite is pretty fancy but depends on a text based configuration, and it's easy to make your own git-shell which is basically a shell that will allow certain commands, and assign each ssh key to a git-shell with arguments such as a username if you want to bind to a db or something


> basically a shell that will allow certain commands, and assign each ssh key to a git-shell with arguments such as a username

Well, to be fair, you don’t need gitolite for that. You can just do that with any account and the right .ssh/authorized_keys file. prgmr.com uses/used(?) this for out of band access to VMs for example.


Oh man, I remember 10 years ago when Gitlab was a very modest UI on top of this.


The article makes the presumptions that one would be running Arch on a server and that you must be using nginx as a web server.

The latter half of that is particularly striking to me, given that he immediately dives into a shortcoming of nginx... rather than reaching for Apache, he works around nginx's shortcoming.


Arch also doesn't make for a particularly good server OS.


Why? (Asking as someone running Arch on many servers for the last 5 years, and debian server for much longer)

I can think of a few things I dislike, but all of the failures so far were self-inflicted. Like forgetting to update packages that I self compiled outside of supported repository.

It's not completely autonomous when it comes to upgrades, you have to think for a while before hitting "yes" after seeing the list of packages to update, but so is not Debian in the long run... (some of the servers I have to manage are 13 years old or so, and going through major dist-upgrades is never that pleasant either. It's a bit more hassle, because I don't trust the major version upgrade so I have to run them on a backup VM first just to see whether some issues will crop up).

But having the latest versions of the programs is great, I don't have to second guess myself when writing new programs (will it be compatible?), can use the latest kernel APIs, etc.


This is all personal experience, so obvious salt is required with it.

I've had bizarre networking bugs pop up on arch that I've not had elsewhere. I also just don't like rolling releases as much as I used to. Since the majority of what I do can be containerized, I prefer a much slower release cadence for my hosts, and anything that requires more up to date packages just gets thrown in a container.

Basically, I think my server workflow just doesn't line up with how arch works. On desktops it's great, since I'll always have up to date video drivers, desktop environments, etc, but on a server I don't usually use things that require the latest and greatest software. I figure as long as it's still getting bug fixes, I'm probably fine.


Arch changes too much, and change means risk. In my opinion, for production, boring is always best. If the latest flavor of X is absolutely essential for the business, there are typically ways to get it for most LTS distros (back ports, third part repos, etc).


It also changes in ways that can make it hard to automate deployments. For instance, the base package group used to include a kernel, but no longer does.


I like self hosting git but these tutorials set you up with only a one-machine solution. I'd like to be able to self-host a git service that's robust in the face of network/hardware/OS maintenance.

I know git is distributed by design. So if I want to push code to a pair of servers for better availability, I can do it explicitly:

  git push <remote1> <branch>
  git push <remote2> <branch>
But what if I wanted to make this transparent but still highly available, such that the remote URL in

  git push <remote> <branch>
is actually backed by a HA cluster?

Some of the software and ops to make this happen is Github's secret sauce. I'm not looking to compete with them, but would love an open source solution that had a better uptime than a single digital ocean droplet running debian. Ideally, I could get there without green-fielding raft consensus shims into a modified git binary.


Plenty of alternative approaches that can get you there. Here’s one:

  * distributed file system like gluster or ceph for repos
  * clustered db (eg replicated Postgres)
  * redundant instances of gitea
  * load balancing

Am I missing something?


Load balancing redundant gitea over clustered postgres and clustered fs provides a resilient read-only stack.

The trouble comes when the system receives two simultaneous pushes to the same branch. When ceph goes to merge them, which one wins? There has to be a distributed write mutex. Perhaps this mutex could be acquired in a pre-commit hook on the gitea nodes, but it's absolutely necessary (in addition to the other clustered services) to prevent silent data loss or corruption.


post-receive hook pushing everything to other instances. How the push happens could be git remotes, rsync, or whatever else you want.


  git remote set-url --add origin $second_url
Not quite HA cluster levels of redundancy, but it's also way simpler to set up.


I actually have this at work for a repo where we want to both push to the server where it'll be deployed (so this server doesn't need some sort of access to another system, trust relations going only one way) as well as to the source repo where we actually collaborate and have an issue tracker etc. Works great for us. Yes, of course you can setup some limited access for this server to pull from the original source but... small team with high security requirements, so this low-tech solution is quite perfect.


I push to a couple places that have a main git server that everyone pushes to/through and then that somehow pushes everywhere else.

post-receive hook[1] can be used to automate that,

[1] https://git-scm.com/book/en/v2/Customizing-Git-Git-Hooks#_po...


For those trying to host a git server on your home network like I have and encountered network issues even after port forwarding, you might be behind carrier grade NAT.

You'll need some forwarding solution, or use an SSH reverse tunnel to punch through CGNat.

Use something like ngrok or localhost.run. For example if you're using gitea, host it on localhost:8080 then run this:

ssh -R 80:localhost:8080 localhost.run

This will forward localhost:8080 to <subdomain>.localhost.run

Now you just pass <subdomain>.localhost.run to colleagues and they'd be able to connect to your self-hosted git instance.

Now do the same for ssh port.

Caveat that your traffic would route through localhost.run, so it's best to not use this for anything serious, or alternatively host your own reverse SSH tunnel on a VPS somewhere.


In addition to this:

Carrier grade NAT ISPs sometimes have global scope ipv6s assigned, and if the other endpoint has ipv6 support, too, you can breakout easily using the assigned ipv6.

Rather than that I would recommend reading up on DNS exfiltration techniques [1] and things like pwnat [2] that use faked SNMP reply packets that make routers think they forgot to let a data packet through for hop traces.

And if you have the time, I'd recommend to use websockets as a tunneling protocol because it's very flexible in its payload size and allows compressions via websocket extensions and the srv flags. I wrote a detailed article that explains the WS13 protocol and all its quirks [3]

Additionally to that it's good to know the limitations of a SOCKS proxy, hence that's what most "easy to use" implementations provide. Spoiler: forget ipv6 via socks5 proxies. I also wrote a detailed article about its quirks [4]

I'm currently experimenting with the idea of a DNS protocol implementation that uses multicast DNS service discovery to find local peers and that uses DNS exfiltration techniques to breakout of a CGNAT, but I'm not there yet to write a detailed article about it. It's current research for my stealth browser project.

[1] https://blogs.akamai.com/2017/09/introduction-to-dns-data-ex...

[2] https://github.com/samyk/pwnat

[3] https://cookie.engineer/weblog/articles/implementers-guide-t...

[4] https://cookie.engineer/weblog/articles/implementers-guide-t...


Please excuse the offtipic here, but I found no other way of contacting you - how did U manage to put 32gb ram in your T440P your're pointing out in that old post of yours?|

" Using a t440p base as my laptop, best laptop for the buck. bought it as a 4300m model with a dual core. now it has an IPS display, better coreboot+bios update, 32gb ram, i7-4712, 2x 512gb ssds plus a 4tb hdd. all together cost me less than 600eur. hackintosh compatible if necessary, though it's running Arch these days. "

If it's via modded coreboot revision, please do mail me the file when possible @: delio_man@abv.bg

10x in advance and sorry 'bout the Spam!


How could one selfhost localhost.run setup on own wildcard domaine?


If you're talking about,

> alternatively host your own reverse SSH tunnel on a VPS somewhere.

To make a quick version, on a VPS or somewhere, install OpenSSH server. Modify your sshd.conf file adding,

    GatewayPorts yes
Then you can use something like this,

    ssh -R 8080:localhost:22 user@server.example.com
After that, you can use,

    ssh user@server.example.com -p 8080
from any other computer and it will connect you to the machine you ran the ssh -R from.


You can also install git-shell, which is a login-shell which can be accessed through SSH and provides access restricted to Git commands.


I enjoyed reading the source for https://github.com/c4milo/gitd, also in go, as a way to simply wrap the go utilities, add your own auth, and get heroku-like deployments.


What would be the best practice way of allowing git pushes similar to how heroku handles them? I'm wanting to implement a feature that allows users to push to deploy new themes.


Push to an intermediary location. Use post-receive hook to trigger whatever actions you need. At the end, push that to the server and reload whatever you need.

https://git-scm.com/book/en/v2/Customizing-Git-Git-Hooks#_po...


Does anyone host something like "gogs" in the cloud for themselves ?


We use Gitea (a Gogs fork) for our company and it works great :)


What CI runner ?


Not OP, but we use Drone[https://www.drone.io/] with Gitea[https://gitea.io/en-us/]


We actually use drone, too! It has its quirks (for example, it would be nice to be able to start jobs via the web interface), but it works well for us.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: