Perhaps for those familiar with "functional data structures" such an analogy is ...

pdonis · on Dec 9, 2017

> git doesn't actually create new copies of the content for each commit

More precisely, it doesn't create new copies of content that you didn't change. For example, if you have 100 files in your repo and you change one of them and then commit, git creates a new copy of the content of the file you changed--a new blob storing the new file content--and a new tree object that references the new blob instead of the old one, plus the other 99 blobs that store the contents of files you didn't change; the new commit object then references the new tree object (plus the message and metadata). But git never stores diffs between old and new content; it just creates a new blob every time the content of a file changes.

jdmichal · on Dec 9, 2017

> But git never stores diffs between old and new content; it just creates a new blob every time the content of a file changes.

Git pack files compress objects by storing them as diff files going backwards. That is, it stores the most recent state in full, then uses patches to go backwards. Because you're more likely to need a recent version in full than an older one.

https://git-scm.com/book/en/v2/Git-Internals-Packfiles

emmelaich · on Dec 10, 2017

This is true but packfiles are an implementation detail.

It's still useful and more accurate conceptually to consider every commit as a complete snapshot of the state of code that point.

jdmichal · on Dec 11, 2017

That can be said of every version control system. Restoration of state to any given version is their defining feature. How they achieve that is always an implementation detail, but those details can still be important and interesting.

bbatha · on Dec 11, 2017

Git commits are composed of all of the files in the commit, it’s parent and the commit message. This is an important guarantee that each checkout is valid without the rest of the repo. This allows you to have a lot of exotic implementations guarantee consistency between them. Meaning if your GitHub you can distribute commits across many servers. Or your Microsoft and you build partial checkouts for Gvfs. It’s what allows Git LFS to keep many of git’s core guarantees while making tradeoffs to improve areas where git is traditionally weak.

emmelaich · on Dec 12, 2017

Sorta true but see what bbatha said.

There are people who distinguish changeset oriented and snapshot oriented and will hotly debate that one or the other is better.

But as you say, restoration of state is a necessary and defining feature.

nine_k · on Dec 9, 2017

Exactly. Those familiar with functional data structures would thus point you at git and say: heres a structure from the Okasaki book that you use every day.

One more step is to point that futures / promises, and even lists, are monads that a e.g. JS programmer uses every day, too. It reminds me of the old literary character who did not know that he's been speaking prose all his life.

rst · on Dec 9, 2017

Particularly since a git repo, as a whole, isn't a functional data structure. The commits (and the graph in which they're embedded) are immutable, but the mapping between branch names and commits gets mutated all the time. (To say nothing of slightly deeper esoterica like the index and stashes.)

masklinn · on Dec 9, 2017

That's pretty functional still, the "branches" are just (in clojure terms) atoms, aka atomically mutable references to a structure (namely a commit which is some metadata + a reference to a tree).

kazinator · on Dec 9, 2017

> Git lets you do version control via full snapshots as opposed to just tracking diffs.

That's completely orthogonal to whether the version history is immutable with each branch being like singly linked list where we "cons" new things onto the front.

misnome · on Dec 9, 2017

That’s the point; because sentences like

> the version history is immutable with each branch being like a singly linked list where we “cons” new things onto the front

Just contributes to the general confusion around git where people decide it’s too complicated to learn.

(Apologies if my sarcasm detector is just broken today)