"However, there is something else we can do. It is more radical, but also more useful. Rather than letting people only evaluate papers, why not give them a chance to participate and improve them as well? Put all your papers on github and let others discuss them, open issues, fork them, improve them, and send you corrections. Does it sound crazy? Of course it does, open source also sounded crazy when Richard Stallman announced his manifesto.
Let us be honest, who is going to steal your LaTeX source code? There are much more valuable things to be stolen. If you are tenured professor you can afford to lead the way. Have your grad student teach you git and put your stuff somewhere publicly. Do not be afraid, they tenured you to do such things."
This is a fascinating idea. I'm curious. If we did this (say, with an upcoming technology paper), would anyone want to contribute?
Richard Stallman still sounds crazy. But open source software has gotten credibility because it's very good. Linus Torvalds had more to do with the acceptance of open source than Richard Stallman.
PS: I hate the stupid fights over words: "Gnu/Linux", "Free software"... It's called "Linux" and it's called "open source software". Now get off my lawn.
He doesn't sound crazy he is just very eccentric as a individual. And he has an "extremist" position, some would say. IMHO we need someone like that to keep us honest. Remember all the years people like Microsoft put actually money into discrediting FOSS? Would a wishy washy attitude have done much good then? Diplomacy doesn't work worth a damn when your enemy wants to destroy you by any means possible, be it by embracing you or extinguishing you
His extreme positions sure sounded crazy to me. Like "I refuse to have a cell phone because they are tracking and surveillance devices... many can be remotely converted into listening devices."
After this NSA scandal broke, I feel I was a little quick to judge him. I'm not going to give up my cellphone, but I've already quit using FB and have started using more digital privacy tools like OTR in pidgin.
I get that "Free software" means more than "open source" and I appreciate that the difference is meaningful and significant.
I haven't heard an argument for "GNU/Linux" that's any more than "hey a bunch of GNU code is part of linux so GNU should be in the name." which seems a much less principled objection. If I'm wrong though, I'd love to be corrected.
Well, I suppose that I still cling to a mentality from a decade or so ago, when it seemed worthwhile using "GNU/Linux" to refer to the user-space system (e.g. "Debian is a GNU/Linux"), as opposed to just the kernel. But these days "Linux" is almost universally used to mean the former, and "the kernel" is used to refer to the latter.
So I suppose you are right, it is a much less principled objection :)
I like this blog post better. It goes more into the radical innovation (for academia) of the authors' process:
"But more importantly, the spirit of collaboration that pervaded our group at the Institute for Advanced Study was truly amazing. We did not fragment. We talked, shared ideas, explained things to each other, and completely forgot who did what (so much in fact that we had to put some effort into reconstruction of history lest it be forgotten forever). The result was a substantial increase in productivity. There is a lesson to be learned here (...), namely that mathematicians benefit from being a little less possessive about their ideas and results. I know, I know, academic careers depend on proper credit being given and so on, but really those are just the idiosyncrasies of our time. If we can get mathematicians to share half-baked ideas, not to worry who contributed what to a paper, or even who the authors are, then we will reach a new and unimagined level of productivity. Progress is not made by those who break rules."
"Truly open research habitats cannot be obstructed by copyright, profit-grabbing publishers, patents, commercial secrets, and funding schemes that are based on faulty achievement metrics. Unfortunately we are all caught up in a system which suffers from all of these evils. But we made a small step in the right direction by making the book source code freely available under a permissive Creative Commons license. Anyone can take the book and modify it, send us improvements and corrections, translate it, or even sell it without giving us any money. (If you twitched a little bit when you read that sentence then the system has gotten to you.)"
and racist. And if my subjective impression is true, there are some other biases as well. Basically this picture looks like our University math department 25 years ago, may be add a few more very old professors.
to the parent - by now, man, you probably know that a lot of people with downvoting karma here do lack a sense of humor and/or have very low tolerance for sarcasm. The main impact of XML innovation on human civilization is the ability to use <sarcasm> in an environment like this to at least minimize the wrath of the former group. Nothing though can help with the latter.
I have a sense of humor and high tolerance for sarcasm, and didn't downvote. But look at the long thread of sibling & niece/nephew comments to yours. Do you really think that the parent's comment led to any further useful discussion?
Is there some sort of evidence behind this statement or is it just the usual "x is sexist/racist/ageist/etc." based on a photo and the assumption that there's some sort of agenda behind selection when there's any sort of clear majority that's male and/or white? I'm asking honestly because I can't tell any more.
Apparently, I didn't know what reactionary meant. In case I'm not the only one, I'll link to the wikipedia page and hope it does a better job explaining the term than the 15,000 word article posted by Jach.
We've been watching subtle differences unfold for decades. The PC movement was aghast that anyone dare think that men/women had any differences at all. That's been chipped away at for decades now. The proof is now on those who think that they are the exact same, despite mounds of suggestions that there are indeed some notable differences. We don't know the full extent of the differences, but this is not one of those cases where you get to push up your glasses and say "Well, without 100% proof you shouldn't believe it."
Every time someone mentions that they think there's a difference between males and females, someone pushes up their glasses and asks "source?", as if the 6th layer of comments on a news aggregater is the place to rehash decades of complicated studies and biology.
Accept that the position that there are some differences is a reasonable one and move on, even if you disagree. We don't know how much of it is mental, but we don't know much about "mental" at all right now. We're guessing based on what we do know.
Just as long as we all don't let our guesses about the unknowns of science cloud our judgment in specific, concrete situations, we'll be perfectly fine. And merely believing that some differences do exist won't itself cause that.
> Accept that the position that there are some differences is a reasonable one and move on, even if you disagree.
Accepting that there are some differences does not lift the burden of persuasion off those who assert the existence of a particular difference, in the same way that accepting the idea that there are some murderers in the world doesn't lift the burden of persuasion off those who insist that a particular person is a murderer.
There are a lot of innate differences. But sometimes we observe a difference (say, gender distribution in graduate math) and we don't know why we observe. It is valid to conjecture that perhaps the observed difference is due to one of the innate differences. It isn't proof, but it's valid conjecture in the absence of anything better.
There are also a lot of difference in socialization and social treatment that are not innate differences.
> It is valid to conjecture that perhaps the observed difference is due to one of the innate differences.
It is also valid to conjecture that perhaps the observed difference is due to cultural/environmental forces and not due to one of the innate differences.
There's a big gap between conjecturing that something may perhaps be true (which is a very weak position that doesn't require much support) and asserting that something it true, or probably true, or the best explanation given the available evidence.
> It is also valid to conjecture that perhaps the observed difference is due to cultural/environmental forces and not due to one of the innate differences.
It is. But we originally started with someone conjecturing that it wasn't, and getting told off for it. Don't change it from "believing X is reasonable" to "believing not X is also reasonable".
>There are a lot of innate differences. But sometimes we observe a difference (say, gender distribution in graduate math) and we don't know why we observe
The assertion that there are innate differences? Of course. They're flooded with different hormones from birth. Throughout life, the genders will experience vastly different hormones. Their development process is different from very early one. It's not drastically different, but it's a key different very early on. Males and females then go on to have notable differences in physical prowess like strength, coordination, and in other areas like chemically-motivated aggression, etc.
We know there are differences. We've chipped away at it for decades, and no one in science disagrees. The problem is that the obvious differences that are manifested are all macro physical, we don't know how the differences extend to the mental area. But then, in these kinds of arguments we don't really even know what we mean by "mental" because, technically, it's all physical. (At least, the part science cares about.)
Some people think that it doesn't extend to "mental". Some do. Considering the basic fact that our mental states are very obviously modified chemically, and that males and females get different chemical influences throughout life, I can't think anything but we have to leave that possibility open that there are consistent developmental differences. This is not one of those situations where it's probably most accurate to assume there is no difference until one is proven.
Also, stonemetal posted a link elsewhere in this comment thread. No one wanted to reply to that, they just wanted to post "are you sure?" in this thread.
And I can't help but wonder if this is supposed to be satire, considering above I posted:
> Every time someone mentions that they think there's a difference between males and females, someone pushes up their glasses and asks "source?", as if the 6th layer of comments on a news aggregater is the place to rehash decades of complicated studies and biology.
Quibble: I downloaded the pdf, which defaults to a file called 'hott-online.pdf'. Why not Homotopy Type Theory - 1st Edition.pdf'? I download a lot of pdfs (mainly because of being a law nerd), and I am sick to the back teeth of having to rename virtually everything that I download so that I'll be able to find the filename again later. I do use Mendeley to keep my documents more-or-less organized, but what have people got against human-readable filenames? Really I ought to be able to get semantic metadata for everything I download.
As for the work itself, great stuff, I look forward to reading it, or at least dipping into it (not a mathematician).
I doubt I'll ever willingly include spaces in files names and I avoid punctuation too if I can.
If you've got a directory system that goes many levels deep though those long file/folder names won't fit on your screen and so you can't see at glance where things at.
I do sympathize with your plight though, I guess I've just accepted that if I want to find something I download again I have to rename or move it to where it ought to belong.
A similar application exists for Gnome, called Referencer (https://launchpad.net/referencer). It keeps a library of PDF files and can extract metadata such as title and DOI.
Follow-up: now that I've had time to check, at least there's proper metadata in the .pdf properties. Meantime, my search for the Netscape Navigator of the Semantic Web continues :)
Someone should make an effort to market Git to mathematicians (and academics and scientists in fields where Git exists).
This article is a start, but it would probably be stronger if the author had a nuts-and-bolts tutorial about how to use Git from the point of view of someone whose use case involves LaTeX markup rather than source code.
Honest question: what is the advantage of Git over something like Dropbox or remote centralized storage accessed via ssh? The only possible answer I can think of is you get a revision history, but this doesn't seem terribly useful.
If you could provide a use case to demonstrate how Git+latex is a winner, I'd genuinely appreciate it!
Edit: As soon as I submitted this it occurred to me that for collaborative editing Git is the obvious choice.
I think the most important difference is conflict resolution. If you modify a file on DropBox and save it while someone else still has it opened, you're going to end up with two files, one named youFileName (conflicted copy of someone 06/20/13).tex and the original. Not only does Git allow you to see and fix the conflicts when you merge, it does it automatically most of the time, which is a huge advantage.
I haven't found git to work much differently in practice, at least with papers (I haven't written a book in it). You will end up with way too many conflicts unless you basically "lock" sections through either a software mechanism or agreement, because edits often touch or reorder many paragraphs. A copyediting/wording pass will typically touch every paragraph, for example, and these are more frequent in writing than refactoring or variable-renaming passes are in coding.
You can somewhat improve the situation if you adopt a somewhat weird source-formatting policy, where you write each sentence on one separate long-ish line, rather than flowing by paragraph. Then you will be able to automatically merge as long as nobody has made an edit or move that touches the same sentence, giving you better merge granularity. But even then conflicts happen pretty often, and few people like this style of formatting (maybe it'd work better with tool support).
For a book this might work better, though, because I assume you'd be making less frequent edits, and people would less often be working at the same time.
Sure, but in practice, starting each sentence on a new line isn't difficult and works great with standard 'diff'.
I actually quite like it for proofreading, since fragments and run-on sentences jump out at you.
> You will end up with way too many conflicts unless you basically "lock" sections through either a software mechanism or agreement, because edits often touch or reorder many paragraphs. A copyediting/wording pass will typically touch every paragraph, for example, and these are more frequent in writing than refactoring or variable-renaming passes are in coding.
As stated above, Dropbox doesn't even try to merge files, but Git does, which is definitely an advantage (at the very least, it tells you where the conflicts are, which is far more reliable than trying to work out the conflicts by hand, or with a separate diffing tool).
Review: If you get into the habit of git diff --cached before every commit, it's a good way to review new changes before they go in, and you don't have to re-read the whole section -- you can just read your changes.
Merges: Git does all the bookkeeping for collaboration -- if you have two people making changes, it's much easier to merge them with VCS. Or an independent contributor can use rebase to, as the manual page succinctly puts it, "forward-port local commits to the updated upstream head."
Discussion: If you use a Web-based product like Github or Gitlab (an open-source, self-hostable GitHub clone), you get an issue tracker and a dedicated discussion thread for individual changes.
Even when there is only single user branching can be quite useful, too. When I was writing my thesis, I usually worked on different parts at the same time. Having each changeset in a separate branch made it easier, especially when I procrastinated by fiddling with various settings like margins, colors and page styles.
> what is the advantage of Git over something like Dropbox
Merging. People can work on the manuscript at the same time, and as long as they edited non-overlapping regions, git will merge both contributions silently and painlessly.
Editing a word processor document stored in a shared Dropbox folder, I guess you'd need to inspect both new versions and do the merge manually?
I'll try:
Git repo can be forked, cloned and you can get to keep a local copy of whatever branches you prefer. Managing different versions from different authors while having an "offical" repo with dropbox seems harder to manage with dropbox than with git to me.
Well in that case couldn't you just use the native github application? It has a decent UI for normal usage scenarios. (Assuming they're using GitHub like these authors.)
We are also tackling this at http://banyan.co . Our major focus right now is publishing tools, but science/academics need improvements pretty much everywhere
To be fair, this is a community of mostly non-mathematicians, most of whom will probably have no use for this work. But yeah, certainly that's the real "big deal" here.
"Nicolas Bourbaki is the collective pseudonym under which a group of (mainly French) 20th-century mathematicians wrote a series of books presenting an exposition of modern advanced mathematics, beginning in 1935."
I found this new book surprisingly readable, but agree with your general criticism - I hope others will be inspired by this project to do more elementary open-source affordable books aimed at a more general audience. Imagine if kids could afford to buy their own maths books instead of having them dispensed by the school district with all the corruption that entails.
I wonder if GitHub are planning to add LaTeX files compilation any time soon. It would be amazing to be able to see diffs on a generated output rather than source code.
Shamelessly copied [1] from tactics (go upvote him there) from the Haskell subreddit in case it's of interest to anyone here:
"Homotopy Type Theory is a recent advancement in the area of dependent types. Think Agda, Coq, Idris-style languages if you're familiar with them... otherwise think GADTs on supersteroids gone berserk.
Dependent types allow you to be extremely precise with your data types. You can talk about not just lists or lists of strings.... but also lists of strings of length n (for some natural number n). In the far future, it may be the key to getting fast-as-C performance (think removing bounds checking on arrays completely safely) and software verified correctness of a program simultaneously.
This isn't software, though. This is a math book.
There was a realization a few years ago that equality types (the ability to express x = y in the type system) gave rise to a mathematical structure called a weak ω-groupoid which was giving homotopy and category theorists a hard time. Homotopy Type Theory (HoTT) is a typed lambda calculus that makes studying these things easier. In fact, every data type corresponds to (a very boring) weak ω-groupoid.
What this allows mathematicians to do, though, is to create new interesting data types corresponding to more interesting examples of these things. You get a data type for a Circle, or a Sphere, or a Torus. You can define functions between them via recursion the same way you'd define a function on lists or trees. These new fancy data types are called higher inductive types, and while they don't (currently) have any use for programmers, they pay the meager salaries of long beards in the ivory tower.
The other novelty of the theory might be more interesting for programmers some day (at least if you believe dependent types will save the world).
A guy named Voevodsky proposed a new axiom called the Univalence Axiom that makes HoTT a substatial alternative to the former foundations of mathematics. The Univalence Axiom formalizes a practice mathematicans had been using for a long time, (despite its technical incompatibility with ZFC). tl;dr, the Univalence Axiom says that if two data types are isomorphic then they are equal.
Eventually, this axiom may allow a programmer to do some neat things. For instance, a programmer could write two versions of a program -- a naive version and a "fast" version. (Currently, all programmers only write the "fast" version). If you want to formally prove your "fast" program doesn't suck, it's nasty. However, it might even be humanly possible to prove some correctness about the naive version. The Univalence Axiom (once given "computational semantics") may be able to let us prove things about the dumb, slow, reference implementation of a program or library, then transfer that proof of correctness to the fast one.
To give a small example for anyone familiar with a dependently-typed language, you may notice that in Coq and Agda and whatever, the first data type you learn (and one you stick with for a long time) are the unary natural numbers. That is, you have 0, and 0+1, and 0+1+1, and 0+1+1+1, etc.. We use unary numbers because they are reaaaally easy to prove stuff about. But as any programmer might guess, actually doing anything with them is suicide. The Univalence Axiom would allow us to keep on working with unary numbers for all of our proofs, but then swap them out for actual, honest-to-God 2's complement representations when it comes time to run the program.
So there's that.
Not everyone cares about software correctness, though. But if you're sold on category theory, here's a neat trick. You probably know that equality becomes a hairy, nasty thing in category theory. Two objects can be equal or isomorphic. If you move onto 2-categories, two categories can be equal, isomorphic, or equivalent! And for higher category theory, you end up with even more notions of equaltiy, isomorphism, equivalence, etc, etc.
In a univalent foundation of category theory (which appears in the later chapters of this book), we see that all of these notions of equality collapse down into just one. If two things are isomorphic, then they are, by definition, equal to each other. You no longer have to worry about that stupid squiggle over your equals signs, because univalence means that every construction must respect the structure of your data. There are no leaky abstractions in your data types!"
I guess no one on here has heard of the stacks project. Its over 3800 pages and I counted 93 contributors on the first page. Also, the LaTeX source is freely available on github. https://github.com/stacks/stacks-project
So, obviously I don't want to sound ungrateful and I'm impressed that they put this together and made it freely available. I'm not sure that creative commons is the right license though, since it doesn't require the LaTeX source to be redistributed if the pdf is (unlike the GPL or the GNU Free Documentation License). (They're providing the source on GitHub, but if interest in the project trails off and someone else forks it, the person forking is under no obligation to share the modified LaTeX).
I can think of many sets of notes on the MIT OCW site that are CC, which I'd like to be able to modify and share but can't because the source is missing.
Anyone else have thoughts on the right license for this sort of project?
http://math.andrej.com/eff/ (Language Eff, a functional programming language based on algebraic effects and their handlers.) My understanding is that in this language all effects and evaluation order are explicit; effects and pure functions are easily and explicitly composable. Something like monads but more advanced. (My interpretation is probably wrong!)
http://andrej.com/plzoo/ (The Programming Language Zoo). A number of mini languages which demonstrate various techniques in design and implementation of programming languages. (calculator, mini-ML, mini-Haskell, mini-Prolog, etc)
If I were to go back to college, I'd probably try to get all my engineering friends to use git for projects to the point of getting their annoyance. Even now when I ask them why they don't use github to do something that would make their job easier, I get blank stares… so is life I suppose.
I guess it's partly resistance to change. If you tried to follow every recommendation a friend gave you, you'd go mad.
But sometimes it's hard to encourage VC simply because it shifts the balance of power. You can't hold anyone hostage with "your" code. Mistakes are traceable. You can't waste as much time in meetings, discussing things like project status and interfaces. Silos exist for a reason, it's just not a good reason.
For anyone interested, with a little background in Coq or Agda it feels like (from what I've read so far) this book is pretty approachable by mathematicians and computer scientists alike.
One of the authors listed, Thorsten Altenkirch, teaches quite a few second year Computer Science modules at Nottingham. He's absolutely mad. Here is a selection of some of his best moments: http://www.cs.nott.ac.uk/~txa/thorsten.html.
Note that this hasn't been updated in 13 years. The gold that has been lost since then truly saddens me.
This is awesome! Great use of git & latex! This is the exact problem we are tackling at my startup, Banyan, with our new latex platform http://cl.ly/image/3u3Z3f40382K
Less opportunity for guessing games and amateur sleuthing though. Seriously Bourbaki was the first thing that came into my head when reading the blog post about the book linked to in these comments.
Off topic -- The title of the hacker news submission made me chuckle. It reminds me the old software principle that "9 women can't have a baby in one month."
The issue is less that git treats every paragraph as a single line, and more that git treats every line as a single line and they've put the paragraphs onto one line. It turns out that LaTeX actually looks for a double line break between paragraphs, and you can insert single line breaks into paragraphs without having them appear in the output.
So what you actually want to do is put a line break after every sentence and possibly after every clause, so that in the LaTeX source diffs work sanely, while the output looks as normal.
I'd argue that what you want to do is write English language as you're accustomed. Paragraphs with multiple sentence in a single line.
What you'd like Git (or whatever your diff'ing program) to do is parse the individual sentences as individual lines for version control.
To each his own though. I have trained myself to start putting LaTeX sentences on separate lines to accomodate for this, but don't think that this is the best solution.
You also want to write English, as you normally write English.
It'd be nice to have an intermediate step that turned your LaTeX into a canonical form that works well with Git. And then when you are about to edit, converts it back into a nice-looking 80-column text file, or whatever you prefer.
"However, there is something else we can do. It is more radical, but also more useful. Rather than letting people only evaluate papers, why not give them a chance to participate and improve them as well? Put all your papers on github and let others discuss them, open issues, fork them, improve them, and send you corrections. Does it sound crazy? Of course it does, open source also sounded crazy when Richard Stallman announced his manifesto.
Let us be honest, who is going to steal your LaTeX source code? There are much more valuable things to be stolen. If you are tenured professor you can afford to lead the way. Have your grad student teach you git and put your stuff somewhere publicly. Do not be afraid, they tenured you to do such things."
This is a fascinating idea. I'm curious. If we did this (say, with an upcoming technology paper), would anyone want to contribute?