Hacker News new | past | comments | ask | show | jobs | submit login

". . . unlike every single horror I've ever witnessed when looking closer at SCM products, git actually has a simple design, with stable and reasonably well-documented data structures. In fact, I'm a huge proponent of designing your code around the data, rather than the other way around, and I think it's one of the reasons git has been fairly successful. . . .

"I will, in fact, claim that the difference between a bad programmer and a good one is whether he considers his code or his data structures more important. Bad programmers worry about the code. Good programmers worry about data structures and their relationships."

--- Linus Torvalds, https://lwn.net/Articles/193245/




That last comment is absolutely golden. Once upon a time I had the privilege to spend a few years working in Swansea University's compsci department, which punches above its weight in theoretical computer science. One of the moments that made me the programmer I am today (whatever that's worth) came when I was meeting with the head of the department to discuss a book he was writing, and while we were discussing this very point of data vs code, I said to him, realising the importance of choosing the right structure, "so the data is central to the subject" (meaning computer science in general" — to which he replied emphathically that "the data IS the subject". That was a lightbulb moment for me. From then on I saw computer science as the study of how data is represented, and how those representations are transformed and transported — that's it, that basically covers everything. It's served me well.


That's great. It reminds me of a comment by Rich Hickey, the inventor of Clojure:

" Before we had all this high falutin' opinions of ourselves as programmers and computer scientists and stuff like that, programming used to be called data processing.

How many people actually do data processing in their programs? You can raise your hands. We all do, right? This is what most programs do. You take some information in, somebody typed some stuff, somebody sends you a message, you put it somewhere. Later you try to find it. You put it on the screen. You send it to somebody else.

That is what most programs do most of the time. Sure, there is a computational aspect to programs. There is quality of implementation issues to this, but there is nothing wrong with saying: programs process data. Because data is information. Information systems ... this should be what we are doing, right?

We are the stewards of the world's information. And information is just data. It is not a complex thing. It is not an elaborate thing. It is a simple thing, until we programmers start touching it.

So we have data processing. Most programs do this. There are very few programs that do not.

And data is a fundamentally simple thing. Data is just raw immutable information. So that is the first point. Data is immutable. If you make a data structure, you can start messing with that, but actual data is immutable. So if you have a representation for it that is also immutable, you are capturing its essence better than if you start fiddling around.

And that is what happens. Languages fiddle around. They elaborate on data. They add types. They add methods. They make data active. They make data mutable. They make data movable. They turn it into an agent, or some active thing. And at that point they are ruining it. At least, they are moving it away from what it is."

https://github.com/matthiasn/talk-transcripts/blob/master/Hi...


Many decades ago I was coaxed into signing-up for an APL class by my Physics professor. He was a maverick who had managed to negotiate with the school to create an APL class and another HP-41/RPN class with full credit that you could take instead of FORTRAN and COBOL (yeah, it was a while ago).

One of the things he pounded into everyone's heads back then was "The most important decision you have to make is how to represent the problem. Do that well and programming will be easy. Get it wrong and there isn't a force on this world that will help you write a good solution in any programming language."

In APL data representation is of crucial importance, and you see the effects right away. It turned out to be he was right on that point regardless of the language one chose to use. The advise is universal.


I also really like this quote and his has influenced the way I work a lot. When I started working professionally as programmer I sometimes ended up with quite clunky data structures, a lot of expensive copying (C++ :-)) and difficult maintenance.

But based on this, I always take greatest care about the data structures. Especially when designing database tables, I keep all the important aspects of it in mind (normalization/denormalization, complexity of queries on it, ...) Makes writing code so much more pleasurable and it's also key to make maintenance a non-issue.

Amazing how far-sighted this is, when considering that most Web Apps are basically I/O - i.e. data - bound.


A mentor often repeated the title of Niklaus Wirth’s 1975 book “Algorithms Plus Data Structures Equals Programs”.

This encapsulates it for me and informs my coding everyday. If I find myself having a hard time with complexity, I revisit the data structures.


I think part of the confusion stems from the word “computer” itself. Ted Nelson makes the point that the word is an accident of history, arising because main funding came from large military computation projects.

But computers don’t “compute”, they don’t do math. Computers are simplifying, integrating machines that manipulate symbols.

Data (and its relationships) is the essential concept in the term “symbolic manipulator”.

Code (ie a function) is the essential concept in the term “compute”.


But what is math, if not symbolic manipulation? Numbers are symbols that convey specific ideas of data, no? And once you go past algebra, the numbers are almost incidental to the more abstract concepts and symbols.

Not trying to start a flamewar, I just found the distinction you drew interesting.


Well, the question of whether there's more to math than symbolic manipulation or not was of course one of the key foundational questions of computer science, thrashed out in the early 20th century before anyone had actually built a general computing machine. Leibniz dreamt of building a machine to which you could feed all human knowledge and from which you could thus derive automatically the answer to any question you asked, and the question of whether that was possible occupied some of the great minds in logic a hundred years ago: how far can you go with symbolic manipulation alone? Answering that question led to the invention of the lambda calculus, and Turing machines, and much else besides, and famously to Godel's seminal proof which pretty much put the nail in the coffin of Leibniz' dream: the answer is yes, there is more to math than just symbolic manipulation, because purely symbolic systems, purely formal systems, can't even represent basic arithmetic in a way that would allow any question to be answered automatically.

More basically and fundamentally, I'd suggest that no, numbers aren't symbols: numbers are numbers (i.e. they are themselves abstract concepts as you suggest), and symbols are symbols (which are much more concrete, indeed I'd say they exist precisely because we need something concrete in order to talk about the abstract thing we care about). We can use various symbols to represent a given number (say, the character "5" or the word "five" or a roman numeral "V", or five lines drawn in the sand), but the symbols themselves are not the number, nor vice versa.

This all scales up: a tree is an abstract concept; a stream is an abstract concept, a compiler is an abstract concept — and then our business is finding good concrete representations for those abstractions. Choosing the right representations really matters: I've heard it argued that the Romans, while great engineers, were ultimately limited because their maths just wasn't good enough (their know-how was acquired by trial-and-error, basically), and their maths wasn't good enough because the roman system is a pig for doing multiplication and division in; once you have arabic numerals (and having a symbol for zero really helps too BTW!), powerful easy algorithms for multiplication and division arise naturally, and before too long you've invented the calculus, and then you're really cooking with gas...


It involves symbolic manipulation, but it’s more than that. Math is the science of method. Science requires reason.

If one were to say computers do math, they would be saying computers reason. Reason requires free will. Only man can reason; machines cannot reason. (For a full explanation of the relationship between free will and reason, see the book Introduction to Objectivist Epistemology).

Man does math, then creates a machine as a tool to manipulate symbols.


You make some interesting points. There was a time I was intrigued by Objectivism but ultimately it fell flat for me. I sort of had similar ideas before encountering it in the literature, but these days I'm mostly captivated by what I learned from "Sapiens" to be known as inter-subjective reality, which I also mostly arrived at through my own questioning of Objectivism. I'm not sure we can conceive of any objective reality completely divorced from our own perceptive abilities.

> Reason requires free will

isn't it still kind of an open question whether humans have free will, or what free will even is? How can we be sure our own brains are not simply very complex (hah, sorry, oxymoron) machines that don't "reason" so much as react to or interpret series of inputs, and transform, associate and store information?

I find the answer to this question often moves into metaphysical, mystical or straight up religious territory. I'm interested to know some more philosophical approaches to this.


Your comment reminds me of the first line from Peikoff’s Objectivism: The Philosophy of Ayn Rand (OPAR): “Philosophy is not a bauble of the intellect, but a power from which no man can abstain.” There are many intellectual exercises that feel interesting, but do they provide you with the means—the conceptual tools—to live the best life?

If objective reality doesn’t exist, we can’t even have this conversation. How can you reason—that is, use logic—in relation to the non-objective? That would be a contradiction. Sense perception is our means of grasping (not just barely scratching or touching) reality (that which exists). If a man does not accept objective reality, then further discussion is impossible and improper.

Any system which rejects objective reality cannot be the foundation of a good life. It leaves man subject to the whim of an unknown and unknowable world.

For a full validation of free will, I would refer you to Chapter 2 of OPAR. That man has free will is knowable through direct experience. Science has nothing to say about whether you have free will—free will is a priori required for science to be a valid concept. If you don’t have free will, again this entire conversation is moot. What would it mean to make an argument or convince someone? If I give you evidence and reason, I am relying on your faculty of free will to consider my argument and judge it—that is, to decide about it. You might decide on it, you might decide to drift and not consider it, you might even decide to shut your mind to it on purpose. But you do decide.


Last idea, stated up front: sorry for the wall of text that follows!

It's not that I reject the idea of objective reality–far from it. However I do not accept that we can 1) perfectly understand it as individuals, and 2) perfectly communicate any understanding, perfect or otherwise, to other individuals. Intersubjectivity is a dynamical system with an ever-shifting set of equilibria, but it's the only place we can talk about objective reality–we're forever confined to it. I see objective reality as the precursor to subjective reality: matter must exist in order to be arranged into brains that may have differences of opinion, but matter itself cannot form opinions or conjectures.

I'll assume that book or other studies of objectivity lay out the case for some of the statements you make, but as far as I can tell, you are arguing for objectivity from purely subjective stances: "good life", "improper discussion"... and you're relying on the subjective judgement of others regarding your points on objectivity. Of course, I'm working from the assumption that the products of our minds exist purely in the subjective realm... if we were all objective, why would so much disagreement exist? Is it really just terminological? I'm not sure. Maybe.

Some other statements strike me as non-sequiturs or circular reasoning, like "That man has free will is knowable through direct experience". Is this basically "I think, therefore I am?" But how do you know what you think is _what you think_? How do you know those ideas were not implanted via others' thoughts/advertisements/etc, via e.g. cryptomnesia? Or are we really in a simulation? Then it becomes something like "I think what others thought, therefore I am them," which, translated back to your wording, sounds to me something like "that man has a free will modulo others' free will, is knowable through shared experience." What is free will then?

"free will is a priori required for science to be a valid concept" sounds like affirming the consequent, because as far as we know, the best way to "prove" to each other that free will exists is via scientific methods. Following your quote in my previous paragraph, it sounds like you're saying "science validates free will validates science [validates free will... ad infinitum]." "A implies B implies A", which, unless I'm falling prey to a syllogistic fallacy, reduces to "A implies A," (or "B implies B") which sounds tautological, or at least not convincing (to me).

I apologize if my responses are rife with mistakes or misinterpretations of your statements or logical laws, and I'm happy to have them pointed out to me. I think philosophical understanding of reality is a hard problem that I don't think humanity has solved, and again I question whether it's solvable/decidable. I think reality is like the real number line, we can keep splitting atoms and things we find inside them forever and never arrive at a truly basic unit: we'll never get to zero by subdividing unity, and even if we could, we'd have zero–nothing, nada, nihil. I am skeptical of people who think they have it all figured out. Even then, it all comes back to "if a tree falls..." What difference does it make if you know the truth, if nobody will listen? Maybe the truth has been discovered over and over again, but... we are mortal, we die, and eventually, so do even the memories of us or our ideas. But, I don't think people have ever figured it all out, except for maybe the Socratic notion that after much learning, you might know one thing: that you know nothing.

Maybe humanity is doing something as described in God's Debris by Scott Adams: assembling itself into a higher order being, where instead of individual free will or knowledge, there is a shared version? That again sounds like intersubjectivity. All our argumentation is maybe just that being's self doubt, and we'll gain more confidence as time goes on, or it'll experience an epiphany. I still don't think it could arrive at a "true" "truth", but at least it could think [it's "correct"], and therefore be ["correct"]. Insofar as it'll be stuck in a local minimum of doubt with nobody left to provide an annealing stimulus.

I will definitely check out that book though, thanks for the recommendation and for your thoughts. I did not expect this conversation going into a post about Git, ha. In the very very end (I promise we're almost at the end of this post) I love learning more while I'm here!


One problem is that, at least for certain actions, you can measure that motor neurons fire (somewhere in the order of 100ms) before the part of your brain that thinks it makes executive decisions.

At least for certain actions and situations, the "direct experience" of free will is measurably incorrect.

Doesn't mean free will doesn't exist (or myabe it does), but it's been established that that feeling of "I'm willing these actions to happen" often times happens well after the action has been set into motion already.


Starting at 1:12:35 in this video, there is a discussion of those experiments with an academic neuroscientist. He explains why he believes they do not disprove free will.

https://youtu.be/X6VtwHpZ1BM


Oh, thank you for this :) Because I won't deny, a friend originally came to me with this theory and it has been bugging me :)


There is a lot here. For now, I will simply assert that morality, which means that which helps or harms man’s survival, is objective and knowable.

I’ve enjoyed this discussion. It has been civil beyond what I normally expect from HN. From our limited interaction, I believe you are grappling with these subjects in earnest.

This is a difficult forum to have an extended discussion. If you like, reach out (email is in my profile) and we can discuss the issues further. I’m not a philosopher or expert, but I’d be happy to share what I know and I enjoy the challenge because it helps clarify my own thinking.


Yeah, I expect we're nearing the reply depth limit. Thanks for the thought provoking discussion! Sent you an email. My email should be in my profile, too, if anyone wants to use that method.


In Spanish the preferred name is "ordenador" which would translate to something like "sorter" or "organizer machine".


That's in Spain. In American Spanish computador/a is most often used: http://lema.rae.es/dpd/srv/search?key=computador

There is also informática/computación; both Spanish words to refer to the same thing but used in Spain/America.

I guess that literally they'd be something like IT and CS.


Good points from both, indeed it's a country thing not a language thing. My bad!


In French, it's the same; it's about "putting things in order", similar in concept to an ordonnateur:

https://en.wikipedia.org/wiki/Ordonnateur


Similarly for French - "ordinateur"

https://www.dictionnaire-academie.fr/article/A9O0665

A search for "computer" does not find anything; though I suspect many French actually use computer not ordinateur.


No we don't. We use ordinateur.


In Finnish, it's an 'information machine'.

To use one is colloquially 'to data'; as in, a verb form of data :)


more accurate, it is called 'ordenador' in Spain. In Latin America, is 'Computadora'


Hmm, interesting. In Norwegian the word for computer translates to "data machine" (datamaskin)


As in Swedish.


The Swedish name is “dator”, isn’t it? Its root is certainly “data”, but I like it better than the more cumbersome Norwegian word “datamaskin”.


I’ve always thought that dator was just a short form of datamaskin. But some other comments suggested otherwise, so I had to look it up. Apparently, dator is a made up word from 1968, from data and parallels the words tractor and doctor.


Yes, it's "dator". The word was initially proposed based on the same Latin -tor suffix as in e.g. doctor and tractor, so the word would fit just as well into English as it does in Swedish.


And in Danish we had "datamat", which has a nice ring to it. But everybody says "computer" instead.


In Anathem by Neal Stephenson, computers are called Syntactic Devices ("syndev").


Computer science indeed sounds a lot like you're working with computers.

In German the subject is called "Informatik", translating to information science. I find that quite elegant in contrast.


Yes, I've heard it said that calling it computer science is like calling astronomy "telescope science".


It also helps identifying journalists that don't know what they are writing about. They frequently translate "computer science" literally as "Computerwissenschaft".


Interestingly, Computer Science is called "Datalogi" in Danish. I always liked that term better.

Coined by Peter Naur (of BNF-"fame"), by the way.


Same in Swedish. Also the Swedish word for computer is dator. Don't know if this in any way shifts the mental perspective though.


Informatika (Інформатика) in Ukrainian. Probably originated from German or French.


Linus was probably exposed to Wirth's book (from 1976) at some point.

I believe it was the first major CS book that emphasised data structures.

https://en.wikipedia.org/wiki/Algorithms_%2B_Data_Structures...


The Mythical Man-Month, published a year before Wirth's book, provides the most well-known quote on the subject (though he uses now antiquated language):

"Show me your flowchart and conceal your tables, and I shall continue to be mystified. Show me your tables, and I won't usually need your flowchart; it'll be obvious."

But I don't think Brooks was trying to suggest it was an original idea to him or his team, either. I imagine there were a decent number of people who reached the same conclusion independently.


There's a good bit about data structure-centric programming in The Art Of Unix Programming: http://www.catb.org/~esr/writings/taoup/html/ch01s06.html#id...

(Apologies for linking to esr but it's a good book)


What's wrong with esr?


He's kind of a nutcase. He's a gun rights advocate, to the point where immediately after 9/11 (within a day or two?) he argued that the solution should be for everyone to carry a gun always, especially on airplanes.

And then he accused women in tech groups of trying to "entrap" prominent male open source leaders to falsely accuse them of rape.

And then he claimed that "gays experimented with unfettered promiscuity in the 1970s and got AIDS as a consequence", and that police who treat "suspicious" black people like lethal threats are being rational, not racist.

Basically, he's a racist, bigoted old man who isn't afraid to spout of conspiracy theories because he thinks the world is against him.


At least half of these "nutcase" claims are plainly true. Thanks for the heads up, I'll be looking into this guy.


Which half?


Maybe someone willing to get into a politically fraught internet argument over plainly true things will jump in for me. I'm already put off by the ease and comfort with which HN seems to disparage someone's character for his ideas and beliefs, actions not even entering the picture.


Public utterances are actions which can have consequences. If you're in favor of free speech, buckle up because criticism of public figures is protected speech.

But in this case the "consequence" to esr was somebody apologizing for linking to him. Methinks the parent protests too much


Every action has consequences, it's either profound or meaningless to point this out. I see it used as a reason to limit speech because this speech that I disagree with is insidious and sinister. Rarely is any direct link provided between this sinister speech and any action that couldn't be better described as being entirely the responsibility of the actor.


Indeed, I point out that actions have consequences because it's a common trope that "free speech" implies a lack of consequence.

> I see it used as a reason to limit speech because this speech that I disagree with is insidious and sinister.

Limiting speech is a very nuanced issue, and there's a lot of common misconceptions surrounding it. For a counterexample, if you're wont to racist diatribes, that can make many folks in your presence uncomfortable; if you do it at work or you do it publicly enough that your coworkers find out about it, that can create a toxic work environment and you might quickly find yourself unemployed. In this case, your right to espouse those viewpoints has not been infringed -- you can still say that stuff, but nobody is obliged to provide audience.

And as a person's publicity increases, so do the ramifications for bad behavior -- as it should. Should esr be banned from the internet by court order? Probably not. Does any and every privately owned platform have the right to ban him or/and anybody who dis/agrees with him? Absolutely: nobody's right to free speech has been infringed by federal or state governments. And that's the only "free speech" right we have.


The reason free speech is called free is that it is supposed to be free of suppression and negative consequence where that speech does not infringe on the interests of others. That it is only now protected in scope by interference from government does not make this version of the free speech the one that supporters of it (myself included) the ideal.

> Should esr be banned from the internet by court order? Probably not.

Where's the uncertainty in this?

> Does any and every privately owned platform have the right to ban him or/and anybody who dis/agrees with him?

Those that profess to being a platform and not a publisher should not be able to ban him, nor anybody else, for their views, whether expounded via their platform. That's why they get legal protections not afforded to others. Do you think the phone company should be able to cut you off for conversations you have on their system?


[flagged]


> I just explicitly affirmed at least two of four "racist, misogynistic, bigoted" statements of fact.

Well, that's how you're characterizing your actions, okay. But just so you know. Your employer is free to retain you, or fire you, on the basis of opinions that you express in public or private. Wicked tyranny, that freedom of association.

> Presumably now you'd like to...

Well, that's certainly a chain of assumptions you've made. Why would you, say, "respond to the strongest plausible interpretation of what someone says, not a weaker one that's easier to criticize" when you're out in public? Oh right, that's a quote from HN guidelines. In any case, you're not changing minds by acting this way.

> because this is how you tyrants prefer we genuflect to avoid guilt by association.

Oh, no, the tyranny of public criticism! Hey did you know something? You're free to disagree with me. And criticize me. In public! And others are free to agree with me, or you, or even both of us, even if that makes zero sense!

> This is profoundly idiotic, but I again refrain from arguing because my audience has proven itself very unthinking and vicious

A personal attack, how droll.

> I hope you're not an American,

I am! And as an American I've got the freedom of association -- that means that I'm not legally obligated to verbally support or denounce anybody; nor is it unlawful for me to verbally support or denounce anybody! Funny thing about freedoms; we've all got 'em and it doesn't mean we need to agree on a damned thing.

> because you don't understand what "free speech" is or why we have it,

Well you're wrong there, but IANAL so here's first amendment attorney, Ken White.

>> Public utterances are actions which can have consequences

https://www.popehat.com/2013/09/10/speech-and-consequences/

>> buckle up because criticism of public figures is protected speech.

https://www.popehat.com/2012/07/31/the-right-not-to-be-criti...

> my friend

That's taking things too far. No thank you.


Next time you're in New York we'll get some boba, on me. I'm friends with everybody.


He's become somewhat controversial due to his worldview and political writings, which include climate change denial


And this has nothing to do with programming.

Imagine Einstein alive and denying climate change. Would you apologize every time when you are referring to the theory of relativity?

P.S. Sorry, if you don't agree with the apologising comment and were just informing about possible reasons.


Sorry if you're getting downvoted a lot. We as a group need to start learning a little subtlety when it comes to condemning all of a person's contributions because we don't like their opinions or their actions. We are smart enough that we should be able to condemn ESR's idiotic words and actions and still praise his extremely important contribution to technology.


Absolutely agree


I don't know, does it have to be a hard and fast rule?

Sometimes I quote HP Lovecraft and sometimes I feel like apologizing for his being racist (and somewhat stronger than just being a product of his times). But most of the time, also not. But it does usually cross my mind and I think that's okay and important. In a very real "kill your idols" way. Nobody's perfect.

And that's just for being a bigot in the early 20st century, which, as far as I know, is of no consequence today.

However if Einstein were alive and actively denouncing climate change today, I would probably add a (btw fuck einstein) to every mention of his theories. But that's just because climate change is a serious problem that's going to kill billions if we would actually listen to the deniers and take them seriously. This hypothetical Einstein being a public figure, probably even considered an authority by many, would in fact be doing considerable damage spouting such theories in public. And that would piss me off.

What I mean to say is, you don't have to, but it's also not wrong to occasionally point out that even the greatest minds have flaws.

Also, a very different reason to do it, is that some people with both questionable ideas and valuable insights, tend to mix their insightful writings with the occasional remark or controversial poke. In that case, it can be good to head off sidetracking the discussion, and making it clear you realize the controversial opinions, but want to talk specifically about the more valuable insights.

And this IS in fact important to keep in mind both, even if you think it is irrelevant. Because occasionally it turns out, for instance, through the value of a good deep discussion, that the valuable insights in fact fall apart as you take apart the controversial parts. Much of the time it's just unrelated, but you wouldn't want to overlook it if it doesn't.


I disagree.


The theory of relativity is a much bigger contribution to society than TAOUP.

The chapter I linked to was just a summary of ideas put forth by others - though admittedly written well.

My problem with esr is more his arrogance and conceit than politics (which I also find distasteful)


I'd say they are incomparable, but I hope it helped to get my point across :)

I've read and liked his book, btw, but I had to ignore all his stupid Windows-bashing where he attributes every bad practice to the Windows world and every good one - to the Unix world.


This is a good review of the book by Joel Spolsky which also touches on that point:

https://www.joelonsoftware.com/2003/12/14/biculturalism/


Referring to relativity and linking to Einstein's personal web page are surely two different things, no?


Yes, but I don't think this invalidates my analogy


Right, the book stands on its own. Thoughts on the author are irrelevant on the context of the work.


He's kind of crusty about climate change, but other than that he's just a guy with some strong opinions. I guess that scares some folks enough to require an apology.


Telling how this very reasonable, “maybe things aren’t completely black and white” comment got downvoted.


Not saying I agree with either sentiment, but there's a delicious irony in this comment in that you're reading into votes as if they're pure expressions of support or not for an issue that's not black and white... Even though the expressions are just projections of a spectrum of thoughts through a binary voting system!


Ah yes, the old insight. Fred Brooks: "Show me your [code] and conceal your [data structures], and I shall continue to be mystified. Show me your [data structures], and I won't usually need your [code]; it'll be obvious."


Yes, and here's one by Rob Pike, "Data dominates. If you've chosen the right data structures and organized things well, the algorithms will almost always be self­evident. Data structures, not algorithms, are central to programming." --- https://www.lysator.liu.se/c/pikestyle.html

I think I found all these quotes on SQLite's website, https://www.sqlite.org/appfileformat.html


> I'm a huge proponent of designing your code around the data

That same comment was made to a class I was in by a University Professor, only he didn't word it like that. He was discussing design methodologies and tools - I guess things like UML and his comment was he "preferred Jackson, because it revolved around the the data structures, and they changed less than the external requirements". (No, I have no idea what Jackson is either.)

Over the years I have come to appreciate the core truth in that statement - data structures do indeed evolve slower than API's - far slower in fact. I have no doubt the key to git's success was after of years of experience of dealing with VCS systems Linux hated, he had an epiphany and came up with the fast and efficient data structure that captured the exact things he cared about, but left him the freedom to change the things that didn't matter (like how to store the diff's). Meanwhile others (hg, I'm looking at you) focused on the use cases and "API" (the command line interface in this case). The end result is git had a bad API, but you could not truly fuck it up because the underlying data structure did a wonderful job of representing a change history. Turns out hg's API wasn't perfect after all and it's found adapting difficult. Git's data structure has had hack upon hack tacked onto the side of it's UI, but still shines through as strong and as simple as ever.

Data structures evolving much more slowly than API's does indeed give them the big advantage of being a solid rock base for futures design decisions. However they also have a big down side - if you decide that data structure is wrong it changes everything - including the API's. Tacking on a new function API on the other hand is drop dead easy, and usually backwards compatible. Linus's git was wildly successful only because he did something remarkably rare - got it right on the first attempt.



My memory is a little fuzzy, but I think Jackson was/is an XML serializer/deserializer that operates on POJOs (potentially with annotations). You define your data structures as Java objects, and Jackson converts them to XML for you. As opposed to other approaches where you define some schema in non-Java (maybe an XSD) and have your classes auto-generated for you.


"It is better to have 100 functions operate on one data structure than to have 10 functions operate on 10 data structures." This quote is from Alan Perlis' Epigrams on Programming (1982).


Which is also a base design principle of Clojure. There are few persistent data structures at the core, a sequence abstraction and lots of functions to work on them.


So you would have one data structure with 10 pointers to those 10 data structures you need and 10 times the functions?

Id rather split up independent structures.


Having a smaller amount of data structures makes the whole graph of code more comparable. Creating a bespoke data structure for 10 different elements of a problem means writing quite a lot of code just to orchestrate each individual structure, mostly due to creating custom APIs for accessing what is simple data underneath the hood.

There’s a reason why equivalent Clojure code is much much shorter than comparable programs in other languages.


> "I will, in fact, claim that the difference between a bad programmer and a good one is whether he considers his code or his data structures more important. Bad programmers worry about the code. Good programmers worry about data structures and their relationships."

It should be noted that the basic premise of Domain-Driven Design is that the basis of any software project is the data structure that models the problem domain, and thus the architecture of any software project starts by identifying that data structure. Once the data structure is identified then the remaining work consists of implementing operations to transform and/or CRUD that data structure.


DDD is about modeling which is data and behaviour.


> DDD is about modeling which is data and behaviour.

It really isn't. DDD is all about the domain model, not only how to synthesize the data structure that represents the problem domain (gather info from domain experts) but also how to design applications around it.


I remember that Richard Hipp (SQLite creator) once cited a bunch of similar quotes including the Linus' one.

https://www.percona.com/sites/default/files/hipp%20sqlite%20...

"Show me your flowcharts and conceal your tables, and I shall continue to be mystified. Show me your tables, and I won't usually need your flowcharts; they'll be obvious." -- Fred Brooks, The Mythical Man-Month, pp. 102-103


>> Bad programmers worry about the code. Good programmers worry about data structures and their relationships.

Tsk. Now I'll never know if I'm a good programmer. I do all my programming in Prolog and, in Prolog, data is code and code is data.


"The more I code the more I observe getting a system right is all about getting the data structures right. And unfortunately that means I spend a lot of time reworking data structures without altering (or improving) functionality..." https://devlog.at/d/dYGHXwDinpu


These are ideas echoed by some of the top people in the game development community as well. there is a nice book about these kinds of ideas:

https://www.amazon.com/dp/1916478700


Can anyone point to some good resources that teach how to code around data and not the other way round?


The Art of Unix Programming, by Eric Raymond, particularly chapter 9: http://www.catb.org/~esr/writings/taoup/html/generationchapt...


Switch to a language that emphasises functional programming (F#, Clojure, OCaml, etc) and it will happen naturally.


Surprisingly there is no one book AFAIK but techniques are spread across many books;

The first thing is to understand FSMs and State Transition Tables using a simple two-dimensional array. Implementing a FSM using a while/ifthenelse code vs. Transition table dispatch will really drive home the idea behind data-driven programming. There is a nice explanation in Expert C Programming: Deep C secrets.

SICP has a detailed chapter on data-driven programming.

An old text by Standish; Data Structure Techniques.

Also i remember seeing a lot of neat table based data-driven code in old Data Processing books using COBOL. Unfortunately i can't remember their names now. Just browse some of the old COBOL books in the library.


Jonathan Blow is also talking about data-oriented programming as a basis for designing his Jai programming language.


Data-oriented programming in games is a completely different concept though. It’s about designing your data to be able to be operated on very efficiently with modern computer hardware. It speeds up a lot of the number crunching that goes on in extremely fast game loops.

The Linus comment is about designing your programs around a data representation that efficiently models your given problem.


If I could say just one thing about programming to my kids, I would quote this.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: