Data first, not code first

mpweiher · on Sept 28, 2015

"Show me your flowchart and conceal your tables, and I shall continue to be mystified. Show me your tables, and I won't usually need your flowchart; it'll be obvious." -- Fred Brooks, The Mythical Man Month (1975)

via "Objects have not failed" -- Guy Steele, http://www.dreamsongs.com/ObjectsHaveNotFailedNarr.html

hcrisp · on Sept 29, 2015

"Bad programmers worry about the code. Good programmers worry about data structures and their relationships." -Linus Torvalds

zurn · on Sept 29, 2015

Though Guy Steele's idea sounds contentious, that OO encourages "data-first" because code is encapsulated:

  "Smart data structures and dumb code works a lot better than 
  the other way around."

  This is especially true for object-oriented languages, 
  where data structures can be smart by virtue of the fact 
  that they can encapsulate the relevant snippets of "dumb 
  code." Big classes with little methods–that's the way to go!

Or maybe he is just encouraging OO programmers to think more in this vein?

Chris_Newton · on Sept 29, 2015

The tricky part is that smartness of data structures is context-sensitive.

One of the most common design errors in OO systems seems to be building systems that beautifully encapsulate a single object’s state… and then finding that the access patterns you actually need involve multiple objects but it’s impossible to write efficient implementations of those algorithms because the underlying data points are all isolated within separate objects and often not stored in a cache-friendly way either.

Another common design problem seems to be sticking with a single representation of important data even though it’s not a good structure for all of the required access patterns. I’m surprised by how often it does make sense to invest a bit of run time converting even moderately large volumes of data into some alternative or augmented structure, if doing so then sets up a more efficient algorithm for the expensive part of whatever you need to do. However, again it can be difficult to employ such techniques if all your data is hidden away within generic containers of single objects and if the primary tools you have to build your algorithms are generic algorithms operating over those containers and methods on each object that operate on their own data in isolation.

The more programming experience I gain, the less frequently I seem to find single objects the appropriate granularity for data hiding.

ak39 · on Sept 29, 2015

Well said.

The exercise of comparmentalising and creating atomic islands of objects that dutifully encapsulate data becomes difficult during reassembly simply because we recreate the need for declarative style of accessing data in an imperative (OO) world. It's ye old object-relational impedance mismatch.

A (relational) data model is a single unit. It has to be seen this way. Creating imperative sub-structures (like encapsulating data into objects) breaks this paradigm with serious consequences when attempting to rejig the object-covered data into an on-demand style architecture. The whole model (database?) must be seen as a single design construct and all operations against the entire model must be sensitive to this notion - even if we access one table at a time. Yes, at specific times we may be interested in the contents of a single table or a few tables joined together declaratively for a particular use case, but the entire data model is a single atomic structure "at rest".

When paradigmatic lines like this are drawn, I side with the world-view that getting the data model "right" first is the way to go.

Fred Brooks and Linus Torvalds speak from experience in the trenches.

hyperpallium · on Sept 29, 2015

This also comes up in relational databases. There might be a nice, canonical way to represent what the data really is (i.e. what it represents). but then access patterns for how it is used mean that a different representation is better (usually, somewhat de-normalized). Fortunately, relational algebra enables this (one of Codd's main motivations).

A programming language is even more about data processing that a database is. But it still seems that data structures/objects represent something. I recently came up with a good way to resolve what that is:

  In a data processing (i.e. programming) language what you are
  modelling/representing is not entities in the world, but computation.
  Therefore, choose data structures that model your data processing.

This definition allows for the messy denormalized-like data structures you get when you optimize for performance. It also accounts for elegant algebra-like operators, that can be easily composed to model different computation (like the +, . and * of reg exp).

iofj · on Sept 29, 2015

Have you noticed yet that programming ideologies go around in a circle. Programmers may currently be defending data-first (/data-oriented/...) programming, but it isn't the first time they did so ?

The way I experienced it:

micro-services/small tools "that do one thing well"/... (pro: reasonably generic, con: loads of tools, requires expert users, if one tool out of 30 is not working well things break)

data-first/data-oriented programming (really easy to manipulate data, very, VERY hard to maintain consistency)

database-oriented programming (enforce consistency. Otherwise data-oriented. Works well, then when in data-oriented programming your data would have gone inconsistent, in this paradigm you get errors. Needless to say "every operation errors out" is better than "our data suddenly became crap", but still blocks entire departments/systems unpredictably)

event-driven programming (really easy to make button X do Y) (to some extent built into data-base oriented programming, also available separately. Works well, but gets extremely confusing when programs get larger)

object-oriented programming (massively simplifies the "I have 200 different message types and forgot how they interact" problems of event-driven programming, also provides the consistency of database-oriented programming)

event-driven programming with UI designers (makes event-driven programming and later object oriented event-driven programming accessible for idiots)

declarative object-oriented programming / aspect-oriented programming / J2EE and cousins / "Generating forms from data" / Django

micro-services/"do one thing well" (same disadvantages as during the 80s)

data-first/data-oriented programming (same disadvantages as end-of-80s) * you are here *

How much do you want to bet that servers that enforce data consistency and store it are next ?

mgrennan · on Sept 29, 2015

So so much truth in this. I've also seen Batch processing with terminals ID move to real time programming and operating systems. Move back to browsers and session ID (IE HTML GET / PUT are the new punch cards)

draw_down · on Sept 29, 2015

There is truth to that but when I think about it, I don't see a problem. Old ideas can be good ones.

nicklaf · on Sept 29, 2015

I'd been well familiar with the Fred Brooks quote, but that rather interesting linked essay by Guy Steele which quotes it was new to me. Thanks!

mgrennan · on Sept 29, 2015

Algorithms + Data Structures = Programs by Niklaus Wirth - Does not one read this book anymore. It takes both and both must be considered throughout the coding process. Its all math / computing is a matter of encoding and logic. 'A' = 41 hex = 01000001 binary. An array of 3 x 3 is also an array of 1 x 9. - A L Turing

jeffdavis · on Sept 28, 2015

Or, start from the user experience.

Both are good places to start, and both should be given serious consideration early in the project.

Code is usually the worst place to start. If you do need code early on, I think it's fine to write it as quickly as possible, even ignoring edge cases and simplifying algorithms.

If you have good data structure design (and understand the invariants) and you understand the problem space well from a functional standpoint (including functional edge cases), you can get the code into shape later. The bugs you encounter will be mostly the trivial kind.

But if you misunderstand the user experience, or the data structures are a mess, then it's difficult to recover.

That's one of the reasons SQL databases are so great: they help you get the data design up and going quickly, and will enforce a lot of your invariants (through schema normalization, PK/FKs, and CHECK constraints). If anything goes wrong, rollback returns you to a known-good state.

I don't know anything equivalent at the user experience level, though.

nostrademons · on Sept 29, 2015

This is one reason why prototyping is often so necessary. When you start from the user experience, you usually end up working your way back from the front-end: first you visualize what the user should see, then you mock up screens to give them that, then you figure out what they should be able to interact with and how, then you write code to make that happen, then you define data structures that make the code possible. When you start from data, you identify what data the user will need to manipulate, then you identify relationships between that data, then you write code to manipulate it, then you slap a UI in front, one which usually mimics the relationships you identified a priori.

There's a bit of an impedance mismatch here. Start from the UX, and you end up with a bunch of ad-hoc data structures that are very difficult to rationalize and inefficient to access. Start from the data, and you end up with a UI that mimics the data you were given and not how the user thinks about achieving their task.

The solution is to write a quick & dirty prototype focusing on UX, nailing it but focusing on only the happy-path that's core to the user experience. Then take careful note of the data structures that you ended up with, and throw away the prototype. Then you start with a carefully planned data architecture that captures everything you learned in the prototyping phase, but eliminates redundancies and awkward access paths that you wrote in the quick & dirty prototype.

Scarbutt · on Sept 29, 2015

That's too much work ;)

applecore · on Sept 29, 2015

The better solution is to start by designing the API, and to leave the design of the data model (to support the API) and myriad of user interfaces (as clients of the API) for later.

nostrademons · on Sept 29, 2015

You don't know what should be in the API until you've had a couple clients built on top of it. Users will surprise you; oftentimes that little UI detail (the one that's really hard to fit into your API) is what will make them buy.

Chris_Newton · on Sept 29, 2015

A similar argument also applies in the opposite direction. That is, it’s all very well beginning with a beautiful, idealised API, but if there is no efficient way to implement that interface then you’ve backed yourself into a corner before you even start.

XorNot · on Sept 29, 2015

Also UIs which don't have to support an actual data manipulation tend to underestimate dealing with the edge cases of that manipulation - which I tend to think is why you never get too far from the shell in Unix-likes.

nostrademons · on Sept 29, 2015

And why so many database-backed webapps look like a frontend over a database, and virtually all mobile messaging apps are frontends over TCP/IP, and web directories before Google were just webpages with a whole lot of links.

It turns out it can be pretty profitable to break with the machine model of the underlying data and instead present things the way your users want to see them, even if it makes the code necessary look complicated and gross. The danger with building the data model or API first is that you'll build the easy, obvious API, which is the same easy, obvious data model & API that all your competitors build. Start with the user instead and you can end up with some really powerful differentiators, at the cost of it being a bitch to transform into working, sane computer code.

Chris_Newton · on Sept 29, 2015

It turns out it can be pretty profitable to break with the machine model of the underlying data and instead present things the way your users want to see them, even if it makes the code necessary look complicated and gross.

It certainly can, though I think you’re still figuring out a data model first on those projects. It’s just that you’re starting with how the information is organised and manipulated from the user’s point of view, then figuring out an internal representation to support it afterwards, rather than the other way around.

yoklov · on Sept 29, 2015

Experience (with this style of development) helps out here. For stuff you're completely clueless on though, it still happens. And that's when you revise your design.

Its far better to go in this direction than the other, in terms of code clarity, and it's pretty rare to have it impact performance... at all, as long as you're willing to go back and rewrite the ideal code to be a bit more practical when you notice an issue.

Scarblac · on Sept 29, 2015

Which API? There are many different APIs within one application. You can't really start at one of them, you'll first need to decide what the parts of your program that need APIs between them are going to be.

onion2k · on Sept 29, 2015

The data constraints in an application will inform the user experience. Until you know what your application has to do with data any UX design will be guesswork.

I agree that the user experience development should come before any code, but it can't come before designing the data structures.

liotier · on Sept 29, 2015

> Or, start from the user experience. > Both are good places to start

Reminds me of Merise - a modeling method where the data model and the process model co-evolve in parallel. After many years I still have a fondness for it - but maybe that is just the French 80's kid in me...

marktangotango · on Sept 29, 2015

Ah yes, takes me back to the time when 'Data Analyst' was an actual profession, and organizations planned their workloads in order to place an appropriate order for 'enough' IBM hardware and dasd. Back then, during the first wave of business automation via computer, the processes being automated were so well defined, this level of understanding was acheivable; when you have rooms full of clerks calculating monthly payroll, you KNOW what the data is.

In so many domains today, you don't know, and the business is forgiving enough (or amateur enough?) that THEY don't even fully understand their processes.

emersonrsantos · on Sept 29, 2015

Everyone bash Cobol since the dawn of microcomputing, but the definition of a "Data Division" section where all data is specified before literally any actual line of code is something very interesting.

This language implementation mostly made the Y2K transition so sucessful, and today these Fortune 500 companies are still stuck with their legacy systems which were written by people "with a clue" back then.

But do not program in COBOL if you can avoid it.

riskneural · on Sept 29, 2015

Cobol is interesting, but horrifically impractical in practice. Cobol is my bugbear language... I just couldn't solve problems in it. It somehow separates what should go together and puts together what is unrelated.

rodgerd · on Sept 29, 2015

> the business is forgiving enough (or amateur enough?) that THEY don't even fully understand their processes.

After the first few waves of automation, the business made the savings to justify automation by sacking everyone who understood the processes, all the RPG programmers who moved the processes into code have retired, and now we're getting by on hope and hacks.

busterarm · on Sept 29, 2015

And when you're in the crosshairs of a weird edge case, you are fuuuuucked.

Even the guy I knew who drew payroll & disability for 9 years after being terminated. They finally caught up to him.

nostrademons · on Sept 29, 2015

Or agile enough. Most of the the time, when business processes are a mess, it comes from learning from their customers and being relentlessly customer-focused.

al2o3cr · on Sept 29, 2015

Code samples or GTFO. No seriously: an article that brings up messy real-world OO code and then handwaves it away with "just use FUNCTIONS, yo!" and some box-and-arrow diagrams is useless and borderline unhelpful. Something like:

http://prog21.dadgum.com/37.html

is, IMO, more valuable in the discussion. It points out the tradeoffs explicitly: pure functions mean (in this case) immutable data, which means more-complex data structures - which means the OPPOSITE of the micro-optimization stuff this article closes with.

It's also worth noting that if you're fussing about the cache efficiency of code you haven't written yet, STOP. JUST STOP. Write the code and MEASURE IT.

agentultra · on Sept 29, 2015

There are some things worth doing the napkin math first before even bothering writing the code. That includes cache efficiency. Finding and fixing design mistakes long after you've written a few thousand lines of heavily depended on code is rather difficult to tear out when you realize the design was wrong to begin with.

Since it's mostly game engine programmers presently driving this design philosophy imagine a scene in a game. There are some number of entities in the scene. Is that number infinite? No. Do each of those entities exist in their own bubble? No. You want to process all of the entities that need updates all at once. One transformation. You also know that from frame to frame there may be 0 - N entities in the scene but you don't want to be constantly allocating, deallocating, and resizing data structures. So you sketch out a reasonable number of entities to support in a scene. You know the data each entity needs to track to participate in the scene and you figure the best way to pack all of that into a struct of arrays so that processing that update is fast and doesn't waste a byte of cache. You haven't even written a line of code yet.

There is a give and take. You have to test out your assumptions and not be afraid to throw away bad code and start again. Once you start down a path you may find that the statistically significant number of accesses for entity data happens for the members related to the physics update. You might choose a different structure to pack that data into apart from the cold data that doesn't change much.

Much of data-oriented design is like this: save the modelling for documentation and diagrams and things that humans understand. Write the code for the machine since the machine is the platform. You lose mechanical sympathy when you start designing data structures based on intuition.

It doesn't cost you anything to think about these things. It's just engineering.

yoklov · on Sept 29, 2015

I don't think the article as advocating pure functional programming. More of a data oriented/array programming style, which has been gaining traction among game programming. There are similarities but more differences than similarities.

I recommend you watch the Mike Acton talk he linked at the end (even if you aren't a C or C++ developer).

P.s. And yes, I agree you should measure (obviously), however that is only one half of the equation. You make assumptions about the performance issues and verify them by measuring. Measuring is very poor signal to noise, but your assumptions can be wrong. Both are needed to optimize code well.

If what you said were true, it wouldn't be possible to design for efficient cache access. And it is.

112233 · on Sept 29, 2015

Huh? I read the article as being about simple code that is easier to understand and maintain because it's not semantically overloaded with abstractions, and picking data structures that fit the need of your process, instead of some patterns because Herb said so or something. Usually you don't need code samples for that because it's such a common bad thing.

The ultimate data-first man would be Chuck Moore.

current_call · on Sept 28, 2015

Input -> Process -> Output

Somewhere a Haskell programmer is nodding.

vvanders · on Sept 29, 2015

Not really...

AFAIK(I'm no Haskell guru) Haskell doesn't let you specify memory layout explicitly. Without that you won't see the massive gains that come from data-oriented design.

The second half of this talk from Herb Sutter covers is succinctly: https://channel9.msdn.com/Events/Build/2014/2-661

vezzy-fnord · on Sept 28, 2015

Somewhere a programmer is nodding.

ww520 · on Sept 29, 2015

No, the lazy evaluation in Haskell would make it difficult to reason about the runtime performance in his case.

The conclusion is to use tightly packed data structure to take advantage of the cache locality of the modern CPU for high performance requirement. That can be done in most languages.

HelloNurse · on Sept 29, 2015

It can be done less easily in Haskell, because even if arrays are available (albeit as second-class citizens) and hopefully well packed, the actual structure and memory layout of their elements is an implementation detail (and, in practice, it tends to be unusually pointer-rich because it has to support lazy evaluation).

_RPM · on Sept 28, 2015

I believe computing has 1 concept needed to be understood by programmers: Input & Ouput. That's all programming is.

dragonwriter · on Sept 28, 2015

> I believe computing has 1 concept needed to be understood by programmers: Input & Ouput.

That's two concepts. So, the two concepts that need to be understood by programmers are Input and Output. And State.

philipov · on Sept 28, 2015

State is just emergent behavior caused by feeding your output back into your input.

inopinatus · on Sept 28, 2015

Programmers therefore need to understand time.

socceroos · on Sept 29, 2015

However, time is shackled to space. Therefore, all programmers need to understand space.

inopinatus · on Sept 29, 2015

That's interesting, we don't generally consider relativistic effects on computation.

There could be an interesting synthesis between Minkowski spacetime, lambda calculus, and Lamport clocks that will either a) show that the fundamental unifying element in the universe is functions (and really annoy at least one side in the disagreements over the Black Hole Information Paradox), or b) make for a terrific Greg Egan novel.

SapphireSun · on Sept 28, 2015

Also sequence.

munificent · on Sept 28, 2015

And a ruthless efficiency!

antod · on Sept 29, 2015

I didn't expect that.

pjmorris · on Sept 28, 2015

My first programming (Fortran!) teacher listed input, storage, decisions, and output as the four concepts needed. Later I learned about Turing machines, same deal.

eru · on Sept 29, 2015

Point to where the decisions are in SKI calculus. SKI calculus is as strong as Turing machines and lambda calculus.

(If you squint right, you can see the decision making bit. But it's a tight squint.)

--- (from Wikipedia:)

The evaluation operation is defined as follows:

(x, y, and z represent expressions made from the functions S, K, and I, and set values):

I returns its argument:

    Ix = x

K, when applied to any argument x, yields a one-argument constant function Kx , which, when applied to any argument, returns x:

    Kxy = x

S is a substitution operator. It takes three arguments and then returns the first argument applied to the third, which is then applied to the result of the second argument applied to the third. More clearly:

    Sxyz = xz(yz)

---

See https://www.reddit.com/r/haskell/comments/ypcu7/hello_world_... for an example program in SKI.

otis_inf · on Sept 29, 2015

> I believe computing has 1 concept needed to be understood by programmers: Input & Ouput. That's all programming is.

No, programming is making output from input. Input is just a set of bytes, and output is also a set of bytes.

reedlaw · on Sept 29, 2015

Does the conclusion boil down to "use functional programming"? The author doesn't specify the way data or functions are encapsulated in the final figure. But my gut reaction is that this describes FP better than OO or any other paradigm.

flohofwoe · on Sept 29, 2015

Not really, it's basically 'composition over inheritance', and laying out data structures in a cache-friendly way, and only then build the functions that work on those data structures (the common term is 'data-oriented design').

It's the mantra that game developers have been preaching for at least the last 5..10 years and which slowly makes its way into the general programming world, because the CPU/memory performance gap is getting wider, not narrower. The bitsquid engine blog (now: Autodesk Stingray engine) has a lot of really great articles on that topic (for instance http://bitsquid.blogspot.de/2014/08/building-data-oriented-e...).

deckard1 · on Sept 29, 2015

No. Because the data structures he is referring to need not be immutable. Think "C struct". Structs in C are more closely aligned with components and lend themselves to composition much easier than a tower of inheritance. If you ever try Unity after bashing your head in with C++, you'll know what a godsend components are.

eru · on Sept 29, 2015

Check out functional relational programming (http://shaffner.us/cs/papers/tarpit.pdf).

nicklaf · on Sept 29, 2015

I admire the combination of theoretical audacity and sweeping pragmatism taken by the authors of that paper.

They mention Oz a few times. I am big fan of "Concepts, Techniques, and Models of Computer Programming" by Van Roy and Haridi, which shows the reader how different computing paradigms can be constructed from a kernel language in Oz. That makes the text something of a kindred spirit to the paper, insofar as it tries to aid the student in grasping many different paradigms of computation (functional, object-oriented, concurrent, and relational are covered).

eru · on Sept 29, 2015

The Haskell using folks at Standard Chartered Bank have been implementing some of the "Out of the Tarpit" ideas for a few years now. I can confirm, functional relational programming does work really well in practice.

yoklov · on Sept 29, 2015

Not really. It boils down way closer to procedural than anything else. The three links he has at the bottom are very relevant.

lackbeard · on Sept 29, 2015

Yeah, that's basically what he says.

chubot · on Sept 28, 2015

It's interesting that he's advocating this on games.

I've been annoyed at most OOP codebases, but conceded that games and GUIs were the canonical applications of OOP. Perhaps you could say that highly stateful apps running on a single machine can use state encapsulated in objects.

In contrast, for web UIs and their "big data" back ends, I believe you also want to use a data-oriented approach, rather than a code-oriented approach (or at least this is the style that my code has converged upon). You don't need to wrap everything up in objects all the time when you're really just passing strings through the network and doing light processing. And when you're not even using a type-safe language to begin with.

But he's advocating the same style on games, just with structs and functions rather than buffers and processes. This makes me think OOP is more of a mistake, although I will admit that I use objects as modules now (a longer conversation, but the idea is to instantiate almost all your objects at startup, and not allocate during normal program flow).

I've long thought there was this interesting historical oddity in that we came up with languages in the 90's and designed them for GUIs, but we're using them the 2010's for web and server programming. Then you get web apps that are a mess of misleading objects, when really the architecture is stateless (or stored in a database, which is not programmable using your language).

chipsy · on Sept 29, 2015

I think our main error was in thinking that objects could be pervasive, in the same way that today one encounters "test-driven" culture.

They work reasonably well at a lower level of granularity - make a fairly complex data model using components under the hood, and then publish the resulting api as a more streamlined mutable object.

But the diminishing returns are fast once you start needing each piece of data to be fungible and adapt to new requirements, and then wrap it up in an object. The relational database has been standardized for some time in large part because it guides you towards coherent data automatically - systems that resist that model are more likely to experience a catastrophic failure, and "ORM" systems are likewise brittle.

eru · on Sept 29, 2015

Compare out "The Next Mainstream Programming Language: A Game Developer’s Perspective" (https://news.ycombinator.com/item?id=1029820). By the people behind Unreal. They advocate a functional approach for games, too.

rebootthesystem · on Sept 29, 2015

When I took my first APL class the prof drilled into our heads to focus on data representation before writing any code. Over the years and across a dozen languages or so this has proven to perhaps have been the most valuable thing I learned in CS. I lost count of how many problems I've seen go from difficult to manageable and even simple to solve by devoting more effort towards finding the best way to represent the data or the problem than simply taking things as presented and writing code right away.

wsh91 · on Sept 29, 2015

Hey there! First off, beautifully written. Thank you for a lot of food for thought.

Second--I can't help but think of the notion of mechanical sympathy.[1] Very useful.

Finally, I want to put on [2] but my wit is eluding me.

[1] http://martinfowler.com/articles/lmax.html

[2] https://www.cs.cmu.edu/~crary/819-f09/Backus78.pdf

ampersandy · on Sept 29, 2015

I feel the author has chosen poor examples for illustrating 'bad code'. Does he really think so little of the id software team that they aren't aware of these things? There's no mention of the fact that in the real world, all the patterns and weird tricks there are won't save you from compromising on a perfect design to get stuff done & shipped.

draw_down · on Sept 29, 2015

Good people make mistakes too. It's important for all of us to accept criticism of our ideas and code without mistaking that for personal criticism.

Narishma · on Sept 29, 2015

I don't think it's a bad example. I remember John Carmack saying Doom 3 was their first C++ project and they weren't very familiar with the language and did some poor decisions on that.

agentultra · on Sept 29, 2015

This idea of data-oriented design has been influencing much of how I write code and approach problems these days. In dynamic GC'd languages it's all too easy to trick yourself into believing that all problems are unbounded. However most problems are not. And doing the math ahead of time allows you to write the minimal amount of code to do the transformation required.

It's really cool.

dyarosla · on Sept 29, 2015

I think the author has some great points to begin with, but the finale is a little.. lacklustre.

It's all well and good to say we can separate process from objects, but in reality it's not uncommon to require instantiating new objects/components dynamically or changing structure in ways that don't map so easily to the input->process->output diagram shown.

I also think that maintaining code and building up testable modules was one of the niceties of OOP ahead of all the inheritance spaghettis that started happening in many poorly-OOPed projects.

Kudos though to the author for showcasing his first few points by commenting on Doom's code- you don't often see that.

ilurk · on Sept 29, 2015

The author had some pretty good points there. But I have to wonder if these are somewhat cherry picked.

I mean, where is the other side of the coin? Where do the advocated philosophies break?

Has any big project been built using that approach? With no problems at all?

Narishma · on Sept 29, 2015

A lot of AAA games are built that way.

ilurk · on Sept 29, 2015

Have any of them released their source code?

Not a snarky comment, just wondering if it would be possible to peak at the code.

Narishma · on Sept 29, 2015

None that I know of. The Ogre3d renderer has moved from an OOP to a data oriented design in their newer versions, so maybe you can look at that.

mjcohen · on Sept 29, 2015

Two relevant blasts from my past:

"Algorithms + Data Structures = Programs" by Wirth

The Jackson Design Methodology, which essentially said that the structure of the data will produce the structure of the program.

javajosh · on Sept 29, 2015

I've heard this as "debug data not code" and that's an addage that has served me very, very well.

mathgenius · on Sept 29, 2015

Yeah, "smart data-structures, dumb algorithms." Mathematicians know this too: if you get the notation right, then the calculation is simple.

truncate · on Sept 29, 2015

Couldn't agree more. Whatever program we write, it is manipulating and transforming data. If our code is revolving around data, it couldn't make more sense to formulate data before actually start code which is supposed to work on that data.

zkhalique · on Sept 28, 2015

I used to think I can communicate the architecture (including code) of an entire website using just the tables. And a good enough developer+designer could turn the schema into a website.

jdbernard · on Sept 28, 2015

You imply that you no longer believe this. What changed?

112233 · on Sept 29, 2015

I so wish c++ would let me code like this without getting in the way with it's OOishness. Recent gripes:

- got a few bit flags per struct, need to know if there is at least one instance with the flag set (a counter per flag). I cannot bit-pack the flags without duplicating the flag getter/setter/destructor code.

- no way to declare static linkage funcions as friends (the linkage is forced to external). so i need to expose types that are implementation detail in public header.

would be nice if c++ could get some RAD-slanted design work, instead of trying to get all RUSTy

112233 · on Sept 29, 2015

(expanding on the above while using non-touchscreen keyboard)

The problem that I continually encounter writing data-first in c++ is the inability to group data members that are related to single implementation detail but are scattered in multiple classes, and indicating in a compiler-parsable way which functions have the responsibility to manage these data fields. For example, consider maintaining an M to N relation links between classes A an B, without putting links in parent class, and grouping all implementation details of these links in a a single file. Or even maintaining reference counter without using inheritance or other heavy-OO mentioned in the article. Even worse if your data is only a bitfield.

C++ assumes single implementation detail to be a single class, contained in a single allocation. It can be subverted using "public", "friend", CRTP, but it feels like hammering nails with pliers.

Newer standards go all-out on supporting really powerful data structures in library, and writing incredible templates, but there is nothing for low-level C-style custom data handling.

otis_inf · on Sept 29, 2015

E. Yourdon was right after all! I knew it!

vinceguidry · on Sept 28, 2015

> The boss comes in and says "Hey, change of plans. The player is now a car."

Stupid decision-making is why I will never work in games. With regular coding, you can iterate towards the domain and so long as you keep your wits about you and never make the same mistake twice, never be subjected to such a demand.

In the arts, everything's got to hew to some asshole's vision. The people building it can never really know what's going on in his head and the asshole never understands how much work it is to change things to fit the evolving vision. Killer features on games are rarely evident from day one, game shops seem to always wind up looking like sausage factories.

dragonwriter · on Sept 29, 2015

> Stupid decision-making is why I will never work in games. With regular coding, you can iterate towards the domain and so long as you keep your wits about you and never make the same mistake twice, never be subjected to such a demand.

In the ideal environment, perhaps; all too often in real environments, radical shifts in requirements not discoverable by advance consideration of the domain can happen outside of games (sometimes, because there are radical shifts in the domain -- if your system is automating implementation of corporate or government policy, whims of policymakers can change things as radically as "the player is now a car").

vinceguidry · on Sept 29, 2015

I work for a marketing company, I'm well aware of shifting whims. You can build flexibility into your infrastructure if you really need it. You come up with a model for how things are likely to change in the future and work modularity into your codebase.

This approach doesn't really work for games. Everything is so interdependent, modularity that allows for all possible ways for the system could change would slow it down to a crawl. There's just no way to avoid expensive, painful rewrites.

yoklov · on Sept 29, 2015

This is somewhat true, but you structure the code in a way so that rewriting any one part is easy. You keep your modules fairly large, your abstractions thin, and the code very literal. It's much easier to do a total rewrite on code that's like this than when you abstract it out a great deal.

It's very quick to write code like this, and the maintenance cost of this isn't as large as you might think (honestly, after you get used to it, you often end up preferring it. It's very, uh, blunt, for lack of a better term).

Edit: Worth noting that these rewrites aren't always due to a change in creative direction. There are often ones for technical reasons.

Rusky · on Sept 29, 2015

Game engines can just as easily be "iterated towards the domain" with "your wits about you." This post hints at several ways of achieving that.

Artistic direction is normally accommodated at a higher level in data and scripting anyway.

Retra · on Sept 29, 2015

This is why you need to be that asshole yourself and make your own games to your own vision.

vinceguidry · on Sept 29, 2015

Doesn't fix the lack of stability.