Hacker News new | past | comments | ask | show | jobs | submit login

Code samples or GTFO. No seriously: an article that brings up messy real-world OO code and then handwaves it away with "just use FUNCTIONS, yo!" and some box-and-arrow diagrams is useless and borderline unhelpful. Something like:

http://prog21.dadgum.com/37.html

is, IMO, more valuable in the discussion. It points out the tradeoffs explicitly: pure functions mean (in this case) immutable data, which means more-complex data structures - which means the OPPOSITE of the micro-optimization stuff this article closes with.

It's also worth noting that if you're fussing about the cache efficiency of code you haven't written yet, STOP. JUST STOP. Write the code and MEASURE IT.




There are some things worth doing the napkin math first before even bothering writing the code. That includes cache efficiency. Finding and fixing design mistakes long after you've written a few thousand lines of heavily depended on code is rather difficult to tear out when you realize the design was wrong to begin with.

Since it's mostly game engine programmers presently driving this design philosophy imagine a scene in a game. There are some number of entities in the scene. Is that number infinite? No. Do each of those entities exist in their own bubble? No. You want to process all of the entities that need updates all at once. One transformation. You also know that from frame to frame there may be 0 - N entities in the scene but you don't want to be constantly allocating, deallocating, and resizing data structures. So you sketch out a reasonable number of entities to support in a scene. You know the data each entity needs to track to participate in the scene and you figure the best way to pack all of that into a struct of arrays so that processing that update is fast and doesn't waste a byte of cache. You haven't even written a line of code yet.

There is a give and take. You have to test out your assumptions and not be afraid to throw away bad code and start again. Once you start down a path you may find that the statistically significant number of accesses for entity data happens for the members related to the physics update. You might choose a different structure to pack that data into apart from the cold data that doesn't change much.

Much of data-oriented design is like this: save the modelling for documentation and diagrams and things that humans understand. Write the code for the machine since the machine is the platform. You lose mechanical sympathy when you start designing data structures based on intuition.

It doesn't cost you anything to think about these things. It's just engineering.


I don't think the article as advocating pure functional programming. More of a data oriented/array programming style, which has been gaining traction among game programming. There are similarities but more differences than similarities.

I recommend you watch the Mike Acton talk he linked at the end (even if you aren't a C or C++ developer).

P.s. And yes, I agree you should measure (obviously), however that is only one half of the equation. You make assumptions about the performance issues and verify them by measuring. Measuring is very poor signal to noise, but your assumptions can be wrong. Both are needed to optimize code well.

If what you said were true, it wouldn't be possible to design for efficient cache access. And it is.


Huh? I read the article as being about simple code that is easier to understand and maintain because it's not semantically overloaded with abstractions, and picking data structures that fit the need of your process, instead of some patterns because Herb said so or something. Usually you don't need code samples for that because it's such a common bad thing.

The ultimate data-first man would be Chuck Moore.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: