Let's Build NSMutableArray

pacaro · on Dec 11, 2012

"This code omits error checking in the interest of brevity and simplicity."

In the interest of brevity and simplicity, and at the cost of correctness and reliability - I understand the intent, but I believe that there is a balance to be struck on this for all code, whether example, prototype, or production.

Because sample code so very often chooses an extreme of that spectrum, it is easy for neophyte coders to end up thinking that is how all code should be written.

huxley · on Dec 11, 2012

I'd love to see a widget that would let you display and toggle between two versions of the code on a webpage:

* a first version illustrating the core concepts

* a second more complex version aimed at showing proper coding practices including comments showing where you can have performance or security issues.

It would be a nice addition to the many source code beautifiers/formatters out there.

Firehed · on Dec 11, 2012

Unfortunately if it's the first semi-working code sample in a Google result, it will still end up in production somewhere no matter how crappy it is and how many warning labels are put on it.

mikeash · on Dec 11, 2012

This is a really cool idea, and I'm going to have to see if I can incorporate it into future blog posts. I'm not sure if it'll be worth the trouble, but it certainly bears thinking about.

pi18n · on Dec 11, 2012

I guess it depends on the audience and the intent of the tutorial. Other posts on his blog include things like "let's have a look into the Objective-C runtime" and "let's unwind the stack on multiple threads to get a nice crash report", so I assume he's writing for people that understand enough to know that a final implementation needs to be robust and reliable.

mikeash · on Dec 11, 2012

You got it. While stuff for beginners is undoubtedly useful, I leave that to others.

mistercow · on Dec 11, 2012

>For the unfamiliar, memmove is a potentially slower variant of memcpy

It's worth pointing out that the performance hit of memmove vs. memcpy is a single check in memmove which decides, based on if the source or destination address is higher, whether to copy front to back or back to front. You're never going to win any measurable performance gains by replacing a memmove call with a memcpy.

lilyball · on Dec 11, 2012

That single check could potentially be a branch mis-predict.

mistercow · on Dec 11, 2012

The only time it would make a big difference would be if you were doing a lot of small memmoves in a tight loop. In that scenario, you mispredict at the end of the loop within memmove/memcpy anyway. So when the outer loop comes back around, the pipe is already empty and any mis-predict on the pointer comparison is basically harmless.

paupino_masano · on Dec 10, 2012

I think this is a great article to refresh your knowledge as to how the underlying data structures actually work.

My only nitpick is ignoring the capacity variable in initWithCapacity. Really, a simple allocation of memory here and setting the _capacity variable is all that's needed. It IS an important implementation to write - especially if you know you're going to be working with a large array size right from the get go to avoid the memory copies.

But otherwise - I enjoyed that. I think I may just start following his blog to refresh my knowledge with other data structures!

Someone · on Dec 10, 2012

Make sure you follow the link to http://ridiculousfish.com/blog/posts/array.html. It tells you that, in this case, this is not how the "underlying data structures actually work".

This NSMutableArray subclass has O(1) access, as one expects from an array. NSMutableArray itself, however, only guarantees O(log n).

chc · on Dec 10, 2012

Underlining the fact that NSMutableArray may ignore your capacity hint is also important, though. Many people make false assumptions about how NSMutableArray manages memory.

DrJokepu · on Dec 11, 2012

The Cocoa documentation does not seem to suggest that the numItems parameter of the initWithCapacity message is in any way optionally handled by the runtime. It repeatedly states that the array will be initialised "with enough memory to hold numItems objects" in no uncertain language. It does not imply in any way that it is merely a hint. If the implementation behaves differently, that's either a problem with the documentation or the implementation, not with people's assumptions.

chc · on Dec 11, 2012

Interesting! You're right that it says that now. Thanks for the correction.

I'm nearly positive that the wording used to be much more vague on that point, as I remember thinking in my early days that it was so fuzzy as to be nearly pointless to go out of your way to give an accurate capacity. Mike is an old-time Cocoa guy, so if my memory is serving me well, he was probably thinking of the same old wording.

lilyball · on Dec 11, 2012

I too thought it was fuzzy. In fact, I'm still reasonably sure that the implementation still doesn't actually guarantee it reserves that space, despite how the documentation is now worded, although I am not an authority on this subject. It might be interesting to check out the CFArray implementation on opensource.apple.com and see what that does (although CFMutableArray treats capacity a bit differently than NSMutableArray to begin with).

mikeash · on Dec 11, 2012

I wonder just what such a guarantee means in an API contract, since there's no way whatsoever to actually check on that behavior from the outside.

awolf · on Dec 11, 2012

Nice one Mike! My one improvement would be using realloc in the out-of-space case vs. malloc+memcpy+free.

http://www.cplusplus.com/reference/cstdlib/realloc/

masklinn · on Dec 11, 2012

Although I had the same initial reaction, in the comments he explains he willfully used malloc/memcpy/free to demonstrate the underlying behavior.

mikeash · on Dec 11, 2012

Right. The reallocation behavior is so fundamental to the implementation of NSMutableArray that there seemed no point in calling realloc directly. If I did that, might as well just use NSMutableArray instead of writing my own, and then there wouldn't be anything to write about.

nthitz · on Dec 10, 2012

Brings me back to my Data Structures class. Funny how often we use these features yet take for granted the underlying implementation. Classic first year CS student classwork (although we did Java not ObjC).