Hacker News new | past | comments | ask | show | jobs | submit login

Before using realloc, you have to do this:

  void *sane_realloc(void *ptr, size_t size)
  {
     if (ptr == 0) {
       return malloc(size);
     } else if (size == 0) {
       free(ptr);
       return 0;
     } else {
       return realloc(ptr, size);
     }
  }
ISO C realloc has braindamaged corner cases. Some implementations behave like the above, in which case you can just have #define sane_realloc realloc on those targets.

With the above you can initialize a vector to null, with size zero, and use nothing but realloc for the entire lifetime management: growing it from zero to nonzero size allocates it, shrinking down to zero frees it.

malloc(0) doesn't necessarily return null; it can return some non-null pointer that can be passed to free. We can get such a thing if we call sane_realloc(0, 0), and can avoid that if we change the malloc line to:

   return size ? malloc(size) : 0;





According to the standard `realloc(NULL, size)` should already behave like `malloc(size)`. You shouldn't need that special case unless you're working on a system with a very buggy/non-compliant libc.

I think that you can omit the call to malloc, as realloc(NULL, size) does the same thing as malloc(size), and free(NULL) does nothing.

  void *sane_realloc(void *ptr, size_t size)
  {
     if (size == 0) {
       free(ptr);
       return 0;
     } else {
       return realloc(ptr, size);
     }
  }
Unfortunately, when shrinking an array down to 0, you run into a complication. Detecting allocation failure now requires checking both size > 0 and sane_realloc returning 0. To simplify this further, just always allocate a non-zero size.

  void *saner_realloc(void *ptr, size_t size)
  {
     if (size == 0) {
       size = 1;
     }
     return realloc(ptr, size);
  }

But the second sane_realloc now never frees. That's a problem shared by by the ISO C realloc.

According to ISO C, size zero can behave like this:

  free(old)
  return malloc(0)
and if malloc(0) allocates something, we have not achieved freeing.

There are ways to implement malloc(0) such that it returns unique pointers, without allocating memory. Or at least not very much memory. For instance we can use the 64 bit space to have some range of (unmapped) virtual addresses where we allocate bytes, and use a compact bitmask (actually allocated somewhere) to keep track of them.

Such a scheme was described by Tim Rentsch in the Usenet newsgroup comp.lang.c.

If an implementation does such a thing, adjusting the size to 1 will defeat it; allocations of size 1 need real memory.

(I can't fathom the requirement why we need malloc(0) to be a source of unique values, and why someone would implement that as efficiently as possible, when it's implementation-defined behavior that portable programs cannot rely on. Why wouldn't you use some library module for unique, space-efficient pointers.)

I would never rely malloc(0) to obtain unique pointers at all, let alone pray that it is efficient for that purpose.

I'd be happy with a malloc(0) which returns, for instance, ((void *) -1) which can be hidden behind some #define symbol.

saner_realloc isn't realloc; it is our API, and we can make it do this:

   #define SANE_REALLOC_EMPTY ((void *) -1)

   void *sane_realloc(void *ptr, size_t size)
   {
     if (ptr == 0 || ptr == SANE_REALLOC_EMPTY) {
       return size ? malloc(size) : SANE_REALLOC_EMPTY;
     } else if (size == 0) {
       free(ptr);
       return SANE_REALLOC_EMPTY;
     } else {
       return realloc(ptr, size);
     }
   }
Now, a null return always means failure. The shrink to zero, or allocate zero cases give us SANE_REALLOC_EMPTY which tests unequal to null, and we accept that value for growing or freeing.

The caller can also pass in something returned by malloc(0) that is not equal to null or SANE_REALLOC_EMPTY.


I forgot to mention my own opinion. I think that malloc(0) ought to return a 0-sized object, and likewise realloc(ptr, 0) ought to resize the object to 0. Malloc and realloc always allocating feels more consistent to me. For a practical example, I have some code that reads an entire FILE stream into memory. This requires realloc with doubling the size every time. After finishing, I'd like to resize the buffer down to the total bytes read. If it reads 0 bytes, I'd like it to resize the buffer to 0.

I think my sane_realloc never freeing has much simpler behavior. As much as I hate the needless waste of 1 byte, if my code allocates thousands of 0-sized objects, I'd rather fix that before adding complexity to my sane_realloc.

With yours solving the 1 byte problem, it still interests me. We can simplify your code slightly.

  #define SANE_REALLOC_EMPTY ((void *) -1)

  void *sane_realloc(void *ptr, size_t size)
  {
    if (ptr == SANE_REALLOC_EMPTY) {
      ptr = 0;
    }
    if (size == 0) {
      free(ptr);
      return SANE_REALLOC_EMPTY;
    } else {
      return realloc(ptr, size);
    }
  }

A zero-sized object is immutable, since it has no bits to mutate. The pointer cannot be dereferenced. It may be compared to other pointers. So the question is: do you want to obtain unique zero-sized objects, or are you okay with there being one representative instance of all of them? If you're okay with one, you can just call SANE_REALLOC_EMPTY that object. I called it EMPTY on purpose; it represents an empty object with no bits.

If you want unique zero-sized objects, you can always simulate them with objects of size 1. The allocator API doesn't have to make that choice internally for you.


Like you, I have no idea what use the unique 0-sized pointers would provide. SANE_REALLOC_EMPTY is certainly adequate for me. Thank you!

In a Lisp implementation, I could use unique, zero sized objects to implement symbols. The gensym function would create a new one.

Zero-sized distinguishable objects can have properties attached to them via hashes. At the API level, you cannot tell how a property (e.g. symbol-name) is attached to an object.

Properties, like the symbol name, would be attached to these objects via hashes.

It's not an ideal representation for symbols when all symbols have a certain property, like name.

If you have objects such that they have certain properties, but the properties are rare (only a few objects have a value of the property that requires storing space, and for the rest it is some "undefined" value, you can achieve a sparse representation by making the objects as small as possible, attaching the properties externally.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: