The article linked from this plus page (http://exple.tive.org/blarg/2013/10/22/c...

lilyball · on Oct 23, 2013

There's another reason that I haven't see anyone bring up. If your native unsigned integer type is the same size as a word (which is common), and a word defines the limits of your address space, and if you have a byte pointer initialized to 0, then 0-based indexing allows you to index into every byte in your address space, whereas 1-based indexing prevents you from indexing the final byte in your address space.

Additionally, assuming an unsigned integer type is used for indexing, 0-based makes every index valid and meaningful, whereas 1-based leaves one index value (0) as invalid.

Dylan16807 · on Oct 24, 2013

Having a null value that's invalid is a good thing. Thanks for the novel argument in favor of 1-based indexing!

kenbot · on Oct 24, 2013

Erk, no it isn't. If your semantics require an optional value, then encode it that way explicitly. Don't inflict special cases and bugs on everyone else.

lectrick · on Oct 24, 2013

Please cite an example where this is superior to a 0-based-index example. Just saying "it's a good thing" won't fucking fly around here.

Dylan16807 · on Oct 24, 2013

I thought I explained myself, especially with my second comment, but sure I'll go over it again in detail.

Premise 1: Accessing a null index is a logically invalid operation.

Premise 2: Silent invalid operations are bad.

Premise 3: If a null has a special representation that the hardware understands to be invalid, it can trap or set a flag or otherwise not be silent.

Conclusion: Null indexes that are architecturally invalid to use are a good thing, because you can catch bugs involving their use immediately.

Edit: made terms clearer

lectrick · on Oct 24, 2013

Null is NOT the same as "zero"

Ruby has "nil", and 0. They are not the same, and should not be considered the same.

Would you say that accessing coordinate 0,0 is a "logically invalid operation"? What is an array but a 1 dimensional coordinate system?

Are you suggesting that accessing an array via an uninitialized parameter will show up as a runtime error because of 1-based indexing? That only happens if your language initializes values to 0.

Ruby initializes everything to nil, not 0, and yes, indexing by nil will fail.

Perhaps I'm not seeing your problem or solution because Ruby doesn't even have the kind of problem that is solved by a 1-based array index. See: the Sapir-Whorf Hypothesis http://en.wikipedia.org/wiki/Linguistic_relativity , your thinking may be suffering due to the language you are typically working in.

Dylan16807 · on Oct 24, 2013

>Would you say that accessing coordinate 0,0 is a "logically invalid operation"?

If you have a matrix that starts at cell 1,1 then trying to access 0,0 is absolutely a logically invalid operation. I'm not saying that 0-indexing is illogical, I'm saying that dereferencing null/nil is illogical.

As for the rest of your post, the scenario eridius came up with is pretty weird, at this point I'm just going to shrug and give up on analysis.

IanCal · on Oct 24, 2013

Premise 4: Your type system can't tell the difference between "null" and 0.

Adding 4 and null also makes no sense, yet 4 + 0 is perfectly fine.

> Conclusion: Null indexes that are architecturally invalid to use are a good thing, because you can catch bugs involving their use immediately.

Recommendation: Have a good type system.

    >>> a = [1]
    >>> a[None]
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    TypeError: list indices must be integers, not NoneType

lilyball · on Oct 24, 2013

Passing 0 as the index into an indexing operation is not even remotely "accessing a null value".

Also, believe it or not, on various machines and OS's it's legal to map address 0.

Edit: Your edit makes even less sense. There is no such thing as a "null index", much less an operation to "access a null index".

Dylan16807 · on Oct 24, 2013

>There is no such thing as a "null index",

Would you rather I write "index that is 0 in a world where indexes start at 1" over and over? I am willing to do that if it helps make the conversation smoother.

>much less an operation to "access a null index".

foo_array[0]; //operation to do an access with a null index

>Also, believe it or not, on various machines and OS's it's legal to map address 0.

Which is a bad thing. You get a couple extra variable slots in exchange for thousands of privilege-escalation bugs that could instead be crash bugs.

lectrick · on Oct 24, 2013

How in the hell does "accidentally" accessing the first element of an array (by "accidentally" passing 0 to an array in a language that has 0-based indexing) result in "privilege escalation errors"? That's just "shitty programming".

For that matter, how does a 1-based indexing language NOT result in far more "off by one" errors which would include the kind of errors commonly known as "buffer overflow" errors?

Dylan16807 · on Oct 24, 2013

In most cases indexing doesn't affect safety, the privilege stuff was related to what eridius was talking about with physical addresses.

I don't think either one creates more off-by-one errors in people used to it.

lilyball · on Oct 24, 2013

It is? What possible reason is there to have an invalid value for an index parameter?

Dylan16807 · on Oct 24, 2013

You specifically said the native word size and address space. That's going to be a pointer, where a value that causes the CPU to trap is quite handy.

And there's already plenty of invalid index values. -1 for example. There's no way to make an allocation the size of your entire ram, so 0xff...f is going to be invalid.

lilyball · on Oct 24, 2013

> That's going to be a pointer, where a value that causes the CPU to trap is quite handy.

You're confusing an invalid pointer with an invalid index. There is absolutely no reason for any index to ever be invalid.

> And there's already plenty of invalid index values. -1 for example.

-1 is not representable with unsigned integers. There are no invalid index values in a 0-based indexing system (using unsigned integers, as I initially stated).

> There's no way to make an allocation the size of your entire ram, so 0xff...f is going to be invalid.

This is so completely off the mark I'm surprised you even came up with it. There are several things wrong with this (off the top of my head):

1. I don't need an allocation the size of my address space in order to want to address both the first and last bytes in the address space.

2. You don't know what machine I'm running on. This could be an embedded 16-bit machine without multiprocessing with 64KB of RAM. Every single addressable byte is valid and corresponds to RAM storage.

3. It doesn't even matter whether or not a given address can be loaded. Indexing a pointer does not necessarily require loading it. As a trivial example, I could take the address of the resulting indexed value.

Dylan16807 · on Oct 24, 2013

>You're confusing an invalid pointer with an invalid index. There is absolutely no reason for any index to ever be invalid.

I'm not confusing them. You are very clearly describing a pointer.

>-1 is not representable with unsigned integers.

When I wrote -1 I meant "the value you get when you type -1 in your source code".

>This is so completely off the mark I'm surprised you even came up with it. There are several things wrong with this (off the top of my head):

>1. I don't need an allocation the size of my address space in order to want to address both the first and last bytes in the address space.

That's a pointer issue. If I allocate 20 bytes near the end of ram, my last index is only going to be 19 or 20, so it doesn't matter if I can't go to UINT_MAX.

> 2. You don't know what machine I'm running on. This could be an embedded 16-bit machine without multiprocessing with 64KB of RAM. Every single addressable byte is valid and corresponds to RAM storage.

Again, that's a pointer.

> 3. It doesn't even matter whether or not a given address can be loaded. Indexing a pointer does not necessarily require loading it. As a trivial example, I could take the address of the resulting indexed value.

Taking the address is a pointer issue.

kens · on Oct 24, 2013

The article you link to is interesting with a lot of research, but it is rather hostile and reaches a strange conclusion. The article quotes the creator of BCPL (precursor language to C): "if p is a pointer [...] Naturally p+0 has the same value as p. [... discussion of the indirection operator ...] I can see no sensible reason why the first element of a BCPL array should have subscript one." So the explanation from the creator of BCPL is basically what you'd expect, that 0-based is the natural thing to do if you're working with machine-level pointers.

But then the article goes off into compiler optimization, a total misunderstanding of time-sharing, and a misunderstanding of how indexing works and concludes that in BCPL arrays started at 0 to minimize compile time to avoid job preemption, and that's why many languages today use 0-based arrays. The article is still worth reading, though.

soperj · on Oct 23, 2013

VBScript is 0 based for some stuff, but 1-based for slicing. Drove me nuts for a day until I figured it out inspite of the criptic error message. Hate VBScript for this reason alone.

ams6110 · on Oct 24, 2013

One thing to remember is that Lua arrays are 1-based by convention, not requirement, since they are really just tables indexed by integers. So you can have zero-based arrays in Lua, but you'll be making life difficult for yourself as the convention, the libraries, and array literals will all want to give you 1-based arrays.

SomeCallMeTim · on Oct 24, 2013

...and if you're using LuaJIT, 0-based arrays will even be optimized appropriately (i.e., array[0] will be stored internally in the array rather than in the hash).

But you'll still be fighting with the libraries and array literals, as you say.

justincormack · on Oct 24, 2013

You can fix some libraries by using ipairs. But not all of course.

ams6110 · on Oct 24, 2013

I seem to make off-by-one errors about the same either way. They are just off by one more or less.