Half-open intervals are why I try as much as possible to stay away from language...

tylerneylon · on Nov 22, 2022

The example using closed intervals makes sense.

I see why you mention 1-based indexing, but it’s mostly a different beast. I’ll defend 1-based indexing here because, having used Lua a lot, the indexing issue just goes away.

People like what they’re used to. Most coders are used to 0-based, but we all start life 1-indexed. We begin counts with 1: chapter 1, the 1st floor of a building (in the US), 1 AD, the 1st of Dec, etc.

In C/C++, 0-based makes sense because we often care about pointers and pointer arithmetic. But in languages without pointers, that reason goes away. And you can still use half-open intervals with 1-based indexing.

_glass · on Nov 22, 2022

I was born 0 years old.

macleginn · on Nov 22, 2022

Which shows that you weren't born in Japan: people there are born 1 year old.

programmer_dude · on Nov 22, 2022

They are not even though they start counting from 1. The child is still zero years old.

mhink · on Nov 22, 2022

It's a matter of (natural language) terminology. It's the difference between asking "which year of your life are you currently in" as opposed to "how many full years have elapsed since you were born?"

programmer_dude · on Nov 22, 2022

Agreed. My previous reply just pointed out that 1 is not "how many full years have elapsed since you were born?". The answer to this question is still 0.

lupire · on Nov 22, 2022

0-based is for offset arithmetic, not pointer arithmetic.

rav · on Nov 24, 2022

Can you explain what you mean by "having used Lua a lot, the indexing issue just goes away"? Most modern languages have facilities (iterator combinators) that make indexing less common than e.g. in C, but in my experience, you can never completely escape indexes, and for the indexes that I do have to deal with, I really want half-open intervals and 0-based indexing.

GuB-42 · on Nov 23, 2022

> 1 AD

And this makes things really confusing. 1999 is 20th century even if the year starts with "19". 2000 is still 20th century, 21th century starts in 2001. "BC" dates feel like negative numbers but they are not. Using closed intervals [10 BC, 10 AD] spans 20 years, but [10 AD, 20 AD] spans 21 years.

nh23423fefe · on Nov 22, 2022

I don't see how your example works? consider how to implement zoom out generically.

We do the 1d case and lose nothing compared to the 2d case.

Let some sequence of 2n tiles labelled [k+1 k+2 ... k+2n] and we wish to map to a sequence of n tiles labelled [l+1 ... l+n]

map e = (e - (k+1)) `intdiv` 2 + (l+1)

where intdiv is just the usual truncating division

mapping from 1-based array to 1-based array instantiates k=l=0

so

map e = (e - 1)/2 + 1

clearly the simplest map is actually instantiating k = l = -1 which is 0-based

map e = e/2

So now consider the half open case. it must also work the same way. because the 2 tiles beyond the last one map to the 1 tile beyond the sequence to be mapped to. it couldn't work any other way.

The solution is using a 0-based array, not the choice between closed or half open.

chrismorgan · on Nov 23, 2022

I don’t think your mapping example works. You’re manipulating a range in a way that is inherently lossy and incorrect, and chosen a method that is trivially not reversible—try to go the other direction, which you will want to do in projecting from tile space to the original coordinate space, and your closed range introduces error. So using a closed range only helped in one direction, and gave you a misleading sense that your range was still equivalent. Consider also what happens with the next tile along: you have overlap between adjacent tiles unless your tiles are aligned to a multiple of two; and this demonstrates why it’s such a risky proposition, because you’re baking in unstated assumptions which will have a habit of coming back to bite you later when you want to relax them.

My observation with this kind of data is that people work in a theoretically-continuous coordinate space (though it’s probably represented in floating-point numbers so that it’s strictly discrete, just at very high precision), and, when forced to deal with coarser resolutions like tile pixel grids, immediately project back so they can work in the preferred coordinate space. Even if your coordinate spaces are all square, you will have margins of error however you do this, due to quantisation error—and does the pixel at (0, 0) show the data from (−0.5, −0.5) to (0.5, 0.5), or from (0, 0) to (1, 1)?

rav · on Nov 24, 2022

"does the pixel at (0, 0) show the data from (−0.5, −0.5) to (0.5, 0.5), or from (0, 0) to (1, 1)?" - that's an interesting question, but it is independent of tile grids.

You're right that the "ideal datasets" of this world aren't discrete and pixelised, however, real-world pixel-based datasets exist, and real-world pixel-based screens exist, and you want to be able to explore large pixel-based datasets on pixel-based screens, which can be done in a really nice and crisp way if you just put one data pixel on each screen pixel. Suppose that these datasets live in some common cartesian coordinate system but have different extents. The issue of storing the tile ranges that cover each dataset is a real proposition - it's not "lossy and incorrect" once you've accepted that pixels are what you have to deal with as input.

If I want a system where zooming out aggregates 2x2 tiles to a single tile, then I have to decide what happens if there's an odd number of tile rows on a zoom level: Should the number of tile rows on the coarser zoom level be half rounded down, or half rounded up? It seems natural to take the half rounded up, and then fill out the missing tile row with blank pixels. What are the tile row numbers on the coarser zoom level? Suppose we start with tiles on rows 2,3,4,5,6 - that's [2,6] as a closed interval or [2,7) as an open interval. On the next zoom level, the tiles are aggregated so that rows 2 and 3 end up on row 1, row 4 and 5 end up on row 2, and row 6 ends up on row 3. This means we have tiles on rows [1,3] or [1,4). In general, if you have tiles on rows [a,b], then the next zoom level will have tiles on rows [floor(a/2), floor(b/2)]. Expressed with half-open intervals instead, if you have tiles on rows [a,b), then the next zoom level will have tiles on rows [floor(a/2), floor((b+1)/2)). I don't like the +1 in the formula for half-open, so I'll take the closed interval formulation any day.

Steltek · on Nov 22, 2022

I wonder if a lot of Lua's 1-indexed pain would be avoided by encouraging for-each style loops. Unless you need a specific sub range within an array, for-each will get you pretty far.

carpenecopinum · on Nov 22, 2022

That's a big thing with Julia actually. A lot of people first getting into it criticise it for its use of 1-based indexing (I'm not the biggest fan of it either tbh), but in reality, there are very few cases in which you actually directly interact with array indices at all. And even if you do, the idiomatic way to iterate over the indices of an array is to use the `indices`-function, which has the benefit of working with multidimensional arrays, or arrays that start with an index other than 1.

notpeter · on Nov 22, 2022

Yeah, but it's hard to argue with the fact that pairs/ipairs in loops are 25% to 2.5x slower than using `for i = 1,#tbl do v = tbl[i] end`. Performance isn't everything, but it's not nothing either.

https://springrts.com/wiki/Lua_Performance#TEST_9:_for-loops

NaturalPhallacy · on Nov 22, 2022

>+1/-1 shenanigans

"There are two hard problems in programming, cache invalidation, naming things, and off by one errors."