My experience is that story points get a bad rep because they don't mean anything unless you use them as an explicit proxy for time. There's no way to say in reality "this task is small" unless you have some idea of how long it takes. Additonally, this concept of velocity makes no sense because a task that's big for me might be small for someone else in the team, so then we either pre-assign tasks and set points based on assignment (and then have problems if we switch the assignee for any reason), or we assign "generic points" and that ends up not meaning anything at all if the team is not very uniform (e.g. all seniors with similar skills and ownership of most of the same code, or all juniors).
Additonally, all methodologies tend to discourage correcting point values after the fact. That makes the process of deriving time estimates (velocity) even more error prone, because it conflates uncertainty with mistakes. That is, you can correctly estimate a task at 8 points and finish it in four weeks; or you can incorrectly estimate a task as 3 points and finish it in 4 weeks. That doesn't mean that the team has a velocity of about 1.35 points/week, it means it has a velocity of about 2 points/week, but made a mistake with one task.
>My experience is that story points get a bad rep because they don't mean anything unless you use them as an explicit proxy for time.
So when you shop for clothes, Small/Medium/Large are useless? You require precise measurements for every item, and they have to be exactly the same size across manufacturers, or else sizes have no utility for you? The reality is that a Large can be large on you in different ways, even if it's a t-shirt. And software complexity has a lot more dimensions than a t-shirt. The utility of story points is that they allow a team to create a rough idea of their capacity over a sprint, so that they don't consistently under- (or more commonly) over-commit.
If you try to use story points purely as a uniform proxy for time, of course they're going to be useless, because you can always just use time instead.
Of course small/medium/large mean something, they are an approximation of size/dimensions. But story points, adherents claim, are not a measure of time at all! They are not "an approximation of time", they are, so it is claimed, supposed to be unrelated to time, but to "complexity".
And while I agree that a task can be large either because you know what must be done and there is a lot of work or because you're not sure what needs to be done yet. But conflating those two things as "8 points" or whatever is just not helpful.
Story points are also "an approximation of size/dimensions." If my team has consistently deployed 25-35 story points per sprint for the last three sprints, it's reasonable for me to assume that next sprint they will also be able to complete about 30 story points of work. By contrast, knowing that they worked a combined total of 300 hours on average doesn't help me at all. And accounting for uncertainty is important, which is one reason a Fibonacci sequence is commonly used. The general rule is to go up one in the sequence if the team is uncertain. The whole purpose of story points is to avoid having to track things like uncertainty separately. It's like the Markov assumption, the information to get to the current point estimate is baked in. It is useful (essential, even) to incorporate fuzzy concepts like perceived complexity or uncertainty without bogging the team down trying to measure them precisely and explicitly.
> If my team has consistently deployed 25-35 story points per sprint for the last three sprints, it's reasonable for me to assume that next sprint they will also be able to complete about 30 story points of work.
And if the last few sprints they've completed between 5 and 30 points, do you believe they'll complete around 17.5 points next sprint?
Now, if the team is good at estimating (which they are if they get consistent results between sprints), they can tell you of them telling you feature X is 8 points, Y is 5 points, Z is 15 points, and you concluding that they will finish X, Y, and Z next sprint. But, they can exactly as well tell you that X will take around 3 days, Y will take around 2 days, and Z will take around 5 days, and you can have the same conclusion.
>And if the last few sprints they've completed between 5 and 30 points, do you believe they'll complete around 17.5 points next sprint?
I don't know, what did the team figure out in retro? Was the big difference a real underestimation, or was there some kind of unforeseen blocker? I've never seen that big a variation, but anything's possible.
If it makes you feel better to measure team velocity in something you call "days" instead of story points and it works for your team, more power to you. But don't fool yourself that you're talking about actual days. At best you're talking about "probable days", and how many days it actually takes will depend on a lot of things, including unknowns and who takes the story (are "Bob-days" the same as "Carol-days"?). So you'll end up with a measure of days that is very team- and uncertainty-dependent, and at that point it's better to just use story points and admit that it's not a universal measure and doesn't need to be. Not to mention that by using days you'll invite confusion between calendar time and time-on-task.
Additonally, all methodologies tend to discourage correcting point values after the fact. That makes the process of deriving time estimates (velocity) even more error prone, because it conflates uncertainty with mistakes. That is, you can correctly estimate a task at 8 points and finish it in four weeks; or you can incorrectly estimate a task as 3 points and finish it in 4 weeks. That doesn't mean that the team has a velocity of about 1.35 points/week, it means it has a velocity of about 2 points/week, but made a mistake with one task.