Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> You've just invited the assumption that your compiler can do SROA as the basis for better performance - how is that an abstraction?

"Expands to the exact same code everywhere on every imaginable compiler, even toy compilers nobody uses" is not part of the definition of "abstraction".

> If you really care about performance to the point of register occupancy, you need to look at the context. The "abstraction" of having a Point type with methods is almost certainly far from optimal, because it doesn't fit SIMD well.

What do you think http://www.agner.org/optimize/#vectorclass is then?

> It's then also not "good engineering practice" to use it.

Yes, it is! It makes your code more readable, and if you're compiling on any production-quality C compiler anywhere your Point class will have the same performance as the raw version. Lower maintenance cost, fewer bugs, same performance.



> "Expands to the exact same code everywhere on every imaginable compiler, even toy compilers nobody uses" is not part of the definition of "abstraction".

MSVC compiler doesn't do SROA, as far as I know.

> What do you think http://www.agner.org/optimize/#vectorclass is then?

Have you actually looked at that thing? It's not a Point struct, I can tell you that. There's nothing abstract about it.

If you want to take advantage of SIMD fully, you need to lay out your data in a very specific way. A Point {x,y,z} struct doesn't naturally fit a SIMD register.

Now, if you're willing to make a lot of assumptions on your compiler, you can do something like this: http://www.codersnotes.com/notes/maths-lib-2016/

Still, you need to put in the work and the research. No magic.

> Yes, it is! It makes your code more readable, and if you're compiling on any production-quality C compiler anywhere your Point class will have the same performance as the raw version. Lower maintenance cost, fewer bugs, same performance.

If performance really matters then your abstract solution is almost certainly suboptimal and it's not good engineering practice to use it for the sake of readability.


> MSVC compiler doesn't do SROA, as far as I know.

Yes, it has since at least 2010 (and probably earlier). See "scalar replacement": https://blogs.msdn.microsoft.com/vcblog/2009/11/02/visual-c-...

> Have you actually looked at that thing? It's not a Point struct, I can tell you that. There's nothing abstract about it.

The Vec classes can be used as Point structs.

> If you want to take advantage of SIMD fully, you need to lay out your data in a very specific way. A Point {x,y,z} struct doesn't naturally fit a SIMD register.

So pad it out to 4 fields, using homogeneous coordinates.

> Now, if you're willing to make a lot of assumptions on your compiler, you can do something like this: http://www.codersnotes.com/notes/maths-lib-2016/

That's not making a lot of assumptions about your compiler. The x87 floating point stack, for example, has been obsolete for a long time.

> If performance really matters then your abstract solution is almost certainly suboptimal and it's not good engineering practice to use it for the sake of readability.

I disagree. Let's look at actual examples. "Almost certainly" suboptimal abstractions are not what we've seen in Rust, for example, which leans on abstractions heavily.


> Yes, it has since at least 2010 (and probably earlier).

See my reply to whitequark_, this seems to be loop-specific.

> The Vec classes can be used as Point structs.

Oh, sure. Which one though? How do I abstract this, again?

> So pad it out to 4 fields, using homogeneous coordinates.

In other words, "do something else than what I originally did" and "potentially leave 25% throughput on the table".

I think you're trying to pull a fast one on me.

> That's not making a lot of assumptions about your compiler. The x87 floating point stack, for example, has been obsolete for a long time.

Did you even read beyond the first paragraphs? There's five compiler-specific flags you'll have to get right for any of this to work.

Again, there's no free lunch...

> I disagree. Let's look at actual examples. "Almost certainly" suboptimal abstractions are not what we've seen in Rust, for example, which leans on abstractions heavily.

Could you just scroll up for a moment? We are talking about this because someone wrote the "abstract" version of squaring a number and some optimization didn't kick in, resulting in significantly degraded performance. And that's a trivial example! In a real codebase, you'll have to carefully audit compiler output to see if it does the right thing, potentially requiring you to rewrite code. For Rust, you also have the blessing that there is only one compiler...


> MSVC compiler doesn't do SROA, as far as I know.

In the future, consider verifying your extraordinary claims. Of course it does, and it took me about two minutes to demonstrate that: https://godbolt.org/g/9kT1NP


"As far as I know" implies that I might not know, so that's not an extraordinary claim.

You haven't actually verified SROA. The scalar replacement didn't kick in for the non-inlined method, so this could be a result of loop-specific optimizations. The code is also far from optimal, but that's besides your point of course.

Also note that you're testing an RC of the latest version. There's a feature request for SROA from 2013, so it may be implemented by now: https://connect.microsoft.com/VisualStudio/feedback/details/...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: