It's kind of bizarre how it's an accomplishment to get your code closer to what the hardware is capable of. In a sane world, that should be the default environment you're working in. Anything else is wasteful.
There's always been a tradeoff in writing code between developer experience and taking full advantage of what the hardware is capable of. That "waste" in execution efficiency is often worth it for the sake of representing helpful abstractions and generally helping developer productivity.
The real win here is when we can have both because of smart toolchains that can transform those high-level constructs and representations into the most efficient implementation for the hardware.
Posts like this demonstrate what's possible with the right optimizations so tools like compilers and assemblers are able to take advantage of these when given the high-level code. That way we can achieve what you're hoping for: the default being optimal implementations.
> That "waste" in execution efficiency is often worth it for the sake of representing helpful abstractions and generally helping developer productivity.
That's arguable at best. I for one am sick of 'developer productivity' being the excuse for why my goddamned supercomputer crawls when performing tasks that were trivial even on hardware 15 years older.
> The real win here is when we can have both because of smart toolchains that can transform those high-level constructs and representations into the most efficient implementation for the hardware.
That's been the promise for a long time and it still hasn't been realized. If anything things seem to be getting less and less optimal.
No, it's really not even arguable. Lots and lots of software is written in business contexts where the cost of developing reliable code is a lot more important than its performance. Not everything is a commercial product aimed at a wide audience.
What you're "sick of" is mostly irrelevant unless you represent a market that is willing to pay more for a more efficient product. I use commercial apps every day that clearly could work a lot better than they do. But... would I pay a lot for that? No. They are too small a factor in my workday.
People have been sick of slow programs and slow computers since literally forever. I think you live in a bubble or are complacent.
No one I know has anything good to say about microsoft teams, for instance. And that's just one of the recent "dekstop apps" which are actually framed browsers.
> I for one am sick of 'developer productivity' being the excuse for why my goddamned supercomputer crawls when performing tasks that were trivial even on hardware 15 years older.
The problem here is developer salaries. So long as developers are as expensive as they are the incentive will be to optimise for developer productivity over runtime efficiency.
We've been making developer experience optimizations _long_ before they started demanding high salaries. The whole reason to go from assembly to C was to improve developer experience and efficiency.
It seems fairly reductive to dismiss the legitimate advantages of increased productivity. It's faster to iterate on ideas and products, we gain back time to focus on more complex concepts, and, more broadly, we further open up this field to more and more people. And those folks can then go on to invest in these kind of performance improvements.
> It seems fairly reductive to dismiss the legitimate advantages of increased productivity.
Certainly there are some, but I think we passed the point of diminishing returns long long ago and we're now well into the territory of regression. I would argue that we are actually experiencing negative productivity increases from a lot of the abstractions we employ, because we've built up giant abstraction stacks where each new abstraction has new leaks to deal with and everything is much more complicated than it needs to be because of it.
Hmm... I think our standards for application functionality are also a lot higher. For example, how many applications from the 90s dealt flawlessly with unicode text.
How much added slowness do you think Unicode is responsible for? Because as much of a complex nightmarish standard as it is[0], there are plenty of applications that are fast that handle it just fine as far as I can tell. They're built with native widgets and written in (probably) C.
[0] plenty of slow as fuck modern software doesn't handle it even close to 'flawlessly'
If developers costed one fifth of what they do now, how many projects that let performance languish today would staff up to the extent that doing a perf pass would make it's way to the top of the backlog queue?
Come on now. Let's be honest here. The answer for >90% of projects is either a faster pace on new features, or to pocket the payroll savings. They'd never prioritize something that they've already determined can be ignored.
Hmm, I don’t know about that. When I made a quarter of what I do now, I worked at companies that would happily let me spend two weeks on a task without expecting much in that time.
As far as I can tell, a slow computer is due to swapping from having a bunch of stuff open, or an inherently heavy background task and an OS that doesn't know how to allocate resources.
Sometimes there's some kind of network access bogging everything down, now that SASS is everywhere(In which case, we need more abstractions and developer productivity to enable offline work!).
Sometimes things are slow because we are doing a lot of heavy media processing. That's not inefficiency, it's just inherently heavy work. In fact, simplicity might actively slow things even more, because what you really need for that stuff is to use the GPU, and the "bloated mega library" you're avoiding might already do that.
Android is kind of the proof. Android is always pretty fast. Anything 15yo hardware could do trivially, Android can probably do faster. And Android is not lightweight.
There may be some slowness out there caused by abstraction layers, but as far as I can tell without digging into the code of every slow app I've ever seen, the real problem is the keep it simple mindset that discourages even basic optimization tricks like caching, and the NIH that makes people write things themselves and assume it's faster just because it's smaller.
Plus, the UNIXy pipeline mentality that did a number on computing. GUI workflows are not pipelines and there's lots of things that are very messy to do in a pipe style model, like changing one thing and selectively recomputing only what needs changing.
The "Data processing" way of thinking leads to producing a batch of stuff, then passing it to something that computes with it, instead of working directly with a state.
There's always been a tradeoff in writing code between developer experience and taking full advantage of what the hardware is capable of. That "waste" in execution efficiency is often worth it for the sake of representing helpful abstractions and generally helping developer productivity.
The GFLOP/s is 1/28th of what you'd get when using the native Accelerate framework on M1 Macs [1]. I am all in for powerful abstractions, but not using native APIs for this (even if it's just the browser calling Accelerate in some way) is just a huge waste of everyone's CPU cycles and electricity.
Wasteful of computing resources, yes, but for a long time we've been prioritizing developer time. That happens because you can get faster hardware cheaper than you can get more developer time (and not all developers time is equal, say, Carmack con do in a few hours things I couldn't do in months).
I do agree that we'd get fantastic performance out of our systems if we had the important layers optimized like this (or more), but it seems few (if any) have been pushing in that direction.
But we've got far more developers now, and by large, we aren't producing more capable software than we did 20 years ago. It's gotten very complex, but that is by large just a failure to keep things simple.
Spotify is slower to launch on my Ryzen 3900X than my operating system, and lacks many of the basic features that WinAmp had in 1998. Now you're thinking "Aha! But WinAmp just played mp3s, it didn't stream over the internet!", Yes it could. It was also by large developed by one guy, over the course of a couple of months.
I don't know where this superior developer productivity is going, but it sure doesn't seem to be producing particularly spectacular results.
Back when I was younger we had to develop a few simulations in university, and we spent half the semester coding the building blocks in C. I was slightly good at it, and having seen my brother stumble a couple of years before, I knew I had to be careful with the data structures and general code layout to keep things simple and working.
As this was a group project, there were other hands on deck. One weekend I caught a nasty cold and I couldn't assist the weekly meeting to work on the project. Monday comes and we have to show our advances. The code was butchered. It took me a day to fix what had been broken (and keeping egos at bay, it would've been easier to just throw everything away and implement things from my last stable version).
Now I can fire up python, import numpy and scipy, and make far more complex simulations in a couple of minutes and a few lines of code. Sure, back then python and numpy did exist, I just didn't know about them. But you know what didn't exist 10-15 years ago? Pytorch, TensorFlow, JAX and all the deep learning ecosystem (probably scikit-learn did exstist, it's been around forever!). Thanks to those, I can program/train deep learning algorithms for which I probably wouldn't be able to code the lower-level abstractions to make them work. JAX comes with a numpy that runs on hardware "for free" (and there was PyCUDA before that if you had a compatible GPU).
That's the programmer productivity you're not seeing. Sure, these are very specific examples, but we have many more building blocks available to make interesting things without having to worry about the lower layers.
You can also complain about Electron being a bloated layer, and that's OK. There's your comparisson about how Spotify is slow and Winamp is/was fast.
That's kinda like saying Bob built a go-kart in his garage over a couple months, it moves you from A to B, I don't see why a Toyota Corolla needs a massive factory. Spotify's product isn't just a streaming media player. It's also all the infrastructure to produce the streams, at scale, for millions of users.
How often are you actually launching spotify? I start it once when I boot and that's it until my next reboot, weeks/months later.
Now you might of course ask, "why isn't the velocity 6554x that of winamp, even when correcting for non-eng staff, management, overhead, etc". Well, they probably simply aren't allocating that much to the client development.
Also often times one dev who knows exactly what he is doing can be way more effective than a team; no bikeshedding, communication, PRs, etc. What happens when they get hit by a bus?
But you can't get faster hardware cheaper anymore. Not naively faster hardware anyways. You are getting more and more optimization opportunities nowadays though. Vectorize your code, offload some work to the GPU or one of the countless other accelerators that are present on modern SOC, change your I/O stack so you can utilize SSDs efficiently, etc. I think it's a matter of time until someone puts FPGA onto mainstream SOC, and the gap between efficient and mainstream software will only widen from that point.
You are precisely telling me the ways in which I can get faster hardware: GPU, accelerators, the I/O stack and SSDs, etc.
I agree that the software layer has become slow, crufy, bloated, etc. But it's still cheaper to get faster hardware (or wait a bit for it, see M1, Alder Lake, Zen 3, to name a few, and those are getting successors later on this year) than to get a good programmer to optimize your code.
And I know that we'd get much better performance out of current (and probably future) hardware if we had more optimized software, but it's rare to see companies and projects tackle on such optimization efforts.
But you can't get all these things in the browser. You don't just increase CPU frequency and get free performance anymore. You need conscious effort to use GPU computing, conscious effort to ditch current I/O stack for io_uring. Modern hardware gives performance to ones who are willing to fight for it. Disparity between naive approach and optimized approach grows every year.
I'm not arguing against your point at all. I'm just pointing at the fact that there are many who are happy to wait a year, buy a 10-20% faster CPU and call it a day, or buy more/larger instances on the cloud, or do anything but optimize the software they use. Some couldn't even if they wanted, because they buy software rather than develop in-house, and aren't effective in requiring faster performance from their vendors.
Everything has a cost. If the developer is a slave to machine architecture, development is slow and error prone. If the machine is a slave to a abstraction, everything will run slowly. Unsurprisingly, the real trick is finding appropriate balance for your situation.
Of course you can make things worse, in both directions.
The real issue here is that the hardware isn't capable of sandboxing without introducing tons of side channel attacks. Lots of applications are willing to sacrifice a lot of performance in order to gain the distribution advantages from a safe, sandboxed execution environment.
This is the real issue. The other comments here are talking about dev-productivity, but they have the causal chain backwards. The browser overhead is about running in a sandbox, on arbitrary/unknown hardware. Web development had poorer DX than desktop app development for a long time, until the critical mass of web-devs (driven by the distribution advantages) finally drove the DX investment to heavier on the web side.
On the other hand, in your sane world, productivity would be a fraction of what it currently is, for developers and users. You favor computer time over developer time. While computer time can be a proxy for user time, it isn't always as developer time can be used to speed up user time too. A single-minded focus on computer time sounds like a case of throwing out metrics like developer time and user time because they are harder to measure than computer time. In any case, it sounds like a mistake to me.
In a sane world (which is the world that we live in), it's best to find a well-optimized library for common operations like matrix multiplication. But if you want to do something unusual (multiply large matrices inside a browser, quickly) you've exited the sane world so you'll have to work at it.