Assuming the websites are using images of appropriate dimensions (that is, not using huge images and relying on browser downscaling, which is a bad practice in any case), you can quite easily do the math. A 1080p screen is about 2 megapixels, a 4K screen is about 8 megapixels. If your images decode at 50 Mpx/s, that's 25 full screens (or 6 full screens at 4K) per second. You need to scroll quite quickly and have a quite good internet connection before decode speed will become a major issue, whether for UX or for battery life. Much more likely, the main issue will be the transfer time of the images.
Aside from all points you raise I find the discussion about battery life a little absurd in light of how negligible it is compared to the impact poorly written JavaScript in the context of web apps. For example, I noticed this morning that my bank somehow pins one CPU thread to 100% usage whenever I have their internet banking site open, even when nothing is being done. AFAIK there is no cryptocurrency nonsense going on, and the UI latency is pretty good too, so my best guess is that their "log out automatically after ten minutes of inactivity" security feature is implemented through constant polling.
And the one you're replying to is also talking about battery life. The energy needed to display an image for a few seconds is probably higher than the energy needed to decode it.
OTOH video decoding is highly likely to be hardware accelerated on both laptops and smartphones.
> For still images, the main thing that drains your battery is the display, not the image decoding :)
I wonder if it becomes noticeable on image-heavy sites like tumblr, 500px, etc.