JPEG XL and the Pareto Front

jug · 2024-03-01T11:48:43 1709293723

Pay attention to just how good WebP is at _lossless_ comparison though!

I've always thought that one as flying under the radar. Most get stuck on WebP not offering tangible enough benefits (or even worse) over MozJPEG encoding, but WebP _lossless_ is absolutely fantastic for performance/speed! PNG or even OptiPNG is far worse. And very well supported online now, and leaving the horrible lossless AVIF in the dust too of course.

jonsneyers · 2024-03-01T11:53:43 1709294023

Lossless WebP is very good indeed. The main problem is that it is not very future-proof since it only supports 8-bit. For SDR images that's fine, but for HDR this is a fundamental limitation that is about as bad as GIF's limitation to 256 colors.

chungy · 2024-03-01T20:18:08 1709324288

Lossless WebP is also stuck with a low axis limit of 16383.

It is a good format when you can use it, but JPEG XL almost always compresses better anyway, and lacks color space and dimension limits.

JyrkiAlakuijala · 2024-03-02T03:29:28 1709350168

This limit didn't exist in my first version as well as the 4 GB limit. These were artificially introduced to "match" the properties of lossy WebP. We could have done better there.

mardifoufs · 2024-03-02T04:11:16 1709352676

Were you involved in creating WebP? If so that's super cool! Why would they want to match webp's lossy compression though? To make it more uniform? And do you know why lossy WebP had such a limitation in the first place? Thank you!

JyrkiAlakuijala · 2024-03-02T04:38:49 1709354329

I designed the WebP lossless format, wrote the spec, and implemented the first encoder.

The constraint was in WebP lossy to facilitate exact compatibility with VP8 specification and hoping that it would allow hardware decoding and encoding of WebP images using VP8 hardware.

Hardware encoding and decoding were never used, but the limitation stuck.

There was no serious plan to do hardware lossless, but the constraint was copied for "reducing confusion".

I didn't and don't like it that much as more PNG images couldn't be represented as WebP lossless as a result of that.

chungy · 2024-03-02T04:54:18 1709355258

Wow, that really sucks. I appreciate the explanation as well as your frustration with it.

My desktop had a mixture of PNG and WebP files solely because of this limitation. I use the past tense because they've now all been converted to JPEG XL lossless.

JyrkiAlakuijala · 2024-03-02T06:08:44 1709359724

Another group of incompatibility with PNG was 16 bit coding. I had a plan to add it as simply sending two 8 bit images where the second image containing the 8 least significant bits would be predicted to be the same as the first. That way it would not be perfect, but it would be 100x better than how PNG deals with it. WebP had another plan for layers and tiles that never realized, and as a consequence WebP is stuck at 8 bits.

jug · 2024-03-01T13:51:05 1709301065

Ah, I didn't know this and I agree this is a fairly big issue and increasingly so over time. I think smartphones in particular hastened the demand for HDR quite a bit, what was once a premium/enthusiast feature you only had to explicitly buy into.

fluidcruft · 2024-03-02T05:38:59 1709357939

HDR is also important for medical imaging applications (which have been moving to web)

omoikane · 2024-03-01T18:12:08 1709316728

I haven't ran across websites that serves up HDR images, I am not sure I would notice the difference. WebP seems appropriately named and optimized for image delivery on the web.

Maybe you are thinking of high bit depth for archival use? I can see some use cases there where 8-bit is not sufficient, though personally I store high bit depth images in whatever raw format was produced by my camera (which is usually some variant of TIFF).

lonjil · 2024-03-01T19:37:05 1709321825

8-bit can have banding even without "HDR". Definitely not enough. 10 bit HDR video is becoming more common, and popularity for images will follow. Adoption is hampered by the fact that Windows has bad HDR support, but it all works plenty well on macOS and mobile platforms.

adgjlsfhk1 · 2024-03-01T22:27:08 1709332028

unfortunately linux HDR is pretty much completely absent. That said, Wayland slowly looks like it's getting there.

spookie · 2024-03-02T11:28:19 1709378899

KDE Plasma 6 has HDR support.

https://zamundaaa.github.io/wayland/2023/12/18/update-on-hdr...

toastal · 2024-03-02T09:05:34 1709370334

X11 supports 10-bit SDR… which Wayland doesn’t

JyrkiAlakuijala · 2024-03-02T10:48:48 1709376528

It would be nice if everything was non tone-mapped HDR all the time for software, and the windowing system or the monitor would do the (local) tone mapping.

kllrnohj · 2024-03-02T14:56:24 1709391384

No, you don't want that. You want your windowing system & UI/graphics layers working in display native range (see eg, Apple's EDR). The consistency of appearance of the SDR range is too important to leave it up to unknown tone mapping.

Also it's too power expensive to have your display in HDR if you're only showing SDR UIs, which not only is the common reality but will continue to be so for the foreseeable future.

JyrkiAlakuijala · 2024-03-03T08:41:55 1709455315

I think both approaches have advantages.

Requiring every game, photo viewing app, drawing program, ... every application to decide how to do it's HDR->SDR seems unnecessary complication due to poor abstractions.

The locality of local tone mapping (the ideal approach to HDR->SDR mapping) would expose the window boundaries. Two photos or two halves of the same photo in different windows (as opposed to being in the same window) would create an artificial discontinuity for the correction fields being artificially contained within each window instead of spanning the users visual field as best as possible.

Every local tone mapping needs to make an assumption of the surrounding colors: is the window surrounded by black, gray, colored or bright light should influence how the tone mapping is done at borders. This information is not available for an app: it can only be done at the windowing system level or in the monitor.

The higher the quality of the HDR->SDR mapping in a system, the more opportunity there is to limit the maximum brightness, and thus also the opportunity for energy savings.

kllrnohj · 2024-03-03T16:11:03 1709482263

> Requiring every game [..] to decide how to do it's HDR->SDR seems unnecessary complication due to poor abstractions.

Games already do this regardless and it's part of their post processing pipelines that also govern aspects of their environmental look.

What's missing is HDR->Display which is why HDR games have you go through a clunky calibration process to attempt to reverse out what the display is going to the PQ signal, when what the game actually wants is what MacOS/iOS and now Android just give them - the exact amount of HDR headroom which the display doesn't manipulate.

As for your other examples, being able to target the display native range doesn't mean you have to. Every decent os has a compositor API that lets you offload this to the system.

JyrkiAlakuijala · 2024-03-03T17:22:50 1709486570

Yes, if local tone mapping is done by the operating system (windowing system or the monitor itself), then there is a chance that borders of windows are done appropriately within their spatial context.

account42 · 2024-03-04T15:33:35 1709566415

X11 does but many X11 applications and window managers don't.

kllrnohj · 2024-03-02T01:36:17 1709343377

For things where you actually care about lossless you probably also don't care about HDR.

HDR is (or can be) good for video & photography, but it's absolutely ass for UI.

Besides, you can just throw a gainmap approach at it if you really care. Works great with jpeg, and gainmaps are being added to heif & avif as well, no reason jpegxl couldn't get the same treatment. The lack of "true 10-bit" is significantly less impactful at that point

JyrkiAlakuijala · 2024-03-02T03:39:20 1709350760

Gainmaps don't solve 8-bit mach banding. If anything you get more banding: two bandings, one banding from each of the two 8-bit fields multiplied together.

Gainmaps "solve" the problem of computing a local tone mapping by declaring that it needs to be done at server side or at image creation time rather than at viewing time.

My prediction: Gainmaps are going to be too complex of a solution for us as a community and we are going to find something else that is easier. Perhaps we could end up standardizing a small set of local tone mapping algorithms applied at viewing time.

kllrnohj · 2024-03-02T14:37:47 1709390267

> Gainmaps "solve" the problem of computing a local tone mapping by declaring that it needs to be done at server side or at image creation time rather than at viewing time.

Which was already the case. A huge amount of phone camera quality is from advanced processing, not sensor improveds. Trying to get that same level of investment in all downstream clients is both unrealistic and significantly harder. A big aspect of why DolbyVision looks better is just Dolby forces everyone to use their tone mapping, and client consistency is critical.

Gainmaps also avoid the proprietary metadata disaster that plaugues HLG/PQ video content.

> If anything you get more banding: two bandings, one banding from each of the two 8-bit fields multiplied together.

The math works out such that you get the equivalent of something like 9 bits of depth but you're also not wasting bits on colors and luminance ranges you aren't using like you are with bt2020 hlg or PQ

JyrkiAlakuijala · 2024-03-02T17:14:54 1709399694

I didn't try it out, but I don't see the 9 bit coming. I feel it gives about 7.5 bits.

Mixing two independent quantization sources will lead to more error.

Some decoding systems such as traditional jpeg does not specify results exactly so bit-perfect quantization-aware compensation is not going to be possible.

kllrnohj · 2024-03-02T23:00:06 1709420406

The context of this thread is lossless webp, there aren't any compression artifacts to deal with.

Go try the math out, the theoretical is higher than 9 bits but 8.5-9 bit equivalent is very achievable in practice. With two lossless 8 bit sources this is absolutely adequate for current displays, especially since again you're not wasting bits on things like PQ's absurd range.

Will this be added to webp? Probably not, it seems like a dead format regardless. But 8 bit isn't the death knell for HDR support as is otherwise believed.

JyrkiAlakuijala · 2024-03-03T17:21:01 1709486461

This depends on the quality you are aiming for.

You just cannot reach the best quality with 8 bits, not in SDR, not in HDR, not with gainmap HDR. Sometimes you don't care for a particular use case, and then 8 bits becomes acceptable. Many use cases remain where degradation by a compression system is unacceptable or creates too many complications.

lonjil · 2024-03-02T12:14:38 1709381678

> but it's absolutely ass for UI.

uhh, no it isn't?

And gainmaps suck, take lots of space, don't reduce banding. Even SDR needs 10-bit in a lot of situations to not have banding.

kllrnohj · 2024-03-02T14:24:01 1709389441

> > but it's absolutely ass for UI.

> uhh, no it isn't?

Find me a single example of a UI in HDR for the UI components, not for photos/videos.

> Even SDR needs 10-bit in a lot of situations to not have banding.

You've probably been looking at 10-bit HDR content on an 8-bit display panel anyway (true 10-bit displays don't exist in mobile yet, for example). 8-bits works fine with a bit of dithering

adgjlsfhk1 · 2024-03-02T19:47:22 1709408842

> 8-bits works fine with a bit of dithering

Yes, but the place where you want dithering is in the display, not in the image. Dithering doesn't work with compression because it's exactly the type of high frequency detail that compression removes. It's much better to have a 10 bit image which makes the information low frequency (and lets the compression do a better job since a 10 bit image will naturally be more continuous than an 8 bit image since there is less rounding error in the pixels), and let the display do the dithering at the end.

maxoverpower · 2024-03-02T14:00:00 1709388000

Gainmaps only take a lot of space if implemented poorly in the software creating the gainmap. On Pixel phones they take up 2% of the image size, by being stored in quarter resolution and scaling back up during decoding, which works fine on all gainmap-supported devices. They're more prone to banding, but it's definitely not a problem with all images.

lonjil · 2024-03-02T15:30:59 1709393459

I guess I should say, take up lots of space unless you're OK with lower quality. In any case, gainmaps are entirely unrelated to the 8 bit vs 10 bit question. The more range you have (gamut or brightness) the worse 8 bit is, regardless of whether you're using a gainmap or not. And you can use gainmaps with 10 bit images.

a-french-anon · 2024-03-01T13:15:29 1709298929

An issue with lossless WebP is that it only supports (A)RGB and encodes grayscale via hacks that aren't as good as simply supporting monochrome.

If you compress a whole manga, PNG (via oxipng, optipng is basically deprecated) is still the way to go.

Another something not mentioned in here is that lossless JPEG2000 can be surprisingly good and fast on photographic content.

out_of_protocol · 2024-03-01T13:54:24 1709301264

Just tried it on random manga page

- OxiPNG - 730k

- webp lossless max effort - 702k

- avif lossless max effort - 2.54MB (yay!)

- jpegxl lossless max effort - 506k (winner!)

Andrex · 2024-03-01T15:03:01 1709305381

Probably depends on the manga itself. Action manga probably don't compress as well as more dialogue-heavy works.

ComputerGuru · 2024-03-01T15:59:43 1709308783

I would, at a first blush, disagree with that characterization? Dialogue equals more fine-grained strokes and more individual, independent “zones” to encode.

Andrex · 2024-03-01T21:17:54 1709327874

Now I'm really curious. If anyone has more time than me they could do some sample compressions and I'd be interested in the results.

lonjil · 2024-03-01T18:09:50 1709316590

I wonder if the text would be consistent enough for JXL's "patches" feature to work well.

edflsafoiewq · 2024-03-01T13:35:19 1709300119

IIRC the way you encode grayscale in WebP is a SUBTRACT_GREEN transform that makes the red and blue channel 0 everywhere, and then use a 1-element prefix code for R and B, so the R and B for each pixel take zero bits. Same idea with A for opaque images. Do you know why that's not good enough?

JyrkiAlakuijala · 2024-03-01T13:52:13 1709301133

I made a mistake there with subtract green.

If I had just added 128 to the residuals, all remaining prediction arithmetic would have worked better and it would have given 1 % more density.

This is because most related arithmetic for predicting pixels is done in unsigned 8 bit arithmetic. Subtract green moves such predictions to often cross the 0 -> 255 boundary, and then averaging, deltas etc make little sense and add to the entropy.

edflsafoiewq · 2024-03-01T13:54:00 1709301240

Can you explain why?

JyrkiAlakuijala · 2024-03-01T13:58:30 1709301510

I edited the answer into the previous message for better flow.

a-french-anon · 2024-03-01T14:59:51 1709305191

Thankfully the following comment explains more than I know, I was speaking purely from empiric experience.

edflsafoiewq · 2024-03-01T15:05:11 1709305511

Then you can't know that any difference you see is because of how WebP encodes grayscale.

JyrkiAlakuijala · 2024-03-01T12:39:58 1709296798

Thank you! <3

WebP also has a near-lossless encoding mode based on lossless WebP specification that is mostly unadvertised, but should be preferred over real lossless in almost every use case. Often you can half the size without additional visible loss.

netol · 2024-03-01T15:33:42 1709307222

Is this mode picked automatically in "mixed" mode?

Unfortunately, that option doesn't seem to be available in gif2webp (I mostly use WebP for GIF images - as animated AVIF support is poor on browsers and that has an impact in interoperability)

edflsafoiewq · 2024-03-02T03:39:05 1709350745

Not in gif2webp, no. It is available in img2webp as a global (not per-frame) option. It looks like when you pass it, it will be used for all lossless encoding, including when "-mixed" tries the lossless mode.

JyrkiAlakuijala · 2024-03-01T16:00:31 1709308831

I don't know

kurtextrem · 2024-03-01T16:51:12 1709311872

do you know why Jon didn't compare near-lossless in the "visually lossless" part?

JyrkiAlakuijala · 2024-03-02T03:47:52 1709351272

WebP near-lossless is very far in compression density against that kind of visually lossless. Still 2-3x more bits I think. No reason to compare. The near-lossless (at settings 60 and 80) is closer to pixel perfect no matter how much you zoom, whereas Jon's "visually lossless" is what I'd rather call usual very high quality lossy without pixel precision guarantees.

ummonk · 2024-03-02T06:54:51 1709362491

I'm actually shocked by how poorly the lossless versions of new image formats (AVIF and HEIC) perform compared to the venerable png.

kodabbb · 2024-03-01T17:46:04 1709315164

Looks like there are more savings coming on lossless AVIF: https://www.reddit.com/r/AV1/comments/1b3lh08/comment/kstmbr...

JyrkiAlakuijala · 2024-03-01T19:43:04 1709322184

Also more savings will come for JPEG XL soon.

Possibly mostly focused on medium and low quality.

kodabbb · 2024-03-01T19:54:10 1709322850

will those require a bitstream change too?

JyrkiAlakuijala · 2024-03-02T03:52:58 1709351578

No bitstream changes. In JPEG XL we have a 3-bit control field for each 8x8 square to guide which pixels participate in smoothing.

The heuristics (here https://github.com/libjxl/libjxl/blob/main/lib/jxl/enc_ar_co...) for choosing that value are quite primitive and produces only two values: no smoothing and some smoothing (values 0 and 4).

If we replace those heuristics with a search that tries out which of the values is closest to the original, we should get better quality, especially at the lowest bitrates where smoothing is important.

jonsneyers · 2024-03-01T20:22:05 1709324525

It is unlikely that there will be any bitstream changes in JPEG XL. There is still a lot of potential for encoder improvements within the current bitstream, both for lossy and for lossless.

account42 · 2024-03-04T15:28:52 1709566132

Last time I tested cwebp it did not handle (PNG) color spaces correctly so the result of a supposedly lossless conversion actually looked different from the original. What good is better lossless compression if it is not actually lossless visually?

KingOfCoders · 2024-03-02T09:19:23 1709371163

But all compression in the graph is lossless. The whole article is basically about JPEG XL lossless. Am I missing something?

artemisart · 2024-03-02T14:02:32 1709388152

About 3/4 of the article? There is a lot of analysis on lossy compression too after the lossless part.

Akronymus · 2024-03-01T13:36:14 1709300174

I really like webp. Sadly there's still a lot of applications that dont work with it (looking at discord)

ocdtrekkie · 2024-03-01T15:16:20 1709306180

It is ironic you said this because when I disabled webp in my browser because it had a huge security vulnerability, Discord was the only site which broke and didn't immediately just serving me more reasonable image formats.

Modified3019 · 2024-03-01T08:03:57 1709280237

At the very low quality settings, it's kinda remarkable how jpeg manages to to keep a sharper approximation of detail that preserves the holistic quality of the image better in spite of the obvious artifacts making it look like a mess of cubism when examined close. It's basically converting the image into some kind of abstract art style.

Whereas jxl and avif just become blurry.

JyrkiAlakuijala · 2024-03-01T08:20:03 1709281203

It is because JPEG is given 0.5 bits per pixel, where JPEG XL and AVIF are given around 0.22 and 0.2.

These images attempt to be at equal level of distortion, not at equal compression.

Bpps are reported beside the images.

In practice, use of quality 65 is rare in the internet and only used at the lowest quality tier sites. Quality 75 seems to be usual poor quality and quality 85 the average. I use quality 94 yuv444 or better when I need to compress.

bmacho · 2024-03-01T08:49:47 1709282987

You refer to this? https://res.cloudinary.com/jon/qp-low.png

Bitrates are in the left column, jpg low quality is the same size as jxl/avif med-low quality (0.4bpp), so you should compare the bottom left picture to the top mid and right pictures.

izacus · 2024-03-01T15:13:11 1709305991

Well, that's because JPEG is still using about twice as many bits per pixel, making the output size significantly larger.

Don't get swept away by false comparisons, JXL and AVIF look significantly better if you give them twice as much filesize to work with as well.

mrob · 2024-03-01T13:37:53 1709300273

JPEG bitrates are higher, so all it means is that SSIMULACRA2 is the wrong metric for this test. It seems that SSIMULACRA2 heavily penalizes blocking artifacts but doesn't much care about blur. I agree that the JPEG versions look better at the same SSIMULACRA2 score.

jonsneyers · 2024-03-01T16:17:23 1709309843

Humans generally tend to prefer smoothing over visible blocking artifacts. This is especially true when a direct comparison to the original image is not possible. Of course different humans have different tastes, and some do prefer blocking over blur. SSIMULACRA2 is based on the aggregated opinions of many thousands of people. It does care more about blur than metrics like PSNR, but maybe not as much as you do.

JyrkiAlakuijala · 2024-03-01T14:34:37 1709303677

Ideally one would use human ratings.

The author of the blog post did exactly that in a previous blog post:

https://cloudinary.com/labs/cid22/plots

Human ratings are expensive and clumsy so people often use computed aka objective metrics, too.

The best OSS metrics today are butteraugli, dssim and simulacra. The author is using one of them. None of the codecs was optimized for that metrics except jpegli partially.

porker · 2024-03-01T08:17:13 1709281033

Yes, that was my takeaway from this that JPEG keeps edge sharpness really well (e.g. the eyelashes) while the jxl and avif smooth all detail out of the image.

mort96 · 2024-03-02T13:08:43 1709384923

No, JXL and AVIF keep keep the same level of edge sharpness but without all the blocking artifacts when given the same amount of bits per pixel as the lowest-quality jpeg.

AceJohnny2 · 2024-03-01T08:22:04 1709281324

I do not understand why this article focuses so much on encode speed, but for decode, which I believe represents 99% of usage in this web-connected world, give a cursory...

> Decode speed is not really a significant problem on modern computers, but it is interesting to take a quick look at the numbers.

lifthrasiir · 2024-03-01T08:29:17 1709281757

Anything more than 100 MB/s is considered "enough" for the internet because at that point your bottleneck is no longer decoding. Most modern compression algorithms are asymmetric, that is, you can spend much more time on compression without significantly affecting the decompression performance, so it is indeed less significant once the base performance is achieved.

JyrkiAlakuijala · 2024-03-01T09:19:30 1709284770

During the design process of pik/jpeg xl I experimented on decode speed as a personal experience to have an opinion about this. I tried a special version of chrome that artificially throttled the image decoding. Once the decoding speed gets into the 20 megapixels per second the feeling coming from the additional speed was difficult to notice. I tried 2, 20 and 200 megapixels per second throttlings. This naturally depends on image sizes and uses too.

There was a much more easy to notice impact from progressive images and even sequential images displayed in a streaming manner during the download. As a rule of thumb, sequential top-to-bottom streaming feels 2x faster as a waiting rendering, and progressive feels 2x faster than sequential streaming.

silvestrov · 2024-03-01T10:03:05 1709287385

Decoding speed is important for battery time.

If a new format drains battery twice as fast, users don't want it.

jonsneyers · 2024-03-01T12:21:49 1709295709

This matters way more for video (where you are decoding 30 images per second continuously) than it does for still images. For still images, the main thing that drains your battery is the display, not the image decoding :)

But in any case, there are no _major_ differences in decoding speed between the various image formats. The difference caused by reducing the transfer size (network activity) and loading time (user looking at a blank screen while the image loads) is more important for battery life than the decoding speed itself. Also the difference between streaming/progressive decoding and non-streaming decoding probably has more impact than the decode speed itself, at least in the common scenario where the image is being loaded over a network.

caskstrength · 2024-03-01T13:51:31 1709301091

> This matters way more for video (where you are decoding 30 images per second continuously) than it does for still images.

OTOH video decoding is highly likely to be hardware accelerated on both laptops and smartphones.

> For still images, the main thing that drains your battery is the display, not the image decoding :)

I wonder if it becomes noticeable on image-heavy sites like tumblr, 500px, etc.

jonsneyers · 2024-03-01T16:27:39 1709310459

Assuming the websites are using images of appropriate dimensions (that is, not using huge images and relying on browser downscaling, which is a bad practice in any case), you can quite easily do the math. A 1080p screen is about 2 megapixels, a 4K screen is about 8 megapixels. If your images decode at 50 Mpx/s, that's 25 full screens (or 6 full screens at 4K) per second. You need to scroll quite quickly and have a quite good internet connection before decode speed will become a major issue, whether for UX or for battery life. Much more likely, the main issue will be the transfer time of the images.

vanderZwan · 2024-03-02T10:48:32 1709376512

Aside from all points you raise I find the discussion about battery life a little absurd in light of how negligible it is compared to the impact poorly written JavaScript in the context of web apps. For example, I noticed this morning that my bank somehow pins one CPU thread to 100% usage whenever I have their internet banking site open, even when nothing is being done. AFAIK there is no cryptocurrency nonsense going on, and the UI latency is pretty good too, so my best guess is that their "log out automatically after ten minutes of inactivity" security feature is implemented through constant polling.

j16sdiz · 2024-03-01T18:30:33 1709317833

The parent was talking about battery life.

Mobile phone CPU can switch between different power state very quickly. If the image decoding is fast, it can sleep more

lonjil · 2024-03-01T19:39:43 1709321983

And the one you're replying to is also talking about battery life. The energy needed to display an image for a few seconds is probably higher than the energy needed to decode it.

JyrkiAlakuijala · 2024-03-01T12:30:57 1709296257

Agreed. For web use they all decode fast enough. Any time difference might be in progression or streaming decoding, vs. waiting for all the data to arrive before starting to decode.

For image gallery use of camera resolution photographs (12-50 Mpixels) it can be more fun to have 100+ Mpixels/s, even 300 Pixels/s.

JyrkiAlakuijala · 2024-03-01T12:26:15 1709295975

I wasn't able to convince myself about that when approaching that question with with back-off-the-envelope calculation, published research and prototypes.

Very few applications are constantly decoding images. Today a single image is often decided in a few milliseconds, but watched 1000x longer. If you 10x or even 100x energy consumption of image decoding, it is still not going to compete with display, radio and video decoding as a battery drain.

oynqr · 2024-03-01T08:45:09 1709282709

When you actually want good latency, using the throughput as a metric is a bit misguided.

lifthrasiir · 2024-03-01T08:47:51 1709282871

As others pointed out, that's why JPEG XL's excellent support for progressive decoding is important. Other formats do not support progressive decoding at all or made it optional, so it cannot be even compared at this point. In the other words, the table can be regarded as an evidence that you can have both progressive decoding and performance at once.

mort96 · 2024-03-02T13:15:15 1709385315

There is no "throughput vs latency" here, there is no "start-up time" for decoding an image that's already in RAM. If a decoder decodes at 100MiB/s, and a picture is 10MiB, it's decoded in 0.1 seconds. If the decoder decodes at 1 MiB/s, the same picture is decoded in 10 seconds.

adgjlsfhk1 · 2024-03-02T19:50:38 1709409038

This isn't true for web use cases. There being able to start decoding from when the first bits arrive vs the last bits can make a difference (which JPEG-XL does a lot better than other image formats because it first sends all the DC coefficients which lets the website display a low resolution version of the image and then fill in the detail as the rest of the image comes through.

mort96 · 2024-03-02T23:32:39 1709422359

That's about progressive decoding, which isn't what this benchmark is focusing on. When benchmarking decode speed, throughput is a perfectly good metric. If the article wanted to talk about progressive decode performance, it would require a complete redesign, not just a change in metric from "throughput" to "latency".

lonjil · 2024-03-01T11:25:16 1709292316

If you don't have progressive decoding, those metrics are essentially the same.

jsheard · 2024-03-01T10:37:24 1709289444

Is it practical to use hardware video decoders to decode the image formats derived from video formats, like AVIF/AV1 and HEIC/H264? If so that could be a compelling reason to prefer them over a format like JPEG XL which has to be decoded in software on all of today's hardware. Everything has H264 decode and AV1 decode is steadily becoming a standard feature as well.

jonsneyers · 2024-03-01T11:12:18 1709291538

No browser bothers with hardware decode of WebP or AVIF even if it is available. It is not worth the trouble for still images. Software decode is fast enough, and can have advantages over hw decode, such as streaming/progressive decoding. So this is not really a big issue.

izacus · 2024-03-01T13:26:35 1709299595

No, not really - mostly because setup time and concurrent decode limitations of HW decoders across platforms tend so undermine any performance or battery gains from that approach. As far as I know, not even mobile platforms bother with it with native decoders for any format.

stusmall · 2024-03-01T18:31:56 1709317916

In some use cases the company is paying for the encoding, but the client is doing the decoding. As long as the client can decode the handful of images on the page fast enough for the human to not notice, its fine. Meanwhile any percentage improvement for encoding can save real money.

JyrkiAlakuijala · 2024-03-02T06:52:13 1709362333

Companies and economies who optimize the cost of clients as if their own can see diffuse benefits from it. Even if they consider 0.01 % of that cost it can lead to better decisions. If that will make the website load faster or allow for more realistic product pictures, it can lead to faster growth or less returns etc.

lonjil · 2024-03-01T12:04:01 1709294641

Real-time encoding is pretty popular, for which encoding speed is pretty important.

youngtaff · 2024-03-01T18:34:55 1709318095

Because that’s Cloudinary’s use case… they literally spend millions of dollars encoding images

PetahNZ · 2024-03-01T10:43:25 1709289805

My server will encode 1,000,000 images itself, but each client will only decode like 10.

bombcar · 2024-03-01T15:48:10 1709308090

But you may have fifty million clients, so the total "CPU hours" spend on decoding will outlast encoding.

But the person encoding is picking the format, not the decoder.

lonjil · 2024-03-01T20:56:13 1709326573

But the server doesn't necessarily have unlimited time to encode those images. Each of those 1 million images needs to be encoded before it can be sent to a client.

okamiueru · 2024-03-01T13:46:31 1709300791

That isn't saying much or anything.

gaazoh · 2024-03-01T08:59:21 1709283561

The inclusion of QOI in the lossless benchmarks made me smile. It's a basically irrelevant format, that isn't supported by default by any general-public software, that aims to be just OK, not even good, yet it has a spot on one of these charts (non-photographic encoding). Neat.

shdon · 2024-03-01T15:18:26 1709306306

GameMaker Studio has actually rather quickly jumped onto the QOI bandwagon, having 2 years ago replaced PNG textures with QOI (and added BZ2 compression on top) and found a 20% average reduction in size. So GameMaker Studio and all the games produced with it in the past 2 years or so do actually use QOI internally.

Not something a consumer knowingly uses, but also not quite irrelevant either.

JyrkiAlakuijala · 2024-03-02T04:06:44 1709352404

That feels a little sad. If qoi had anything good (fast single-threaded decoding speed for photographic content) adding bz2 most certainly removed it. They could have just used WebP lossless and it would have been faster and smaller.

pornel · 2024-03-01T19:08:45 1709320125

Oof that looks like a confused mess of a format.

bz2 is obsolete. It’s very slow, and not that good at compressing. zstd and lzma beat it on both compression and speed at the same time.

QOI’s only selling point is simplicity of implementation that doesn’t require a complex decompressor. Addition of bz2 completely defeats that. QOI’s poorly compressed data inside another compressor may even make overall compression worse. It could heve been a raw bitmap or a PNG with gzip replaced with zstd.

lifthrasiir · 2024-03-01T09:01:44 1709283704

And yet didn't reach the Pareto frontier! It's quite obvious in hindsight though---QOI decoding is inherently sequential and can't be easily parallelized.

gaazoh · 2024-03-01T09:17:33 1709284653

Of course it didn't, it wasn't designed to be either the fastest nor the best. Just OK and simple. Yet in some cases it's not completely overtaken by competition, and I think that's cool.

I don't believe QOI will ever have any sort of real-world practical use, but that's quite OK and I love it for it has made me and plenty of others look into binary file formats and compression and demystify it, and look further into it. I wrote a fully functional streaming codec for QOI, and it has taught me many things, and started me on other projects, either working with more complex file formats or thinking about how to improve upon QOI. I would probably never have gotten to this point if I tried the same thing starting with any other format, as they are at least an order of magnitude more complex, even for the simple ones.

theon144 · 2024-03-01T12:32:58 1709296378

>I don't believe QOI will ever have any sort of real-world practical use

Prusa (the 3d printer maker) seems to think otherwise! https://github.com/prusa3d/Prusa-Firmware-Buddy/releases/tag...

lonjil · 2024-03-01T11:21:08 1709292068

> Of course it didn't, it wasn't designed to be either the fastest nor the best. Just OK and simple. Yet in some cases it's not completely overtaken by competition, and I think that's cool.

Actually, there was a big push to add QOI to stuff a few years ago, specifically due to it being "fast". It was claimed that while it has worse compression, the speed can make it a worthy trade off.

p0nce · 2024-03-01T12:11:02 1709295062

It can be interesting if you need fast decode on low complexity, and it's an easy to improve format (-20 to -30%). Base QOI isn't that great.

phoboslab · 2024-03-01T20:40:09 1709325609

As far a I understand this benchmark JXL was using 8 CPU cores, while QOI naturally only used one. If you were to plot the graph with compute used (watts?) instead of Mpx/s, QOI would compare much better.

Also, curious that they only benchmarked QOI for "non-photographic images (manga)", where QOI fares quite badly because it doesn't have palleted mode. QOI does much better with photos.

lonjil · 2024-03-01T20:48:23 1709326103

Actually, they did try QOI for the photographic images:

> Not shown on the chart is QOI, which clocked in at 154 Mpx/s to achieve 17 bpp, which may be “quite OK” but is quite far from Pareto-optimal, considering the lowest effort setting of libjxl compresses down to 11.5 bpp at 427 Mpx/s (so it is 2.7 times as fast and the result is 32.5% smaller).

17 bpp is way outside the area shown in the graph. All the other results would've gotten squished and been harder to read, had QOI been shown.

phoboslab · 2024-03-01T21:26:28 1709328388

Thanks, I missed that.

I just ran qoibench on the photos they used[1] and QOI does indeed fair pretty badly with a compression ratio of 71.1% vs. 49.3% for PNG.

The photos in the QOI benchmark suite[2] somehow compress a lot better (e.g. photo_kodak/, photo_tecnick/ and photo_wikipedia/). I guess it's the film grain with the high resolution photos used in [1].

[1] https://imagecompression.info/test_images/

[2] https://qoiformat.org/benchmark/

aidenn0 · 2024-03-01T08:04:04 1709280244

One does wonder how much of JXL's awesomeness is the encoder vs. the format. Its ability to make high quality, compact images just with "-d 1.0" is uncanny. With other codecs, I had to pass different quality settings depending on the image type to get similar results.

kasabali · 2024-03-01T08:35:39 1709282139

That's a very good point. At this rate of development I wouldn't be surprised if libjxl becomes x264 of image encoders.

On the other hand, libvpx has always been a mediocre encoder which I think might be the reason for disappointing performance (I mean in general, not just speed) of vp8/vp9 formats, which inevitably also affected performance of lossy WebP. Dark Shikari even did a comparison of still image performances of x264 vs vp8 [0].

[0] https://web.archive.org/web/20150419071902/http://x264dev.mu...

JyrkiAlakuijala · 2024-03-01T09:49:22 1709286562

While WebP lossy still has image quality issues it has improved a lot over the years. One should not consider a comparison done with 2010-2015 implementations indicative of quality performance today.

kasabali · 2024-03-01T13:28:21 1709299701

I'm sure it's better now than 13 years ago, but the conclusion I got from looking at very recent published benchmark results is that lossy webp is still only slightly better than mozjpeg at low bitrates and still has worse max. PQ ceiling compared to JPEG, which in my opinion makes it not worth using over plain old JPEG even in web settings.

JyrkiAlakuijala · 2024-03-01T16:25:34 1709310334

That matches my observations. I believe that WebP lossy does not add value when Jpegli is an option and is having hard time to compete even with MozJPEG.

JyrkiAlakuijala · 2024-03-01T09:46:33 1709286393

Pik was designed initially without quality options only to do the best there is to achieve distance 1.0.

We kept a lot of focus on visually lossless and I didn't want to add format features which would add complexity but not help at high quality settings.

In addition to modeling features, the context modeling and efficiency of entropy coding is critical at high quality. I consider AVIFs entropy coding ill-suited for high quality or lossless photography.

edflsafoiewq · 2024-03-01T09:44:26 1709286266

They've also made a JPEG encoder, cjpegli, with the same "-d 1.0" interface.

dingdingdang · 2024-03-01T11:17:06 1709291826

Excellent run-through of jpegli encoder here: https://giannirosato.com/blog/post/jpegli/ - wish I could find a pre-compiled terminal utility for cjpegli!

botanical · 2024-03-01T13:57:19 1709301439

It's in the static github release files here: https://github.com/libjxl/libjxl/releases/tag/v0.10.1

kasabali · 2024-03-01T13:51:49 1709301109

They're available at https://github.com/libjxl/libjxl/releases/ for linux and windows.

dingdingdang · 2024-03-01T20:22:50 1709324570

Dear lord.. despite browsing and using github on a daily basis I still miss releases section sometimes! Before I saw your reply I checked the Scoop repos and sure enough, on Windows this will get you latest cjpegli version installed and added to path in one go:

scoop install main/libjxl

Note.. now that I tried it: that is really next level for an old format..!

account42 · 2024-03-04T16:26:13 1709569573

> The way jpegli handles XYB color in a JPEG image is by applying an ICC color profile that maps the existing JPEG color channels to XYB.

That sounds like it might negatively affect compatibility with older decoders that do not handle color spaces correctly.

JyrkiAlakuijala · 2024-03-05T09:21:03 1709630463

Jpegli has the possibility of using XYB. By default as by being just a replacement of mozjpeg or libjpeg-turbo it doesn't.

I believe Jon has compared jpegli without XYB. If you turn XYB on, you get about 10 % more compression.

Jpegli is great even without XYB. It has many other methods for success (largely copied over from JPEG XL adaptive quantization heuristics, more precise intermediate calculations, as well as the guetzli method for variable dead-zone quantization).

disclaimer: I created the XYB colorspace, most of the JPEG XL VarDCT quality-affecting heuristics, and scoped jpegli. Zoltan (from WOFF2/Brotli fame!) did the actual implementation and made it work so well.

lonjil · 2024-03-01T11:22:58 1709292178

I have heard that it will see a proper standalone release at some point this year, but I don't know more than that.

aidenn0 · 2024-03-02T00:47:12 1709340432

It's in the unstable channel of nixpkgs.

JyrkiAlakuijala · 2024-03-01T16:36:03 1709310963

It is worth noting that the JPEG XL effort produced a nice new parallelism library called Highway. This library is powering not only JPEG XL but also Google's latest Gemma AI models.

jhalstead · 2024-03-01T19:38:33 1709321913

[0] for those interested in Highway.

It's also mentioned in [1], which starts off

> Today we're sharing open source code that can sort arrays of numbers about ten times as fast as the C++ std::sort, and outperforms state of the art architecture-specific algorithms, while being portable across all modern CPU architectures. Below we discuss how we achieved this.

[0] https://github.com/google/highway

[1] https://opensource.googleblog.com/2022/06/Vectorized%20and%2..., which has an associated paper at https://arxiv.org/pdf/2205.05982.pdf.

janwas · 2024-03-01T23:39:19 1709336359

:) Thanks for the mention. Highway/vqsort TL here, happy to discuss.

PS: I used to work on JPEG XL. It is great to see these outstanding improvements, congrats to the team!

a-french-anon · 2024-03-04T09:01:07 1709542867

And VIPS! It seems like the best way to get portable SIMD in C++, to me.

TacticalCoder · 2024-03-01T15:09:07 1709305747

Without taking into account whether JPEG XL shines on its own or not (which it may or may not), JPEG XL completely rocks for sure because it does this:

    .. $  ls -l a.jpg && shasum a.jpg
    ... 615504 ...  a.jpg
    716744d950ecf9e5757c565041143775a810e10f  a.jpg

    .. $  cjxl a.jpg a.jxl
    Read JPEG image with 615504 bytes.
    Compressed to 537339 bytes including container

    .. $  ls -l a.jxl
    ... 537339 ... a.jxl

But, wait for it:

    .. $  djxl a.jxl b.jpg
    Read 537339 compressed bytes.
    Reconstructed to JPEG.

    .. $  ls -l b.jpg && shasum b.jpg
    ... 615504 ... b.jpg
    716744d950ecf9e5757c565041143775a810e10f  b.jpg

Do you realize how many billions of JPEG files there are out there which people want to keep? If you recompress your old JPEG files using a lossy format, you lower its quality.

But with JPEG XL, you can save 15% to 30% and still, if you want, get your original JPG 100% identical, bit for bit.

That's wonderful.

P.S: I'm sadly on Debian stable (12 / Bookworm) which is on ImageMagick 6.9 and my Emacs uses (AFAIK) ImageMagick to display pictures. And JPEG XL support was only added in ImageMagick 7. I haven't looked more into that yet.

JyrkiAlakuijala · 2024-03-01T15:39:49 1709307589

I managed to add that requirement to jpeg xl. I think it will be helpful to preserve our digital legacy intact without lossy re-encodings.

izacus · 2024-03-01T15:10:29 1709305829

I'm sure that will be hugely cherished by users which take screenshots of JPEGs so they can resend them on WhatsApp :P

F3nd0 · 2024-03-01T15:20:13 1709306413

This particular feature might not, but if said screenshots are often compressed with JPEG XL, they will be spared the generation loss that becomes blatantly visible in some other formats: https://invidious.protokolla.fi/watch?v=w7UDJUCMTng

IshKebab · 2024-03-01T17:42:47 1709314967

Maybe. But to know for sure you need to offset the image and change encoder settings.

Zamicol · 2024-03-01T08:03:24 1709280204

> The new version of libjxl brings a very substantial reduction in memory consumption, by an order of magnitude, for both lossy and lossless compression. Also the speed is improved, especially for multi-threaded lossless encoding where the default effort setting is now an order of magnitude faster.

Very impressive! The article too is well written. Great work all around.

ImageXav · 2024-03-01T20:27:22 1709324842

Maybe someone here will know of a website that describes each step of the jpeg xl format in detail? Unlike for traditional jpeg, I have found it hard to find a document providing clear instructions on the relevant steps, which is a shame as there are clearly tons of interesting innovations that have been compiled together to make this happen, and I'm sure the individual components are useful in their own right!

adgjlsfhk1 · 2024-03-02T01:35:43 1709343343

https://github.com/libjxl/libjxl/blob/main/doc/format_overvi... is a pretty detailed but good overview. The highlights are variable size DCT (up to 128x128), ANS entropy prediction, and chroma from luminance prediction. https://github.com/libjxl/libjxl/blob/main/doc/encode_effort... also gives a good breakdown of features by effort level.

ImageXav · 2024-03-03T10:28:01 1709461681

Thank you for sharing. This gives me a good idea of where to start looking.

btdmaster · 2024-03-01T09:10:51 1709284251

Missing from the article is rav1e, which encodes AV1, and hence AVIF, a lot faster than the reference implementation aom. I've had cases where aom would not finish converting an image in a minute of waiting what rav1e would do in less than 10 seconds.

JyrkiAlakuijala · 2024-03-01T09:40:45 1709286045

Is rav1e pareto-curve ahead of libaom pareto-curve?

Does fast rav1e look better than jpegli at high encode speeds?

pornel · 2024-03-01T19:22:19 1709320939

rav1e is generally head to head with libaom on static images, and which one wins on the speed/quality/size frontier depends a lot on the image and settings, as much as +/- 20%. I suspect rav1e has an inefficient block size selection algorithm, so the particular shape of blocks is a make or break for it.

I’ve only compared rav1e to mozjpeg and libwebp, and at fastest speeds it’s only barely ahead.

btdmaster · 2024-03-01T10:55:09 1709290509

Difficult to know without reproduction steps from the article, but I would think it behaves better than libaom for the same quality setting.

Edit: found https://github.com/xiph/rav1e/issues/2759

JyrkiAlakuijala · 2024-03-01T16:27:10 1709310430

If Rav1e found better ways of encoding, why would the aom folks copy it in libaom?

jonsneyers · 2024-03-01T11:17:25 1709291845

Both rav1e and libaom have a speed setting. At similar speeds, I have not observed huge differences in compression performance between the two.

taylorius · 2024-03-01T08:31:57 1709281917

The article mentions encoding speed as something to consider, alongside compression ratio. I would argue that decoding speed is also important. A lot of the more modern formats (webp, avif etc) can take significantly more CPU cycles to decode than a plain old jpg. This can slow things down noticeably,especially on mobile.

oynqr · 2024-03-01T08:39:18 1709282358

JPEG and JXL have the benefit of (optional) progressive decoding, so even if the image is a little larger than AVIF, you may still see content faster.

lifthrasiir · 2024-03-01T08:51:00 1709283060

Note that JPEG XL always supports progressive decoding, because the top-level format is structured in that way. The optional part is a finer-grained adjustment to make the output more suitable for specific cases.

izacus · 2024-03-01T13:28:51 1709299731

That's great, are there any comparison graphs and benchmarks showing that in real life (similarly to this article)?

149765 · 2024-03-01T13:43:50 1709300630

A couple of videos comparing progressive decoding of jpeg, jxl and avif:

https://www.youtube.com/watch?v=UphN1_7nP8U

https://www.youtube.com/watch?v=inQxEBn831w

There's more on the same channel, generation loss ones are really interesting.

izacus · 2024-03-01T15:13:55 1709306035

Awesome, thanks.

JyrkiAlakuijala · 2024-03-02T04:12:42 1709352762

https://opensource.googleblog.com/2021/09/using-saliency-in-... can be interesting

lifthrasiir · 2024-03-01T08:41:02 1709282462

Any computation-intensive media format on mobile is likely using a hardware decoder module anyway, and that most frequently includes JPEG. So that comparison is not adaquate.

kasabali · 2024-03-01T09:06:12 1709283972

"computation-intensive media" = videos

Seriously, when is the last time mobile phones used hardware decoding for showing images? Flip phones in 2005?

I know camera apps use hardware encoding but I doubt gallery apps or browsers bother with going through the hardware decoding pipeline for hundreds of JPEG images you scroll through in seconds. And when it comes to showing a single image they'll still opt to software decoding because it's more flexible when it comes to integration, implementation, customization and format limits. So not surprisingly I'm not convinced when I repeatedly see this claim that mobile phones commonly use hardware decoding for image formats and software decoding speed doesn't matter.

jeroenhd · 2024-03-01T10:08:31 1709287711

I don't know the current status of web browsers, ut hardware encoding and decoding for image formats is alive and well. Not really relevant for showing a 32x32 GIF arrow like on HN, but very important when browsing high resolution images with any kind of smoothness.

If you don't really care about your users' battery life you can opt to disable hardware acceleration within your applications, but it's usually enabled by default, and for good reason.

kasabali · 2024-03-01T13:39:15 1709300355

> hardware encoding and decoding for image formats is alive and well

I keep hearing and hearing this but nobody has ever yet provided a concrete real world example of smart phones using hw decoding for displaying images.

lonjil · 2024-03-01T11:38:58 1709293138

Hardware acceleration of image decoding is very uncommon in most consumer applications.

izacus · 2024-03-01T13:29:18 1709299758

No, not a single mobile platform uses hardware decode modules for still image decoding as of 2024.

At best, the camera processors output encoded JPEG/HEIF for taken pictures, but that's about it.

lifthrasiir · 2024-03-02T00:37:43 1709339863

That was quite contrary to my understanding, so I took some more time to verify both my and your claim. The reality turned out to be somewhere in the middle: modern mobile SoCs do ship with hardware JPEG decoding among others, but there is no direct API for that hardware decoding module in the mobile platform itself (Android 7 and onwards use libjpeg-turbo by default, for example). But mobile manufacturers can change the implementation details behind those APIs, so it is still true that some mobiles do use hardware JPEG decoding behind the scene. But it is hard to tell how common it is. So well, thank you for the counterpoint---that corrected my understanding.

kllrnohj · 2024-03-02T01:57:56 1709344676

They ship with hardware jpeg decoders because they ship with hardware jpeg encoders for camera capture latency reasons and it turns out you can basically just run that hardware in reverse.

The SoCs aren't investing more than a token amount of effort into those jpeg decoders, and from experience some of them claim to exist but produce the shittiest looking output imaginable and more slowly than jpeg-turbo at that.

Also you can trivially find out if your Android phone is doing this or not, just run some perf call sampling while decoding jpegs. If all you see is AOSP libraries & libjpeg-turbo, well then they aren't doing hardware decodes :)

throwaway81523 · 2024-03-01T08:25:36 1709281536

Does JPEG XL have patent issues? I half remember something about that. Regular JPG seems fine to me. Better compression isn't going to help anyone since they will find other ways to waste any bandwidth available.

lifthrasiir · 2024-03-01T08:35:58 1709282158

The main innovation claimed by Microsoft's rANS patent is about the adaptive probability distribution, that is, you should be able to efficiently correct the distribution so that you can use less bits. While that alone is an absurd claim (that's a benefit shared with arithmetic coding and its variants!) and there is a very clear prior art, JPEG XL doesn't dynamically vary the distribution so is thought to be not related to the patent anyway.

jonsneyers · 2024-03-01T16:43:45 1709311425

No it doesn't.

And yes, regular JPEG is still a fine format. That's part of the point of the article. But for many use cases, better compression is always welcome. Also having features like alpha transparency, lossless, HDR etc can be quite desirable, and those things are not really possible in JPEG.

MikeCapone · 2024-03-01T14:35:56 1709303756

I really hope this can become a new standard and be available everywhere (image tools, browsers, etc).

While in practice it won't change my life much, I like the elegance of using a modern standard with this level of performance an efficiency.

jshier · 2024-03-02T00:59:57 1709341197

I have an existing workflow where I take JPEGs (giant PNGs) from designers and reencode them using mozjpeg. However, I can't find a way to invoke jpegli tool in the same way, especially since it seems to just be part of the jpeg-xl tool? Is that right? Are there any sample invocations anywhere?

chungy · 2024-03-02T01:19:34 1709342374

You should be able to use the `cjpegli` command the same way you'd use cjxl. The simplest invocation would be `cjpegli input.png output.jpg`

JyrkiAlakuijala · 2024-03-02T04:16:12 1709352972

Or replace/ldconfig the jpeg library with the jpegli library. It is API/ABI compatible with libjpeg-turbo and mozjpeg.

jshier · 2024-03-02T07:19:01 1709363941

Thanks. It seems the macOS release is a bit behind on brew (0.9.1) and doesn't include the cjpegli tool yet.

kodabbb · 2024-03-01T16:16:47 1709309807

Looks like there are more savings coming on lossless AVIF: https://www.reddit.com/r/AV1/comments/1b3lh08/comment/kstmbr...

JyrkiAlakuijala · 2024-03-01T18:55:35 1709319335

I plan to add 15–25 % more quality in the ugly lowest end quality in JPEG XL in the coming two months.

jiggawatts · 2024-03-01T22:13:47 1709331227

The problem with JPEG XL is that it is written in an unsafe language and has already had several memory safety vulnerabilities found in it.

Image codecs are used in a wide range of attacker-controlled scenarios and need to be completely safe.

I know Rust advocates sound like a broken record, but this is the poster child for a library that should never have been even started in C++ in the first place.

It’s absolute insanity that we write codecs — pure functions — in an unsafe language that has a compiler that defaults to “anything goes” as an optimisation technique.

lonjil · 2024-03-01T22:25:11 1709331911

Pretty much every codec in every browser is written in an unsafe language, unfortunately. I don't see why JXL should be singled out. On the other hand, there is a JXL decoder in Rust called jxl-oxide [1] which works quite well, and has been confirmed by JPEG as conformant. Hopefully it will be adopted for decode-only usecases.

[1] https://github.com/tirr-c/jxl-oxide/pull/267

> It’s absolute insanity that we write codecs — pure functions — in an unsafe language that has a compiler that defaults to “anything goes” as an optimisation technique.

Rust and C++ are exactly the same in how they optimize, compilers for both assume that your code has zero UB. The difference is that Rust makes it much harder to accidentally have UB.

jiggawatts · 2024-03-01T23:21:38 1709335298

"We've never had to wear helmets before, why start now?"

There are only a handful of image codecs that are widely accepted. Essentially just GIF, PNG, and JPG. There's a smattering of support for more modern formats, but those three dominate.

Adding a fourth image format is increasing this attack surface by a substantial margin across a huge range of software. Not just web browsers, but chat apps, server software (thumbnail generators), editors, etc...

This is the kind of thing that gets baked into standard libraries, operating systems, and frameworks. It's up there with JSON or XML.

You had better be damned sure what you're doing is not going to cause a long list of CVEs!

JPEG XL is a complex codec, with a lot of code. This increases the chance of bugs and increases the attack surface.

A (surprisingly!) good metric for complexity is the size of the zip file of the code. Libjpeg is something like 360 kB, libpng is 350 kB, and giflib is 90 kB.

The JXL source is 1.4 MB zipped, making more than twice the size of the above three combined!

The other libraries use C/C++ not because that's a better choice, but because it was the only choice back in the ... checks Wikipedia ... 1980s and 90s!

We live in the future. We have memory-safe languages now. We're allowed to use them. You won't get in trouble from anyone, I promise.

lonjil · 2024-03-01T23:53:15 1709337195

> "We've never had to wear helmets before, why start now?"

> We live in the future. We have memory-safe languages now. We're allowed to use them. You won't get in trouble from anyone, I promise.

That's why I specifically said that it's unfortunate that C++ is still wide spread, and pointed to a fully conformant JXL decoder written in Rust :p

> There are only a handful of image codecs that are widely accepted. Essentially just GIF, PNG, and JPG. There's a smattering of support for more modern formats, but those three dominate.

Every browser ships libwebp and an AVIF decoder. Every reasonably recent Android phone does as well. And every iPhone. Every (regular) install of Windows has libwebp. Every Mac has libwebp and dav1d. That's all C++. AVIF in particular is only a couple of years older than JXL, and yet I've never seen opposition to it on the grounds of memory safety. That is what I meant about JXL being singled out.

> JPEG XL is a complex codec, with a lot of code. This increases the chance of bugs and increases the attack surface.

> A (surprisingly!) good metric for complexity is the size of the zip file of the code. Libjpeg is something like 360 kB, libpng is 350 kB, and giflib is 90 kB.

> The JXL source is 1.4 MB zipped, making it nearly twice the size than all of the above combined.

Which code exactly are you including in that? The libjxl repo has a lot of stuff in it, including an entire brand new JPEG encoder! Though jxl certainly is more complex than those three combined, since JXL is essentially a superset of all their functionality, plus new stuff.

jiggawatts · 2024-03-02T00:14:50 1709338490

I revised my numbers a bit by filtering out the junk and focusing only on the code that most likely contributes to the runtime components (where the security risks lie). E.g.: Excluded the samples, test suites, doco, changelogs, etc... and kept mostly just the C/C++ and assembly code.

I also recompressed all of the libraries with identical settings to make the numbers more consistent.

JyrkiAlakuijala · 2024-03-02T04:23:56 1709353436

I believe JPEG XL binary size is about one third of AVIF binary size. It is relatively compact. It is easy to write a small encoder: libjxl-tiny is just 7000 lines of code.

account42 · 2024-03-04T16:40:38 1709570438

> an unsafe language

No such thing.

> a compiler that defaults to “anything goes” as an optimisation technique

That's just FUD.

mips_r4300i · 2024-03-01T08:24:17 1709281457

This is really impressive even compared to WebP. And unlike WebP, it's backwards compatible.

I have forever associated Webp with macroblocky, poor colors, and a general ungraceful degradation that doesn't really happen the same way even with old JPEG.

I am gonna go look at the complexity of the JXL decoder vs WebP. Curious if it's even practical to decode on embedded. JPEG is easily decodable, and you can do it in small pieces at a time to work within memory constraints.

bombcar · 2024-03-01T15:52:31 1709308351

Everyone hates WebP because when you save it, nothing can open it.

That's improved somewhat, but the formats that will have an easy time winning are the ones that people can use, even if that means a browser should "save JPGXL as JPEG" for awhile or something.

ComputerGuru · 2024-03-01T16:05:52 1709309152

Everyone hates webp for a different reason. I hate it because it can only do 4:2:0 chroma, except in lossless mode. Lossless WebP is better than PNG, but I will take the peace of mind of knowing PNG is always lossless over having a WebP and not knowing what was done to it.

149765 · 2024-03-01T16:21:16 1709310076

> peace of mind of knowing PNG is always lossless

There is pngquant:

> a command-line utility and a library for lossy compression of PNG images.

bombcar · 2024-03-01T16:30:08 1709310608

You also have things like https://tinypng.com which do (basically) lossy PNG for you. Works pretty well.

ComputerGuru · 2024-03-01T20:40:26 1709325626

Neither of these are really what I'm referring to, as I view these as ~equivalent to converting a jpeg to png. What I mean is within a pipeline, once you have ingested a [png|webp|jpeg] and you need to now render it at various sizes or with various filters for $purposes. If you have a png, you know that you should always maintain losslessness. If you have a jpeg, you know you don't. You don't need to inspect the file or store additional metadata, the extension alone tells you what you need to know. But when you have a webp, the default assumption is that it's lossy but it can sometimes be otherwise.

lonjil · 2024-03-01T20:51:23 1709326283

Actually, if you already have loss, you should try as hard as possible to avoid further loss.

ComputerGuru · 2024-03-01T21:10:54 1709327454

I don't disagree, in principle. But if I have a lossy 28MP jpeg, I'm not going to encode it as a lossless thumbnail (or other scaled-down version).

mfkp · 2024-03-01T20:38:39 1709325519

I've noticed in chrome-based browsers, you can right click on a webp file and "edit image". When you save it, it defaults to png download, which makes a simple conversion.

Mobile browsers seem to default to downloading in png as well.

lonjil · 2024-03-01T19:51:16 1709322676

I think JXL has been seeing adoption by apps faster than Webp or AVIF.

account42 · 2024-03-04T16:45:24 1709570724

Unfortunately its the other way around for web browsers :|

CharlesW · 2024-03-01T16:13:53 1709309633

> And unlike WebP, it's backwards compatible.

No, JPEG XL files can't be viewed/decoded by software or devices that don't have a JPEG XL decoder.

Zardoz84 · 2024-03-01T17:02:51 1709312571

JPEG XL can be converted to/from JPEG without any loss of quality. See another commenter where shows a example where doing JPEG -> JPEG XL -> JPEG generates a binary exact copy of the original JPEG.

Yeah, this not means what usually we call backwards compatibility, but allows usage like storing the images as JPEG XL and, on the fly, send a JPEG to clients that can't use it, without any loss of information. WebP can't do that.

F3nd0 · 2024-03-02T16:36:37 1709397397

But that only works when the JXL has been losslessly converted from a JPEG in the first place, right? So this wouldn’t work for all JXL in practice. (Unless I’ve missed something and this is not the case.)

JyrkiAlakuijala · 2024-03-02T17:02:44 1709398964

You could start with relatively good jpegli as a codec and then lossless recompress that with jpeg xl. Naturally some entity (server side, app, content encoding etc.) needs to unpack the jpeg xl jpeg into a usual jpeg before it can be consumed by a legacy system.

eviks · 2024-03-01T08:42:58 1709282578

Welcome efficiency improvements

And in general, Jon's posts provide a pretty good overview on the topic of codec comparison

Pity such a great format is being held back by the much less rigorous reviews

anewhnaccount2 · 2024-03-01T08:17:37 1709281057

Should the Pareto front not be drawn with line perpendicular to the axes rather than diagonal lines?

JyrkiAlakuijala · 2024-03-01T16:30:01 1709310601

Often with this kind of pareto it can be argued that even when continuous decisions are not available, a compression system could keep choosing every second at effort 7 and every second at effort 6 (or any ratio), leading, on the average interpolated results. Naturally such interpolation does not produce straight lines in log space.

penteract · 2024-03-01T08:37:31 1709282251

Yes, it should, but it looks like they just added a line to the jxl 0.10 series of data on whatever they used to make the graph, and labelled it the Pareto front. Looking closely at the graphs, they actually miss some points where version 0.9 should be included in the frontier.

lifthrasiir · 2024-03-01T09:04:32 1709283872

I think it can be understood as an expected Pareto frontier if enough options are added to make it continuous, which is often implied in this kind of discussions.

penteract · 2024-03-01T11:00:33 1709290833

I'm not sure that's reasonable - The effort parameters are integers between 1 and 10, with behavior described here: https://github.com/libjxl/libjxl/blob/main/doc/encode_effort..., the intermediate options don't exist as implemented programs. This is a comparison of concrete programs, not an attempt to analyze the best theoretically achievable.

Also, the frontier isn't convex, so it's unlikely that if intermediate options could be added then they would all be at least as good as the lines shown; and the use of log(speed) for the y-axis affects what a straight line on the graph means. It's fine for giving a good view of the dataset, but if you're going to make a guess about intermediate possibilities, 'speed' or 'time' should also be considered.

jonsneyers · 2024-03-01T12:28:10 1709296090

You are right, but that would make an uglier plot :)

Some of the intermediate options are available though, through various more fine-grained encoder settings than what is exposed via the overall effort setting. Of course they will not fall exactly on the line that was drawn, but as a first approximation, the line is probably closer to the truth than the staircase, which would be an underestimate of what can be done.

jamesthurley · 2024-03-01T09:28:02 1709285282

Perpendicular to which axis?