Hacker News new | past | comments | ask | show | jobs | submit login
JPEG XL and the Pareto Front (cloudinary.com)
497 points by botanical 10 months ago | hide | past | favorite | 319 comments



Pay attention to just how good WebP is at _lossless_ comparison though!

I've always thought that one as flying under the radar. Most get stuck on WebP not offering tangible enough benefits (or even worse) over MozJPEG encoding, but WebP _lossless_ is absolutely fantastic for performance/speed! PNG or even OptiPNG is far worse. And very well supported online now, and leaving the horrible lossless AVIF in the dust too of course.


Lossless WebP is very good indeed. The main problem is that it is not very future-proof since it only supports 8-bit. For SDR images that's fine, but for HDR this is a fundamental limitation that is about as bad as GIF's limitation to 256 colors.


Lossless WebP is also stuck with a low axis limit of 16383.

It is a good format when you can use it, but JPEG XL almost always compresses better anyway, and lacks color space and dimension limits.


This limit didn't exist in my first version as well as the 4 GB limit. These were artificially introduced to "match" the properties of lossy WebP. We could have done better there.


Were you involved in creating WebP? If so that's super cool! Why would they want to match webp's lossy compression though? To make it more uniform? And do you know why lossy WebP had such a limitation in the first place? Thank you!


I designed the WebP lossless format, wrote the spec, and implemented the first encoder.

The constraint was in WebP lossy to facilitate exact compatibility with VP8 specification and hoping that it would allow hardware decoding and encoding of WebP images using VP8 hardware.

Hardware encoding and decoding were never used, but the limitation stuck.

There was no serious plan to do hardware lossless, but the constraint was copied for "reducing confusion".

I didn't and don't like it that much as more PNG images couldn't be represented as WebP lossless as a result of that.


Wow, that really sucks. I appreciate the explanation as well as your frustration with it.

My desktop had a mixture of PNG and WebP files solely because of this limitation. I use the past tense because they've now all been converted to JPEG XL lossless.


Another group of incompatibility with PNG was 16 bit coding. I had a plan to add it as simply sending two 8 bit images where the second image containing the 8 least significant bits would be predicted to be the same as the first. That way it would not be perfect, but it would be 100x better than how PNG deals with it. WebP had another plan for layers and tiles that never realized, and as a consequence WebP is stuck at 8 bits.


Ah, I didn't know this and I agree this is a fairly big issue and increasingly so over time. I think smartphones in particular hastened the demand for HDR quite a bit, what was once a premium/enthusiast feature you only had to explicitly buy into.


HDR is also important for medical imaging applications (which have been moving to web)


I haven't ran across websites that serves up HDR images, I am not sure I would notice the difference. WebP seems appropriately named and optimized for image delivery on the web.

Maybe you are thinking of high bit depth for archival use? I can see some use cases there where 8-bit is not sufficient, though personally I store high bit depth images in whatever raw format was produced by my camera (which is usually some variant of TIFF).


8-bit can have banding even without "HDR". Definitely not enough. 10 bit HDR video is becoming more common, and popularity for images will follow. Adoption is hampered by the fact that Windows has bad HDR support, but it all works plenty well on macOS and mobile platforms.


unfortunately linux HDR is pretty much completely absent. That said, Wayland slowly looks like it's getting there.



X11 supports 10-bit SDR… which Wayland doesn’t


It would be nice if everything was non tone-mapped HDR all the time for software, and the windowing system or the monitor would do the (local) tone mapping.


No, you don't want that. You want your windowing system & UI/graphics layers working in display native range (see eg, Apple's EDR). The consistency of appearance of the SDR range is too important to leave it up to unknown tone mapping.

Also it's too power expensive to have your display in HDR if you're only showing SDR UIs, which not only is the common reality but will continue to be so for the foreseeable future.


I think both approaches have advantages.

Requiring every game, photo viewing app, drawing program, ... every application to decide how to do it's HDR->SDR seems unnecessary complication due to poor abstractions.

The locality of local tone mapping (the ideal approach to HDR->SDR mapping) would expose the window boundaries. Two photos or two halves of the same photo in different windows (as opposed to being in the same window) would create an artificial discontinuity for the correction fields being artificially contained within each window instead of spanning the users visual field as best as possible.

Every local tone mapping needs to make an assumption of the surrounding colors: is the window surrounded by black, gray, colored or bright light should influence how the tone mapping is done at borders. This information is not available for an app: it can only be done at the windowing system level or in the monitor.

The higher the quality of the HDR->SDR mapping in a system, the more opportunity there is to limit the maximum brightness, and thus also the opportunity for energy savings.


> Requiring every game [..] to decide how to do it's HDR->SDR seems unnecessary complication due to poor abstractions.

Games already do this regardless and it's part of their post processing pipelines that also govern aspects of their environmental look.

What's missing is HDR->Display which is why HDR games have you go through a clunky calibration process to attempt to reverse out what the display is going to the PQ signal, when what the game actually wants is what MacOS/iOS and now Android just give them - the exact amount of HDR headroom which the display doesn't manipulate.

As for your other examples, being able to target the display native range doesn't mean you have to. Every decent os has a compositor API that lets you offload this to the system.


Yes, if local tone mapping is done by the operating system (windowing system or the monitor itself), then there is a chance that borders of windows are done appropriately within their spatial context.


X11 does but many X11 applications and window managers don't.


For things where you actually care about lossless you probably also don't care about HDR.

HDR is (or can be) good for video & photography, but it's absolutely ass for UI.

Besides, you can just throw a gainmap approach at it if you really care. Works great with jpeg, and gainmaps are being added to heif & avif as well, no reason jpegxl couldn't get the same treatment. The lack of "true 10-bit" is significantly less impactful at that point


Gainmaps don't solve 8-bit mach banding. If anything you get more banding: two bandings, one banding from each of the two 8-bit fields multiplied together.

Gainmaps "solve" the problem of computing a local tone mapping by declaring that it needs to be done at server side or at image creation time rather than at viewing time.

My prediction: Gainmaps are going to be too complex of a solution for us as a community and we are going to find something else that is easier. Perhaps we could end up standardizing a small set of local tone mapping algorithms applied at viewing time.


> Gainmaps "solve" the problem of computing a local tone mapping by declaring that it needs to be done at server side or at image creation time rather than at viewing time.

Which was already the case. A huge amount of phone camera quality is from advanced processing, not sensor improveds. Trying to get that same level of investment in all downstream clients is both unrealistic and significantly harder. A big aspect of why DolbyVision looks better is just Dolby forces everyone to use their tone mapping, and client consistency is critical.

Gainmaps also avoid the proprietary metadata disaster that plaugues HLG/PQ video content.

> If anything you get more banding: two bandings, one banding from each of the two 8-bit fields multiplied together.

The math works out such that you get the equivalent of something like 9 bits of depth but you're also not wasting bits on colors and luminance ranges you aren't using like you are with bt2020 hlg or PQ


I didn't try it out, but I don't see the 9 bit coming. I feel it gives about 7.5 bits.

Mixing two independent quantization sources will lead to more error.

Some decoding systems such as traditional jpeg does not specify results exactly so bit-perfect quantization-aware compensation is not going to be possible.


The context of this thread is lossless webp, there aren't any compression artifacts to deal with.

Go try the math out, the theoretical is higher than 9 bits but 8.5-9 bit equivalent is very achievable in practice. With two lossless 8 bit sources this is absolutely adequate for current displays, especially since again you're not wasting bits on things like PQ's absurd range.

Will this be added to webp? Probably not, it seems like a dead format regardless. But 8 bit isn't the death knell for HDR support as is otherwise believed.


This depends on the quality you are aiming for.

You just cannot reach the best quality with 8 bits, not in SDR, not in HDR, not with gainmap HDR. Sometimes you don't care for a particular use case, and then 8 bits becomes acceptable. Many use cases remain where degradation by a compression system is unacceptable or creates too many complications.


> but it's absolutely ass for UI.

uhh, no it isn't?

And gainmaps suck, take lots of space, don't reduce banding. Even SDR needs 10-bit in a lot of situations to not have banding.


> > but it's absolutely ass for UI.

> uhh, no it isn't?

Find me a single example of a UI in HDR for the UI components, not for photos/videos.

> Even SDR needs 10-bit in a lot of situations to not have banding.

You've probably been looking at 10-bit HDR content on an 8-bit display panel anyway (true 10-bit displays don't exist in mobile yet, for example). 8-bits works fine with a bit of dithering


> 8-bits works fine with a bit of dithering

Yes, but the place where you want dithering is in the display, not in the image. Dithering doesn't work with compression because it's exactly the type of high frequency detail that compression removes. It's much better to have a 10 bit image which makes the information low frequency (and lets the compression do a better job since a 10 bit image will naturally be more continuous than an 8 bit image since there is less rounding error in the pixels), and let the display do the dithering at the end.


Gainmaps only take a lot of space if implemented poorly in the software creating the gainmap. On Pixel phones they take up 2% of the image size, by being stored in quarter resolution and scaling back up during decoding, which works fine on all gainmap-supported devices. They're more prone to banding, but it's definitely not a problem with all images.


I guess I should say, take up lots of space unless you're OK with lower quality. In any case, gainmaps are entirely unrelated to the 8 bit vs 10 bit question. The more range you have (gamut or brightness) the worse 8 bit is, regardless of whether you're using a gainmap or not. And you can use gainmaps with 10 bit images.


An issue with lossless WebP is that it only supports (A)RGB and encodes grayscale via hacks that aren't as good as simply supporting monochrome.

If you compress a whole manga, PNG (via oxipng, optipng is basically deprecated) is still the way to go.

Another something not mentioned in here is that lossless JPEG2000 can be surprisingly good and fast on photographic content.


Just tried it on random manga page

- OxiPNG - 730k

- webp lossless max effort - 702k

- avif lossless max effort - 2.54MB (yay!)

- jpegxl lossless max effort - 506k (winner!)


Probably depends on the manga itself. Action manga probably don't compress as well as more dialogue-heavy works.


I would, at a first blush, disagree with that characterization? Dialogue equals more fine-grained strokes and more individual, independent “zones” to encode.


Now I'm really curious. If anyone has more time than me they could do some sample compressions and I'd be interested in the results.


I wonder if the text would be consistent enough for JXL's "patches" feature to work well.


IIRC the way you encode grayscale in WebP is a SUBTRACT_GREEN transform that makes the red and blue channel 0 everywhere, and then use a 1-element prefix code for R and B, so the R and B for each pixel take zero bits. Same idea with A for opaque images. Do you know why that's not good enough?


I made a mistake there with subtract green.

If I had just added 128 to the residuals, all remaining prediction arithmetic would have worked better and it would have given 1 % more density.

This is because most related arithmetic for predicting pixels is done in unsigned 8 bit arithmetic. Subtract green moves such predictions to often cross the 0 -> 255 boundary, and then averaging, deltas etc make little sense and add to the entropy.


Can you explain why?


I edited the answer into the previous message for better flow.


Thankfully the following comment explains more than I know, I was speaking purely from empiric experience.


Then you can't know that any difference you see is because of how WebP encodes grayscale.


Thank you! <3

WebP also has a near-lossless encoding mode based on lossless WebP specification that is mostly unadvertised, but should be preferred over real lossless in almost every use case. Often you can half the size without additional visible loss.


Is this mode picked automatically in "mixed" mode?

Unfortunately, that option doesn't seem to be available in gif2webp (I mostly use WebP for GIF images - as animated AVIF support is poor on browsers and that has an impact in interoperability)


Not in gif2webp, no. It is available in img2webp as a global (not per-frame) option. It looks like when you pass it, it will be used for all lossless encoding, including when "-mixed" tries the lossless mode.


I don't know


do you know why Jon didn't compare near-lossless in the "visually lossless" part?


WebP near-lossless is very far in compression density against that kind of visually lossless. Still 2-3x more bits I think. No reason to compare. The near-lossless (at settings 60 and 80) is closer to pixel perfect no matter how much you zoom, whereas Jon's "visually lossless" is what I'd rather call usual very high quality lossy without pixel precision guarantees.


I'm actually shocked by how poorly the lossless versions of new image formats (AVIF and HEIC) perform compared to the venerable png.


Looks like there are more savings coming on lossless AVIF: https://www.reddit.com/r/AV1/comments/1b3lh08/comment/kstmbr...


Also more savings will come for JPEG XL soon.

Possibly mostly focused on medium and low quality.


will those require a bitstream change too?


No bitstream changes. In JPEG XL we have a 3-bit control field for each 8x8 square to guide which pixels participate in smoothing.

The heuristics (here https://github.com/libjxl/libjxl/blob/main/lib/jxl/enc_ar_co...) for choosing that value are quite primitive and produces only two values: no smoothing and some smoothing (values 0 and 4).

If we replace those heuristics with a search that tries out which of the values is closest to the original, we should get better quality, especially at the lowest bitrates where smoothing is important.


It is unlikely that there will be any bitstream changes in JPEG XL. There is still a lot of potential for encoder improvements within the current bitstream, both for lossy and for lossless.


Last time I tested cwebp it did not handle (PNG) color spaces correctly so the result of a supposedly lossless conversion actually looked different from the original. What good is better lossless compression if it is not actually lossless visually?


But all compression in the graph is lossless. The whole article is basically about JPEG XL lossless. Am I missing something?


About 3/4 of the article? There is a lot of analysis on lossy compression too after the lossless part.


I really like webp. Sadly there's still a lot of applications that dont work with it (looking at discord)


It is ironic you said this because when I disabled webp in my browser because it had a huge security vulnerability, Discord was the only site which broke and didn't immediately just serving me more reasonable image formats.


At the very low quality settings, it's kinda remarkable how jpeg manages to to keep a sharper approximation of detail that preserves the holistic quality of the image better in spite of the obvious artifacts making it look like a mess of cubism when examined close. It's basically converting the image into some kind of abstract art style.

Whereas jxl and avif just become blurry.


It is because JPEG is given 0.5 bits per pixel, where JPEG XL and AVIF are given around 0.22 and 0.2.

These images attempt to be at equal level of distortion, not at equal compression.

Bpps are reported beside the images.

In practice, use of quality 65 is rare in the internet and only used at the lowest quality tier sites. Quality 75 seems to be usual poor quality and quality 85 the average. I use quality 94 yuv444 or better when I need to compress.


You refer to this? https://res.cloudinary.com/jon/qp-low.png

Bitrates are in the left column, jpg low quality is the same size as jxl/avif med-low quality (0.4bpp), so you should compare the bottom left picture to the top mid and right pictures.


Well, that's because JPEG is still using about twice as many bits per pixel, making the output size significantly larger.

Don't get swept away by false comparisons, JXL and AVIF look significantly better if you give them twice as much filesize to work with as well.


JPEG bitrates are higher, so all it means is that SSIMULACRA2 is the wrong metric for this test. It seems that SSIMULACRA2 heavily penalizes blocking artifacts but doesn't much care about blur. I agree that the JPEG versions look better at the same SSIMULACRA2 score.


Humans generally tend to prefer smoothing over visible blocking artifacts. This is especially true when a direct comparison to the original image is not possible. Of course different humans have different tastes, and some do prefer blocking over blur. SSIMULACRA2 is based on the aggregated opinions of many thousands of people. It does care more about blur than metrics like PSNR, but maybe not as much as you do.


Ideally one would use human ratings.

The author of the blog post did exactly that in a previous blog post:

https://cloudinary.com/labs/cid22/plots

Human ratings are expensive and clumsy so people often use computed aka objective metrics, too.

The best OSS metrics today are butteraugli, dssim and simulacra. The author is using one of them. None of the codecs was optimized for that metrics except jpegli partially.


Yes, that was my takeaway from this that JPEG keeps edge sharpness really well (e.g. the eyelashes) while the jxl and avif smooth all detail out of the image.


No, JXL and AVIF keep keep the same level of edge sharpness but without all the blocking artifacts when given the same amount of bits per pixel as the lowest-quality jpeg.


I do not understand why this article focuses so much on encode speed, but for decode, which I believe represents 99% of usage in this web-connected world, give a cursory...

> Decode speed is not really a significant problem on modern computers, but it is interesting to take a quick look at the numbers.


Anything more than 100 MB/s is considered "enough" for the internet because at that point your bottleneck is no longer decoding. Most modern compression algorithms are asymmetric, that is, you can spend much more time on compression without significantly affecting the decompression performance, so it is indeed less significant once the base performance is achieved.


During the design process of pik/jpeg xl I experimented on decode speed as a personal experience to have an opinion about this. I tried a special version of chrome that artificially throttled the image decoding. Once the decoding speed gets into the 20 megapixels per second the feeling coming from the additional speed was difficult to notice. I tried 2, 20 and 200 megapixels per second throttlings. This naturally depends on image sizes and uses too.

There was a much more easy to notice impact from progressive images and even sequential images displayed in a streaming manner during the download. As a rule of thumb, sequential top-to-bottom streaming feels 2x faster as a waiting rendering, and progressive feels 2x faster than sequential streaming.


Decoding speed is important for battery time.

If a new format drains battery twice as fast, users don't want it.


This matters way more for video (where you are decoding 30 images per second continuously) than it does for still images. For still images, the main thing that drains your battery is the display, not the image decoding :)

But in any case, there are no _major_ differences in decoding speed between the various image formats. The difference caused by reducing the transfer size (network activity) and loading time (user looking at a blank screen while the image loads) is more important for battery life than the decoding speed itself. Also the difference between streaming/progressive decoding and non-streaming decoding probably has more impact than the decode speed itself, at least in the common scenario where the image is being loaded over a network.


> This matters way more for video (where you are decoding 30 images per second continuously) than it does for still images.

OTOH video decoding is highly likely to be hardware accelerated on both laptops and smartphones.

> For still images, the main thing that drains your battery is the display, not the image decoding :)

I wonder if it becomes noticeable on image-heavy sites like tumblr, 500px, etc.


Assuming the websites are using images of appropriate dimensions (that is, not using huge images and relying on browser downscaling, which is a bad practice in any case), you can quite easily do the math. A 1080p screen is about 2 megapixels, a 4K screen is about 8 megapixels. If your images decode at 50 Mpx/s, that's 25 full screens (or 6 full screens at 4K) per second. You need to scroll quite quickly and have a quite good internet connection before decode speed will become a major issue, whether for UX or for battery life. Much more likely, the main issue will be the transfer time of the images.


Aside from all points you raise I find the discussion about battery life a little absurd in light of how negligible it is compared to the impact poorly written JavaScript in the context of web apps. For example, I noticed this morning that my bank somehow pins one CPU thread to 100% usage whenever I have their internet banking site open, even when nothing is being done. AFAIK there is no cryptocurrency nonsense going on, and the UI latency is pretty good too, so my best guess is that their "log out automatically after ten minutes of inactivity" security feature is implemented through constant polling.


The parent was talking about battery life.

Mobile phone CPU can switch between different power state very quickly. If the image decoding is fast, it can sleep more


And the one you're replying to is also talking about battery life. The energy needed to display an image for a few seconds is probably higher than the energy needed to decode it.


Agreed. For web use they all decode fast enough. Any time difference might be in progression or streaming decoding, vs. waiting for all the data to arrive before starting to decode.

For image gallery use of camera resolution photographs (12-50 Mpixels) it can be more fun to have 100+ Mpixels/s, even 300 Pixels/s.


I wasn't able to convince myself about that when approaching that question with with back-off-the-envelope calculation, published research and prototypes.

Very few applications are constantly decoding images. Today a single image is often decided in a few milliseconds, but watched 1000x longer. If you 10x or even 100x energy consumption of image decoding, it is still not going to compete with display, radio and video decoding as a battery drain.


When you actually want good latency, using the throughput as a metric is a bit misguided.


As others pointed out, that's why JPEG XL's excellent support for progressive decoding is important. Other formats do not support progressive decoding at all or made it optional, so it cannot be even compared at this point. In the other words, the table can be regarded as an evidence that you can have both progressive decoding and performance at once.


There is no "throughput vs latency" here, there is no "start-up time" for decoding an image that's already in RAM. If a decoder decodes at 100MiB/s, and a picture is 10MiB, it's decoded in 0.1 seconds. If the decoder decodes at 1 MiB/s, the same picture is decoded in 10 seconds.


This isn't true for web use cases. There being able to start decoding from when the first bits arrive vs the last bits can make a difference (which JPEG-XL does a lot better than other image formats because it first sends all the DC coefficients which lets the website display a low resolution version of the image and then fill in the detail as the rest of the image comes through.


That's about progressive decoding, which isn't what this benchmark is focusing on. When benchmarking decode speed, throughput is a perfectly good metric. If the article wanted to talk about progressive decode performance, it would require a complete redesign, not just a change in metric from "throughput" to "latency".


If you don't have progressive decoding, those metrics are essentially the same.


Is it practical to use hardware video decoders to decode the image formats derived from video formats, like AVIF/AV1 and HEIC/H264? If so that could be a compelling reason to prefer them over a format like JPEG XL which has to be decoded in software on all of today's hardware. Everything has H264 decode and AV1 decode is steadily becoming a standard feature as well.


No browser bothers with hardware decode of WebP or AVIF even if it is available. It is not worth the trouble for still images. Software decode is fast enough, and can have advantages over hw decode, such as streaming/progressive decoding. So this is not really a big issue.


No, not really - mostly because setup time and concurrent decode limitations of HW decoders across platforms tend so undermine any performance or battery gains from that approach. As far as I know, not even mobile platforms bother with it with native decoders for any format.


In some use cases the company is paying for the encoding, but the client is doing the decoding. As long as the client can decode the handful of images on the page fast enough for the human to not notice, its fine. Meanwhile any percentage improvement for encoding can save real money.


Companies and economies who optimize the cost of clients as if their own can see diffuse benefits from it. Even if they consider 0.01 % of that cost it can lead to better decisions. If that will make the website load faster or allow for more realistic product pictures, it can lead to faster growth or less returns etc.


Real-time encoding is pretty popular, for which encoding speed is pretty important.


Because that’s Cloudinary’s use case… they literally spend millions of dollars encoding images


My server will encode 1,000,000 images itself, but each client will only decode like 10.


But you may have fifty million clients, so the total "CPU hours" spend on decoding will outlast encoding.

But the person encoding is picking the format, not the decoder.


But the server doesn't necessarily have unlimited time to encode those images. Each of those 1 million images needs to be encoded before it can be sent to a client.


That isn't saying much or anything.


The inclusion of QOI in the lossless benchmarks made me smile. It's a basically irrelevant format, that isn't supported by default by any general-public software, that aims to be just OK, not even good, yet it has a spot on one of these charts (non-photographic encoding). Neat.


GameMaker Studio has actually rather quickly jumped onto the QOI bandwagon, having 2 years ago replaced PNG textures with QOI (and added BZ2 compression on top) and found a 20% average reduction in size. So GameMaker Studio and all the games produced with it in the past 2 years or so do actually use QOI internally.

Not something a consumer knowingly uses, but also not quite irrelevant either.


That feels a little sad. If qoi had anything good (fast single-threaded decoding speed for photographic content) adding bz2 most certainly removed it. They could have just used WebP lossless and it would have been faster and smaller.


Oof that looks like a confused mess of a format.

bz2 is obsolete. It’s very slow, and not that good at compressing. zstd and lzma beat it on both compression and speed at the same time.

QOI’s only selling point is simplicity of implementation that doesn’t require a complex decompressor. Addition of bz2 completely defeats that. QOI’s poorly compressed data inside another compressor may even make overall compression worse. It could heve been a raw bitmap or a PNG with gzip replaced with zstd.


And yet didn't reach the Pareto frontier! It's quite obvious in hindsight though---QOI decoding is inherently sequential and can't be easily parallelized.


Of course it didn't, it wasn't designed to be either the fastest nor the best. Just OK and simple. Yet in some cases it's not completely overtaken by competition, and I think that's cool.

I don't believe QOI will ever have any sort of real-world practical use, but that's quite OK and I love it for it has made me and plenty of others look into binary file formats and compression and demystify it, and look further into it. I wrote a fully functional streaming codec for QOI, and it has taught me many things, and started me on other projects, either working with more complex file formats or thinking about how to improve upon QOI. I would probably never have gotten to this point if I tried the same thing starting with any other format, as they are at least an order of magnitude more complex, even for the simple ones.


>I don't believe QOI will ever have any sort of real-world practical use

Prusa (the 3d printer maker) seems to think otherwise! https://github.com/prusa3d/Prusa-Firmware-Buddy/releases/tag...


> Of course it didn't, it wasn't designed to be either the fastest nor the best. Just OK and simple. Yet in some cases it's not completely overtaken by competition, and I think that's cool.

Actually, there was a big push to add QOI to stuff a few years ago, specifically due to it being "fast". It was claimed that while it has worse compression, the speed can make it a worthy trade off.


It can be interesting if you need fast decode on low complexity, and it's an easy to improve format (-20 to -30%). Base QOI isn't that great.


As far a I understand this benchmark JXL was using 8 CPU cores, while QOI naturally only used one. If you were to plot the graph with compute used (watts?) instead of Mpx/s, QOI would compare much better.

Also, curious that they only benchmarked QOI for "non-photographic images (manga)", where QOI fares quite badly because it doesn't have palleted mode. QOI does much better with photos.


Actually, they did try QOI for the photographic images:

> Not shown on the chart is QOI, which clocked in at 154 Mpx/s to achieve 17 bpp, which may be “quite OK” but is quite far from Pareto-optimal, considering the lowest effort setting of libjxl compresses down to 11.5 bpp at 427 Mpx/s (so it is 2.7 times as fast and the result is 32.5% smaller).

17 bpp is way outside the area shown in the graph. All the other results would've gotten squished and been harder to read, had QOI been shown.


Thanks, I missed that.

I just ran qoibench on the photos they used[1] and QOI does indeed fair pretty badly with a compression ratio of 71.1% vs. 49.3% for PNG.

The photos in the QOI benchmark suite[2] somehow compress a lot better (e.g. photo_kodak/, photo_tecnick/ and photo_wikipedia/). I guess it's the film grain with the high resolution photos used in [1].

[1] https://imagecompression.info/test_images/

[2] https://qoiformat.org/benchmark/


One does wonder how much of JXL's awesomeness is the encoder vs. the format. Its ability to make high quality, compact images just with "-d 1.0" is uncanny. With other codecs, I had to pass different quality settings depending on the image type to get similar results.


That's a very good point. At this rate of development I wouldn't be surprised if libjxl becomes x264 of image encoders.

On the other hand, libvpx has always been a mediocre encoder which I think might be the reason for disappointing performance (I mean in general, not just speed) of vp8/vp9 formats, which inevitably also affected performance of lossy WebP. Dark Shikari even did a comparison of still image performances of x264 vs vp8 [0].

[0] https://web.archive.org/web/20150419071902/http://x264dev.mu...


While WebP lossy still has image quality issues it has improved a lot over the years. One should not consider a comparison done with 2010-2015 implementations indicative of quality performance today.


I'm sure it's better now than 13 years ago, but the conclusion I got from looking at very recent published benchmark results is that lossy webp is still only slightly better than mozjpeg at low bitrates and still has worse max. PQ ceiling compared to JPEG, which in my opinion makes it not worth using over plain old JPEG even in web settings.


That matches my observations. I believe that WebP lossy does not add value when Jpegli is an option and is having hard time to compete even with MozJPEG.


Pik was designed initially without quality options only to do the best there is to achieve distance 1.0.

We kept a lot of focus on visually lossless and I didn't want to add format features which would add complexity but not help at high quality settings.

In addition to modeling features, the context modeling and efficiency of entropy coding is critical at high quality. I consider AVIFs entropy coding ill-suited for high quality or lossless photography.


They've also made a JPEG encoder, cjpegli, with the same "-d 1.0" interface.


Excellent run-through of jpegli encoder here: https://giannirosato.com/blog/post/jpegli/ - wish I could find a pre-compiled terminal utility for cjpegli!


It's in the static github release files here: https://github.com/libjxl/libjxl/releases/tag/v0.10.1


They're available at https://github.com/libjxl/libjxl/releases/ for linux and windows.


Dear lord.. despite browsing and using github on a daily basis I still miss releases section sometimes! Before I saw your reply I checked the Scoop repos and sure enough, on Windows this will get you latest cjpegli version installed and added to path in one go:

scoop install main/libjxl

Note.. now that I tried it: that is really next level for an old format..!


> The way jpegli handles XYB color in a JPEG image is by applying an ICC color profile that maps the existing JPEG color channels to XYB.

That sounds like it might negatively affect compatibility with older decoders that do not handle color spaces correctly.


Jpegli has the possibility of using XYB. By default as by being just a replacement of mozjpeg or libjpeg-turbo it doesn't.

I believe Jon has compared jpegli without XYB. If you turn XYB on, you get about 10 % more compression.

Jpegli is great even without XYB. It has many other methods for success (largely copied over from JPEG XL adaptive quantization heuristics, more precise intermediate calculations, as well as the guetzli method for variable dead-zone quantization).

disclaimer: I created the XYB colorspace, most of the JPEG XL VarDCT quality-affecting heuristics, and scoped jpegli. Zoltan (from WOFF2/Brotli fame!) did the actual implementation and made it work so well.


I have heard that it will see a proper standalone release at some point this year, but I don't know more than that.


It's in the unstable channel of nixpkgs.


It is worth noting that the JPEG XL effort produced a nice new parallelism library called Highway. This library is powering not only JPEG XL but also Google's latest Gemma AI models.


[0] for those interested in Highway.

It's also mentioned in [1], which starts off

> Today we're sharing open source code that can sort arrays of numbers about ten times as fast as the C++ std::sort, and outperforms state of the art architecture-specific algorithms, while being portable across all modern CPU architectures. Below we discuss how we achieved this.

[0] https://github.com/google/highway

[1] https://opensource.googleblog.com/2022/06/Vectorized%20and%2..., which has an associated paper at https://arxiv.org/pdf/2205.05982.pdf.


:) Thanks for the mention. Highway/vqsort TL here, happy to discuss.

PS: I used to work on JPEG XL. It is great to see these outstanding improvements, congrats to the team!


And VIPS! It seems like the best way to get portable SIMD in C++, to me.


Without taking into account whether JPEG XL shines on its own or not (which it may or may not), JPEG XL completely rocks for sure because it does this:

    .. $  ls -l a.jpg && shasum a.jpg
    ... 615504 ...  a.jpg
    716744d950ecf9e5757c565041143775a810e10f  a.jpg

    .. $  cjxl a.jpg a.jxl
    Read JPEG image with 615504 bytes.
    Compressed to 537339 bytes including container

    .. $  ls -l a.jxl
    ... 537339 ... a.jxl
But, wait for it:

    .. $  djxl a.jxl b.jpg
    Read 537339 compressed bytes.
    Reconstructed to JPEG.

    .. $  ls -l b.jpg && shasum b.jpg
    ... 615504 ... b.jpg
    716744d950ecf9e5757c565041143775a810e10f  b.jpg
Do you realize how many billions of JPEG files there are out there which people want to keep? If you recompress your old JPEG files using a lossy format, you lower its quality.

But with JPEG XL, you can save 15% to 30% and still, if you want, get your original JPG 100% identical, bit for bit.

That's wonderful.

P.S: I'm sadly on Debian stable (12 / Bookworm) which is on ImageMagick 6.9 and my Emacs uses (AFAIK) ImageMagick to display pictures. And JPEG XL support was only added in ImageMagick 7. I haven't looked more into that yet.


I managed to add that requirement to jpeg xl. I think it will be helpful to preserve our digital legacy intact without lossy re-encodings.


I'm sure that will be hugely cherished by users which take screenshots of JPEGs so they can resend them on WhatsApp :P


This particular feature might not, but if said screenshots are often compressed with JPEG XL, they will be spared the generation loss that becomes blatantly visible in some other formats: https://invidious.protokolla.fi/watch?v=w7UDJUCMTng


Maybe. But to know for sure you need to offset the image and change encoder settings.


> The new version of libjxl brings a very substantial reduction in memory consumption, by an order of magnitude, for both lossy and lossless compression. Also the speed is improved, especially for multi-threaded lossless encoding where the default effort setting is now an order of magnitude faster.

Very impressive! The article too is well written. Great work all around.


Maybe someone here will know of a website that describes each step of the jpeg xl format in detail? Unlike for traditional jpeg, I have found it hard to find a document providing clear instructions on the relevant steps, which is a shame as there are clearly tons of interesting innovations that have been compiled together to make this happen, and I'm sure the individual components are useful in their own right!


https://github.com/libjxl/libjxl/blob/main/doc/format_overvi... is a pretty detailed but good overview. The highlights are variable size DCT (up to 128x128), ANS entropy prediction, and chroma from luminance prediction. https://github.com/libjxl/libjxl/blob/main/doc/encode_effort... also gives a good breakdown of features by effort level.


Thank you for sharing. This gives me a good idea of where to start looking.


Missing from the article is rav1e, which encodes AV1, and hence AVIF, a lot faster than the reference implementation aom. I've had cases where aom would not finish converting an image in a minute of waiting what rav1e would do in less than 10 seconds.


Is rav1e pareto-curve ahead of libaom pareto-curve?

Does fast rav1e look better than jpegli at high encode speeds?


rav1e is generally head to head with libaom on static images, and which one wins on the speed/quality/size frontier depends a lot on the image and settings, as much as +/- 20%. I suspect rav1e has an inefficient block size selection algorithm, so the particular shape of blocks is a make or break for it.

I’ve only compared rav1e to mozjpeg and libwebp, and at fastest speeds it’s only barely ahead.


Difficult to know without reproduction steps from the article, but I would think it behaves better than libaom for the same quality setting.

Edit: found https://github.com/xiph/rav1e/issues/2759


If Rav1e found better ways of encoding, why would the aom folks copy it in libaom?


Both rav1e and libaom have a speed setting. At similar speeds, I have not observed huge differences in compression performance between the two.


The article mentions encoding speed as something to consider, alongside compression ratio. I would argue that decoding speed is also important. A lot of the more modern formats (webp, avif etc) can take significantly more CPU cycles to decode than a plain old jpg. This can slow things down noticeably,especially on mobile.


JPEG and JXL have the benefit of (optional) progressive decoding, so even if the image is a little larger than AVIF, you may still see content faster.


Note that JPEG XL always supports progressive decoding, because the top-level format is structured in that way. The optional part is a finer-grained adjustment to make the output more suitable for specific cases.


That's great, are there any comparison graphs and benchmarks showing that in real life (similarly to this article)?


A couple of videos comparing progressive decoding of jpeg, jxl and avif:

https://www.youtube.com/watch?v=UphN1_7nP8U

https://www.youtube.com/watch?v=inQxEBn831w

There's more on the same channel, generation loss ones are really interesting.


Awesome, thanks.



Any computation-intensive media format on mobile is likely using a hardware decoder module anyway, and that most frequently includes JPEG. So that comparison is not adaquate.


"computation-intensive media" = videos

Seriously, when is the last time mobile phones used hardware decoding for showing images? Flip phones in 2005?

I know camera apps use hardware encoding but I doubt gallery apps or browsers bother with going through the hardware decoding pipeline for hundreds of JPEG images you scroll through in seconds. And when it comes to showing a single image they'll still opt to software decoding because it's more flexible when it comes to integration, implementation, customization and format limits. So not surprisingly I'm not convinced when I repeatedly see this claim that mobile phones commonly use hardware decoding for image formats and software decoding speed doesn't matter.


I don't know the current status of web browsers, ut hardware encoding and decoding for image formats is alive and well. Not really relevant for showing a 32x32 GIF arrow like on HN, but very important when browsing high resolution images with any kind of smoothness.

If you don't really care about your users' battery life you can opt to disable hardware acceleration within your applications, but it's usually enabled by default, and for good reason.


> hardware encoding and decoding for image formats is alive and well

I keep hearing and hearing this but nobody has ever yet provided a concrete real world example of smart phones using hw decoding for displaying images.


Hardware acceleration of image decoding is very uncommon in most consumer applications.


No, not a single mobile platform uses hardware decode modules for still image decoding as of 2024.

At best, the camera processors output encoded JPEG/HEIF for taken pictures, but that's about it.


That was quite contrary to my understanding, so I took some more time to verify both my and your claim. The reality turned out to be somewhere in the middle: modern mobile SoCs do ship with hardware JPEG decoding among others, but there is no direct API for that hardware decoding module in the mobile platform itself (Android 7 and onwards use libjpeg-turbo by default, for example). But mobile manufacturers can change the implementation details behind those APIs, so it is still true that some mobiles do use hardware JPEG decoding behind the scene. But it is hard to tell how common it is. So well, thank you for the counterpoint---that corrected my understanding.


They ship with hardware jpeg decoders because they ship with hardware jpeg encoders for camera capture latency reasons and it turns out you can basically just run that hardware in reverse.

The SoCs aren't investing more than a token amount of effort into those jpeg decoders, and from experience some of them claim to exist but produce the shittiest looking output imaginable and more slowly than jpeg-turbo at that.

Also you can trivially find out if your Android phone is doing this or not, just run some perf call sampling while decoding jpegs. If all you see is AOSP libraries & libjpeg-turbo, well then they aren't doing hardware decodes :)


Does JPEG XL have patent issues? I half remember something about that. Regular JPG seems fine to me. Better compression isn't going to help anyone since they will find other ways to waste any bandwidth available.


The main innovation claimed by Microsoft's rANS patent is about the adaptive probability distribution, that is, you should be able to efficiently correct the distribution so that you can use less bits. While that alone is an absurd claim (that's a benefit shared with arithmetic coding and its variants!) and there is a very clear prior art, JPEG XL doesn't dynamically vary the distribution so is thought to be not related to the patent anyway.


No it doesn't.

And yes, regular JPEG is still a fine format. That's part of the point of the article. But for many use cases, better compression is always welcome. Also having features like alpha transparency, lossless, HDR etc can be quite desirable, and those things are not really possible in JPEG.


I really hope this can become a new standard and be available everywhere (image tools, browsers, etc).

While in practice it won't change my life much, I like the elegance of using a modern standard with this level of performance an efficiency.


I have an existing workflow where I take JPEGs (giant PNGs) from designers and reencode them using mozjpeg. However, I can't find a way to invoke jpegli tool in the same way, especially since it seems to just be part of the jpeg-xl tool? Is that right? Are there any sample invocations anywhere?


You should be able to use the `cjpegli` command the same way you'd use cjxl. The simplest invocation would be `cjpegli input.png output.jpg`


Or replace/ldconfig the jpeg library with the jpegli library. It is API/ABI compatible with libjpeg-turbo and mozjpeg.


Thanks. It seems the macOS release is a bit behind on brew (0.9.1) and doesn't include the cjpegli tool yet.


Looks like there are more savings coming on lossless AVIF: https://www.reddit.com/r/AV1/comments/1b3lh08/comment/kstmbr...


I plan to add 15–25 % more quality in the ugly lowest end quality in JPEG XL in the coming two months.


The problem with JPEG XL is that it is written in an unsafe language and has already had several memory safety vulnerabilities found in it.

Image codecs are used in a wide range of attacker-controlled scenarios and need to be completely safe.

I know Rust advocates sound like a broken record, but this is the poster child for a library that should never have been even started in C++ in the first place.

It’s absolute insanity that we write codecs — pure functions — in an unsafe language that has a compiler that defaults to “anything goes” as an optimisation technique.


Pretty much every codec in every browser is written in an unsafe language, unfortunately. I don't see why JXL should be singled out. On the other hand, there is a JXL decoder in Rust called jxl-oxide [1] which works quite well, and has been confirmed by JPEG as conformant. Hopefully it will be adopted for decode-only usecases.

[1] https://github.com/tirr-c/jxl-oxide/pull/267

> It’s absolute insanity that we write codecs — pure functions — in an unsafe language that has a compiler that defaults to “anything goes” as an optimisation technique.

Rust and C++ are exactly the same in how they optimize, compilers for both assume that your code has zero UB. The difference is that Rust makes it much harder to accidentally have UB.


"We've never had to wear helmets before, why start now?"

There are only a handful of image codecs that are widely accepted. Essentially just GIF, PNG, and JPG. There's a smattering of support for more modern formats, but those three dominate.

Adding a fourth image format is increasing this attack surface by a substantial margin across a huge range of software. Not just web browsers, but chat apps, server software (thumbnail generators), editors, etc...

This is the kind of thing that gets baked into standard libraries, operating systems, and frameworks. It's up there with JSON or XML.

You had better be damned sure what you're doing is not going to cause a long list of CVEs!

JPEG XL is a complex codec, with a lot of code. This increases the chance of bugs and increases the attack surface.

A (surprisingly!) good metric for complexity is the size of the zip file of the code. Libjpeg is something like 360 kB, libpng is 350 kB, and giflib is 90 kB.

The JXL source is 1.4 MB zipped, making more than twice the size of the above three combined!

The other libraries use C/C++ not because that's a better choice, but because it was the only choice back in the ... checks Wikipedia ... 1980s and 90s!

We live in the future. We have memory-safe languages now. We're allowed to use them. You won't get in trouble from anyone, I promise.


> "We've never had to wear helmets before, why start now?"

> We live in the future. We have memory-safe languages now. We're allowed to use them. You won't get in trouble from anyone, I promise.

That's why I specifically said that it's unfortunate that C++ is still wide spread, and pointed to a fully conformant JXL decoder written in Rust :p

> There are only a handful of image codecs that are widely accepted. Essentially just GIF, PNG, and JPG. There's a smattering of support for more modern formats, but those three dominate.

Every browser ships libwebp and an AVIF decoder. Every reasonably recent Android phone does as well. And every iPhone. Every (regular) install of Windows has libwebp. Every Mac has libwebp and dav1d. That's all C++. AVIF in particular is only a couple of years older than JXL, and yet I've never seen opposition to it on the grounds of memory safety. That is what I meant about JXL being singled out.

> JPEG XL is a complex codec, with a lot of code. This increases the chance of bugs and increases the attack surface.

> A (surprisingly!) good metric for complexity is the size of the zip file of the code. Libjpeg is something like 360 kB, libpng is 350 kB, and giflib is 90 kB.

> The JXL source is 1.4 MB zipped, making it nearly twice the size than all of the above combined.

Which code exactly are you including in that? The libjxl repo has a lot of stuff in it, including an entire brand new JPEG encoder! Though jxl certainly is more complex than those three combined, since JXL is essentially a superset of all their functionality, plus new stuff.


I revised my numbers a bit by filtering out the junk and focusing only on the code that most likely contributes to the runtime components (where the security risks lie). E.g.: Excluded the samples, test suites, doco, changelogs, etc... and kept mostly just the C/C++ and assembly code.

I also recompressed all of the libraries with identical settings to make the numbers more consistent.


I believe JPEG XL binary size is about one third of AVIF binary size. It is relatively compact. It is easy to write a small encoder: libjxl-tiny is just 7000 lines of code.


> an unsafe language

No such thing.

> a compiler that defaults to “anything goes” as an optimisation technique

That's just FUD.


This is really impressive even compared to WebP. And unlike WebP, it's backwards compatible.

I have forever associated Webp with macroblocky, poor colors, and a general ungraceful degradation that doesn't really happen the same way even with old JPEG.

I am gonna go look at the complexity of the JXL decoder vs WebP. Curious if it's even practical to decode on embedded. JPEG is easily decodable, and you can do it in small pieces at a time to work within memory constraints.


Everyone hates WebP because when you save it, nothing can open it.

That's improved somewhat, but the formats that will have an easy time winning are the ones that people can use, even if that means a browser should "save JPGXL as JPEG" for awhile or something.


Everyone hates webp for a different reason. I hate it because it can only do 4:2:0 chroma, except in lossless mode. Lossless WebP is better than PNG, but I will take the peace of mind of knowing PNG is always lossless over having a WebP and not knowing what was done to it.


> peace of mind of knowing PNG is always lossless

There is pngquant:

> a command-line utility and a library for lossy compression of PNG images.


You also have things like https://tinypng.com which do (basically) lossy PNG for you. Works pretty well.


Neither of these are really what I'm referring to, as I view these as ~equivalent to converting a jpeg to png. What I mean is within a pipeline, once you have ingested a [png|webp|jpeg] and you need to now render it at various sizes or with various filters for $purposes. If you have a png, you know that you should always maintain losslessness. If you have a jpeg, you know you don't. You don't need to inspect the file or store additional metadata, the extension alone tells you what you need to know. But when you have a webp, the default assumption is that it's lossy but it can sometimes be otherwise.


Actually, if you already have loss, you should try as hard as possible to avoid further loss.


I don't disagree, in principle. But if I have a lossy 28MP jpeg, I'm not going to encode it as a lossless thumbnail (or other scaled-down version).


I've noticed in chrome-based browsers, you can right click on a webp file and "edit image". When you save it, it defaults to png download, which makes a simple conversion.

Mobile browsers seem to default to downloading in png as well.


I think JXL has been seeing adoption by apps faster than Webp or AVIF.


Unfortunately its the other way around for web browsers :|


> And unlike WebP, it's backwards compatible.

No, JPEG XL files can't be viewed/decoded by software or devices that don't have a JPEG XL decoder.


JPEG XL can be converted to/from JPEG without any loss of quality. See another commenter where shows a example where doing JPEG -> JPEG XL -> JPEG generates a binary exact copy of the original JPEG.

Yeah, this not means what usually we call backwards compatibility, but allows usage like storing the images as JPEG XL and, on the fly, send a JPEG to clients that can't use it, without any loss of information. WebP can't do that.


But that only works when the JXL has been losslessly converted from a JPEG in the first place, right? So this wouldn’t work for all JXL in practice. (Unless I’ve missed something and this is not the case.)


You could start with relatively good jpegli as a codec and then lossless recompress that with jpeg xl. Naturally some entity (server side, app, content encoding etc.) needs to unpack the jpeg xl jpeg into a usual jpeg before it can be consumed by a legacy system.


Welcome efficiency improvements

And in general, Jon's posts provide a pretty good overview on the topic of codec comparison

Pity such a great format is being held back by the much less rigorous reviews


Should the Pareto front not be drawn with line perpendicular to the axes rather than diagonal lines?


Often with this kind of pareto it can be argued that even when continuous decisions are not available, a compression system could keep choosing every second at effort 7 and every second at effort 6 (or any ratio), leading, on the average interpolated results. Naturally such interpolation does not produce straight lines in log space.


Yes, it should, but it looks like they just added a line to the jxl 0.10 series of data on whatever they used to make the graph, and labelled it the Pareto front. Looking closely at the graphs, they actually miss some points where version 0.9 should be included in the frontier.


I think it can be understood as an expected Pareto frontier if enough options are added to make it continuous, which is often implied in this kind of discussions.


I'm not sure that's reasonable - The effort parameters are integers between 1 and 10, with behavior described here: https://github.com/libjxl/libjxl/blob/main/doc/encode_effort..., the intermediate options don't exist as implemented programs. This is a comparison of concrete programs, not an attempt to analyze the best theoretically achievable.

Also, the frontier isn't convex, so it's unlikely that if intermediate options could be added then they would all be at least as good as the lines shown; and the use of log(speed) for the y-axis affects what a straight line on the graph means. It's fine for giving a good view of the dataset, but if you're going to make a guess about intermediate possibilities, 'speed' or 'time' should also be considered.


You are right, but that would make an uglier plot :)

Some of the intermediate options are available though, through various more fine-grained encoder settings than what is exposed via the overall effort setting. Of course they will not fall exactly on the line that was drawn, but as a first approximation, the line is probably closer to the truth than the staircase, which would be an underestimate of what can be done.


Perpendicular to which axis?


both - staircase style.


Good grief. A poorly phrased question, and an answer that doesn't narrow the possibilities.

        *
        |
        |
        |
  *-----+
or

  +-----*
  |
  |
  |
  *
… and why?


Whichever is more pessimistic. So for the axes in this article, the first one. If you have an option on the "bad" side of the Pareto curve, you can always find an option that is better in both axes. If a new option is discovered that falls on the good side of the curve, well, then the curve needs to be updated to pass thru that new option.


The choice to represent the speed based on multithreaded encoding strikes me as somewhat arbitrary. If your software has a critical path dependent on minimal latency of a single image, then it makes some sense, but you still may have more or fewer than 8 cores. On the other hand if you have another source of parallelism, for example you are encoding a library of images, then it is quite irrelevant. I think the fine data in the article would be even more useful if the single threaded speed and the scalability of the codec were treated separately.


Wow, that new jpegli encoder. Just wow. Look at those results. Haha, JPEG has many years left still.


> JPEG has many years left still

Such a shame arithmetic coding (which is already in the standard) isn't widely supported in the real world. Because converting Huffman coded images losslessly to arithmetic coding provides an easy 5-10% size advantage in my tests.

Alien technology from the future indeed.


The benefits of JPEG kind of go away if you start adopting more recent changes to the format, no? JPEG is nice because everything has supported it for 20+ years. JPEG-with-arithmetic-coding is essentially a new, incompatible format, why not use JXL or AVIF instead?


Yes, but this is really a pity in the specific case of Arithmetic Coding, because, unlike "more recent changes to the format", it's been in the standard since the very beginning - but is not supported by a lot of implementations due to software patents (which meanwhile expired, but their damage remains).


arithmetic encoding is old, but around 2010 (when all the patents expired), there was a ton of really good research on how to to table based ans and vectorized rans to make the performance good. Aside from the patent issues Arithmetic encoding wasn't pursued much because the CPU cost was too high. Now that multiply is cheap and the divisions can be avoided, ANS is a lot better than it used to be.


AVIF, AV1, WebP lossy and Opus use traditional arithmetic coding.

JPEG XL, Brunsli, Draco 3D and ZStd use table-based arithmetic coding (ANS).

JPEG, Deflate, Brotli, MP3, WebP lossless and the fastest mode of JPEG XL lossless use prefix coding.


I'm surprised mozjpeg performed worse than libjpeg-turbo at high quality settings. I thought its aim was having better pq than libjpeg-turbo at the expense of speed.


It is consistent to what I have seen. Both in metrics and in eyeballing. Mozjpeg gives good results around quality 75, but less good at 90+++


Pretty good article, though I would have used oxipng instead of optipng in the lossless comparisons, it's the new standard, there.


Thanks for the suggestion, oxipng is indeed a better choice. Next time I will add it to the plots!


AVIF looks better here: JPEG XL looks very blurred out on the bottom with high compression. AVIF preserves much more detail and sharpness.

https://res.cloudinary.com/jon/qp-low.png


JPEG XL is awesome!

One thing I think would help with its adoption, is if they would work with e.g. the libvips team to better implement it.

For example, streaming encoder and streaming decoder would be the preferred integration method in libvips.


It is nice 0.10 finally landed those memory and speed optimisation.

But the King remains HALIC. In terms of MT encoder it still uses 3.5x more memory than HALIC, and 6x encoding time compared to HALIC. While offering the same or smaller files size in Lossless. Hopefully JPEG XL could narrow those gaps some days.


How is lossless webp 0.6th of the size of lossless avif? I find it hard to believe that.


WebP is awesome at lossless and way better than even PNG.

It's because WebP has a special encoding pipeline for lossless pictures (just like PNG) while AVIF is basically just asking a lossy encoder originally designed for video content to stop losing detail. Since it's not designed for that it's terrible for the job, taking lots of time and resources to produce a worse result.


On the other hand, animated webp is surprisingly bad considering that the format comes from a video codec.


> and way better than even PNG.

I mean it's kinda hard to be worse than just shoving a bitmap through zlib...


Lossless webp is actually quite good, especially on text heavy images, e.g. screenshots of a terminal with `cwebp -z9` are usually smaller than `jxl -d 0 -e 9` in my experience.


Lossless AVIF is just really quite bad. Notice that how for photographic content, it is barely better than PNG, and for non-photographic content, it is far worse than PNG.


It's so bad you wonder why AV1 even has a lossless mode. Maybe lossy mode has some subimages it uses lossless mode on?


It has lossless just to check a box in terms of supported features. A bit like how JPEG XL supports animation just to have feature parity. But in most cases, you'll be better off using a video codec for animation, and an image format for images.


There are some user-level differences between an animated image and a video, which haven't really been satisfactorily resolved since the abandonment of GIF-the-format. An animated image should pause when clicked, and start again on another click, with setting separate from video autoplay to control the default. It should not have visible controls of any sort, that's the whole interface. It should save and display on the computer/filesystem as an image, and degrade to the display frame when sent along a channel which supports images but not animated ones. It doesn't need sound, or CC, or subtitles. I should be able to add it to the photo roll on my phone if I want.

There are a lot of little considerations like this, and it would be well if the industry consolidated around an animated-image standard, one which was an image, and not a video embedded in a way which looks like an image.


Hence why AVIF might come in handy after all!


I believe it is more fundamental. I like to think that AV1 entropy coding just becomes ineffective for large values. Large values are dominantly present in high quality photography and in lossless coding. Large values are repeatedly prefix coded and this makes effective adaptation of the statistics difficult for large integers. This is a fundamental difference and not a minor difference in focus.


> Large values are repeatedly prefix coded

You mean exponential golomb style, where numbers like 0b110100101 would be encoded as 00000000110100110 (essentially using 2 bits per bit)?


Usually the issue is not using the YCgCo-R colorspace. I do not see enough details in the article to know if that is the case here. There are politics around getting the codepoint included: https://github.com/AOMediaCodec/av1-avif/issues/129


"Pareto" being used outside the context of Brazil's best prank call ever (Telerj Prank) will always confuse me. I keep thinking, "what does the 'thin-voiced lawyer' have to do with statistics?"...


If only Google could be convinced to adopt this marvelous codec... Not looking super positive at the moment:

https://issues.chromium.org/issues/40270698

https://bugs.chromium.org/p/chromium/issues/detail?id=145180...


It's so frustrating how the chromium team is ending up as a gatekeeper of the Internet by pick and choosing what gets developed or not.

I recently come across another issue pertaining to the chromium team not budging on their decisions, despite pressure from the community and an RFC backing it up - in my case custom headers in WebSocket handshakes, that are supported by other Javascript runtimes like node and bun, but the chromium maintainer just disagrees with it - https://github.com/whatwg/websockets/issues/16#issuecomment-...


> It's so frustrating how the chromium team is ending up as a gatekeeper of the Internet by pick and choosing what gets developed or not.

https://github.com/niutech/jxl.js is based on Chromium tech (Squoosh from GoogleChromeLabs) and provides an opportunity to use JXL with no practical way for Chromium folks to intervene.

Even if that's a suboptimal solution, JXL's benefits supposedly should outweight the cost of integrating that, and yet I haven't seen actual JXL users running to that in droves.

So JXL might not be a good support for your theory: where people could do they still don't. Maybe the format isn't actually that important, it's just a popular meme to rehash.


Why do you assume that the benefits would outweigh said costs? That's a weird burden to set on the format. Using JavaScript on the browser to decode it is a huge hurdle, I don't know of any format that ever got popular or got its initial usage from a similar approach. Avif was just added too, even if no one was using a js library to decode it beforehand

Fwiw I agree that there's a weird narrative around jpegxl, at the end of the day it's just a format, and I think it's not very good for lower quality images as proven by the linked article in the OP. Avif looks better in that regard.

I think it would've made more sense than WebP though (which also doesn't look good at all when not lossless), but that was like a decade ago and that ship has sailed. So avif fills a niche that WebP sucks at, while jpegxl doesn't really do that. That alone is reason enough to not bother with including it.


People don't use blurry low quality images in the web. These low qualities don't matter outside of compression research.

Average/median quality of images is between 85 to 90 depending how you calculate it.

There, users' waiting time is worth during image formats life time for about 3 trillion USD. If we can reduce 20 % of it we create wealth of 600 billion USD distributed to the users. More savings come from data transfer costs.


I use blurry lo-fi images sometimes, eg to reduce the server pain during a Mastodon preview stampede, and for hero images when Save-Data is set!


> Why do you assume that the benefits would outweigh said costs? That's a weird burden to set on the format.

I'm not assuming that there are those benefits, but that there are people to see them. Those who _very_ vocal about browsers (and Chrome in particular) not supporting it seem to think so or they wouldn't bother.

If I propose integrating good old Targa file support into Chrome, I'd also be asked about a cost/benefit analysis. And by building and using a polyfill to add that support, I show that I'm serious about Targa files, which gives credence to my cost/benefit analysis and also lets people play around with the Targa format, hopefully making it self-evident that the format is good, and from there that these benefits based on native support would be even better.

For JXL I see people talking the talk but, by and large, not walking the walk.


I see what you mean. Yeah, I think jpegxl is the format that I've heard about the most but never really seen in the wild. It's a chicken and egg problem but still, it's basically not used at all compared to the Mindshare it seems to have in these discussions


Question is for how long. Time to slam the hammer on them.


What hammer? You want US president or supreme court to compel Chrome developers to implement every image format in existence and every JS API proposed by anyone anywhere?

Unless it is some kind of anti-competitive behavior like they intentionally stiffening adoption of standard competing with their proprietary patent-encumbered implementation that they expect to collect royalties for (doesn't seem to be the case), then I don't see the problem.


Why not make a better product than slam some metaphorical hammer?


That's not how this works. Firefox is the closest we have, and realistically the closest we will get to a "better product" than Chromium for the foreseeable future, and it's clearly not enough.


The only hammer at all left is Safari, basically on iPhones only.

That hammer is very close to going away; if the EU does force Apple to really open the browsers on the iPhone, everything will be Chrome as far as the eye can see in short order. And then we fully enter the chromE6 phase.


And Firefox does not support the format. Mozilla is the same political company as everyone else.


Because "better" products don't magically win.


Where's Firefox's and Webkit's position on the proposal?


Safari/Webkit has added JPEG XL support already.

Firefox is "neutral", which I understand as meaning they'll do whatever Chrome does.

All the code has been written, patches to add JPEG XL support to Firefox and Chromium are available and some of the forks (Waterfox, Pale Moon, Thorium, Cromite) do have JPEG XL support.


I believe they were referring to that WebSocket issue, not JXL.


They didn't "lose interest", their lawyers pulled the emergency brakes. Blame patent holders, not Google. Like Microsoft: https://www.theregister.com/2022/02/17/microsoft_ans_patent/. Microsoft could probably be convinced to be reasonable. But there may be a few others. Google actually also holds some patents over this but they've done the right thing and license those patents along with their implementation.

To fix this, you'd need to convince Google, and other large companies that would be exposed to law suits related to these patents (Apple, Adobe, etc.), that these patent holders are not going to insist on being compensated.

Other formats are less risky; especially the older ones. Jpeg is fine because it's been out there for so long that any patents applicable to it have long expired. Same with GIF, which once was held up by patents. Png is at this point also fine. If any patents applied at all they will soon have expired as the PNG standard dates back to 1997 and work on it depended on research from the seventies and eighties.


There are no royalties to be paid on JPEG XL. Nobody but Cloudinary and Google is claiming to hold relevant patents, and Cloudinary and Google have provided a royalty free license. Of course the way the patent system works, anything less than 20 years old is theoretically risky. But so far, there is nobody claiming royalties need to be paid on JPEG XL, so it is similar to WebP in that regard.


"Patent issues" has become a (sometimes truthful) excuse for not doing something.

When the big boys want to do something, they find a way to get it done, patents or no, especially if there's only "fear of patents" - see Apple and the whole watch fiasco.


Patents was not the latest excuse I heard from Google. Their explanation was security concerns.


Do you have a link? Or was it a private communication?


> [...] other large companies that would be exposed to law suits related to these patents (Apple, Adobe, etc.) [...]

Adobe included JPEG XL support to their products and also the DNG specification. So that argument is pretty much dead, no?


Adobe also has an order of magnitude lower number of installed software than Chrome or Firefox which makes patent fees much cheaper. And their software is actually paid for by users.


DNG Converter (which includes JPEG XL compression) isn’t paid. You can get it here: https://helpx.adobe.com/camera-raw/using/adobe-dng-converter...


Not that simple. Maybe they struck a deal with a few of the companies or they made a different risk calculation. And of course they have a pretty fierce patent portfolio themselves so there's the notion of them being able to retaliate in kind to some of these companies.


I don't think that's true (see my other comment for what the patent is really about), but even when it is, Adobe's adoption means that JPEG XL is worth the supposed "risk". And Google does ship a lot of technologies that are clearly patent-encumbered. If the patent is the main concern, they could have answered so because there are enough people wondering about the patent status, but the Chrome team's main reason against JPEG XL was quite different.


Adobe sells paid products and can carve out a license fee for that, like they do with all the other codecs and libraries they bundle. That's part of the price you are paying.

Harder to do for users of Chrome.


The same thing can be said with many patent-encumbered video codecs which Chrome does support nevertheless. That alone can't be a major deciding factor, especially given that the rate of JPEG XL adoption has been remarkably faster than any recent media format.


Is this not simply a risk vs reward calculation? Newer video codecs present a very notable bandwidth saving over old ones. JPEG XL presents minor benefits over WebP, AVIF, etc. So while the dangers are the same for both the calculation is different.


Video = billions lower costs for Youtube.


You can get Adobe DNG Converter for free and use it to convert your raw files to DNG compressed with JPEG XL.

https://helpx.adobe.com/content/dam/help/en/camera-raw/digit...


The Microsoft patent doesn't apply to JXL, and in any case, Microsoft has literally already affirmed that they will not use it to go after any open codec.


How exactly is that done? I assume even an offhand comment by an official (like CEO, etc) that is not immediately walked back would at least protect people from damages associated with willful infringement.


That ANS patent supposedly relates to refining the coding tables based on symbols being decided.

It is slower for decoding and Jpeg xl does not do that for decoding speed reasons.

The specification doesn't allow it. All coding tables need to be in final form.


> their lawyers pulled the emergency brakes

Do you have source for that claim?


Probably this: https://www.theregister.com/2022/02/17/microsoft_ans_patent/

I think it would be much better for everyone involved and humanity if Mr. Duda himself got the patent in the first place instead of praying no one else will.


Duda published his ideas, that’s supposed to be it.


Prior art makes patents invalid anyway.


Absolutely.

And nothing advances your career quite like getting your employer into a multi-year legal battle and spending a few million on legal fees, to make some images 20% smaller and 100% less compatible.


Well, lots of things other than JXL use ANS. If someone starts trying to claim ANS, you'll have Apple, Disney, Facebook, and more, on your side :)


But that doesn't matter. If a patent is granted, choosing to infringe on it is risky, even if you believe you could make a solid argument that it's invalid given enough lawyer hours.


The Microsoft patent is for an "improvement" that I don't believe anyone is using, but Internet commentators seem to think it applies to ANS in general for some reason.

A few years earlier, Google was granted a patent for ANS in general, which made people very angry. Fortunately they never did anything with it.


I believe that Google's patent application dealt with interleaving non-compressed and ANS data in a manner that made streaming coding easy and fast in software, not a general ANS patent. I didn't read it but discussed shortly about it with a capable engineer who had.


If the patent doesn't apply to JXL then that's a different story, then it doesn't matter whether it's valid or not.

...

The fact that Google does have a patent which covers JXL is worrying though. So JXL is patent encumbered after all.


I misrecalled. While the Google patent is a lot more general than the Microsoft one, it doesn't apply to most uses of ANS.


I'm just inferring from the fact that MS got a patent and then this whole thing ground to a halt.


Not only you have no source backing your claim, but there is a glaring counterexample. Chromium's experimental JPEG XL support carried an expiry milestone, which was delayed multiple times and it was bumped last time on June 2022 [1] before the final removal on October, which was months later the patent was granted!

[1] https://issues.chromium.org/issues/40168998#comment52


In other words, there's no source.


>To fix this, you'd need to convince Google, and other large companies that would be exposed to law suits related to these patents (Apple, Adobe, etc.), that these patent holders are not going to insist on being compensated.

Apple has implemented JPEG XL support in macOS and iOS. Adobe has also implemented support for JPEG XL in their products.

Also, if patents were the reason Google removed JXL from Chrome, why would they make up technical reasons for doing so?

Please don't present unsourced conspiracy theories as if they were confirmed facts.


[flagged]


Mate, you're literally pulling something from your ass. Chrome engineers claim that they don't want JXL because it isn't good enough. Literally no one involved has said that it has anything to do with patents.


>There must be a more rational reason than that. I've not heard anything better than legal reasons. But do correct me if I'm wrong. I've worked in big companies, and patents can be a show stopper. Seems like a plausible theory (i.e. not a conspiracy theory)

In your first comment, you stated as a fact that "lawyers pulled the emergency brakes". Despite literally no one from Google ever saying this, and Google giving very different reasons for the removal.

And now you act as if something you made up in your mind is the default theory and the burden of proof is on the people disagreeing with you.


People who look after Chrome’s media decoding are an awkward bunch, they point blank refuse to support <img src=*.mp4 for example


Seems entirely reasonable, considering that <img> is for images, whereas mp4 is for videos, no?


Doesn't make sense when they support GIF or animated WebP as images. Animated WebP in particular is just a purposely gimped WebM that should not exist at all and would not need to exist if we could use video files directly.


If you want a simple conspiracy theory, how about this:

The person responsible for AVIF works on Chrome, and is responsible for choosing which codecs Chrome ships with. He obviously prefers his AVIF to a different team's JPEG-XL.

It's a case of simple selfish bias.


Why not take Chrome's word for it:

---cut---

Helping the web to evolve is challenging, and it requires us to make difficult choices. We've also heard from our browser and device partners that every additional format adds costs (monetary or hardware), and we’re very much aware that these costs are borne by those outside of Google. When we evaluate new media formats, the first question we have to ask is whether the format works best for the web. With respect to new image formats such as JPEG XL, that means we have to look comprehensively at many factors: compression performance across a broad range of images; is the decoder fast, allowing for speedy rendering of smaller images; are there fast encoders, ideally with hardware support, that keep encoding costs reasonable for large users; can we optimize existing formats to meet any new use-cases, rather than adding support for an additional format; do other browsers and OSes support it?

After weighing the data, we’ve decided to stop Chrome’s JPEG XL experiment and remove the code associated with the experiment. [...]

From: https://groups.google.com/a/chromium.org/g/blink-dev/c/WjCKc...


I try to make a bulletin point list of the individual concerns, the original statement is written in a style that is a bit confusing for a non-native speaker such as me.

* Chrome's browser partners say JPEG XL adds monetary or hardware costs.

* Chrome's device partners say JPEG XL adds monetary or hardware costs.

* Does JPEG XL work best for the web?

* What is JPEG XL compression performance across a broad range of images?

* Is the decoder fast?

* Does it render small images fast?

* Is encoding fast?

* Hardware support keeping encoding costs reasonable for large users.

* Do we need it at all or just optimize existing formats to meet new use-cases?

* Do other browsers and OSes support JPEG XL?

* Can it be done sufficiently well with WASM?


* [...] monetary or hardware costs.

We could perhaps create a GoFundMe page for making it cost neutral for Chrome's partners. Perhaps some industry partners would chime in.

* Does JPEG XL work best for the web?

Yes.

* What is JPEG XL compression performance across a broad range of images?

All of them. The more difficult it is to compress, the better JPEG XL is. It is at its best at natural images with noisy textures.

* Is the decoder fast?

Yes. See blog post.

* Does it render small images fast?

Yes. I don't have a link, but I tried it.

* Is encoding fast?

Yes. See blog post.

* Hardware support keeping encoding costs reasonable for large users.

https://www.shikino.co.jp/eng/ is building it based on libjxl-tiny.

* Do we need it at all or just optimize existing formats to meet new use-cases?

Jpegli is great. JPEG XL allows for 35 % more. It creates wealth of a few hundred billion in comparison to jpegli, in users' waiting times. So, it's a yes.

* Do other browsers and OSes support JPEG XL?

Possibly. iOS and Safari support. DNG supports. Windows and some androids don't support.

* Can it be done sufficiently well with WASM?

Wasm creates additional complexity, adds to load times, and possibly to computation times too.

Some more work is needed before all of Chrome's questions can be answered.


Safari supports jxl since version 17


Mozilla effectively gave up on it before Google did.

https://bugzilla.mozilla.org/show_bug.cgi?id=1539075

It's a real shame, because this is one of those few areas where Firefox could have lead the charge instead of following in Chrome's footsteps. I remember when they first added APNG support and it took Chrome years to catch up, but I guess those days are gone.

Oddly enough, Safari is the only major browser that currently supports it despite regularly falling behind on tons of other cutting-edge web standards.

https://caniuse.com/jpegxl


I followed Mozilla/Firefox integration closely. I was able to observe enthusiasm from their junior to staff level engineers (linkedin-assisted analysis of the related bugs ;-). However, an engineering director stepped in and locked the discussions because they were in "no new information" stage, and their position has been neutral on JPEG XL, and the integration has not progressed from the nightly builds to the next stage.

Ten years ago Mozilla used to have the most prominent image and video compression effort called Daala. They posted inspiring blog posts about their experiments. Some of their work was integrated with Cisco's Thor and On2's/Chrome's VP8/9/10, leading to AV1 and AVIF. Today, I believe, Mozilla has focused away from this research and the ex-Daala researchers have found new roles.


Daala's and Thor's features were supposed to be integrated into AV1, but in the end, they wanted to finish AV1 as fast as possible, so very little that wasn't in VP10 made it into AV1. I guess it will be in AV2, though.


> ... very little that wasn't in VP10 made it into AV1.

I am not sure I would say that is true.

The entire entropy coder, used by every tool, came from Daala (with changes in collaboration with others to reduce hardware complexity), as did some major tools like Chroma from Luma and the Constrained Directional Enhancement Filter (a merger of Daala's deringing and Thor's CLPF). There were also plenty of other improvements from the Daala team, such as structural things like pulling the entropy coder and other inter-frame state from reference frames instead of abstract "slots" like VP9 (important in real-time contexts where you can lose frames and not know what slots they would have updated) or better spatial prediction and coding for segment indices (important for block-level quantizer adjustments for better visual tuning). And that does not even touch on all of the contributions from other AOM members (scalable coding, the entire high-level syntax...).

Were there other things I wish we could have gotten in? Absolutely. But "done" is a feature.


Some "didn't make it in" things that looked promising were the perceptual vector quantization[1], and a butterfly transform that Monty was working on, IIRC as an occasional spectator to the process.

[1] https://jmvalin.ca/daala/pvq_demo/


Dropping PVQ was a hard choice. We did an initial integration into libaom, but due to substantial differences from the way that Daala was designed, the results were not outstanding [1]. Subsequent changes to the codebase made PVQ regress significantly from there, for reasons that were not entirely clear. When we sat down and detailed all of the work necessary for it to have a chance of being adopted, we concluded we would need to put the whole team on it for the entire remainder of the project. These were not straightforward engineering tasks, but open problems with no known solutions. Additional changes by other experiments getting adopted could have complicated the picture further. So we would have had to drop everything else, and the risk that something would not work out and PVQ would still not have gotten in was very high.

The primary benefit of PVQ is the side-information-free activity masking. That is the sort of thing that cannot be judged via PSNR and requires careful subjective testing with human viewers. Not something you want to be rushing at the last minute. After gauging the rest of AOM's enthusiasm for the work, we decided instead to improve the existing segmentation coding to make it easier for encoders to do visual tuning after standardization. That was a much simpler task with much less risk, and it was adopted relatively easily. I still think it was the right call.

[1] https://datatracker.ietf.org/doc/html/draft-cho-netvc-applyp...


I like to think that there might be an easy way to improve AV2 today — drop the whole keyframe coding and replace it with JPEG XL images as keyframes.


It feels like nowadays Mozilla is extremely shorthanded.

They probably gave up because they simply don’t have the money/resources to pursue this.


All those requests to revert the removal are funny: you want Chrome to re-add jxl behind a feature flag? Doesn't seem very useful.

Also, all those Chrome offshoots (Edge, Brave, Opera, etc) could easily add and enable it to distinguish themselves from Chrome ("faster page load", "less network use") and don't. Makes me wonder what's going on...


> you want Chrome to re-add jxl behind a feature flag? Doesn't seem very useful.

Chrome has a neat feature where some flags can be enabled by websites, so that websites can choose to cooperate in testing. They never did this for JXL, but if they re-added JXL behind a flag, they could do so but with such testing enabled. Then they could get real data from websites actually using it, without committing to supporting it if it isn't useful.

> Also, all those Chrome offshoots (Edge, Brave, Opera, etc) could easily add and enable it to distinguish themselves from Chrome ("faster page load", "less network use") and don't. Makes me wonder what's going on...

Edge doesn't use Chrome's own codec support. It uses Windows's media framework. JXL is being added to it next year.


> Edge doesn't use Chrome's own codec support. It uses Windows's media framework. JXL is being added to it next year.

Interesting!


Simply put these offshoots don't really seem to do browser code, and realize how expensive it would be for them to diverge at the core.


No, obviously to re-add jxl without a flag


"jxl without a flag" can't be re-added because that was never a thing.


It can, that's why you didn't say "re-add jxl", but had to mention the flag, 're-add' has no flag implication, that pedantic attempt to constraint is somehing you've made up, that's not what people want, just read those linked issues


It has a flag implication because jpeg-xl never came without being hidden behind a flag. Nothing was taken away from ordinary users at any point in time.

And I suppose the Chrome folks have the telemetry to know how many people set that damn flag.


> And I suppose the Chrome folks have the telemetry to know how many people set that damn flag.

How is that relevant? Flags are to allow testing, not to gauge interest from regular users.


>"But the plans were on display…”

> “On display? I eventually had to go down to the cellar to find them.”

> “That’s the display department.”

> “With a flashlight.”

> “Ah, well, the lights had probably gone.”

> “So had the stairs.”

> “But look, you found the notice, didn’t you?”

> “Yes,” said Arthur, “yes I did. It was on display in the bottom of a locked filing cabinet stuck in a disused lavatory with a sign on the door saying ‘Beware of the Leopard.’”


I guess you're referring to the idea that the flag made the previous implementation practically non-existent for users. And I agree!

But "implement something new!" is a very different demand from "you took that away from us, undo that!"


> No, obviously to re-add jxl without a flag

Is asking for the old thing to be re-added, but without the flag that sabotaged it. It is the same as "you took that away from us, undo that!" Removing a flag does not turn it into a magical, mystical new thing that has to be built from scratch. This is silly. The entire point of having flags is to provide a testing platform for code that may one day have the flag removed.


I suppose I'll trust the reality of what actual users are expressly asking for vs. your imagination that something different is implied


Actual users, perhaps. Or maybe concern trolls paid by a patent holder who's trying to prepare the ground for a patent-based extortion scheme. Or maybe Jon Sneyers with an army of sock puppets. These "actual users" are just as real to me as Chrome's telemetry.

That said: these actual users didn't demonstrate any hacker spirit or interest in using JXL in situations where they could. Where's the wide-spread use of jxl.js (https://github.com/niutech/jxl.js) to demonstrate that there are actual users desperate for native codec support? (aside: jxl.js is based on Squoosh, which is a product of GoogleChromeLabs) If JXL is sooo important, surely people would use whatever workaround they can employ, no matter if that convinces the Chrome team or not, simply because they benefit from using it, no?

Instead all I see is people _not_ exercising their freedom and initiative to support that best-thing-since-slices-bread-apparently format but whining that Chrome is oh-so-dominant and forces their choices of codecs upon everybody else.

Okay then...


We have been active on wasm implementations of jpeg xl but it doesn't really work with progressive rendering, HDR canvas was still not supported, threadpools and simd had hickups etc. etc. Browser wasn't and still isn't ready for high quality codecs as modules. We are continually giving gentle guidance for these but in the heart our small team is an algorithm and data format research group, not a technology lobbyist organization — so we haven't yet been successful there.

In the current scenario jpeg xl users are most likely to emerge outside of the web, in professional and prosumer photography, and then we will have — unnecessarily — two different format worlds. Jpeg xl for photography processing and a variety of web formats, each with their problems.


I tried jxl.js, it was very finicky on iPad, out of memory errors [0] and blurry images [1]. In the end I switched to a proxy server, that reencoded jxl images into png.

[0]: https://github.com/niutech/jxl.js/issues/6

[1]: https://github.com/niutech/jxl.js/issues/7


Both issues seem to have known workarounds that could have been integrated to support JXL on iOS properly earlier than by waiting on Apple (who integrated JXL in Safari 17 apparently), so if anything that's a success story for "provide polyfills to support features without relying on the browser vendor."


The blur issue is an easy fix, yes, but the memory one doesn't help that much.


Or (re-add jxl) (without a flag).




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: