Hacker News new | past | comments | ask | show | jobs | submit login
Perceptual Image Compression at Flickr (flickr.net)
115 points by fahimulhaq on Sept 28, 2015 | hide | past | favorite | 34 comments



What they did was super impressive, and is a fascinating way to figure out the optimal global compression settings. However, I got my hopes up that perceptual image compression might be about computer vision. At scale, analyzing every image would obviously be prohibitively expensive, but it would be amazing to analyze the detail/edges or color usage (e.g. how much red) in a photo, and tune the q & chroma values accordingly. That, or compress with multiple settings and analyze the resulting images for noise introduced by artifacts.


Love the approach. Then they showed their results, and it was only 606 "tests" (showings) of 250 images? On average, just a little more than one same and one different pair-showing for each image. Doesn't strike me as a huge sample.


"Doesn't strike me as a huge sample"

Based on what criteria, exactly? 606 comparisons is more than enough to rule out large differences, especially considering that the testers were heavily primed to look for even the tiniest difference and making forced-choices about difference or no difference. Less than 1% difference suggests no real difference.


If it were 606 comparisons of one image I would agree. As it is, it's (on average) just ~2 comparisons of each image


That doesn't matter if they're doing paired comparisons, as they are. 2 comparisons of a few hundred images yields far more reliable results than a few hundred comparisons of 2 images, because there are measurements both within and between images.

An analogous situation: if someone runs a blood pressure clinical trial, whose results will you believe more - a trial which measures one person's blood pressure on and off a drug several hundred times over a year or two, or a trial which measures several hundred peoples' blood pressure at the beginning and end of the trial? Obviously the latter, because we know that there are big differences between people which must be measured if we want to make reliable predictions about the effect of the drug in the rest of the population, while additional blood pressure measurements of a person only reduces variability a little bit and helps only a little (because most of the sampling error was removed by the first pair of measurements, and further measurements leave the bulk of variance unaffected).


This is interesting but would be more interesting if they linked to the software that they are using and gave information about the settings used.


Excellent writeup! I'm amazed people signed up to play their test game. Does anyone know if the points you get translate to any sort of other reward, or is a place on the leaderboard more than enough of an incentive to keep playing?

Personally though, I think the linked article about on-the-fly resizing[0] was a more interesting "technical" read.

[0] http://code.flickr.net/2015/06/25/real-time-resizing-of-flic...


As someone who participated in an earlier iteration of this, I can answer that there was no "prize" (I don't think there was even a leaderboard on the first iteration of this). I can't speak for everyone, but I participated mainly for two reasons, 1) to make sure a crappy compression didn't eventually effect my images I posted there 2) the geek challenge of finding which image was compressed


I read this looking to see if there was anything we could adapt for mod_pagespeed. Making images smaller without looking worse is, of course, what anyone optimizing images wants. Unfortunately, all they say about how they actually did their perceptual compression is:

    Ideally we’d have an algorithm which automatically tuned
    all JPEG parameters to make a file smaller, but which
    would limit perceptible changes to the image.  Technology
    exists that attempts to do this and can decrease image
    file size by 30-50%.
As far as I know, the only practical way to do this is to do a binary search for the quality level that gives a desired SSIM [1], but this is kind of expensive and I was hoping that they had something cool to tell us here.

[1] If you're interested in getting this kind of improvement and your images are viewed enough that it's worth doing more intensive encoding to get things exactly right, you could do this:

    def evaluator(image, candidate_quality):
      compressed = jpeg_compress(image, candidate_quality)
      return dssim(image, compressed)

    best_quality = binary_search(
      image, desired_dssim, candidate_qualities, evaluator)


You can use interpolated bisection instead of a straight one (I've tried using polynomial interpolation in imgmin and it saves a step or two).

I also wonder whether it would be possible to narrow down the range with only a sample of the image (i.e. pick "representative" subregion(s), bisect on that, then try that quality on the whole image).


There's tools like JpegMini, which tries to detect when artifacts become annoying.

There's an open source re-implementation of their published algorithm on github, but they have patents, which might be part of Flickr's reticence:

https://github.com/dwbuiten/smallfry




Nice. I've been using Jpegoptim for years.

Do you know if it uses a similar strategy to imgmin?


imgmin and jpegoptim are orthogonal.

imgmin finds the lowest-good quality, and jpegoptim optimizes compression at that quality.


Okay, but how is this "perceptual image compression" done?


> Technology exists that attempts to do this

Clearly.


It's odd that this is missing. Without it, it's not possible to analyze for biases w.r.t. the stimuli used or even the perceptual features they're adjusting.


There are a few human quality estimate corpuses out there. I know a year or two ago they were re-running the quality assessments with RED camera footage. Check out the NYU/UTexas/Waterloo labs working on SSIM, I bet you can download it.


I wonder if jbig-style pattern matching and substitution (See the Xerox fail) could be used on photos in jpg format.

If the 8x8 tiles in a jpg were lossy deduplicated, the Huffman compression should work better, even if you kept the high frequencies.

http://www.dkriesel.com/en/blog/2013/0802_xerox-workcentres_...


I've been using Flickr for a while now, and am a regular visitor to flickr.com/explore. It has always been so painfully slow to load. Moving to the final photo page is even bigger a task. Sometimes, the thumbnails take two minutes to load, but if I get a direct link to the 4MB image, it downloads in literally 5-10 seconds.

I'm waiting to get home to see if they've done to fix this mess.


Totally agree. When I'm on Reddit and I spot the link is to flickr.com (rather than say imgur.com) I'll usually skip the link and move to the next post as I know the Flickr photo will take longer than I'm willing to wait.


It's absurd, and it doesn't function! Fix it, Flickr!


Does anyone know what the equivalent imagemagick convert command could be for jpeg and webp? If Firefox supports webp in the near future and IE and Safari support can be done through javascript decompression what could the size reduction with a similar process on webp be compared to jpeg be?


    If Firefox supports webp in the near future...
Where are you getting this? My understanding is FF still doesn't intend to add WebP support. https://bugzilla.mozilla.org/show_bug.cgi?id=856375


Yes, this was wishful thinking on my part. sp332 is right, there was no decision to implement this. They choose to put their efforts towards mozjpeg instead: http://calendar.perfplanet.com/2014/mozjpeg-3-0/


That thread was effectively closed because the conversation became unproductive. There was no decision on implementation of WebP. See this comment for a summary https://bugzilla.mozilla.org/show_bug.cgi?id=856375#c170


Imagemagick supports webp, and that since quite a while. I currently use the prepackaged Ubuntu version on my desktop, ImageMagick 6.7.7-10 2014-03-06 Q16 which supports it.


> 15% fewer bytes per image makes for a much more responsive experience.

I wonder how they measured this. Seems like perceived speed gains would be sub-linear with file sizes to me. Particularly as actual transfer speed is often too once you factor in network latency patterns.


Irony is that this additional 15% compression was noticed by users - not because of improved responsiveness or load times but because it degraded image quality to such a level that users noticed it.


I agree that with mobile, latency is usually a far bigger factor than throughput when it comes to perceived speed. I would also wonder if, once the image is downloaded, what the performance cost or benefit would be of more compressed images. Are they faster due to a smaller footprint? Or are they more costly to the CPU to decompress? (I would assume the former for JPG, while the latter is often the case with video)


This is all impressive and all, but flickr needs to improve their page load times, a lot. I like flickr a lot, and as a paying customer, I'm tempted by 500px every time I visit them.


Took me far too long to realise the screenshot of the comparison test was not interactive. :(


does netflix do this too? or does netflix just get like insane raw source files?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: