Image compression and device-sensitive transcoding are important for mobile and emerging markets, and are somewhat challenging to deploy on your own. For larger sites we acknowledge there is some tuning & thought that must be applied, and we'd be happy to help via the discussion groups (mod-pagespeed-discuss, ngx-pagespeed-discuss). We are also working on a new release that will make this easier. For smaller sites, things should work out of the box. As with all PageSpeed issues, please let us know if they don't, or you think there's something specific to improve.
We've heard user concerns about SEO, and are happy to report that this should be improved with our latest beta release (1.0), which adds 'link canonical' attributes to image response headers. This allows search engines to point back to the origin image.
its really tricky for people to figure out how to build this and deploy it production.
I have a nginx build script with pagespeed that I use for my docker images. I dont think I leverage pagespeed enough - it would be great if you could point out what needs to change (and others could learn from it)
Better yet, an nginx .deb repo with pagespeed already compiled in. Like the one nginx provides[1].
I'm very reluctant to use anything that doesn't come packaged. Compiling your own software only really works when you're constantly maintaining the same project and its stack. In every other situation, it's a hassle, and I always fall behind.
hey jdmarantz love ngx_pagespeed been using it for years now integrating into my Centmin Mod LEMP stack installer's Nginx server http://centminmod.com/nginx_ngx_pagespeed.html. The benefits of ngx_pagespeed are truly amazing :)
if you think about it, http/2 allows requests over one tcp connection but each page element is still around the same size.
ngx_pagespeed can optimise those page elements css/js and reduce image sizes i.e. png to webp conversion on the fly so each page element is smaller in size.
Utlimately reducing the size of each page element = smaller data transfers = faster page loads :)
Seems as if people have had a few negative experiences of 'pagespeed'. Why is this?
Well, not every Pagespeed feature works with every site. However that tends to get learned the hard way. Those lazy loading images were working a treat on Friday, come monday - ah, no Apple users could see any images, consequently in an instant Pagespeed gets removed.
To take an automotive analogy, Pagespeed is a bit like doing things to a car for improved 'performance'. Let's save weight, let's put some active suspension in there, let's remove a few bits to improve the aero, let's put some 100 octane fuel in the tank and chuck in some twin turbo. All these improvements - let's keep going there are 50 more tweaks we can do in the garage, why not try 38 of them at once? What could possibly go wrong?
So, the main problem with Pagespeed is not how it works it is just that it encourages people to keep going with it, layering on more 'improvements' until the website does the computing equivalent of leaving the road to spontaneously burst into flames.
That sums up my experience pretty well. I tried installing it for a couple of sites and seemed really cool at first. But just weird things started to happen with styles and such and it seemed like it was going to take a lot of research and debugging to get everything dialed in. Not really having time to do that, I disabled it and put it on the list for later. I wouldn't call it a "negative" experience necessarily, just not the easy win that I was hoping for.
I do think it would be awesome for some legacy code if it was critical to get the performance tuned up and you have a day or so to get everything configured.
You have to enable options and test the impact.
For example, on my cheap vps, when I enable gzip like google recommends, load time actually increases. Maybe it would be lower for someone using pigeon connection, but not for the average case.
> Recognizes common JavaScript libraries and redirects each to a canonical URL.
> Inserts Google Analytics javascript snippet.
While these are useful (and what appears to be optional) features, it leaves a bad taste in my mouth to think that a business reason for Google to support this library could be to lower the barrier of entry for developers to allow Google to track their users.
Both are optional. In fact, the canonical library is a shell script that you need to run, it has library signatures (hashes) and their URLs. You can fully customize it. The GA option is turned off by default and you need to specify your tracking ID anyway.
I would argue that adding a Google Analytics JS snippet is the easiest thing possible for an average user while adding an Nginx plugin is infinitely more difficult.
In most cases yes. I have a feeling there's a customer somewhere running an old CMS or has a big jungle of HTML documents but not the resources to modify them all accordingly.
You know that inserting a Google Analytics tracking script is a.) one of the most common things a web developer does anyways and b.) a very helpful and common-sense feature coming from a Google product?
> You know that inserting a Google Analytics tracking script is a.) one of the most common things a web developer does anyways and b.) a very helpful and common-sense feature coming from a Google product?
Not at all. I realize Google by having dominant search-engine marketshare and obfuscating their search-terms from their referrer url makes visitor analysis literally impossible without joining the Google Analytics dragnet, but that doesn't mean that everyone is willing to do so, nor that they will do it every time. It may in fact be illegal for one to do so.
One example would be doing business with local government agencies and having to comply with EU data-locality rules. Will adding Google analytics to my site violate those rules? I don't know. But I know I wouldn't risk me being in compliance or not up to chance. Automatically adding third party US-based user-tracking to my service certainly doesn't sound very compliant.
Another example would be if you already have concrete TOS with your customer-base which does not include leaking customer data to a unrelated third party, and especially if you deliver on site software to buying customers. If I bought some software, deployed it on my network and then discovered it was spying on me and reporting everything I did to google, I would be pretty pissed off.
Adding it automatically to my product "out of the box" for a completely unrelated service (nginx optimizations) is akin to drive-by installers of the Windows desktop-age.
You know what they say about ducks: If it looks like a duck, quacks like a duck, and walks like a duck, it's probably a duck.
This looks like spyware, deploys like spyware and behaves like spyware. I think it's safe to call it spyware.
Google: Remember that thing about not being evil? Yeah, let's get back to that.
> ... makes visitor analysis literally impossible without joining the Google Analytics dragnet,
Google Analytics does not give you referrer information from Google search. They keep that to themselves regardless of if you use their analytics software or not.
The only things they do is (a) integrate with webmaster tools which you can use without Google Analytics to give you general search term aggregates and (b) integrate with Google Ads to show your campaign performance, which any other analytics service can do as well.
Something can be good for a corporation you don't like AND good for the world at the same time, without any implied tradeoffs or dichotomies.
Yes, Google is betting heavily on web-everything-all-the-time because they know how to monitise the web (and their competitor don't). Making the web better furthers this goal, but it also makes the web better, so there's that.
When we start seeing things that will break/degrade the web for apps and users that aren't in the Google (ad/tracking) garden, that's when we start to worry, but there's been very little, if any, of that.
We ran this in production from its initial release over two years ago. Last month, we moved to a new datacenter and decided to forgo ngxpagespeed entirely.
The results have long been mixed - it does a great job minifying content and reducing requests, but the net results are highly questionable. While we reduced our network transfers and round-trips, the amount of time it took to serve requests was significantly increased to a point where the net gains were minimal. We played around with enabling and disabling the various filters and changing the settings and configurations to a great extent, and filed quite a few issues with the (ngx)pagespeed teams as we came across them.
If you do use this module, don't enable image recompression. It'll use up a lot of CPU, but worse, it'll also add significant latency to your requests. The caches on disk will always be full, and your disk IO will increase across the board. My personal favorite filter was the CSS merger which pulls various CSS resources into a single file. My favorite feature that always let me down was the "prioritize critical CSS" filter which (in theory) inlines CSS needed for above-the-fold content to display immediately. It never worked reliably for us, and broke our CSS more times than I can count. We always went back every few months to try it and always came away disappointed.
ngxpagespeed does not play well with caching (it must be enabled on the outer-most node) as it relies on realtime metrics collected via JS from served pages to make decisions about content, caching, inlining, and more. It also doesn't work well for SEO without significant manual tuning as it rewrites content URLs (I hear there is a way around this on Apache, but not on nginx) to pass back information to nginx, appending .pagespeed.xxxxx to your non-HTML URLs, including images that will be picked up by Google's bots. If you disable pagespeed later, all these will be broken. There's also no guarantee that these links will remain valid across pagespeed versions or if you enable/disable certain ngxpagespeed modules/filters, though I can't say I've been bitten by that (solution: you need to hack your nginx configuration to always include a Link...canonical header in your outgoing requests).
There are open issues galore regarding HTTPS and ngxpagespeed - it's an ongoing battle to make it play nicely with HTTPS upstreams for caching. It also basically proxies all requests made to your site back to your site once more so it can cache and optimize them, adding an entire roundtrip to initial or uncached requests. By default it won't cache HTML, so all HTML pages get proxied an extra time. There's an option to disable this for static resources by loading them from disk instead, but good luck getting that to play nicely with multiple upstreams and overlapping paths taking advantage of rewrite rules.
In the end, it just wasn't worth it for us. We've configured our webapp to pull in, minify, and inline its own (entire) CSS on first page load (about 20 lines of code w/ the Yahoo minify library doing the crunching), then configured nginx to reduce latency (http2, ssl_buffer_size, ssl session caching, smart keepalives, gzip, ssl_cipher optimizations (hint: bigger isn't better), proper expire headers, MTU optimization, etc) and the results have been better than what ngxpagespeed was giving us.
Sorry for how unstructured this comment is, I've been remembering stuff as I go along. By all means, ngxpagespeed is an awesome effort. As a primarily non-web developer, I think this is what browsers and http servers should do by default: act as a compiler of sorts for your content and find and implement optimizations where possible because your job isn't to figure out how to minimize network requests but rather to write code that works. However, it's the early days of "web compilers" and the intelligence and logic is far from being any good just yet. Just like 30 years ago no compiler could come close to even an average developer's hand-crafted (not even tuned!) assembly, that's how I see the state of "optimizing" proxies today. Give it another decade or two, I suppose :)
I've been using PageSpeed in production for over 3 years. It's definitely not a replacement of manual optimizations, but in some use cases, especially when you don't have full control over the generated assets (such as third-party WordPress plugins), it's better than trying to fork those, or do transformations in PHP.
It requires you to get how it works, the specifics of filters, and so on; it's not so user-friendly, but once you figure it out, it's performing great. Also, the lead developers are very friendly and responsive!
One of them (Otto van der Schaaf [0]), who's not employed by Google, ported PageSpeed to both IIS [1] and ATS (Apache Traffic Server) [2]. I greatly respect Otto! He's such a hard-working, friendly, and dedicated developer! Jeff Kaufman [3] is also an amazing talent!
These are the kind of comments I'll usually spend time scouring Stackoverflow in hopes of finding some realistic review or use case in a business setting. Your writeup hits on points that I was worried about, and know I can't trust this magical module to fix. I appreciate the long post!
For me, this really underline how an already optimised website won't benefit much from it (especially the image recompression bit), and it may even be harmful.
On the other hand, it could work great by being properly configured for small websites that don't get much optimisations, and could really benefit from generic optimisation.
You still could benefit, because the optimizations are user agent-specific, and if you aggressively use caching with Nginx, then you don't need to make as many live calls to the backend to segment the cache per user agent type. PageSpeed is written in C and highly optimized so it has performance benefits in your case as well, I think. For example, there are user agents for which PageSpeed may decide to deliver your CSS as inline and small images - as data URIs. It also automatically (but optionally) does lazy loading of images, etc. It has beacons that are also implemented with it. It's perfect for CMS websites like it is in my usecase. It's not a panacea though.
Why is this so much upvoted? This is old news. IMHO ngxpagespeed is a good option if you have a lot of old legacy stuff which cannot be converted for a good web compression nowadays.
But compiling your own nginx is not trivial and I would not recommend it for everyone.
It would be great if there would be a great precompiled nginx with newest security fixes for my Ubuntu which can enable ngxpagespeed or the new brotli compression: This is missing in the open source world.
It's old news, but underused. This is a life savior especially when you have to deal with software that you don't control, which spits out unoptimized assets. If you have compilophobia, use EasyEngine [0] or Centminmod [1]!
Yup Centmin Mod LEMP stack installer maintains and integrates ngx_pagespeed with Nginx installs http://centminmod.com/nginx_ngx_pagespeed.html along with handy command line shortcuts to enable and disable pagespeed via pscontrol command :)
compiling your own nginx is not trivial and I would
not recommend it for everyone
Yes, this is frustrating as an nginx module developer as well. Apache has a loadable module system and we can ship binaries, but to add a module to nginx you have to recompile.
Nginx's developers have been hinting that they're going to add support for loading .so modules, and I really hope they do!
Also, runtime behavior depends on the order of --add-module compilation flags! Got bitten by this when sub_filters weren't working... and then I realized that the order in which [un]gzip and sub_filters are applied isn't driven by the order of the directives in the config file, but by the compilation string - the substitution filtering was being done on gzipped gibberish!
This was probably a Good Idea (tm) for performance reasons - filtration is just a compile-time linked list of function calls. But it's barely documented at all!
I actually did it live on prod boxes in last job and then used the in-place binary upgrade signal to seamlessly restart nginx. A dev who'd never touched prod and I wrote an ansible playbook for it.
This is something I noticed at my previous job (removing user access to gcc, make, ld etc so only root can run it[1]) and never understood.
It reminds me of blocking ping to improve security or worse blocking all ICMP.
Compiler, and especially make are harmless by themselves. They aren't setuid all it's used for is translate file from one format (source code) to another (machine code). One might as well block sed, because it could be used to modify /etc/passwd or /etc/shadow.
A someone who would want to compromise hosts if they need binaries, they would precompile them statically, that way is a single file, no need for libraries and no need extra development packages with header files and more likely to work across wide range of systems.
[1] because, there were chef recipes that were compiling things :> Also, it decreases security, because now you need to run compiler as root, so you could be compromised through things like this: http://securitytracker.com/id/1004374
Imagine an attacker is able to inject small files onto the system via a channel that would not let them transmit arbitrary binary data, and that the system is otherwise sufficiently firewalled to prevent them from just downloading their own tools without first further compromising the system. Having a compiler available can make it substantially easier to bootstrap a toolchain to compromise the system fully.
Another issue is that it presents a privilege escalation concern. If you compile stuff in a user account on the production machines that will be run with root privileges, if someone compromises that user account they can now put in place a compiler wrapper to embed their own code. Even if you don't do anything else in that account (e.g. sudo) that'd let them e.g. capture passwords the compilation presents another risk. (As an extension to this: Your dev and build environments are security critical; but in your production environment is often far more vulnerable - not least because it's far more visible)
I don't think these are very high on the list of things you should worry about as your system needs to be very locked down before an attacker that is able to make use of them won't have other just as good opportunities, but the more stuff you run in your production environment, the more opportunities you give an attacker.
>Imagine an attacker is able to inject small files onto the system via a channel that would not let them transmit arbitrary binary data, and that the system is otherwise sufficiently firewalled to prevent them from just downloading their own tools without first further compromising the system.
I'd just send the binaries base64 encoded. Decoding is trivial, with any number of tools commonly installed in the system.
Limiting access to compilers, alone, is 100% useless. You either go a lot farther down that road, or there is no point in starting.
You can compile local exploits instead of having to download them. If the machine is fully stripped enough it can be a good thing. Most of the time it's just an annoyance.
Eg. Windows boxes rarely have a compiler and get hacked all the time.
My view is that if people on my box can run a compiler, they surely can run Ruby, or Python, or PHP or one of the many many other dynamic languages that I have which will let them do whatever it is they want.
> compiling your own nginx is not trivial and I would not recommend it for everyone.
It's still pretty easy. For those that have never done it, it's a configure-like script, make, make install. That's it. It's almost your standard autoconf package routine of ./configure && make && make install, except the configure script is oddly in a subdir and not from autofoo. But it is all covered in the documentation, IIRC.
yes compilation itself is not the problem here. Problem is more maintaining the whole stuff. Keep e.g. sure OpenSSL is up to date linked in your nginx is a good example (what a nightmare). Also getting nginx running and configuring like a standard Ubuntu one is not out of the box. This are all extra step which not everybody wants to deal with.
IIRC, OpenSSL is dynamically linked by default in a custom compiled nginx — the same as it would be if you didn't custom compile it, and you have the same amount of work either way maintaining OpenSSL. I maintain a custom compile of nginx, but the custom ends at the boundary of nginx: it uses Ubuntu's libssl, so OpenSSL patches that make it to Ubuntu apply normally to my custom build. The config for nginx is the same: /etc/nginx/…. I can tell you it's exactly the same, as we transitioned from a config from Ubuntu stock nginx to my custom build (and went up a version) with zero issues.
The only real thing that is different is the binary itself, and the installation location. (But I did the latter on purpose, because I didn't want to collide with apt/dpkg's management of /usr/bin.)
Thanks for the information! I will keep this in mind next time. It was some time ago the last time I used custom compiled nginx. Seemed I used a statically linked OpenSSL (which had this downsides). I informed myself a littlebit more here https://www.nginx.com/blog/nginx-and-the-heartbleed-vulnerab...
Compiling nginx would be a lot easier if they would stop hard coding so many things in the makefile. Enabling libc hardening (aslr, pie, full relro for example) requires manually editing generated files before compiling and overriding his ldflags, cflags, etc. I have reported this multiple times but Maxim didn't agree with me that this is a priority.
Cool to see this featured. If anyone's interested in actually playing around with ngx_pagespeed, I wrote a pretty straightforward tutorial on my blog to get up and running (granted, I last updated it in 2013, so some things may be outdated).
This made me chuckle. I've been doing this type of stream editing for years, on the client side. Of course I strip out Javascript. In addition to blocking google-analytics, doubleclick, etc. client-side, via DNS. Google reminds me of Goldman Sachs: trying to take both sides of a bet. But Google can never deliver the best experience for the user, because the best experience does not include redirects to set cookies, tracking, ads and other nonsense that serves no benefit to the user. Either Google is for advertisers or it is for users. Users do not pay Google. Draw your own conclusions.
ngxpagespeed and its apache brother mod_pagespeed are absolutely incredible tools for improving performances of any website. It automates lots of work that used to have to be coded manually in your application, and moves the responsibility to the webserver. It, in the end, allows you with a dead simple configuration file, to minify, shard, optimize, and make many other performance optimization, without you ever worrying about implementing those yourself.
> Either Google is for advertisers or it is for users.
I think the OP is confusing strategic and tactical decisions by Google. On a strategic board-level view, yes Google is an ad company and reducing web page size without stripping ads is a bit disingenuous. Strategically Google will never drop ads, so the next best step for this project probably won't happen
However tactically Google benefits from paying clever people to make great products irrespective of their strategic fit - and this seems a good product.
I had never heard of it, and will build it in my pipeline RSN.
Pagespeed is absolutely one of the worst plugins in the world. mod_pagespeed for apache is garbage, and permanently breaks some websites when removed. I see no reason why the nginx version would be any better.
I'm sorry it didn't work out! Permanently breaking a website when removed is very surprising, since if you uninstall the module it's not there anymore to have an effect. Would you be up for explaining more?
(I work on mod_pagespeed, and if we're breaking sites on removal that's something I want to look into.)
This kind of answer from Jeff (and other people) from the MPS team is in part what made me enjoy using MPS. Should you have any issue with MPS, if you send an email to the mailing list, you will always receive a kind answer, even in those cases were it is obvious you didn't read the doc.
I "worked" for months with Jeff and his team to find a real tricky bug to catch (well, they mostly did the work, I sent the bug reports) and they went above and beyond to find and fix that bug. (Remember the #1048 issue?)
I definitely recommend MPS, it will not "break your website".
I agree, Jeff and Otto have an amazing ethic and truly care for the project. I wish more project are lucky to have such leadership. Otto is no even with Google but has spent endless hours trying to help me and never tried to pitch consulting services although it seems that this is his business.
I have been using it for years on a very high traffic website and it works great. Doesn't break a thing for me. The image optimizations it does are awesome.
Yes, it adds predictable hashes to the URL to help with automatic cache busting; you can easily create a simple Nginx configuration to normalize. As @jdmarantz said above, the latest version adds canonical URL headers to solve this issue without rewrites.
MPS has been an absolute world saver for years for us (until we recently moved to an angular app, which ended up reducing the need to use MPS).
I'm very curious about what could have made your experience that bad. Mind elaborating?
side note: I never used the ngx plugin, for good reasons: it's never been a real priority for the MPS team who seems to be focusing on the Apache version.
Are there any precompiled packages for Nginx for this? Looks like an awesome project, but I'm not a huge fan of compiling from source because it makes maintainability difficult in future. Having to compile Nginx from source just so I can use PageSpeed isn't something I look forward to.
just script the compilation so it's easier over time. That's what i do for my LEMP stack's installer and source compiled nginx. There's a shell based menu option i can run to just recompile nginx for upgrade, downloads or recompiles http://centminmod.com/menu.html :)
Ahh, Kloudsec [0] uses a fork of this in our free CDN service for our users. The issue with something like this is that
a) most website owners who want to get things done will find it hard to recompile, optimize, and deploy in production.
b) most websites in the web are really just WP sites hosted on X hosting provider
And yet we truly believe that sth like Pagespeed is a worthwhile investment (if done by 1, deployed on many). So we built Kloudsec to make optimizations like this available for free in our CDN layer (which is also free).
---
Most importantly, it is super easy to deploy. No programming/dev-ops required. Register an account, add your domain, then update DNS records. (You get to keep your fav nameservers)
I've been a heavy user of mod_pagespeed (the Apache version of Google's plugin) for about 5 years now. It's been an absolutely incredible tool.
I am now, however, wondering what the future of MPS is with the advent of HTTP/2? A lot of the work MPS is doing, and was designed for, is optimizing for HTTP/1.1 shortcomings, and most of MPS's work seems irrelevant in the HTTP2 world, and even detrimental in some cases (take URL sharding for example).
It still can add tremendous value. For example, I do all my CDN offloading with it, domain sharding, making all paths relative, etc. It has a growing number of filters. Combining assets is just one of its most basic feature. It has a lot. Minification is still gonna be useful with HTTP/2.
Beware if you're using this BEHIND a separate caching layer such as Varnish. That's not well-supported (and helped contribute to an outage for us. we used auto-scaling groups and that shot us in the foot when first trying out pagespeed).
In itself this module is pretty cool, but you're actually better off simply internalizing the various practices within your own code & infrastructure, with a CDN in front.
None of this is to knock the Pagespeed plugin. It IS really cool, and in an organization with many silos (like Google) it has many benefits.
We used this a lot when we had a old project to maintain and it's great! Today, having a new version where everything is optimized before we turned it off.
Anything that impacts the readability of the markup sent to the client is an atrocity - not that Google would care about this since they are notorious abusers of js obfuscation. Don't be evil indeed.
Image compression and device-sensitive transcoding are important for mobile and emerging markets, and are somewhat challenging to deploy on your own. For larger sites we acknowledge there is some tuning & thought that must be applied, and we'd be happy to help via the discussion groups (mod-pagespeed-discuss, ngx-pagespeed-discuss). We are also working on a new release that will make this easier. For smaller sites, things should work out of the box. As with all PageSpeed issues, please let us know if they don't, or you think there's something specific to improve.
We've heard user concerns about SEO, and are happy to report that this should be improved with our latest beta release (1.0), which adds 'link canonical' attributes to image response headers. This allows search engines to point back to the origin image.
Please give us a shout on our discussion group or bug-list and we'll try to help sort through any others issues that you see: https://github.com/pagespeed/ngx_pagespeed/issues?q=is%3Aope...