See also https://github.com/bslatkin/dpxdt , a more featureful/complex implementation of the same idea. It also has some discussion of why you'd want to do this.
I wrote https://github.com/gabrielrotbart/gatling ages ago. It integrates visual testing into normal Rspec specs. Focusing on specific elements rather than full pages, makes it more stable.
I'm sorry but… are you an idiot? this is the coolest thing ever. Like, my mind is blown right now. I futzed with something that did the whole page but wasn't really impressed by it. but this is 100x better. Like, you changed my life this morning.
Thanks (I thought that line was going to end differently considering how it started).
I didn't put it out there because, although it's functioning perfectly, I'm not happy with the internals and would like to find time to refactor it before sending it out to the world.
You would just need to run a script that runs phantomjs (casperjs and you can run tests at the same time) and saves screenshots of different pages in your app. Then it checks in the images (possibly in a different repo that is just for comparing screenshots). This could be done as part of continuous integration. Whenever deploying one could generate a compare view with the last deployed revision.
Disclosure: I'm a developer of this tool & work for BBC News.
We use this for catching things that we'd never spot, a good example of how it has helped us is after a giant Sass refactor, we ran this tool against the old CSS site and the new CSS site, this showed us where things were off by a pixel or two (as well as big formatting changes).
We've found this tool rather valuable, we're not saying everyone will, or that others should use it. We open sourced it because it might help someone.
I think the value is in scale. We're currently working on a very large product site (not eCommerce) that has thousands of products and thousands of related content entities. QAing that is a nightmare, but once you get through the base pass, imagine rolling out changes. Being able to run this over thousands of pages, seeing the potential changes over time, will be hugely useful.
This is extremely useful if you are trying to build a responsive site. Imagine you have to test 10 different screen resolutions... even using an extension the process would probably take up to a minute where as this is just running a command. Now imagine the process repeated X times...
Bear in mind that this uses a webkit based browser for all shots, so it doesn't give cross-browser comparisons.
It is for regression testing of css modifications, whether from developer or simply when using some kind of compression tool on source code you know is correct.
I am interested in limited image recognition and pattern matching for an iPad app of mine. For example, I'd like to recognize when there are the red, yello, and green buttons to the left of a point in an image and the edge of an active window just above, so I can provide instant feedback of some kind (probably based on a filtered+recolored version of the top of the window which is being moved) for a VNC app. Eventually, there will be feedback from the server, but limited feedback generated locally will feel much better.
I'm by no means a computer vision expert, nor did any of my education cover the subject, but I have been dabbling in it long enough that I started a company based on limited CV.
What I do is think of ways I can cheat. Sometimes an algorithm that looks really complex in mathematical notation boils down to a for loop for X, and a for loop for Y, iterating over every pixel in the image looking for some feature.
If you know where your buttons will appear, roughly, you can scan that rectangle in your double for loop until you hit a long enough sequence of red, then continue scanning for a sequence of yellow, then green. If you're looking for the edge of a window, scan for a solid line of the window border color.
Basically, think of the simplest possible way to express what you want to know. Your algorithm doesn't need to have a clue what a button is, or have any real intelligence. It feels like cheating, really, but it works quite well for simple tasks.
Sounds good. But I've also heard that there are template based algorithms, which would seem to be advantageous for implementing window corner recognition in a variety of operating systems. Basically, I'd just need some images from screenshots. That plus some edge detection would get me most of the way there.
Great to see more visual regression testing tools on the block. https://github.com/Huddle/PhantomCSS has been around a while now and does pretty much the same thing (as far as I can tell).
Good reporting on failed regression is almost as important as detecting them which is why I developed this on top of PhantomCSS https://github.com/Huddle/PhantomFlow
Is there any reason why you couldn't do this non-visually? That is, using PhantomJS to determine that a DOM element is within certain bounds for a position using if statements?
Simplicity. This automatically tests the whole page "for free".
Robustness of feedback for time spent. A small visual change could cascade into lots of tests cases that must be updated (rendering is hard, look how long it took IE to get it right). You don't need to update any tests with this tool, you just give the OK to a visual diff with expected changes.
Dynamic content. The BBC homepage is going to be different every day. Writing tests to test this could be arduous and error prone. Testing production content on build 779 vs 780 is a simple way to test regressions.
More importantly, you don't need someone technical to tell if it's an important change. This can easily be signed off by QA, designers, PMs, etc. They can see what it was before, after and what changed and decide if that's OK.
Also, the difference between a wanted visual change and a break is dependent on the change you want. That makes this quite a flexible tool (it's not just "any difference is wrong") that can be used simply and is unlikely to have false negatives (while making it easy to ignore false positives).
They don't care about that. They don't care if it's within 10px of the right location, or whether it's one DOM element or multiple, or how it's structured. They care it looks right, and that certain things look identical between builds. The size and position of the BBC logo, for example, must be in exactly the same position.
Also, this means it can be verified by a non-technical person. All you need to do is verify that the differences are either 1) wanted or 2) irrelevant, and this is an extremely simple and robust way of achieving that.
Automated layout analysis is pretty tricky. The data reported from the DOM is not always reliable, so it's helpful to couple it with computer vision data. I spoke about an integrated approach at Selenium Conf 2013: http://www.youtube.com/watch?v=DF0QUD0kuiQ