Hacker News new | past | comments | ask | show | jobs | submit login
Why Recorded/Playback Tests in Selenium Break (automatedbrowsertesting.com)
75 points by dokko1230 on April 18, 2017 | hide | past | favorite | 45 comments



A core issue is the use of Record/Playback approach... something that Selenium itself has dropped long ago (Selenium IDE is dead, people, DEAD).

In my experience, once you stop trying to record clicks, often on elements that will dynamically change outside of your control, and instead use proper coded tests for locating UI elements like Selenium 2+ gives you, it's much easier to have both tests that don't break and tests that are easy to maintain.


The thing is, "programming" tests might be the most robust solution if you know how to do it - but it also can be a waste of developer time. In the end, you spend your time on debugging the tests itself...

In my experience, record and playback can work well. Especially visual, screenshot-based solutions like Sikuli or Kantu are actually more stable then most developers think. And the beauty of them is that they can also be used by non-developers, i. e. "manual testers", so the developers can actually work on the main product, not the test suite.


That kind of attitude is why my entire career path exists. QA Automation is something that I'm constantly surprised people don't know exists. Product devs resent writing tests so we are the ones who do that exclusively.

As a counter to your experience I've spent a great amount of time stamping out issues with record-and-playback tests. They're a passable stop-gap if you desperately need testing and, for whatever reason, haven't written any. Sikuli is possibly worse, though it's been a few years since we last evaluated it and quickly decided against using it in any serious capacity.

Using manual testers as discount automation testers is a waste of everyone's time. They're the true experts in actual use of the product. I've seen them find more issues that from a code perspective shouldn't really be possible that it's given me an implicit trust and respect of their abilities. If someone wants automated testing it's another thing they're going to have to spend money on. Doing it half-assed is more than likely going to do more harm than good.


Sikuli is great. Especially for automating non-web applications. I specifically remember a project I had where we automated an entire test suite using Sikuli on an SAP frontend.

Kinda led me to create https://sourceforge.net/projects/sikuli4net/

Sikuli4net is pretty much a dead project now due to me having absolutely no time to devote to it though. :(


those are some good resources. I didn't even know they existed.


I agree that Selenium IDE has been long retired. To put into perspective, active development started in mid 2000s (2006 or 2007). If you look at the commit history on github, they only have a few commits sparkled in between months.

But the focus of the academic journal (not the blog) was more of a focus on general record/playback testing tools, not just Selenium IDE.


Last I knew, the code generated from Microsoft's CodedUI was absolute garbage. As an example.


We had recorded/playback automated test for our main product that we converted to Java code and try to improve it... It's a nightmare.


I work for testingbot.com where we still have a significant part of our customers who use Selenium IDE scripts.

The biggest reason for this is because they don't have the time or knowledge to re-write their scripts to use WebDriver.

So even though Selenium has dropped support for IDE a long time ago, it's still being used by testers.


I was dealing with Selenium tests a lot. I noticed developers are often skeptical about these tests saying it is not reliable. I do not agree with that, I have proven in several projects that you can have stable and reliable selenium tests. Just like any other tests, it requires proper test environment and continuous maintenance. The most common mistake I noticed is using `browser.sleep` instead of `wait.until`


Any other advice you'd give? I need to get testing up to snuff in a shop i recently joined and part of that involves frontend tests (something i've not done before). I fear making unstable Selenium tests.

Skipping anything you feel is obvious/etc, do you think there are any noteworthy gotchas i should know?

Appreciate your input!


Not the guy you think (kristiangw) but I thought I'd jump in because I was in a similar situation

1. the biggest bang for buck that we got was the ability to pinpoint which tests to run to exercise which feature. Use as many tools or strategies as you can to make slicing and dicing combinations of tests easy.

2. Think like React and keep everything relevant to a test close to each other with minimal separation into layers. Initially, we split data into a data layer (excel spreadsheets,) UI verification and navigation logic into a package, and a very thin test driver wrapper which pulls all this together. It quickly became very hard to determine what to focus on when tests failed. We have now started migrating to Cucumber where everything is nearby so test failures are easy to debug.

3. Write as little code as possible -- this is generally good advice for all s/w projects

4. Distill UI actions into their lowest level components and expose them as primitives. By this I mean expose things like "open URL," "click X," "Put text there," "Check CSS attr" as primitives. After this, each test is just recipes containing primitives.


The reasons for failures this artcle outlines are all pretty good. Weak selectors will get you into trouble pretty quick.

Also, I would definitely look up Selenium's implicitWait functionality, as that can fix a lot of "timing" issues within front-end tests. FluentWait is pretty awesome as well (In the past, I have actually implemented my own FluentWait that was independent from the Selenium libraries, because it was so awesome).

If your app supports multiple browsers, make sure you run your tests on all of them when developing -- I can't count how many times I have gotten a test to run reliably with Firefox, only to find out it is broken on Chrome.

Also, for the love of god do NOT use record/playback to generate your tests. The code that it creates, while a time-saver in the short run, is an absolute nightmare for maintenance. (at least it used to be, I haven't used it in the past two years).

Credit: 2+ years writing automated tests, and experience building three separate Selenium test frameworks in two different languages.


Few more thoughts from me:

* Don't treat your test code as second class citizen. For tests follow the same clean code principles you are following for your app code.

* I saw many dirty workarounds for selenium tests that didn't behave as expected. In major count of it happens due to lack of understanding selenium / test framework API. So read the docs. If something doesn't work - read the docs again ;)

* Don't test everything with selenium E2E tests. Test this way just most crucial scenarios. Leave more detailed stuff to unit tests.


A car that requires "continuous maintenance" is one I would call "unreliable".


Is a driver unreliable because they have to continuously adjust the steering wheel due to changes in the road? Is testing software unreliable because it has to continually adjust due to changes in the functionality of the software it tests?


> Is a driver unreliable because they have to continuously adjust the steering wheel due to changes in the road?

Is making points and adjusting them into counterpoints means a discussion is maintenance of some idea one wants to share, or is it just weird wordplay on your part?


> Automation suites are a way to harden the software, so that even minor changes will be significant and may halt development.

I find this to be the biggest issue. Specifically, the "best" time to write tests is also when the code is least likely to be stable!

It's easy to think "if we had just started writing browser tests when we had fewer features, we could have kept up with a testing suite". But that means your browser tests are constantly breaking and people just give up on them. The app eventually gets big enough, and has enough users, that you have to bolt on the tests later to keep bugs from hurting too many customers.

This seems problem seems to be way less of an issue with regular unit tests. I guess the lack of changing dom selectors would be a big reason why.


Indeed - automated tests introduce more friction because they are hard to maintain and to write.

So really, the best thing to do would be to make it easier to write and update automated tests.

Recording clicks is of course one way, but the tests are probably more brittle for it.

If you have access to the source templates, you could probably right up some kind of rudimentary compile time type checking and validation of selectors, so you could know when a new feature breaks something.

The best would probably be some seamless mixture of the two - record tests in terms of the templates, with the ability to detect breaking Ng template changes...


Hey, that's exactly what we're trying to make! I think there is a sweet spot for building a better IDE for automation...


We're making a push for more e2e tests where I work. And we very much have a focus on reducing the friction surrounding these tests.

We've settled on aiming to build a tool that keeps an index of selectors used <=> page/page structure. Then if something changes, you only have to update it once. So when you create a selector, it'll save it for that page, you can name it and reuse it.

Given that we have the source templates too, I think we'll try to validate that the selector will be valid. We might be able to even add tasks for the front end guys that alerts them if the changes they're checking in will break selectors.

Combine that will an idea for suggesting selectors based off the source, along with e2e script managent, it'd be a pretty slick tool.

It's a pretty ambitious project though. We do hope to make it general enough to open source though. We'll see if we make it (pun intended).


In terms of weak locators, I once wrote a Selenium-IDE plugin that would generate custom expressions. They were XPath-compatible (why throw away all those nice tools?) but the generation scheme strongly preferred using custom-attributes in the markup as the primary "signposts".

The goal was to make things a bit less fragile by allowing Devs to annotate markup, while avoiding retraining the screen-clicking QA or creating an entirely new locator scheme. This was back in 2010, but I think it generated expressions a bit like:

xpath=(((//.[@data-test-label='foo'])[1]//.[@data-test-label='bar'])[1]//.[@data-test-label='baz'])[1]/a

Messy looking, but it meant only a structural change to your foo/bar/baz structure would break the locator, or a particular change within the baz markup. Switching a div for a span or adding another nested div for some reason? No problem.

Unfortunately we never got to use it in production, the company was bought a couple years later and the new management wanted to skimp/dumb-down QA even further. I guess they wanted to buy results rather than processes.


Selenium tests are just generally a pain - even when written manually I've seen these fail more often from structural changes to the code rather than actual breakages.

I might be biased, but QA in Production [1] is my preferred approach now that we have great rollout systems [2][3] and ways to do verification testing in production [4].

[1] https://martinfowler.com/articles/qa-in-production.html

[2] https://split.io

[3] https://launchdarkly.com

[4] https://logrocket.com


But a structural change could be a bug.

The iron rule of software is that changes in the product have a reason behind them. If your UI tests break because of a structural change, that break should be anticipated while the corresponding story/feature is being developed.

In the past, when we faced test failures which seemed annoying and trivial, we discovered that the real reason why we never anticipated those failures was because we never know which tests to run for the feature or page in question. We are hoping to fix that by using robust, semantic tags for tests. So, where we used to have tags like @TC_123456 we now use tags like @Login_UI etc.


I wrote a wrapper around Selenium and Cucumber to achieve much better UI test coverage without the overhead of writing and maintaining chunks of code (PageObjects and so on.)

https://github.com/saiscode/8tomation

Just a regular .feature file (with minor restrictions) can seamlessly interact with any webpage without needing brand new glue code.

The framework also supports parallel verifications. E.g., if you want to check multiple items on the target page in parallel, you can do that too.

//I'm using this for my clients right now


Personally I prefer Protractor (wrapper for Selenium / WebDriver) to Selenium IDE: despite being officially advertised for Angular, it works just fine for non-Angular as well, and takes care of installing and updating webdrivers, and the tests are code instead of a wysiwyg UI.


There's not much sense in using it on non-Angular projects, as all it adds to Webdriver is stuff to work with Angular's digest cycle. Regular Webdriver should work just fine for writing tests in code as well, and you can use webdriver-manager [1] or selenium-standalone [2] to install Selenium and keep the drivers up-to-date.

[1] https://www.npmjs.com/package/webdriver-manager [2] https://www.npmjs.com/package/selenium-standalone


I think the fact that tests break is totally fine. You should expect them to break if your application changes. What you need however is a short feedback loop.

We use a combination of unit-tests for the things that are straight-forward to unit-test (UIs aren't). But for lots of things that unit-tests can't really cover so well, or for when we have to mock the entire universe anyway, we rely on UI tests.

I'm not affiliated in any way, but I can honestly say that Ghost Inspector[0] made things WAY better for us. Why:

* No more wrestling with capybara. Not having to constantly switch between the browser, test runner, IDE. Most tests don't take longer than 5 minutes to create from scratch.

* We run UI tests on our staging environment, and we deploy all feature branches to staging before they get merged to master. UI tests are automatically run as soon as we deploy staging. And they run in parallel. So they're quite fast.

* We plug the Ghost Inspector results into slack+email, so as soon as something breaks, we know immediately.

Fixing the broken test is usually very straight-forward, and our developers do it (we don't have QA). Even if we need to re-record a test, it's usually much faster than re-writing a capybara/integration test.

[0] https://ghostinspector.com/


Attribute in this context would be something like html <input name=“FullName”/> that is changed to html <input name=“Name”/>

This first example is weird. That change has a pretty big impact on the way the form works; browsers serialize forms using the name attribute so that change means the form data that's sent when the form is submitted is going to be different to what it was before. You'd get FullName=<value> in the form encoded data from the first input and Name=<value> in the second. I would expect a change like that to break a test.

If you don't care what the name is then don't use it in your UI tests. For example, use an ID instead. Using <input name=“FullName” id="nameinput"/> you could locate the element using by.id('nameinput'), and then you could change the name <input name=“Name” id="nameinput"/> and your tests would carry on working.


Not necessarily. For instance, in many (most?) SPAs, forms are just a... formality. Something to get your Bootstrap-based (or equivalent) CSS to look proper, but the actual form data is pre-processed in JavaScript and sent via an XMLHttpRequest.


Even if you're not using the name in the form data you're still giving it meaning and importance by using it in a locator - you're saying "find the specific element in the page that has the name 'FullName'". If you don't actually care what the name is then you should use something else, like an ID attribute.


Sure, I don't necessarily disagree with the sentiment you're expressing, I'm just saying that today, HTML forms aren't as tied to the backend as they used to be.

Though personally I wouldn't want to have to start adding ids and names to things in the markup just for Selenium to find them.


I'm the founder/CTO at Unravel, where we're working on using machine learning to solve this problem. Unravel lets devs write tests using our declarative DSL and we generate the browser tests at runtime, using ML for element classification & expert systems to determine sensible actions. You can either give us explicit inputs or use predefined data profiles (ie, complete a form with dummy details for a U.S. user with a test Mastercard, etc).

The dynamically generated at runtime tests solves the weak locator problem, and it's smart enough to avoid inputing invalid values in input fields. We deal with page loading well, and we're working on being able to automatically deal with alerts (and other modals).

The result is browser tests that are way faster to write (because you don't have to worry about locators), and the tests are way more resilient. We're currently pre-release, but we're looking for companies and devs who'd like to be in our upcoming alpha.


Wait is there actually demand for something that...I guess powerful? Applying ML to Selenium selectors seems like some truly massive overkill.

Not to mention dynamically generated tests at runtime. I'm not really sure how that would solve the problem of a weak locator. If you're given explicit inputs and they're weak enough to break then building a reliable test from that seems a tremendous challenge.

I love that you guys are having a go at this problem. I do this kind of regularly. The claims of being faster to write due to not having to figure out locators doesn't seem to hold water though.


Well, instead of having to explicitly fill each field in a form, you can just say "Fill the personal details form" or "Fill the credit card form" and we'll work out what each field is and fill it with sensible inputs - either dummy data or data supplied by the test author. Since you aren't hard-coding each locator, and we determine what fields are present and how to deal with them at runtime, unstable ids, classes, & xpaths, and even adding/removing form fields can be handled gracefully (as long as the user journey doesn't change semantically).

We can also handle other common actions like "Login", "Search for ipsum", "Select the first search result" (or "Select the first product/hotel/flight/etc" - that one's really impressive), "Add to cart", "Checkout". When writing the tests, we give you the list of imperative actions than we think your declarative commands map to, and you can correct us if (when) we get it wrong.

We also have some value-adds. Apart from the usual assertions (checking the page loads, looking for certain text or elements, et cetera), you can write assertions checking for the presence and correctness of 200+ advertising and analytics tags, and can included computed assertions (assertions that are computed from the context of the test). We also have natural language date handling, so you can write evergreen tests with commands like "Search for a flight next Monday" and we'll resolve that to the actual date to input into the datepicker. We're also working on malware detection (basically checking all loaded resources and outbounds like on every page we load for suspicious markers).


Wow, that's really impressive. Good on you all for putting in the work.


Thanks, but literally all the hard work has been done our dev team - not me.


I work on a (commercial) library that solves especially the weak locator problem. It wraps around Selenium to provide a more high-level API, while still giving you full access to Selenium. It's called Helium [1].

[1]: https://heliumhq.com


I tried that, seemed very impressive but for whatever reason it refused to fill in my login box. That was the end of that:(


That's weird. Any chance you remember what login box?


You need to set up a DSL for each feature, which is used to test the feature, so any changes to the feature must necessarily include the changes to the DSL, meaning that tests based on "adjust parameter x" don't break just because the parameter x config element had a revamp.


The assumptions here are bad - what is a click on a screen that executes an action? Its called a command.

implement a proper pattern, and suddenly im locating commands instead of selecting dom elements which may contain my command.


I find automated UI tests to be a massive waste of time in every shop I've worked that used them.

Just smoke test that all the inputs do something and unit test anything else you care about.

UI testing is an antipattern


I am on the same page, I used page objects and was adding Ids to everything with retry couple times before failing (state of art interface automation?) but tests were flaky anyway so I gave up on that. Just too much hassle for little or no return if interface changes. Let alone testing multiple browsers...


I was facing exactly the same issues with my automated tests so I wrote a wrapper to help me completely avoid writing page objects.

https://github.com/saiscode/8tomation

With this, your basic cucumber .feature file (with minor restrictions) can be executed without needing page objects. Take a look if you have a second...




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: