Hacker News new | past | comments | ask | show | jobs | submit login
Web automation: Don't use Selenium, use Playwright (pythonforengineers.com)
408 points by rekahrv on Nov 9, 2022 | hide | past | favorite | 171 comments



OP struggles with XPaths in complex DOMs. This is a solved problem. Use Helium [1] to write test code that acts on what the user sees:

  start_chrome('github.com/login')
  write('user', into='Username')
  write('pw', into='Password')
  click('Sign in')
I am the author of Helium. It's a wrapper around Selenium. It's fully open source. Under the hood, it uses Selenium 3, which is old. But boy does it work beautifully.

1: https://github.com/mherrmann/selenium-python-helium


Super cool, but to be fair to the author, Clicking does seem easier still.


Until the web site changes and you have to re-record.


Playwright's selectors are very resilient.

Yes, your copy might change, but that is also a change in spec (whereas a change in DOM is not always a change in spec).


Yes! Recording xpath works fine in many cases, but once something on the website changes you are left with this auto-generated Xpath that you do not know how to change. Maintaining such a test is a nightmare and a waste of brain-cycles.


I hate XPaths with a passion. They are powerful but a waste of time. The easiest solution is to use visual automation tools like UI Vision (web browser) or Sikuli (desktop). There you write something like "click: login" - and the word login is found not by parsing the DOM, but by using OCR.

Obviously such an automation runs much slower than any Selenium based automation, but it its sooo much easier to create and maintain, especially on complex websites with tons of Javascript and frameworks. I believe this is the future of causal (test) automation.


> OP struggles with XPaths in complex DOMs.

A lot of the selenium code I have run up on people try and get the exact element they want on the first try and repeat the process several times instead of doing something like searching for a div, form, or some parent element that has all their elements then all they need is to search relative xpaths from that element.


Does it solves the issues with virtual dom ? It's a pain to manage.

The readme mentions iframes but not that.


I think a better approach is to take ui tests into account during development and use data attributes, like data-cy="input-username" or data-input-name="username"

Also the issue with Selenium isn't with finding elements on a page


Wow this is cool


Do you know if there's a way to integrate Helium with Splinter?


I don't know Splinter, sorry.


You're going to get a lot of responses that say record-replay testing doesn't scale, is unmaintainable, etc. They're totally right.

But the thing is, early on when your app is in flux, neither does writing Selenium code. There's a pretty big truism in UI automation that writing UI tests before a UI freeze is a recipe for a shit-ton of pain. Coding to ids or xpaths only gets you yay far if the UI flow itself fundamentally changes.

But re-recording might be easy.

Don't use stuff like this for long-standing tests. Unless you architect your app just right so that IDs are always stable, and you never change the interface, it'll break and break in ways that you can only fix by a complete re-record. Plus the tests tend to be extremely timing-fragile because recording uses delays instead of synchronization points, so they just won't work in CI.

But do use stuff like this at your desk during bring-up, when the cost of a re-record is lower than the cost of a test rewrite and it's ok to re-run the thing if the first try craps out due to a timing glitch.

And from there, keep an open mind.

I went to a GTAC (Google testing conference) where a presentation made a very good argument--with numbers and everything--that for smaller projects with simple and more or less static UIs, and where the tests were all fundamentally scripted, there was almost no advantage to coding the tests later. Record-replay was the best way to go.

But I definitely don't think a system like PlayWright fully replaces remote stubs like WebDriver and coding tests in the languages that can talk to it.

At some point you hit the issue that the login screen changed and now every single test is invalid. It's awfully nice if you used Page Object Model and you have a single-point-of-truth to fix it.

More to the point, test automation can do more than just execute a script repeatably. Randomized monkey testing is something you can really only do in code, ditto scenarios that need to talk to testing interfaces exposed from the app itself.

Glad you found a tool that resonates with you!


> I went to a GTAC (Google testing conference) where a presentation made a very good argument--with numbers and everything--that for smaller projects with simple and more or less static UIs, and where the tests were all fundamentally scripted, there was almost no advantage to coding the tests later. Record-replay was the best way to go.

I agree with you. Most people here discuss recording vs not recording. I think most people really lack a basic understanding of testing. A tool is still a tool.

In my point of view most websites aka apps can be treated like that and maybe should. Usually there needs to be a testing strategy in place. Why and what are you testing? What do you wanna confirm, what kind of bugs are you looking for, what is your testing strategy? How much time and effort go into test?

We found out (finance), that one essential ingredient was missing in our tests: the happy path. Without a single test for the end to end happy path, testing for anything else becomes useless.

And the happy path can and maybe should in most cases be tested using a recorder, because it comes closer to what a real person does.


We're of a mind, for sure.

I was/am a pretty big fan of the Context-Driven Testing (https://context-driven-testing.com/) concept from James Bach and Cem Kaner, though I think Kaner later backed away from either it or Bach, not sure which. It was basically the testing version of the Agile Manifesto.

The idea in general of "THINK about what you're doing and the specific results you need and do what's OPTIMAL, not what's DOGMATIC" has guided my career for many years, both inside and outside of testing.


> It's awfully nice if you used Page Object Model and you have a single-point-of-truth to fix it.

I like fewer sources of truth, but I dislike trying to force everything into a "page" analogy. Try maintaining a library of commonly-used actions instead, like:

  logIn()
  registerNewUser()
  addItemToCart()
And maybe group them thematically.

The underlying intent is the same... reduce duplication, improve organization. But you won't be stuck wondering "what page am I on now?" in a single page application, or rifling through folder upon folder of nonsensical "page" files.

And even in the worst case, Ctrl-Shift-H still works.


A long time ago I wrote an enterprise automation platform that abstracted things into a business-minded DSL like your example above. Wired together with XML. Allowed a tools team to define new DSL and tests to be composed more simply by intent. Workflow would look something like

    createUser
    logIn
    addCreditCard
    addItemToCart
    checkoutItem
Was pretty powerful, but ended up being used more as a CD platform for testing and less like automated regression testing.


You can write fixtures for those in Playwright.


I could get into an even longer discussion on this, but basically when I brought up UI automation teams my advice was:

Use granular element-oriented functions (i.e. loginButton.click() or fillForm(name, pw) type stuff for the very small part of the very few tests that were specifically exercising that portion of the UI.

Those you probably define traditionally for POM, as methods of the page (or functions in the page module depending on the language).

Use result-oriented functions (logIn(), registerNewUser()) whenever it's "travel" i.e. things you do to get to the start of your scenario, or a setup/cleanup task.

Those you do not keep with your pages, they live in modules organized by task or result. Plus they have to work from anywhere. They can leave you somewhere, if that's their defined result, but they should be callable from any UI state. By the same token, tests shouldn't assume how they used the UI to get there, again, unless that was defined as their result.

In other words, they're functions: black boxes. The biggest point there was "you can't change this function without preserving that contract, and you can't assume anything but that contract."

The advantage is that if you could wave a magic wand for setup, travel, or cleanup and get the result, the test would still work and be valid. IOW, you can select the most robust and direct way to accomplish those things, even going completely around the UI with cookie injection or whatever.

What most other teams I've had visibility into tend to do differently than I advised is they'll use the granular element-oriented POM functions everywhere except fixture setup/cleanup. They don't have the "travel" concept of things you have to do on the way to your scenario start, and they include that in the test scenario itself with granular element calls.

And travel is really all setup. But some reason when it's "set up yourself via login, option selection, loading a file, etc" people's thought process goes out the window and they think that all needs to be strung together in UI like a user would do it. But intelligently separating out the very small bit of "specific UI manipulation that causes a state change + verification of that change" that is the test from everything else in the scenario that is setup/travel/cleanup gives you much more maintainable tests.

Or even when they do separate them out, they're not really "result-oriented" functions. Instead they're "flow-oriented" macros that you couldn't replace with a magic wand, because the meat of the test assumes intermediate UI flows they performed rather than just end state, and they're written to be strung together in some coupled (and usually undocumented) way.

Then you have the systems that try to use the same functions for setup/cleanup and testing, caught in between the need for granularity and robustness. Those tend to get extra "doItThisWay" flags on their functions and stuff really goes to hell.

Gotta keep 'em separated!

TL;DR I agree with you, and even a few steps further.


That sounds similar to the Screenplay pattern [1]

This has the concept of actors, abilities, interactions, questions, and tasks. This allows good separation of concerns as well as much more user-focused tests.

[1] https://serenity-js.org/handbook/design/screenplay-pattern.h...


Oh, wow! I had been trying to develop a pattern like this out a few years ago, ~2016 (at the time, called Action-Flow), but ended up shifting out of test as a primary focus before I could put polish on it and make it cohesive enough to publish something.

I hadn't realized there was prior art to look at or potentially clone by mistake. I wonder how old this pattern is.

Thanks for showing me!


I watched a few person test team blow itself up (people leaving, going toxic, doing minimum work or other things, ...) because they wanted to get it 'right from the start'. Many, many, months in it was barely not functional.


I've been on at least one team effort like that (not as the architect, in my defense--in fact, it was that failure and my somewhat frustrated ask to my manager to give me a couple of weeks head down to start it properly that launched the automation lead portion of my career).

I bet a team like that would be totally transformed by getting some early success under their belts, and giving valuable feedback early enough to actually be included in the whole process.

That's exactly what a QA team can use these tools for, especially early on--it can accelerate manual testing if nothing else, by people record-replaying at their desk for repeatability. Even if that won't work for verification, sometimes just recording the script that does all the setup you have to do for every test gets you really far--then you just take over manually from there.

In general, I think QA's sometimes-bad rap comes from putting in maximal effort and cost, in trade for a very limited and usually unquantifiable increase in actual quality confidence, and that usually comes down to dogma. Most of the "traditional" ways of doing QA come down to writing maximum docs without any single point of truth (and therefore become maintenance hogs and/or wrong) then spending a million bucks or more a year on a team whose whole job is to tell you 99 times out of 100 that things look the same as yesterday.

It makes the sector a thankless grind from the inside, and makes it an expensive spreadsheet generator that sometimes blocks your releases for unclear reasons from the outside. Fun times.


Record-replay doesnt even work when running the same scenario 2 minutes later half the time - never mind in 2 months' time.

I tend to find in most apps that unless you directly change the front end templates to give elements you interact with better identifiers and then deliberately use those identifiers your browser tests will always end up an unreliable sucky mess.


Agreed, giving it at least stable identifiers to play with makes a huge difference. With some front end generators, that may not be trivial, though I'd think Selenium being a thing (POM does rely heavily on stable unique identifiers, since "nth element past m" xpath queries are way more fragile) has probably made it more of a standard ask.

Dynamically-created UIs are usually the hardest thing to deal with--if they get a different ID every time, record-replay is out the door, and even Selenium is a lot harder.

But personally, with a lot of automation architecture experience, I think you're exaggerating a little re: doesn't work 2 minutes later half the time. But it also depends on whether your app is driving some invisible external resource that makes the timing highly variable, etc. Even then you can usually "harden" the recording by putting in worst-case delays. It's just that then your test takes so long to run you can't CI it either.

It's really situational. What I'm arguing against more than anything is the knee-jerk reaction that it's never appropriate. It's just never enough usually. The fact that we tend to skip over the option entirely in testing is probably a blind spot and a mistake. Devs dabbling in testing as a side task definitely shouldn't ignore the option.


I don't write tests based on xpath or ids - it is too fragile. I just stick a special class on elements I want to target, like class="test-save-user-button". This allows changing the markup however you want without breaking tests. Just leave the test-* classes alone and you're fine.


I was probably being a little too literal when I said xpath/id. Something unique like a custom class would be ideal, agreed. I just see that as a different kind of id that I find through an xpath query, which is what I really meant.

To your point, though, the advantage of the class is that other things in your stack probably don't have opinions about what class you tack on, whereas they might want to define your (actual) id for you. IIRC this was how I asked devs to get around dynamic-UI runtime issues where the ids were also auto-generated.


We add data-test-id, data-test-name, or data-test-value attributes (depending on the case) to our elements, and selectors are rock-solid.


Using data-test-name or data-test="name" for e2e automation is the right answer. Data attributes are not going to conflict with your classes or IDs, won't get mangled, and they are tolerant to DOM, style, and backend/form refactoring. Essentially ignored by everything but your test framework.


I disagree completely.

E2E tests should mimic how users interact with the page and they see no data attributes.

You can query by text, label, html semantics, position.

Hell even for clickable icons you should have alt text.


Minor copy changes shouldn't break your tests. If you're selecting elements by their content the tests are much more brittle.

Nor should unimportant stylistic changes in class, or changes in element order/positioning, break our functional tests. Thus we give important elements data-attributes which are resilient to all forms of refactoring.


Let's agree to disagree I guess.

HTML has specific semantics, and I find that interacting with applications the same way a blind person would (aria labels, text, etc) leads to much more solid tests.

Copy changes are far from common on stuff like labels and any actionable item you aren't changing "submit" to "send" every other week or "pay" to "checkout" (moreover, such a button would have a meaningful aria role like "search" or "register").

And if you do, fixing the tests is generally very cheap and quick, so I see it as a non-issue.

Nothing will change my idea that abusing data attributes (which are sometimes, but rarely, a necessary evil) leads to the great amount of non accessible and semantically incorrect bloated HTML we see everywhere on the net.

Good tests will lead to better websites, and data-attributes do nothing to help in that direction.


Humans are faaar too smart to even try mimicking their behavior. All you can hope for is to make sure that the intended usage works fine.


I agree with all you have written, however I feel its important to reply to:

> for smaller projects with simple and more or less static UIs, and where the tests were all fundamentally scripted, there was almost no advantage to coding the tests later.

How many times in my career have I been asked to meaningfully design tests for basic static sites? Hardly ever.

Complex, multi-tenanted, many user-ed hydras with dynamic client side rendered content? All the time.

The scenario in which that Google presentation suggests it's most effective is the rarest ever, in my experience, especially if site is complex enough to have someone working on the team with browser automation coding experience/ability in the first place.


It's been a lot of years, but some of the handwaving there was editorial on my side. I can't remember exactly which projects they propped up as their examples. That said, I wouldn't recommend record/replay for Gmail, for example, so I do believe they meant smaller and simpler ones.

I also think (but am hesitant to assign to them because obviously I built this up in my head over the last ten years too) that part of the argument was that this enabled testing earlier and for smaller projects than people normally bothered testing--for example, internal-facing tooling. Whether or not they said that, I think that is one of the big advantages.

Also, left unsaid, this was for situations where a very small number of critical path UI tests are sufficient for the moment and you're not re-recording a huge suite if something breaks. Of course, if you're familiar with the testing pyramid, you probably know that only having a small handful of critical-path tests running E2E via UI is your ideal, period. Most organizations who do it at all heavily over-test via UI automation.

Your point is well-taken, though. By the time your app is that complex, I'd say you're probably in the position of creating those long-standing tests I wouldn't recommend recording.


> Selenium just couldn't parse the over-engineered Javascript framework we had, and I had to pile on hacks on hacks.

Excuse me, but isn't selenium actually a browser robot and not a javascript parser? I assume this is just that DOM is mutating so much that they could or could not find that element or always have to wait for it and comes with unreliability and such.

I haven't used alternatives, but using selenium feels like solid "Engine" to make abstractions on.


That's odd - Selenium has been my go-to when I need to scrape and Scrapy won't suffice (I'm not doing anything too huge in scale) and it handles web apps well. I do have a lot of waiting for elements to be visible, admittedly.


Testing, scraping, and automating are not really the same tasks.


I'm a heavy user of playwright and I find it by far the best tool for all of three.


I’ve tried writing e2e tests for web apps using browser drivers on-and-off for the last decade, and it feels like with playwright that the tech has finally matured enough to make the tests reliable enough to be worth the effort of maintaining them.


What does Playwright do that Selenium can't do?


It's just nicer to work with.

I'm pretty sure they have equivalent (or very close to that) functionality, in that anything you can do in Playwright can be expressed in Selenium and vice-versa.

But Playwright, from my experience, is much easier to use. The documentation is more comprehensive, and the API design is much more aligned with the kind of things I want to get done.

This is because Playwright has benefited from decades of experience in the space which started with Selenium. The engineers behind Playwright previously built Puppeteer, which was itself an evolution of ideas from Selenium.


Playwright can run tests in multiple windows as well as tests that cross multiple domain origins. My team builds a collaborative app where we wanted to cover scenarios where two different users collaborated on the same page. This was easy to implement in Playwright, in Cypress it is not possible. We can open two browser windows with Playwright and simulate two or more users simultaneously editing a page.

Also Playwright is a general browser automation tool and can be used for more things than just testing. For example, scraping website data or generating preview thumbnails for URLs.


Selenium can definitely run tests in multiple windows.

You can open multiple tabs within the same browser window, or you can create multiple WebDriver instances for multiple windows.


Fair point. Cypress cannot, the initiator of this comment thread context asked what Selenium can’t do that Playwright can, but my brain misred this as “Cypress”. Sorry about that.


Are there any good tutorials out there for using Playwright for scraping? Never tried it but I have a need right now.


The main difference is that Playwright uses custom headless builds of "browser projects" (like chromium) rather than attempting to automate a normal consumer build of the browser via the webdriver api. The webdriver api is kind of slow, browsers don't always implement it well, and it doesn't have a good mechanism for the browser to push events back to the test application which ends up requiring selenium to do a lot of polling for changes so it can miss events that happen quickly.


> The main difference is that Playwright uses custom headless builds of "browser projects" (like chromium) rather than attempting to automate a normal consumer build of the browser via the webdriver api.

I think you're wrong there. See [1].

It doesn't use custom browser builds, but specific versions of official builds of mainstream unbranded browsers (you can also use branded ones if you wish). There are no "custom headless builds". All browsers support headless mode, and you can use it on your main browser by passing a CLI flag.

The reason it officially supports only specific versions is because there may be slight incompatibilities with the Chrome DevTools Protocol in each browser version, so they suggest running supported versions only. But like a sibling comment mentioned, you can also connect it to any existing browser instance, and, assuming there are no CDP incompatibilities, it will likely work.

[1]: https://playwright.dev/docs/browsers


Events are the bane of automation existence.

Back in Windows-heavyweight app land, it made way more sense to interpose a low overhead shim to intercept and copy events.

I can't imagine the pain you'd encounter trying to paper over the other way around.


> Events are the bane of automation existence.

They do make some tasks tricky to handle, like tracking state, or avoiding race conditions, but in reality events are a good fit for interacting with browser DOM events, representing browser activity, and being able to react to it.

Consider the complexity of an average modern web site, that loads dozens of scripts, uses a frontend framework, is highly dynamic and interactive (likely a SPA), contains multiple frames, etc. All of this happens concurrently, and the only abstraction that makes sense to represent it is an event-based system. You can't achieve the same level of responsiveness using a traditional request/response protocol.

This is why modern browser automation protocols are event-based, and built on top of WebSocket. Selenium is pushing for WebDriver BiDi, while the browser standard is the Chrome DevTools Protocol.


That wasn't an argument against events! It was an argument against the futility of not being able to sit in the middle (-ing) of them, for automation.


Ah, apologies, I misread.

Fully agree then. The only sane way to handle browser automation is by exposing all of them.


Does it have a mode where it can drive a real browser? Most real-world UI test matrixes for web apps want the tests run on their real targets for any final acceptance. "Fake" headless browsers tend to be more useful for first-tier CI and just keeping tests warm.

If it can only drive its internal browser engine, that'd possibly be a big argument against it for any test strategy I was architecting. I'd want the tests to be portable and usable cross-browser for acceptance--particularly UI-driven E2E tests, since that's really the only time they are critical.


Playwright does use "real" web browsers (Chromium, Firefox, WebKit). It just supports specific versions, as otherwise it would be difficult to maintain. It runs them in headless mode, which most browsers support, and should behave exactly as a headed version would. You can start a test in headed mode if you want to see the browser window.

The era of custom headless browsers like PhantomJS, or Selenium/WebDriver for that matter, is dead, IMO. The Chrome DevTools Protocol is the modern standard API to interact with a browser programmatically.


Oh, OK! I misunderstood the comment entirely then. Thanks for correcting me there, especially since I was apparently too lazy to look for myself.

This does sound more promising then, if they've found new ways to add maintainability or QoL aspects to record/replay. As I said elsewhere, I think r/r gets a bad rap, and those sorts of improvements would increase the number of situations where it makes sense (at least for testing in incubation).


I think the mistake was in the comment you replied to, that suggested Playwright used custom browser builds.

I do recommend giving it a try. The web browser automation landscape is much more reliable these days for writing E2E tests than back when Selenium was state of the art.


Yep, it’s just a matter of setting headless=false and it can launch a normal firefox/safari/chromium exe. You can even have it attach to an existing instance of your personal browser (via the debugging protocol).


Funny this popped up...I've been looking at replacing our slow and flaky Selenium test suite recently.

From what I can tell here are the benefits:

* With Playwright you don't need to put manual waits in your program. It's async.

* You also don't need to keep updating the Selenium.Webdriver to stay in sync with your browser version. This is a 15-20 minute time sink each time it happens.

* Can run in headless mode which will run the test suite much faster than selenium.

* Also appears to have better command line options to be run from a CI/CD environment


selenium also supports headless mode though, or is playwright's one different?


Selenium supports things like waiting till an element is clickable?

The webdriver issue is definitely annoying though. Not sure if there's a better way, but we're using a binary regex on the electron app in question to find a matching version and download it on the fly as part of test setup.


> With Playwright you don't need to put manual waits in your program. It's async.

That is just choosing between a blocking method or... a blocking method which suspends and resumes a thread.

> headless mode

That is really up to the browser driver to implement that, and you can do that perfectly fine in Selenium.


Run inside a container with near zero effort for CI pipelines


The author says the killer feature is:

> Playwright "records" your steps and even gives you a running Python script you can just directly use (or extract parts from). I'm serious about the running part- I can literally record a few steps and I will have a script I can then run with zero changes.

Does Selenium have that ?


Selenium IDE does, although it is much easier to do in Playwright IMO. I think there is honestly no contest now between selenium and playwright for most needs; I wouldn't ever go back to Selenium. If you are starting a brand new project today in Selenium, I honestly think that is foolish given the better alternatives now out there. Selenium was a great project, but it is strangled by its own success - very hard to meaningfully redesign its API now so much of it is in legacy projects who will want to update their dependencies one day.

However, regardless of the tech used, recording playback of HTML manipulation is pretty brittle - if you think adopting playwright because it has a record feature will save any kind of time/effort/money, you are in for a surprise. This is a feature everyone tries once and goes "wow, neat!" and then quickly realizes none of the suggested code is production ready, or even likely to work on a 2nd run for anything other than the most basic of static sites.


Is stable and reliable

Selenium sucks


Auto-waiting.


OP suggests that having a large company behind a software tool (Microsoft in full control of Playright) is a positive thing and is an advantage over Selenium.

I tend to see that negatively. Community driven projects (or non profit organisation behind them) like Python, Postgres, Firefox always are more trustworthy and likely to last longer in general.


You won't like learning where Guido's been working for many years now...


But Python itself is controlled by an independent foundation[1] and an elected steering council[2], which is very different from being owned by Microsoft.

For example, if Microsoft wanted to change CPython to automatically collect sloppily anonymized user data and send it to Microsoft without asking, as they did with .NET[3], I would expect the steering council reject that proposal.

[1] https://www.python.org/psf/

[2] https://peps.python.org/pep-0013/

[3] https://github.com/dotnet/sdk/issues/6145


He also stepped back after the whole walrus kerfuffle.


I find it astonishing that you thinking the current employer of the BDFL isn’t public knowledge! But even more astonishing that you alluding this can have an impact on the PSF. GvR working for Microsoft isn’t the same as Microsoft acquistion of GitHub.


Well, CPython performance only started to matter after Guido got hired by Microsoft, the wonders of big corps.


Having big corporations contribute to projects is totally great! The incentives are just fucky when they control them.

My favorite example: VSCode is a great piece of software, but almost half of it is actually only available in a proprietary build from Microsoft's website that contains telemetry.


> but almost half of it is actually only available in a proprietary build from Microsoft

Which half?

VSCodium works great, so I'd go as far to say 99% of VSCode is actually open source (the name and icons being exceptions). You cannot say proprietary extensions are half of VSCode.


I find so ironic that most people that complain about telemetry are also big Google Docs users, or Web apps in general.

Not necessarly your case, but surely for most complainers.


GitHub is also owned by Microsoft.

Playwright is Open Source - the worst thing that will happen if Microsoft will kill it - you’ll be able to use the latest release until you find some replacement.


The bad scenario isn't Microsoft killing it, it's them slowly forcing everyone using it into Azure, Windows and Visual Studio, which is their usual, I think partially openly admitted, open source strategy nowadays.


That doesn't make any sense.


> GitHub is also owned by Microsoft.

GitHub was bought by Microsoft in 2018.

Recently Microsoft announced they were killing off GitHub Atom.

Being able to use the latest release of a dead project is no big reassurance.

This might be an anecdotal case, but it is indeed real world proof that a project being backed by a company the likes of Microsoft does increase the odds of being killed off.


Well, it does not make sense to have two open source, JS based text editors/IDEs that are built on electron.


Sure. [hypothetically] After Microsoft makes new "Azure UI Test", it will surely kill Playwright in favor of the new thing. It wouldn't make sense to maintain two browser automation frameworks. The second one being bound to their cloud offering is just a ... coincidence?


And from both of them, better keep the one that is under Erich Gamma's stewardship, he just happens to know a couple of things about IDEs.


Atom was bad since day One


And Google Reader was awesome till its last day.


That was a service rather than a code product, though I do wish Google released its source code for us self hosters to use rather just kill it.


And someone will likely fork it.


> So much so, I started looking at Javascript testing frameworks like Chai, Mocha, Cypress etc. The problem I found is that they require a completely different setup, and aren't easy to get started for someone from a Python background.

A shame to see Cypress lumped in and dismissed like that. It really is a fantastic way to test.

> The killer feature of Playwright is: You can automatically generate tests by opening a web browser and manually running through the steps you want. It saves the hassle I faced with Selenium, where you were opening developer tools and finding the Xpath or similar. Yuck

This is absolutely not the primary hassle with testing. Recording steps can help kick start your testing, sure. But very quickly you start to realize that it's only saving you from the easiest work at best, and is creating flaky tests at worst.


> Cypress lumped in and dismissed like that

I think Cypress is really cool when you're on their happy path - testing on Chromium browsers and in JavaScript.

I tried it some time ago so maybe this might be outdated, but they had little support for browsers other than Chromium or even rather Electron based and also it was a little difficult to wire up with Gherkin scenarios.

Keep in mind that the OP refers to Python tools, AFAIK Cypress does not offer Python bindings while both Selenium and Playwright do.


We have used Cypress for 3 years to test our oss project: https://apisix.apache.org/blog/2021/02/08/stable-product-del...


We added experimental support for WebKit [0]. It uses PW WebKit under the hood (as mentioned in the blog post). Disclosure: I work on Cypress (but I did not work on the WebKit feature). Firefox works great with Cypress.

Cypress definitely has some limitations, like any other tool, such as only supporting JS-based test. Playwright is definitely a great tool, too. I like them both for different reasons :)

[0] https://www.cypress.io/blog/2022/09/13/cypress-10-8-experime...


Cypress has good support for Firefox now and they are working on WebKit (in fact, I think they’re using playwright-webkit). Not as mature as Playwright, but it’s getting there.


99% sure record tab is available in selenium as well


"Selenium IDE" is the recording tool: https://www.selenium.dev/selenium-ide/


Yeah that was their main product for QA. I’m pretty sure most outsourced companies just use that and add waits to make things green more often lol


Indeed. You need: loggers, data generators, integrations with state probe points, data/result aggregators, etc, etc.

For massive numbers of tests, now it's time to look into parallelism. Really hope you're not sharing your test data between test cases! Or holding open a single db session to your write instance!

How about multimedia artifacts management?

Point being, test automators do a shit ton of work nobody ever wants to dig into.

Tests are not there to only be written. They must be read.


How easy is it to use one of these to do actual automation (not testing)? Say I wanted to login to an SPA, navigate to a page and download a file?

Asking mainly because there’s a tool I have use without an API and if I could script something like that it would make my life a lot easier. Just never tried that type of thing before.


It’s pretty easy unless one or more of the following is true: (1) the site invalidates your session often and uses a strong CAPTCHA on login (you can hook up a CAPTCHA solving service for cheap if your usage frequency is low, if it’s a type supported by the solving services). (2) The site employs advanced automation detection and denies access, in which case you may be screwed. I’ve seen sites defeating puppeteer-extra-plugin-stealth before. (3) the site uses a div soup with no discernible structure and random CSS class names, and the information you want doesn’t have uniquely identifying features, e.g. you want to extract some numbers on the page without labels. In which case you might have to resort to CV in the worst case.


Another annoying one is 2FA logins.


TOTP is easily automated. SMS-only is apparently difficult.


> Say I wanted to login to an SPA, navigate to a page and download a file?

Most testing tools make this kind of thing pretty easy. Cypress sits in the page's JavaScript and has access to a Node back end, so when you're not clicking on stuff you can be firing off requests or doing basically anything.

What you might find is that you don't need any fancy stuff though. Maybe look at the download request in Chrome Dev Tools and see if maybe you can just execute a POST command in your language of choice?


Sounds like you can use a macro on the OS level or a plugin like greasemonkey/tampermonkey.


Never tried that. I’m basically trying to setup a cron to login to this site every day and download a CSV.


Any of these would work fine for your idea and not be very difficult to do. (Assuming the site doesn't have any anti-bot measures). You'll need to create a cron that runs the automation script or opens a browser with your greasemonkey script installed.

Sometimes you can even do these with pure curl. POST the login form, get back the necessary cookie or token, then request the download URL, if it's easily predictable.


In my worst case scenario doing something similar, I had to go through the login process via Selenium, then download the file (it actually was a CSV for me too) in Python, by grepping the source code (Xpath and CSS selectors were useless) with a regex. There are ways to share cookies between Firefox and Python, and you could probably save a step by running Selenium from Python. Then make an HTTP request with the proper cookies, UA, and accept-* headers.


Nordpool energy prices perhaps?


Cypress doesn't support webkit.


As a long time test automation dev, I would never want to be tasked with e2e browser automation again. I did that for a brief period in my career. It didn't feel like real coding. The tests always seemed like they were barely less maintenance than manual testing. I'm curious how big the e2e web tests suites are for best in breed web software. My opinion has been to minimize the amount of e2e browser tests and do app integration tests solely against the API (assuming there is one!), but I've never specialized in web app testing.


Web frameworks are big enough that even if the API is successful, the UI may be wrong. It’s extremely useful to have good enough UI tests that the dev himself can see what is failing.

See that another way: You, as a human, can’t scale for a team of 50. e2e tests can.


> Web frameworks are big enough that even if the API is successful, the UI may be wrong. It’s extremely useful to have good enough UI tests that the dev himself can see what is failing.

This is absolutely true. Who cares if the JSON payload is correct if the page is wrong?


> I would never want to be tasked with e2e browser automation again

This is how I felt before I started using Cypress. You need a tool that helps you rather than being a drag on you, and you need to commit some time to it. It's helped me learn a lot of coding I didn't otherwise know. Writing the test itself is usually not "real coding" but writing helper methods and learning how to organize your code certainly is. The trick to making it all work is making your tests not flaky. They need to pass 99% of the time. Now we have a few hundred long end-to-end tests and maybe 2 of them are flaky at any given time. They are easy to fix when they break, and they uncover important bugs on a regular basis.

Granted, I'm in the business of testing. Anything that isn't automated I have to do manually. And I have to represent a real user, so simple API tests are not going to cut it, and I have an incentive to automate the entire end-to-end process with multiple variations. But if you want to do short API tests you can certainly do that, and Cypress is actually a nice tool for that too.


My journey has been Selenium, then Cypress, then Playwright.

This was the first time I was happy writing ui tests, Playwright is great!

(Though I don't use the recording tool at all this article focuses on I rather write the tests manually)


> Though I don't use the recording tool at all this article focuses on I rather write the tests manually

That's because automatic code generators rarely work (anything more dynamically generated and it's a no-go).


I manually do it, then replace parameters where needed


i miss the cypress test runner


The thing about cypress that annoyed me the most was how it wasn't just using async/await syntax, generating more complex tests dynamically is super hard. Just try to write a test that follows links on a site dynamically, I couldn't figure it out.


Cypress requires developers to have a mental model of "calling a Cypress function actually adds an action to the back of a queue and immediately returns control flow without waiting for that action to run." And this is fine if you can express your waits and conditions declaratively - you give Cypress a plan, it decodes the plan from your Cypress code in milliseconds, and then will take as long as it needs to execute that plan.

And Cypress gives a great API surface where almost everything can be done with just this declarative API, and it enables them to show the full plan in a UI even while it's executing.

But if you actually need custom control flow, with code you control... you need to "pause" your action-enqueueing to let Cypress catch up. Which means you're going to be nesting callbacks, or sprinkling awaits in all the places that aren't logically awaiting, because you need an explicit yield point out of your JS block. Our team finds this rare enough that it's acceptable. But... yeah, it's far from a perfect solution.


I’m using cypress now and finally grokked the queue notion. But cypress is still flaky: very frequently the first character typed into an input element is lost (Firefox). Switched from Chrome to Firefox because Chrome would run out of memory after too many tests.

Both of these issues are well documented in git issues with hundreds of replies and work arounds that just don’t always work.

I’m ready to throw in the towel and rewrite everything in playwright. Maybe b I’ll have different issues there…


I had this issue too, are you using React? The problem I was having was to do with React re-rendering out of step with Cypress (which was going *too* fast).

I have only seen this issue with JS based VDOM frameworks.


Yes, React. How did you solve it?


Thanks! This is a great explanation of exactly what I meant. It really depends if this is an issue for your use case / team.


Cypress has the cy.wait() for precisely letting cypress catch up, or am I missing something here?

I really really like the cy.intercept() for intercepting and modifying requests.


If I write the code “cy.wait(…); console.log(“hi”); cy.whatever()” then the “whatever” will indeed be queued to run after the network request completes. But “hi” will have printed immediately before any of this happens! So if that log was, say, me wanting to condition on something that just loaded, I can’t assume it’s even set up at that point when my JS code is executing!


Like this maybe? https://www.cypress.io/blog/2020/12/10/testing-the-anchor-li...

If that's not it keep looking. If there's a thing you want to do, odds are Gleb has a video or blog post about it.


The thing they do with posts.json, except dynamically from links discovered on a page. I did look, I don't think its possible.

Anyway its subjective, I don't like Cypress.


There's no way to search the current page for all elements of an `<a>` type, then through that list search for the one you want and call click on it?

I figure there must be a way to do that, but it's very possible I'm misunderstanding what you're trying to solve!


Basically what I wanted to write was a recursive broken link checker


Yes, Cypress has many killer features, including:

- The Test Runner lets you retrace your actions with a visual UI and interact with the page as you rewind, with access to Dev Tools the whole time. It also has an excellent tool for targeting elements. (Although you'll eventually lean more on Dev Tools once you're fluent with CSS selectors.)

- There's very little boilerplate for anything. Several useful libraries are packed in and available globally.

- You can easily spy on and interact with fetch and XHR on the page

- You can easily execute Node stuff

- They have a great Dashboard (their only pay feature)


I'm always surprised to see posts like this, and no one mentioning testcafe. It's a super solid e2e testing library! I've been using it for quite a while and it runs reliably and consistently. The API design can be a little messy, but it was (and still is!) a top contender when I was evaluating tools in this space. And open source and contributor friendly. Not sure why I so rarely see it mentioned!


Indeed, I evaluated the e2e landscape years ago and TestCafe was the clear winner. It's a solid solution that integrates well with my TypeScript front and backend.


Playwright is the bee knees. At my last tech job I did maintenance on about 400 wordpress sites, and I used playwright to automate a bunch of monotonous clicky tasks normally done via the wordpress backend gui. It was a lot of fun and toward the end I got playwright running on heroku and made a gui so that my not very tech coworkers could upload a csv of different sites and then and a select a dropdown list of different scrapers, tests, etc.


Selenium, Playwright, and Puppeteer all make use of Chrome DevTools Protocol (CDP) to access the browser, even Firefox. CDP is the current de facto standard to remotely access the browser. It actually starts an HTTP server in the browser that you send requests to. The W3C is working on actual standard for this space that will likely be more mature and durable.

My learning from playing CDP and the test automation applications mentioned is that this is way harder (and slower) than test automation should be and it’s a huge pain in the ass for authoring tests. The experience is so bad that it only seems to appeal to professional testers whose jobs depend upon it.

To solve for that I wrote my own test automation solution that does not require remote access to the browser. In my personal application where I use this form of testing I am able to execute about 30 user events per second in the browser. That performance speed is completely dependent upon other performance conditions of your application, hardware, and transmission handling. The test cases are simple data structures defined as TypeScript interfaces in a Node application communicating to the browser over WebSockets. The events are executed by creating a new Event object in the browser and executed with the event’s dispatchEvent property.

When your automation gets really fast (less than 30 seconds for a major test suite) everybody will make use of it including the developers, product owners, and even the business leaders. It becomes a faster way to setup and access various features of a browser application than doing it manually.


Hiw do you access browser contents programmatically in your custom solution?


Good question. Same origin policy always applies so if you aren’t going to backdoor the browser as a bypass you need to in the same domain. The primary limitation then is that you must own the site you wish to automate, or run a replication of the site from either localhost or a domain you own.

I recommend running the site from an aliased subdomain. This will allow ownership of https certificates with a wildcard to your primary domain and that subdomain can point to the environment that contains your site database or services. You can also have a subdomain that uses production https certifications but resolves to a loopback IP like https://localhost.example.com pointing to 127.0.0.1 and/or a AAAA record pointing to ::1.


> To solve for that I wrote my own test automation solution that does not require remote access to the browser.

Any more details? Are you still working with actual chrome?


I cannot provide too many details without violating anonymity of this anonymous account. I can say that in my primary test suite I have about 85 tests each which comprise a variable number of user events and assertions totaling about 290 total assertions. When I was using HTTP that would take about 45 seconds to execute. When I switched to a home grown websockets solution the test time went down to about 6.5-8.5 seconds. Also, if you want things to be lightning fast try to shy away from using querySelectors.


seems suspect that http and querySelector are the bottleneck. i remember selenium being slow but playwright and puppeteer are pretty fast.


85 tests and nearly 300 assertions in under a minute is still orders of magnitude faster than what I see elsewhere. There isn’t a single magic bullet performance killer. Extreme performance comes from absolutely everything: faster approach to testing, better hardware, super fast application to test, faster transmission, faster reporting and assertions, and everything else.

You can’t know what’s faster unless you are measuring things in isolation and making incremental improvements to a bunch of different bottlenecks. For example people love to tell me how fast their framework, their application, or whatever is but it’s clear that these are almost always anecdotal observations that are not measured or compared to anything.

When looking at performance it does not matter how fast something is, which makes anecdotal observations worthless. The only thing that matters is the difference in speed between things compared, the performance gap.


Nice info, thanks for that.


I just spent half an hour wrestling with Selenium for Python and that API has NOT aged well.

    content = driver.find_element(By.CSS_SELECTOR, "p.content")
In Playwright Python:

    content = page.locator("p.content")
Playwright is just nicer to use. Selenium feels very Java-ish.


Yeah, not only that, but that’s the NEWER Selenium Python API. The older style was actually (imo) easier to grapple with, discover, etc. The whole thing has felt like it’s either written by people that don’t work with Python very often, or that it’s too married to the idea of being a close port of an API targeted at…yeah…Java.


> I found weird issues (when using the recorder) where I couldn't scroll down to the bottom of the screen in Playwright's Chromium, but could in a normal Chrome, forcing me to use other tricks to click the button.

You may consider trying this extension - https://chrome.google.com/webstore/detail/devraven-recorder/.... This extension can record the tests using your Chrome browser instead of launching a separate Playwright chromium browser. Full disclosure: I am the developer of this extension.


For those of you who are as confused as I am about Playwright's growth over Selenium: https://star-history.com/#microsoft/playwright&SeleniumHQ/se...

My company uses Selenium for some of our projects, but I evaluated the browser automation space some time ago and found that it was the best-in-class solution at that time.

It looks like things have since changed.



I’m here just to say how much I love Playwright :)

Best features for me:

- Auto-waiting - it is a BRILLIANT feature!

- Shared authentication;

- Supported browsers list;

- API testing;

- Tests generator (with recognition of data-test-id attributes);

- Flaky tests detection;

- Flexible config for recording failed tests;

- Ease of installation and updates;

- Great performance.

Thank you, Playwright team!


Playwright is good, but stop thinking auto generated code is worthwhile.

Eventually you'll have to write your own scripts. I will admit Playwright can spit out code that will at least serve as a starting point.


If you use automation attributes on your components it can work quite well.


I’ve used playwright for a while, especially on my day job.

It’s a fantastic and versatile tool. I couldn’t agree more.


Unfortunately, the .NET library doesn't support ARM processors. Thus, I can't use it on the Raspberry Pi. But PuppeteerSharp works great on the Pi.


Weird question. The trouble with web automation tools such as Selenium and beautiful soup is that they require specific instructions to parse html. So it's difficult to auto scrape websites if several websites have a different html layout or any website simply decides to change it's html. Would it be possible to create a neural network model that could be trained to parse html as a human would? Or an AI for that matter that could break Google's captcha. I'm surprised that that still works given the advances in image recognition. If both are possible, the security on the web may quickly become a thing of the past.


https://www.adept.ai/act

You can combine traditional RPA with modern ML and do very interesting things.


None of these solutions are robust wrt highly dynamic web sites or - worse - in-depth changes to the HTML implemented over time server-side.

What all of these tools actually need is to be bundled with some AI-powered recognition tools - computer vision, semantic understanding, etc ... - that can sort of "understand" the web page served before interacting with it.

All people do right now is a combo of XPATH-based search for an element in the DOM, or if that fails some sort of regex based pattern matching on the HTML.

These are weak and brittle techniques that are pretty much guaranteed to fail at some point.


I've changed from WordPress to Vue to React but can still use my login e2e test as I depend on a DOM element ID. Eg: #username_field, #password_field, #submit_btn.


Can someone provide an in depth comparison between Cypress and Playwright? It seems that both of them could be good choice compared to Selenium.

For example I keep seeing auto-wait being talked about for Playwright but doesn't cypress have that too? It seems playwright has better browser support and multi-window/concurrency story and cypress have better built-in tools (like a test-runner UI).

Basically looking for recommendation for test-automation for an old complex app mostly written using server-side tech but has potential for react and such moving forward.


Playwright supports webkit by default, even in your linux CI.

Playwright is faster.

It's fully free and OS.

It's written by Microsoft to test their own thousands of services not to sell you anything like Cypress.

The authors of Playwright are the leads from Puppeteeer, Lighthouse, the first node debugger and Chrome Dev Tools, the engineering team has simply no comparisons in the browser automation field.


I haven't used Playwright, but I can tell just by reading about it that the code has more boilerplate and doesn't read as nicely. Example of a Cypress test (pardon the freehand typing here):

  describe(('the google home page') => {
    it(('allows you to search for banana pictures') => {
      cy.visit('https://google.com')
      cy.get('[aria-label="Search"]').type('banana{enter}')
      cy.contains('Images').click()
      cy.get('[alt="Image result for banana"]').should('be.visible')
    })
  })
Nothing too weird, and a lot of the library functions you'd need are available globally without any importing. It even has jQuery and Lodash.


Using TLS client certificates automatically is still not possible in Playwright [1]. That's our main reason why we keep using Selenium (with pytest-selenium) and Selenium Grid [2].

[1] https://github.com/microsoft/playwright/issues/1799

[2] https://www.selenium.dev/documentation/grid/


Actually what is stopping you from generating automation selenium script by recording browser action? I recently wrote a tiny browser extension to do just that. Intercept event and translate the action into a selenium script. Currently the supported features are few. But would love to learn what am I expecting down the path.


It must be a game changer as Selenium is mentioned, but never linked in the article.


I find that Hero[https://ulixee.org/docs/hero] has been a good replacement for my selenium needs


Thanks. I've found Chromedp [0] to be a good Selenium replacement when programming in Go. Used in conjunction with headless-shell [1] you can deploy a Go app into a container and do the testing all within the same container with low overhead.

[0] https://github.com/chromedp/chromedp [1] https://github.com/chromedp/docker-headless-shell


What’s the best way of monitoring suites? The built in report tool is amazing, but would love a solution that makes for easier analysis and maybe a closer integration with jira.


Beyond perhaps a smoke/availability test, I don’t know why you’d perform UI testing. Too brittle, too slow, too much of a pain to maintain compared to the value.


I like UI tests as the test are so thorough. You test the entire stack from database to webserver and final page render. Say you test a sign-up form. The next test logs into a mail account to check if the welcome email has arrived, checking the job scheduler, 3th party email service and if the link in the email works as intended. I don't mind a bit of brittleness given the large test surface e2e tests cover.


how do you perform automated in your build pipeline? I am genuinely curious and looking for new ways to do things.


You do so you can go sleep knowing you are making money and nothing breaks.

There are no tests as valuable as E2E tests.


What happened to Puppeteer? I barely see it mentioned anywhere now.


> The Puppeteer team essentially moved from Google to Microsoft and became the Playwright team.

https://blog.logrocket.com/playwright-vs-puppeteer/


Never knew why people bothered with Puppeteer, you need to test against other browsers, too. Not just Chrome.


Yeah, but python? eww :)


Also has nodejs binding


It also has C# bindings, even though I think the C# documentation is a bit lacking compared to node.js and python




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: