In case you're wondering, and the linked page doesn't explain:
> Puppeteer is a Node library which provides a high-level API to control headless Chrome or Chromium over the DevTools Protocol. It can also be configured to use full (non-headless) Chrome or Chromium.
Thank you. This industry has a huge failure mode in touting tech used in projects/frameworks without linking out to at least the source so that someone can learn about them. Can't tell you how many things I've bailed on because it was a pile of obscure library references that weren't (to me) worth looking up.
Indeed, there is a skill when doing technical writing of putting yourself in the position of a reader, particularly one who is competent but who doesn't already know the thing that you are trying to explain, or the context. It's not as common as it should be.
This is also an experimental project. Who here thoroughly documents their experiments? Although to be fair we did announce it at I/O so that calls for more documentation.
I can only assume that anyone who submits "Puppeteer for Firefox" or "x for y" to Hackernews, is someone who works daily with Puppeteer / x and has the necessary context and doesn't notice that it's not aimed at a more general audience (1).
Then it gets upvoted to a more general audience, who ask "Puppeteer? what's that?"
Huh. I worked on exactly this idea as a Mozilla intern in 2017. I wrote a collection of Rust crates (mentioned in [0]) comprising auto-generated types for all Chrome DevTools Protocol messages [1], (de)serialization, and a server lib to handle both the initial HTTP handshake and subsequent command execution over WebSockets. I used that to build out an initial CDP server inside Firefox (and Servo) with support for a couple of sample commands.
At the end of the internship I handed over everything I'd built by that point, with the idea that they (Mozilla) would turn it into a finished project. But I never saw any activity on that front after I left, so I assumed the idea'd been scrapped. Feels very weird seeing this pop up out of the blue on HN today.
There’s a lot of debate between the browser vendors and automation tools around whether the Chrome DevTools Protocol is the right abstraction layer to focus browser automation around. It’s a nuanced topic and if you dig into it you’ll see that all the different sides make good arguments. My guess is that this debate stalled your project. I’m putting myself out here on the topic because having a coherent and enjoyable cross-browser integration testing story would be a huge win for the web at large and it’s something worth making people aware of in the hopes that we can find common ground.
Edit: I actually have no right to comment on what stalled your project and have no idea what happened there. I just wanted to discuss the fact that we need a good abstraction layer but seem stuck in debate.
Disclosure: Chrome DevTools docs guy, just speaking personally based on my research into the topic over the last year
No need to apologize, I appreciate your perspective. That was my assumption as well at the time - that the debate had shifted the other way and CDP support fell off of the priority list.
Looks like this integration is being developed primarily by a Google employee, rather than a Mozillian ... so perhaps Mozilla is still not particularly into this idea :)
Tangentially related => I used ffmpeg + puppeteer to build StoryScroll https://neal.rs/app, which turns blog posts into scrolling videos for social media.
I also used to build a desktop app that my team uses to download Alexa Flash Briefing metrics (because there's no API for that)
Puppeteer's full-screen screenshots & DOM manipulation abilities are clutch!!
Just download both and try doing the same test suite. See which one hurts the least. I'm 99% sure Puppeteer will feel like a breath of fresh air to you.
CasperJS was OK, even better with Nightwatch but Puppeteer is quicker, more stable and actively developed.
That was what I found, as well, for my PhotoStructure tests. It's great to have them driven by gitlab's ci and have cross-platform tests run automatically, and it will be great to now have them run cross-browser, as well.
Much better access to low-level information. You can hook (and mock) requests, see what POST data was sent and basically do lots of magic. Also, my experience with Selenium under Python is that doing anything bigger quickly turns into one giant timing hack.
You're not technically wrong, but unit/integration/E2E are not hard categories for tests, they're just points along a continuum.
Sometimes I want to directly control a browser to run a test or examine a UI component, but I still want to mock parts of my application. This is one of the biggest weaknesses with Selenium -- it takes a very narrow view of what testing is, so it becomes much less useful as soon as you're trying to do anything interesting.
Different applications call for different testing strategies. Selenium could be a low-level browser controller that allows you to write strict E2E tests, but instead it's a low-level browser controller that is almost fanatical about only supporting strict E2E tests, even when it means that basic actions need to be more complicated. I think that's a design mistake, but whatever.
It's not that Selenium makes it impossible to do things like this, it's that it makes it arbitrarily harder.
If you want to do something like request interception, you have to point your front-end at a proxy server and mock things there. If you want to test a file upload, you have to simulate typing into the dialog and then use a literal file on the disk. It's just pointless friction for most testing setups.
It's not that E2E testing is bad, it's that in some projects there's room for a Selenium-style tool across the entire spectrum from E2E testing to unit testing; particularly if I'm testing something like D3 code, or parts of a CSS layout.
It's all doable, it's just that Selenium makes it all needlessly difficult, because its point-of-view is that your front-end should be treated mostly like a black box, and that any mocking that does exist should be happening via endpoints and application connectors.
I like Selenium -- I prefer using an Open protocol over Puppeteer. I just feel like that protocol could use a lot more design work.
>This is one of the biggest weaknesses with Selenium -- it takes a very narrow view of what testing is
Selenium/WebDriver isn't a testing tool. It's a a W3C protocol for remote control of web browsers. I think once you get over that misconception, selenium starts to make a lot more sense.
That still doesn't explain to me design decisions like how WebDriver's file upload works.
If I'm remote-controlling a browser over a network, wouldn't I occasionally want to send it an arbitrary file upload? Why do I need to separately transfer the file using a different service, and then refer to it with the disk path?
The only reason I can think of is, "normal users couldn't do that." But normal users also can't control a browser over a network. There's no reason to restrict a control protocol to only things that normal users can do.
That is just further reason why it ought to support scenarios that the grandparent was describing, like modifying request data, even if that functionality is not useful for automated testing
back when I did selenium (with python) I wait-ed for the wait for next page to load (html element becomes stale) and for textbox/checkbox/etc. you are selecting to be present with a few tiny helper functions. That took most of the magic sleeps out although I still needed timeouts on the waits
With WebDriver becoming a standard it's sad the browser vendors didn't work together to improve that. Instead Google made their own thing and now the world has to follow
I'm using puppeteer at work to dump dom changes, console logs, network requests, css and images to create a test bundle that can be replayed with a test bundle viewer (ie. move the slider to move the browser view of the test through the changes). Puppeteer is powerful enough to let that happen! And overall Puppeteer has caused us less headaches than Selenium
Selenium feels old and outdated, it's always a hustle to install it/get it running and I have to use WebdriverIO with it in Node. I tried Puppeteer a few days ago, I just did `npm i puppeteer` and it worked as expected.
Just like any other module, but slower. There's a substantial bag of bits it needs to fetch (chromium), but that's cached.
I've also noticed that sometimes (in CI, less than 5% of the runs) the installation or run will fail deep in chromium. It's actually spawning a chrome process and tunneling remote commands to the child process to do work, so, I guess the flakiness is (almost) excusable?
Restarting the test always resolves the issue, fwiw.
You can install Puppeteer without downloading the bundled Chromium (PUPPETEER_SKIP_CHROMIUM_DOWNLOAD env variable), but you need to be careful to use it with a version of Chrome/Chromium that has a compatible dev tools api version.
I too have experienced flakiness in CI where Chrome doesn't always seem to start or quit within a reasonable amount of time. I get around this in production with timeouts that give any lingering Chrome processes a series of increasingly aggressive requests to quit, but my hosted CI environment doesn't like it when I send kill signals to random PIDs.
I'm hoping that this Firefox client will be somewhat more reliable in that regard.
Simple, 0 configuration, fast. With a new installation of
Selenium I always have to download the Firefox/Chrome drivers, have to setup paths, debug startup erros, plus it crashes pretty often.
This is about to change as WebdriverIO will support automation based on Puppeteer soon: https://github.com/webdriverio/webdriverio/pull/4210 .. with that you only need to install WebdriverIO via NPM and the browser you want to automate
The experiment was a great start to see what is needed to support Puppeteer. The next step is coming up with a well-integrated architecture. Find me on the web to talk to me about your use cases and ideas for Puppeteer.
I recognize your name from the Rust community. Have you checked out the rust-headless-chrome library yet? It is essentially our puppeteer, in Rust. I added the print_to_pdf feature a few months ago. I haven't touched the other functionality yet. The project isn't feature complete, yet, and also would benefit by an asyncio refactor, maybe when async-await releases.
How? With selenium I have a case that WebElement.click() fails but it doesn’t throw. Also Select(...).select_by_visible_text() throws WebDriverException, StaleElementException, etc even if you WebDriverWait()’d for it to be available. To me these show selenium is unreliable. How puppeteer is reliable pertaining to these cases?
It's probably because in the meantime dom nodes has been replaced by new ones.
Selenium has it's quirks but when you understand how it works and it's limitations and user centric philosophy there is nothing that can stop you from writing rock solid stable tests.
Sorry I didn't defined the problems clearly and thank your for your help. I am using selenium using python for web automation, not testing. The latter problem was by and large solved by subclassing WebDriverWait by adding common Exceptions related to nodes staleness. The former problem was tackled by writing a for loop detecting changes of the context to know if WebElement.click() fails silently.
The thing is I have to write these workarounds everywhere. What's more, I inject javascript code a lot to avoid nodes staleness caused by network latency. I believe selenium is not designed for writing lots of javascript. And javascript code in a string of python can not be linted, which increases debugging workload.
To this point, the cost of these hacks is too much in my project. I find puppeteer play nicely with javascript after trying. But the difference and reliability between waitFor() in puppeteer and WebDriverWait() in selenium still remain questionable.
I am wondering if using puppeteer can ease pains mentioned above.
Selenium for sure doesn't fit in your use case then :).
I pointed "user-centric approach" because selenium focuses on allowing only things normal user could do in normal user way which is limiting for automation but it's a blessing if you want to look as close to the real user as you can. I worked on solving google ReCaptcha v2/v3 at bigger scale and it's been a solid advantage.
> And javascript code in a string of python can not be linted, which increases debugging workload.
If you use PyCharm or any other JetBrains IDE you can use `Language Injections` [0]
> I am wondering if using puppeteer can ease pains mentioned above.
I'm not sure about puppeteer but Chrome DevTools Protocol [1] (I think this is what puppeteer use under the hood, but I'm not sure) may be interesting for you because it gives low level access to browser mechanisms like injecting code before page load, request interception or separate browser contexts between separate tabs.
Puppeteer is really really good. I just discovered it and used it for something and it’s a blast to work with.
I’m seriously thinking there is a cloud api play in the near future — a puppeteer Page is a pretty awesome container format ... a Puppeteer Page or BrowserContext per request would provide an awesome cloud-function-like programming model I think...
Nice work, I have been using puppeteer with chrome recently and it is a great piece of kit. If the prototype is successful will you keep the chrome and firefox packages separate or merge them and have an abstraction layer so puppeteer can issue commands regardless of chrome/firefox being used for the browser?
Sure but now there are two separate codebases for the same API. A lot of code will be duplicated between the two codebases, you'll have to maintain two sets of issues on github etc.
If it were me I would aim to merge the firefox puppeteer into chrome puppeteer but fair enough if they want to keep them separate.
EDIT: I guess it depends on how much of the code in the puppeteer chrome codebase is tightly coupled to chrome dev tools. Like if 25% of specific to chrome, then that would mean that 75% of the code can be shared between the chrome implementation and the firefox one, in that case I think merging and hiding the 25% chrome/firefox specific stuff behind an abstraction layer is the best option.
is there a benefit to using firefox with puppeteer compared to chromium and puppeteer? or is it just a matter of choice and being able to also use firefox for automated testing?
I run integration tests in Ruby on Rails with Capybara using both the Chrome and Firefox drivers. That project was stopped for a while and when I went back to it the Firefox driver didn't work anymore. I didn't have the time to check what went wrong and fix it but I'm developing with Firefox anyway. I think it's important to test with the major browsers. My customer didn't ask about IE. I remember when I had a few VMs for that and its incompatible version tied to a specific version of Windows.
Interesting, I thought Firefox usage would be increasing based on all the hype/people converting on HN/other tech websites but the market share is actually falling. Guess it puts into perspective the size of that cohort compared to the rest of the world.
After Firefox broke most extensions with their removal of XUL extensions and embrace of "WebExtensions" (aka Chrome extensions), it lost a major differentiation point from Chrome. People simply have no (immediate, practical) reason to use Firefox.
Back in the Mozilla days i was using Mozilla because it was very powerful and flexible despite being very slow (i remember watching Mozilla 0.6's dialog boxes draw themselves) and many pages didn't work with it because everyone only cared about IE (...and i'd put the RIIR crowd to shame with my "evangelization" :-P). Nowadays i use Firefox mainly out of inertia.
Perhaps if Google disables most of the stuff adblockers rely on, there will be again a reason to use Firefox.
It became faster (not sure about double though) but i still remember losing MAFF, DownThemAll and a bunch of other addons. More importantly it is the functionality that was lost for new addons to be made that could provide features not thought by anyone before.
If you look at the graph, you'll see that the Quantum release had no discernable impact on Firefox adoption.
In any case, it still has a number of major differentiation points - more capability for extensions is even one of them, see their Facebook Container extension. But obviously the one Google will never be able to copy is a focus on privacy.
I never used Chrome as a main browser (though i do use it sometimes for its automatic translation).
Performance is ok but it isn't my main concern, i'm also concerned about features (after all lynx is faster than both of those browsers, yet it lacks a bit on the features side).
In any case, this isn't a hill i care to die on. I just do not see much of a difference between the two browsers anymore in terms of what they can do. I used to like Mozilla for its features (in fact i was really annoyed that they switched their focus to Firefox back in the day and left what they renamed to "Mozilla Suite" to die) and Firefox later for its (remaining) features. Funny enough now that i think about it, the reason was also "performance" back then and again i didn't care about it but i did care about the features lost.
I guess they kept on the same path and eventually Firefox will be a chromeless, featureless HTML terminal - not very useful as a browser, but it'll be the fastest HTML terminal :-P.
For what I see, puppeteer is very much like testcafe (able to run with headless/full Firefox, Chrome, Edge ..., but testcafe doesn't need a custom-built Firefox). TestCafe can run javascript code on the page and return the data back, that means it can be used to get data in POST requests. It can take screenshots too. If anybody is familiar with both tools, could you please what this tool can do but the other cannot?
As far as I know, Puppeteer actually controls the browser, whereas TestCafe injects things into the page - risking influencing the results of your tests. I've never used TestCafe, so I don't know how justified that worry is.
I have never heard of TestCafe, but I've used Puppeteer for a variety of projects with Chrome. My latest one was using it to convert HTML files into PDFs en masse.
My experience with wkhtmltopdf a few years ago was that the PDFs that it rendered could look quite different from the PDFs that Chrome or Firefox would render.
Firefox uses PDF.js[1] to render pdf. This is a standalone project to render pdf in the browser using html and js. If I remember correctly, you can even render pdf in chrome using this.
Likely part of a web scrapping project that may require navigation or other browser things to get to. Or they are trying to avoid detection with a more real browser (you have to change some stuff first).
Sorry I'm in that position myself. A couple of years ago I compiled it without too much difficulty, and the included example I tried out rendered a page to a .png. Doing that for that for thousands of PDFs would just take a loop (or find -exec) if you have the files locally.
Firefox has lost its lead because it's lost the developers.
Chrome has invested massively in developer tools for chrome and it's shows.... I think chrome is the best software ever written. The developer tools are so powerful that it'd hard to even know the full breadth of what's in there.
Firefox offers nothing to developers over chrome, which is sad.
Can't you only have one profile open at a time, and each one is linked to a different Google account?
Firefox allows you to have each tab open in a different container, and they don't have to be tied to anything in particular.
I can have different tabs open to the AWS console for each AWS account I use. I can also switch tabs to change between different user accounts to test interactions between users on the site I'm developing.
No. Each profile opens in a new Chrome window. You can have as many open as you'd like.
> and each one is linked to a different Google account?
No. Local profiles have always been a thing.
Chrome's implementation has better UX for privacy IMHO; it's harder to accidentally open a tab in a different profile than expected, which I do _all the time_ in Firefox. In Chrome, Cmd+T opens a new tab in the active profile. In Firefox, Cmd+T opens a new tab in the default profile.
Yeah, the implementation could use some work. I also use the temporary containers plugin so by default tabs open in new containers. I think this gives better security than either of these defaults. If I need a site to open up in a particular container, I can always set that too.
Vimium/cVim just doesn't match up and aren't under as aggressive development. For this keyboard-focused developer, there's not a good enough reason to be on Chrome. I know I am in the minority, but it's a point nonetheless.
Working with both browsers daily for PhotoStructure, I can tell you I miss the FF grid tool and changes panels on chrome dev tools. More features are coming: https://www.mozilla.org/en-US/firefox/developer/
> Puppeteer is a Node library which provides a high-level API to control headless Chrome or Chromium over the DevTools Protocol. It can also be configured to use full (non-headless) Chrome or Chromium.
https://developers.google.com/web/tools/puppeteer/