Hacker News new | past | comments | ask | show | jobs | submit login
Prototype: Puppeteer for Firefox (github.com/googlechrome)
335 points by amjd on July 23, 2019 | hide | past | favorite | 127 comments



In case you're wondering, and the linked page doesn't explain:

> Puppeteer is a Node library which provides a high-level API to control headless Chrome or Chromium over the DevTools Protocol. It can also be configured to use full (non-headless) Chrome or Chromium.

https://developers.google.com/web/tools/puppeteer/


Thank you. This industry has a huge failure mode in touting tech used in projects/frameworks without linking out to at least the source so that someone can learn about them. Can't tell you how many things I've bailed on because it was a pile of obscure library references that weren't (to me) worth looking up.


Indeed, there is a skill when doing technical writing of putting yourself in the position of a reader, particularly one who is competent but who doesn't already know the thing that you are trying to explain, or the context. It's not as common as it should be.


To be fair, in this case the linked page is a subdirectory of the main Puppeteer project.


This is also an experimental project. Who here thoroughly documents their experiments? Although to be fair we did announce it at I/O so that calls for more documentation.

Disclosure: Chrome DevTools docs guy


I always find it weird that people constantly try to "sell" me (us) something but never tell me what that thing is or why I want it.


I can only assume that anyone who submits "Puppeteer for Firefox" or "x for y" to Hackernews, is someone who works daily with Puppeteer / x and has the necessary context and doesn't notice that it's not aimed at a more general audience (1).

Then it gets upvoted to a more general audience, who ask "Puppeteer? what's that?"

1) https://news.ycombinator.com/item?id=20507728


Huh. I worked on exactly this idea as a Mozilla intern in 2017. I wrote a collection of Rust crates (mentioned in [0]) comprising auto-generated types for all Chrome DevTools Protocol messages [1], (de)serialization, and a server lib to handle both the initial HTTP handshake and subsequent command execution over WebSockets. I used that to build out an initial CDP server inside Firefox (and Servo) with support for a couple of sample commands.

At the end of the internship I handed over everything I'd built by that point, with the idea that they (Mozilla) would turn it into a finished project. But I never saw any activity on that front after I left, so I assumed the idea'd been scrapped. Feels very weird seeing this pop up out of the blue on HN today.

[0] https://bugzilla.mozilla.org/show_bug.cgi?id=1523104#c1

[1] Complete with documentation comment generation; I found the result of running rustdoc over that crate super useful as an alternative to the official CDP docs site: https://www.spinda.net/files/mozilla/rust-cdp/doc/cdp/index....


There’s a lot of debate between the browser vendors and automation tools around whether the Chrome DevTools Protocol is the right abstraction layer to focus browser automation around. It’s a nuanced topic and if you dig into it you’ll see that all the different sides make good arguments. My guess is that this debate stalled your project. I’m putting myself out here on the topic because having a coherent and enjoyable cross-browser integration testing story would be a huge win for the web at large and it’s something worth making people aware of in the hopes that we can find common ground.

Edit: I actually have no right to comment on what stalled your project and have no idea what happened there. I just wanted to discuss the fact that we need a good abstraction layer but seem stuck in debate.

Disclosure: Chrome DevTools docs guy, just speaking personally based on my research into the topic over the last year


No need to apologize, I appreciate your perspective. That was my assumption as well at the time - that the debate had shifted the other way and CDP support fell off of the priority list.


Looks like this integration is being developed primarily by a Google employee, rather than a Mozillian ... so perhaps Mozilla is still not particularly into this idea :)


The Firefox end of this is being led up by :ato at Mozilla.

https://bugzilla.mozilla.org/show_bug.cgi?id=protocdp


Are you saying they didn't credit you for your work?


I'm not saying that. Just sharing my experience and expressing surprise at the reappearance of what I thought was a dead project.


It is kind of usable but is still missing stuff. You can check the status here: https://aslushnikov.github.io/ispuppeteerfirefoxready/


We're using it to test our browser extension at Sourcegraph in both Firefox and Chrome: https://github.com/sourcegraph/sourcegraph/blob/master/brows...


Tangentially related => I used ffmpeg + puppeteer to build StoryScroll https://neal.rs/app, which turns blog posts into scrolling videos for social media.

I also used to build a desktop app that my team uses to download Alexa Flash Briefing metrics (because there's no API for that)

Puppeteer's full-screen screenshots & DOM manipulation abilities are clutch!!


For automated testing, why Puppeteer instead of Selenium?


Just download both and try doing the same test suite. See which one hurts the least. I'm 99% sure Puppeteer will feel like a breath of fresh air to you.

CasperJS was OK, even better with Nightwatch but Puppeteer is quicker, more stable and actively developed.


That was what I found, as well, for my PhotoStructure tests. It's great to have them driven by gitlab's ci and have cross-platform tests run automatically, and it will be great to now have them run cross-browser, as well.


Much better access to low-level information. You can hook (and mock) requests, see what POST data was sent and basically do lots of magic. Also, my experience with Selenium under Python is that doing anything bigger quickly turns into one giant timing hack.


If it really E2E from user perspective you shouldn't doing any magic and manipulate POST data.


You're not technically wrong, but unit/integration/E2E are not hard categories for tests, they're just points along a continuum.

Sometimes I want to directly control a browser to run a test or examine a UI component, but I still want to mock parts of my application. This is one of the biggest weaknesses with Selenium -- it takes a very narrow view of what testing is, so it becomes much less useful as soon as you're trying to do anything interesting.

Different applications call for different testing strategies. Selenium could be a low-level browser controller that allows you to write strict E2E tests, but instead it's a low-level browser controller that is almost fanatical about only supporting strict E2E tests, even when it means that basic actions need to be more complicated. I think that's a design mistake, but whatever.


> Sometimes I want to directly control a browser to run a test or examine a UI component, but I still want to mock parts of my application.

If you're building a SPA, you can mock out the backend and control the backend mock behaviour in your Selenium test.


It's not that Selenium makes it impossible to do things like this, it's that it makes it arbitrarily harder.

If you want to do something like request interception, you have to point your front-end at a proxy server and mock things there. If you want to test a file upload, you have to simulate typing into the dialog and then use a literal file on the disk. It's just pointless friction for most testing setups.

It's not that E2E testing is bad, it's that in some projects there's room for a Selenium-style tool across the entire spectrum from E2E testing to unit testing; particularly if I'm testing something like D3 code, or parts of a CSS layout.

It's all doable, it's just that Selenium makes it all needlessly difficult, because its point-of-view is that your front-end should be treated mostly like a black box, and that any mocking that does exist should be happening via endpoints and application connectors.

I like Selenium -- I prefer using an Open protocol over Puppeteer. I just feel like that protocol could use a lot more design work.


>This is one of the biggest weaknesses with Selenium -- it takes a very narrow view of what testing is

Selenium/WebDriver isn't a testing tool. It's a a W3C protocol for remote control of web browsers. I think once you get over that misconception, selenium starts to make a lot more sense.


That still doesn't explain to me design decisions like how WebDriver's file upload works.

If I'm remote-controlling a browser over a network, wouldn't I occasionally want to send it an arbitrary file upload? Why do I need to separately transfer the file using a different service, and then refer to it with the disk path?

The only reason I can think of is, "normal users couldn't do that." But normal users also can't control a browser over a network. There's no reason to restrict a control protocol to only things that normal users can do.


>There's no reason to restrict a control protocol to only things that normal users can do.

The selenium maintainers seem to have a very strong opposite opinion.


That is just further reason why it ought to support scenarios that the grandparent was describing, like modifying request data, even if that functionality is not useful for automated testing


back when I did selenium (with python) I wait-ed for the wait for next page to load (html element becomes stale) and for textbox/checkbox/etc. you are selecting to be present with a few tiny helper functions. That took most of the magic sleeps out although I still needed timeouts on the waits


Does puppeteer work with Safari and MS browsers or is it another Chrome only tool (and now Firefox)?


It's yet another Chrome only tool.


It works very well for Microsoft Edge too (latest DEV version of it) - https://github.com/GoogleChrome/puppeteer/issues/4185#issuec...


You mean it will work with Microsoft Edge when Microsoft Edge becomes Chrome.


With WebDriver becoming a standard it's sad the browser vendors didn't work together to improve that. Instead Google made their own thing and now the world has to follow


Agreed. Any idea why Google didn't work with WebDriver and decided to spin-off yet another homegrown tech?

My conspiracy-laden mind veers towards the obvious but I'd prefer a better explanation.


I am done with selenium because you cannot make stable tests. They are always flaky.


I'm using puppeteer at work to dump dom changes, console logs, network requests, css and images to create a test bundle that can be replayed with a test bundle viewer (ie. move the slider to move the browser view of the test through the changes). Puppeteer is powerful enough to let that happen! And overall Puppeteer has caused us less headaches than Selenium


Selenium feels old and outdated, it's always a hustle to install it/get it running and I have to use WebdriverIO with it in Node. I tried Puppeteer a few days ago, I just did `npm i puppeteer` and it worked as expected.


How would you describe the npm install process?


Just like any other module, but slower. There's a substantial bag of bits it needs to fetch (chromium), but that's cached.

I've also noticed that sometimes (in CI, less than 5% of the runs) the installation or run will fail deep in chromium. It's actually spawning a chrome process and tunneling remote commands to the child process to do work, so, I guess the flakiness is (almost) excusable?

Restarting the test always resolves the issue, fwiw.


You can install Puppeteer without downloading the bundled Chromium (PUPPETEER_SKIP_CHROMIUM_DOWNLOAD env variable), but you need to be careful to use it with a version of Chrome/Chromium that has a compatible dev tools api version.

I too have experienced flakiness in CI where Chrome doesn't always seem to start or quit within a reasonable amount of time. I get around this in production with timeouts that give any lingering Chrome processes a series of increasingly aggressive requests to quit, but my hosted CI environment doesn't like it when I send kill signals to random PIDs.

I'm hoping that this Firefox client will be somewhat more reliable in that regard.


I meant installing npm itself.


Simple, 0 configuration, fast. With a new installation of Selenium I always have to download the Firefox/Chrome drivers, have to setup paths, debug startup erros, plus it crashes pretty often.


This is about to change as WebdriverIO will support automation based on Puppeteer soon: https://github.com/webdriverio/webdriverio/pull/4210 .. with that you only need to install WebdriverIO via NPM and the browser you want to automate


No, I meant installing npm


I already had npm installed. And I think npm comes within the Node.js installer.


Puppeteer covers more browser-native features. For example Puppeteer can capture HTTP traffics.


Javascript-based instead of java-based api?


Selenium / WebDriver has a HTTP API so you can have clients in any language, not just in Java


It would be awesome to have Puppeteer support Python... then I'd drop Selenium from my toolbox.


My company is using https://github.com/miyakogi/pyppeteer in production and it has worked great. I suggest checking it out.


We announced this at Google I/O 2019 and mentioned it in What’s New In DevTools (Chrome 76): https://developers.google.com/web/updates/2019/05/devtools#p...


For anybody who wants to follow along, the Firefox work is tracked here: https://gitter.im/webhintio/Firefox

The experiment was a great start to see what is needed to support Puppeteer. The next step is coming up with a well-integrated architecture. Find me on the web to talk to me about your use cases and ideas for Puppeteer.



Anyone who wants to play around with some Puppeteer examples, I built https://puppeteersandbox.com.

If you want to record your own scripts, I also am actively developing https://checklyhq.com/Puppeteer-recorder, a Chrome extension.


That second link is a 404, although I love your 404 page GIF =)

Checkly looks like a great product!


Thanks, building a SaaS is 99% getting the 404 pages right!


You misspelled the URL https://checklyhq.com/puppeteer-recorder

Love recording macros


Argh, thanks for the heads up


How is the DevTools/Remote protocol better (or worse) than WebDriver prptocol for integration testing?


It's much more reliable.


I recognize your name from the Rust community. Have you checked out the rust-headless-chrome library yet? It is essentially our puppeteer, in Rust. I added the print_to_pdf feature a few months ago. I haven't touched the other functionality yet. The project isn't feature complete, yet, and also would benefit by an asyncio refactor, maybe when async-await releases.


How? With selenium I have a case that WebElement.click() fails but it doesn’t throw. Also Select(...).select_by_visible_text() throws WebDriverException, StaleElementException, etc even if you WebDriverWait()’d for it to be available. To me these show selenium is unreliable. How puppeteer is reliable pertaining to these cases?


It's probably because in the meantime dom nodes has been replaced by new ones.

Selenium has it's quirks but when you understand how it works and it's limitations and user centric philosophy there is nothing that can stop you from writing rock solid stable tests.


Sorry I didn't defined the problems clearly and thank your for your help. I am using selenium using python for web automation, not testing. The latter problem was by and large solved by subclassing WebDriverWait by adding common Exceptions related to nodes staleness. The former problem was tackled by writing a for loop detecting changes of the context to know if WebElement.click() fails silently.

The thing is I have to write these workarounds everywhere. What's more, I inject javascript code a lot to avoid nodes staleness caused by network latency. I believe selenium is not designed for writing lots of javascript. And javascript code in a string of python can not be linted, which increases debugging workload.

To this point, the cost of these hacks is too much in my project. I find puppeteer play nicely with javascript after trying. But the difference and reliability between waitFor() in puppeteer and WebDriverWait() in selenium still remain questionable.

I am wondering if using puppeteer can ease pains mentioned above.


Selenium for sure doesn't fit in your use case then :). I pointed "user-centric approach" because selenium focuses on allowing only things normal user could do in normal user way which is limiting for automation but it's a blessing if you want to look as close to the real user as you can. I worked on solving google ReCaptcha v2/v3 at bigger scale and it's been a solid advantage.

> And javascript code in a string of python can not be linted, which increases debugging workload.

If you use PyCharm or any other JetBrains IDE you can use `Language Injections` [0]

> I am wondering if using puppeteer can ease pains mentioned above.

I'm not sure about puppeteer but Chrome DevTools Protocol [1] (I think this is what puppeteer use under the hood, but I'm not sure) may be interesting for you because it gives low level access to browser mechanisms like injecting code before page load, request interception or separate browser contexts between separate tabs.

[0] https://www.jetbrains.com/help/pycharm/using-language-inject...

[1] https://chromedevtools.github.io/devtools-protocol/


Thank you. The Language Injections feature amazes a vim user, though similar limited plugin exists.


Has anyone else found Puppeteer to be very slow? I was using it to scrape a webpage and found it to be much slower than Selenium.


What is the use case you're trying out Puppeteer for?


Scraping all the posts from a Facebook group.


Puppeteer is really really good. I just discovered it and used it for something and it’s a blast to work with.

I’m seriously thinking there is a cloud api play in the near future — a puppeteer Page is a pretty awesome container format ... a Puppeteer Page or BrowserContext per request would provide an awesome cloud-function-like programming model I think...


I really like https://microlink.io for browser-based automation.


Nice work, I have been using puppeteer with chrome recently and it is a great piece of kit. If the prototype is successful will you keep the chrome and firefox packages separate or merge them and have an abstraction layer so puppeteer can issue commands regardless of chrome/firefox being used for the browser?


They already have the same API


Yeah, that doesn't answer my question though ;)


Merging them in a single package gives you nothing at a significant price. puppeteer = wantFox ? require('puppeteer-firefox') : require('puppeteer')


Sure but now there are two separate codebases for the same API. A lot of code will be duplicated between the two codebases, you'll have to maintain two sets of issues on github etc.

If it were me I would aim to merge the firefox puppeteer into chrome puppeteer but fair enough if they want to keep them separate.

EDIT: I guess it depends on how much of the code in the puppeteer chrome codebase is tightly coupled to chrome dev tools. Like if 25% of specific to chrome, then that would mean that 75% of the code can be shared between the chrome implementation and the firefox one, in that case I think merging and hiding the 25% chrome/firefox specific stuff behind an abstraction layer is the best option.


This is nice work. Does anyone know if it's possible to enable reader view "programatically", though?


Sure, just go to `about:reader?url=whatever`

Or did you mean something else?


This is exactly what I'm after. Thanks a lot! :)


Looks like they’re quite far along in developing this according to this status page

https://aslushnikov.github.io/ispuppeteerfirefoxready/


Huh. That's great news. Time to add Firefox support to Lorca lib, I guess.


is there a benefit to using firefox with puppeteer compared to chromium and puppeteer? or is it just a matter of choice and being able to also use firefox for automated testing?


I run integration tests in Ruby on Rails with Capybara using both the Chrome and Firefox drivers. That project was stopped for a while and when I went back to it the Firefox driver didn't work anymore. I didn't have the time to check what went wrong and fix it but I'm developing with Firefox anyway. I think it's important to test with the major browsers. My customer didn't ask about IE. I remember when I had a few VMs for that and its incompatible version tied to a specific version of Windows.


Ah yes, browser compatibility! IE will soon be chromium-webkit based.


Deliverables are supposed to be tested on every major browser. I believe the goal is to use both.


This is the goal.

Disclosure: Chrome DevTools docs guy


I guess it's a matter of choice for people who still use Firefox.


*people who still use Chrome.



Interesting, I thought Firefox usage would be increasing based on all the hype/people converting on HN/other tech websites but the market share is actually falling. Guess it puts into perspective the size of that cohort compared to the rest of the world.


After Firefox broke most extensions with their removal of XUL extensions and embrace of "WebExtensions" (aka Chrome extensions), it lost a major differentiation point from Chrome. People simply have no (immediate, practical) reason to use Firefox.

Back in the Mozilla days i was using Mozilla because it was very powerful and flexible despite being very slow (i remember watching Mozilla 0.6's dialog boxes draw themselves) and many pages didn't work with it because everyone only cared about IE (...and i'd put the RIIR crowd to shame with my "evangelization" :-P). Nowadays i use Firefox mainly out of inertia.

Perhaps if Google disables most of the stuff adblockers rely on, there will be again a reason to use Firefox.


It also doubled in speed. As a long term FF user (and webdev), I was over the moon to trade in a few extensions for that sort of performance boost.

A year or so after the fact, I can't even remember what I've lost. Couldn't have been that important.


It became faster (not sure about double though) but i still remember losing MAFF, DownThemAll and a bunch of other addons. More importantly it is the functionality that was lost for new addons to be made that could provide features not thought by anyone before.


If you look at the graph, you'll see that the Quantum release had no discernable impact on Firefox adoption.

In any case, it still has a number of major differentiation points - more capability for extensions is even one of them, see their Facebook Container extension. But obviously the one Google will never be able to copy is a focus on privacy.


Firefox actually is FASTER that Chrome. Also, if you have any privacy concerns, you should stop using Chrome ASAP.


I never used Chrome as a main browser (though i do use it sometimes for its automatic translation).

Performance is ok but it isn't my main concern, i'm also concerned about features (after all lynx is faster than both of those browsers, yet it lacks a bit on the features side).

In any case, this isn't a hill i care to die on. I just do not see much of a difference between the two browsers anymore in terms of what they can do. I used to like Mozilla for its features (in fact i was really annoyed that they switched their focus to Firefox back in the day and left what they renamed to "Mozilla Suite" to die) and Firefox later for its (remaining) features. Funny enough now that i think about it, the reason was also "performance" back then and again i didn't care about it but i did care about the features lost.

I guess they kept on the same path and eventually Firefox will be a chromeless, featureless HTML terminal - not very useful as a browser, but it'll be the fastest HTML terminal :-P.


Firefox is faster than Chrome with their synthetic benchmarks.


For what I see, puppeteer is very much like testcafe (able to run with headless/full Firefox, Chrome, Edge ..., but testcafe doesn't need a custom-built Firefox). TestCafe can run javascript code on the page and return the data back, that means it can be used to get data in POST requests. It can take screenshots too. If anybody is familiar with both tools, could you please what this tool can do but the other cannot?


As far as I know, Puppeteer actually controls the browser, whereas TestCafe injects things into the page - risking influencing the results of your tests. I've never used TestCafe, so I don't know how justified that worry is.


I have never heard of TestCafe, but I've used Puppeteer for a variety of projects with Chrome. My latest one was using it to convert HTML files into PDFs en masse.


Is there an advantage to that approach over the more direct wkhtmltopdf?


It looks exactly like it would in the latest version of Chrome and is not dependent on Qt's Webkit version.


My experience with wkhtmltopdf a few years ago was that the PDFs that it rendered could look quite different from the PDFs that Chrome or Firefox would render.


This approach, for us, proved fastest when we scaled it using Kubernetes compared to the others we tried.


How's pdf rendering in Firefox? Chrome headless has its issues with headers and footers


Firefox uses PDF.js[1] to render pdf. This is a standalone project to render pdf in the browser using html and js. If I remember correctly, you can even render pdf in chrome using this.

[1]https://mozilla.github.io/pdf.js/


It's good, however it's not possible to use the remote debugging protocol for printing to PDF yet.


I just used Chrome headless to render several thousand pdfs last week and I saw no issues in version 73


Did you have a reason to not use PDFium directly?


Likely part of a web scrapping project that may require navigation or other browser things to get to. Or they are trying to avoid detection with a more real browser (you have to change some stuff first).


Can you recommend any resources for getting started with pdfium?


Sorry I'm in that position myself. A couple of years ago I compiled it without too much difficulty, and the included example I tried out rendered a page to a .png. Doing that for that for thousands of PDFs would just take a loop (or find -exec) if you have the files locally.


I was under the impression it was about to be retired ? Did something change ?


What is the equivalent of Chrome Dev tools protocol here?


It says its based on the mozilla remote protocol: https://wiki.mozilla.org/Remote

"The Firefox Remote Protocol is a low-level debugging interface based on the CDP protocol."


Weird that they'd pick a red panda for the logo.


Firefox has lost its lead because it's lost the developers.

Chrome has invested massively in developer tools for chrome and it's shows.... I think chrome is the best software ever written. The developer tools are so powerful that it'd hard to even know the full breadth of what's in there.

Firefox offers nothing to developers over chrome, which is sad.


> Firefox offers nothing to developers over chrome, which is sad.

Firefox has multi-account containers. Having different tabs logged in to different accounts is huge for me.


I use Chrome profiles to great effect.


Can't you only have one profile open at a time, and each one is linked to a different Google account?

Firefox allows you to have each tab open in a different container, and they don't have to be tied to anything in particular.

I can have different tabs open to the AWS console for each AWS account I use. I can also switch tabs to change between different user accounts to test interactions between users on the site I'm developing.


> Can't you only have one profile open at a time

No. Each profile opens in a new Chrome window. You can have as many open as you'd like.

> and each one is linked to a different Google account?

No. Local profiles have always been a thing.

Chrome's implementation has better UX for privacy IMHO; it's harder to accidentally open a tab in a different profile than expected, which I do _all the time_ in Firefox. In Chrome, Cmd+T opens a new tab in the active profile. In Firefox, Cmd+T opens a new tab in the default profile.


Yeah, the implementation could use some work. I also use the temporary containers plugin so by default tabs open in new containers. I think this gives better security than either of these defaults. If I need a site to open up in a particular container, I can always set that too.


https://github.com/tridactyl/tridactyl

Vimium/cVim just doesn't match up and aren't under as aggressive development. For this keyboard-focused developer, there's not a good enough reason to be on Chrome. I know I am in the minority, but it's a point nonetheless.


Working with both browsers daily for PhotoStructure, I can tell you I miss the FF grid tool and changes panels on chrome dev tools. More features are coming: https://www.mozilla.org/en-US/firefox/developer/


You actually try FF dev tools ?




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: