Lots of negativity in the comments but I think the solutions outlined really interesting and valuable! There are some idiosyncrasies around unit testing ui that are just hard to work around. And the issues with e2e tests in the context of testing individual UI pieces are totally valid too.
The one thing that at least initially rubs me the wrong way is how the overrides work. Like I get why that's a solution to how we can inject some data, but I don't like the idea of writing test-specific code in a component just to enable tests with this tool. That being said, I've done similar things in the past when I've run out of options and this looks pretty clean, I just wonder if there's another way.
> Conversely, using integration testing tools like Cypress or Playwright provides control over the page, but sacrifices the ability to instrument the bootstrapping code for the app. These tools operate by remotely controlling a browser to visit a URL and interact with the page.
I don't think the author has used Cypress or Playwright. Their real value is that they do not drive browsers from the outside like slow, flakey WebDriver. They also allow injecting test doubles. And to override app functionality (not having to wait for a "normal" 60 second timeout like the author hypothetically suggests).
Author here. Can you show an example of how Playwright would progress a timer on a page? For example, how would you make this component pass faster than 60 seconds?
I'm afraid the answer to this doesn't actually lie in tooling. It lies in software design. If something needs to be controlled, it needs to be controllable. Typically this means push. In a React component, this means props. It could be an optional prop, but once that prop was there, this component could be controlled. Once the component could be controlled via push, the page rendering the component could also be controlled via push. How do you push to a page? Query string params is the most straightforward.
So, imagine a page that rendered a version of this component that a human could navigate to (this is what was historically called a test fixture before Rails rewrote the meaning of this word), then imagine that that human could have complete control over this interval by setting a query string argument. A human can do all of the interactive testing they need. Then, when it comes time to automate, all we need to do is automate what the human can already do.
This is another principle of automation that has been lost in history. We should first be able to do things manually before automating them. When we (as automaters) jump right to automation we often simultaneously necessitate and embrace (again, because we identify as automaters) additional complexity in the form of tooling.
I'd venture a guess that SafeTest is not likely to be necessary for the things that it was built for. Software design could have solved the problems with significantly less complexity and tooling while simultaneously providing useful test fixtures for humans to explore.
Storybook kind of enables, but it's also tooling fixation in my opinion. That's another post, however.
Oh, and I saw your other post about rewriting components to allow testability. You may be attempted to accuse me of suggesting that here. I'm not. I'm suggesting that components are written with fundamental design principles in mind, especially the necessity to exert control.
There's more to say about this that touches on the example of the sign in, and I can expand if interested.
If you want to do it with the whole page and talk only to the local code, then yes, I'd recommend Sinon. I think that's a much simpler solution than . . . creating an all new NIH framework!
I'd also recommend refactoring to a more mock-friendly way to do that countdown if you don't want to cover up all the internal logic.
If the timeout interval is loaded remotely from some API (and it probably is if you have reasonably configurable nag popups), then you can always mock that API call.
The point is that you shouldn't need to rewrite your countdown component to allow testing. Can you provide a snippet of that change and what the test would look like?
Not being toggle parts of the app is the root of the issue when creating e2e tests. For example overriding a feature flag, you could find the API call for the feature but what if it's a grpc call and part of your dev build pulls in any update, you can't easily change the binary response with confidence anymore.
The current solutions are good enough to do smoke tests, but nitty-gritty e2e tests don't scale well.
i think the library's approach to DI is pretty neat (and meets a team where they are which is worth a lot), but i think you're running into an issue where people are saying that instead of working around the realities of your codebase, team and testing needs, you should have done something like this.
i'll say, i have never written a test for a hook or looked into what's needed to actually do that, but i suspect you don't need cypress or webdriver to test something like this has the correct output
or likewise you can probably use sinon or jest's fake timers to test the hook (however it is hooks are tested without triggering that error about not calling hooks outside of a function component, i guess you need to mock React.useState?).
but like, whatever works for your team! i think it's fair to argue for either direction, but neither is zero-cost unless you have buy-in for one direction vs another from your coworkers, which honestly is all that matters especially if you have to eat lunch with them occasionally.
Creating an override is basically just providing a placeholder for a value to be injected via React Context. I view this as a form of dependency injection. Contrast this with how this would be done in vanilla Playwright with reading it from a query param or exposing a global to call page.evaluate on which is more along the lines of forcing test code into a component.
Note that if you needed a specific reference in an override there isn't a good way to get that via Playwright, consider this silly example:
Does it not feel a little old-fashioned to dynamically rebind stuff to force some code to be testable, rather than just write it to be testable in the first place? If I saw someone doing this in any other test suite I’d suggest making the dependencies explicit.
I've always wanted to build a UX tester that uses accessibility tech. We get to test both at the same time, and it might make the accessibility tech saner and easier to use also.
I think this is overselling itself a little. It seems genuinely useful to be able to do more expansive integration tests of components, and combinations of components, especially running in real browsers. But that’s only really an incremental gain over component unit tests with mocks. In the case where you own a very complex SPA with multiple back end dependencies you don’t control, then yeah, maybe this is all you need. But I don’t see how this can replace real end to end functional tests in most apps.
Needing to change the application code to mock things rubs me the wrong way. I 100% do agree with the frontend testing challenges they lay out in the beginning of the post though.
My initial impression of this is that it enables controlling just how much of the app you want to render for your test, while more traditional solutions force you towards either rendering the whole app or testing just a component. Is that accurate?
I think netflix is at a scale where you can question what I'm saying below, but every frontend test suit I've seen is so full of mocks you basically only test your own test suite. And then common front end bugs like "renders off the screen" or "doesnt work in safari" aren't caught anyway.
I hugely support tests and I write a lot more of them than most people. I just don't think it usually works on the frontend.
I work on fairly complex, dashboard-y, web app UI used by many fortune 100 companies. I can count on one hand the number of times the huge test suite I inherited has prevented bugs entering production. I've spent at least 40 hours debugging false positives and config issues, and about half of our deployment time is spent running these tests.
From what I can tell, there are two types of tests that are worthwhile in UI: unit tests on functions (not UI elements), and basic integration tests. I believe it's possible to write valuable tests that don't fit into this framework, but from what I can tell every other UI engineer feels compelled to write tests that just regurgitate component implementation details.
React is pretty performant when context isn't changing. We haven't done any benchmarking but I doubt there's any real world perf hit. For large applications the number of overrides tend to be under 20.
Overrides are opt-in so you can just expose any overridable value as a prop and run a isolated component test on it.
Can you clarify what you mean? Usually, e2e testers don't have a bootstrapping stage for app-level changes, only for things that can be done via the browser automation APIs.
how many billions of dollars and they still host their blog on a nagware site SMH It's not like moving would be high drama since they already have a custom domain for it
From the Hacker News Guidelines: "Please don't complain about tangential annoyances—e.g. article or website formats, name collisions, or back-button breakage. They're too common to be interesting."
The one thing that at least initially rubs me the wrong way is how the overrides work. Like I get why that's a solution to how we can inject some data, but I don't like the idea of writing test-specific code in a component just to enable tests with this tool. That being said, I've done similar things in the past when I've run out of options and this looks pretty clean, I just wonder if there's another way.