>*Alaska Airlines doesn't trust Boeing enough that they're sending their audit t...

mlinhares · on Jan 24, 2024

> “If we would have run the integrated test with ULA through the first orbital insertion burn timeframe, we would have seen that we would have missed the orbital insertion burn”

I can't believe I've read that. There's no saving this company.

twh270 · on Jan 24, 2024

That section reads like a Bad Advice column for software testing in general. "Ehh, don't bother with end to end testing -- testing each part in isolation will be fine."

Maybe fine for an e-commerce site or game, where the stakes are measured mainly in lost revenue/customers and a team of engineers burning the midnight oil to sort the problem. Not so fine for spaceflight.

im_down_w_otp · on Jan 24, 2024

You're going to be very sad to discover the sentiments around software-intensive systems testing in a lot of places & industries that you'd rightfully assume should know better and do better.

I had a safety critical embedded engineer who had worked on everything from weapons systems to automobiles first not have any idea what property-based testing was and then proceed to tell me at length how it was pointless because unit testing is all you need as long as you keep making your units smaller and smaller.

At the time I worked in FinTech and was aghast at what I was hearing as they worked on self-driving. If I had to pin down what made me shift my career path into "cyber-physical systems", it was probably that conversation.

Since making the transition and later founding a company dedicated to testing in that domain, I've heard dozens and dozens of sentiments even more concerning.

JohnFen · on Jan 24, 2024

> and then proceed to tell me at length how it was pointless because unit testing is all you need as long as you keep making your units smaller and smaller.

I've encountered devs that think this nonsense before, but always assumed that this attitude couldn't be held with systems that are actually critical. I'm a bit less naive now.

shermantanktop · on Jan 24, 2024

I thought that too, early on. Sort of a perfect-bricks-make-perfect-walls idea. And when problems emerge at the integration stage, it’s often traceable to a gap that a lower-level test could have caught. So the mentality is a bit self-reinforcing.

im_down_w_otp · on Jan 24, 2024

Exactly. All the products we make are focused at the integration and system level because it's crystal clear that's where enormous amounts of issues arise and unbounded numbers of 11th hour fire drills occur. The various industries who've had multiple product cycles with lots of software are starting to figure this out, but it took awhile.

Despite how obvious it is with a little outside observer perspective, the "perfect bricks" way of thinking is pervasive because that's how the builders and supply chains are organized.

bumby · on Jan 25, 2024

Ariane V is an interesting case study in that regard [1]. It failed because they didn't feel the need to retest the software reused from the Ariane IV design. No need to re-test the bricks that already worked before...

[1] https://en.wikipedia.org/wiki/Ariane_flight_V88

chernevik · on Jan 24, 2024

Which is silly, because the whole point of integration testing is to show your bricks don't fit together as expected.

shermantanktop · on Jan 26, 2024

Or that your bricks aren't square, or they are actually made of sand, or that they collapse under load.

To leave the analogy behind, any non-trivial component has a testable surface area, but typically has additional modes of behavior associated with internal state, environmental conditions, or other areas that well-meaning unit test writers didn't think about.

I have often found issues in simple caller/callee pairs of two components, both of which are tested, but the caller contains subtle expectations of the callee that the unit tests don't match up with.

bumby · on Jan 30, 2024

Software testing needs a unique mindset compared to traditional hardware reliability testing. The software doesn't "wear out" like the hardware but tends to fail in coordinating functions. Too often, we rely on simplistic measures like a hard restart to manage coordination failures, but that isn't always an option on safety-critical applications.

grumpyprole · on Jan 24, 2024

Taking that argument to the extreme, why test any new software at all when the underlying instructions have all been thoroughly tested!

shermantanktop · on Jan 26, 2024

Well we do that all day long when we write to a file system and expect it to actually persist data to disk. It's usually only db engines and loggers that attempt to flush, and even then the disk will usually lie in response.

boringg · on Jan 24, 2024

When lax QA testing / engineering practices / boost the share price strategies from move fast break stuff e-commerce/social media practices seep into real world systems companies.

Zach_the_Lizard · on Jan 24, 2024

From an outsider's perspective, it seems "move fast and break things" worked well enough for SpaceX.

Boeing et al seem to be following the "move slow and hide problems so we don't fix anything" mantra.

bumby · on Jan 24, 2024

>"move fast and break things" worked well enough for SpaceX.

Counterpoint: SpaceX had to re-learn well-known practices that are so commonplace in aerospace it's shocking they weren't being conducted [1]. One example is more complete material testing on critical components as part of supplier quality control. When they lost a rocket due to this process gap, their solution was to layer on those common QC practices. IMO these are risks that, over time, may turn SpaceX into the dinosaurs they are competing with. (reference to Chesterson's fence is probably apt.)

"Moving fast and breaking things" may be fine, but to the GP's point, it has to be a risk-based decision. We probably shouldn't be aiming to move fast when lives are at risk.

[1] https://spacenews.com/falcon-9-failure-linked-to-upper-stage...

ABeeSea · on Jan 24, 2024

> Instead of simulating an entire mission from launch to docking, “the team decided they would rather run multiple tests with chunks of the mission,” Mullholland said. “Going forward, for every flight, we will do launch to docking and undocking to landing in addition to the other tests we were doing in our qualification testing.”

Given the amount of money and lives at risk, this level of process short-cutting is mind boggling.

bumby · on Jan 24, 2024

I suspect this is a result of schedule pressure. The legacy AE firms were falling way behind SpaceX and I'm guessing they felt the need to play catch-up and didn't want to "waste" time on testing.