>> People want to see 100% success in all things. But that isn't very economical.
That sounds logical, but isn't really a thing in aerospace/space. Complicated high-energy systems have thousands of failure points. So to have any chance of success each failure point needs to be engineered below, by way of example, a 0.0001% chance of failure. That costs lots of money. But say one decides to accept more risk for less cost. Ok. So you switch from 0.0001 to 0.001 failure rates. You risk is now 10-fold higher at each failure point, but with thousands of failure points adding up you are now essentially doomed. And you haven't saved anything. The cost of 0.001 components isn't fundamentally different than the 0.0001 components were. SpaceX can save money though different business practices, by trimming people/money/contracts/compliance and such, but if you look at their rockets they are not fundamentally any less-perfect than anyone else's. They cannot afford to be. This is why rocket failures, like aircraft failures, are taken so seriously. There is an extremely fine line between "works ever time" and "never worked twice" with very little money to be saved between the two.
Across many areas, risk-v-cost math never really happens. It is either go or no go. Take CPU production. Intel spends billions at each of hundreds of fabrication step to push down miniscule error rates because any of a million errors can destroy a chip. There is no money to be saved by allowing any one process to become less than as perfect as it can possibly be. A detected slip from 0.0001 to 0.001 at any step would result in an entire fab being shut down in order to diagnose the problem. The marginal savings of a less-than-perfect process isn't worth the exponential increase in the risk of total system failure.
I'm not proposing something, I'm describing it. This is what SpaceX is doing, and it's quite successful.
And you can look all the way back to their little hopper version of the Falcon 9 and see that this strategy has been the key to them undercutting the launch market significantly.
My prediction: SpaceX will have a 5th 100% successful launch of Starship before the SLS has a 5th successful launch. They'll just have ten not-100%-success launches before then.
Completely agree. I've worked in "old space" and the fundamental problem is that they can't afford to experiment. SpaceX has the option to physically test an idea that's holding them up before investing in fully bringing it up to production quality, only to have to redo all that work next iteration. That's why they can make new things and do it cheaper and faster.
Ironically, "old old space" = Soviet space was so damn good at innovating it probably would have made SpaceX look old-fashioned. Really I don't think there's a fundamental reason why we can't have two SpaceX's (Spaces X?) so why is there no other?
Haha I had a feeling someone would bring this up. To me, old space isn't farther back in time in the golden age of space, but rather what the space industry eventually calcified into. New space is like a Renaissance.
There are other new space companies, but they're just not as good.
Except SpaceX’s failure rates are similar with every other successful launch system. Rather than looking at failure as a constant rate you need to consider these numbers change with every flight. Initially major design flaws are identified and workers become more skilled at a process, eventually new errors creep in etc etc.
They have greatly benefited from being able to use modern tools and seen where other systems failed. Many rocket companies have failed when trying to go fast and break things because it isn’t an easy shortcut. Instead SpaceX has used the normal approach used by other successful organizations and simply executed it well.
> Except SpaceX’s failure rates are similar with every other successful launch system.
Falcon 9 has the record for most consecutive successful orbital launches. Their last failure was AMOS-6 in September of 2016. Since then they've had 189 successful launches in a row.[1] In that same time Soyuz has had 113 launches with 3 failures. Soyuz's longest success streak was 100 launches from 1983 to 1986.[2] The US's Delta II had 100 consecutive successes from 1997 to 2018, though it has since been retired. A total of 155 Delta IIs were launched with 2 failures.
Falcon 9's current successful landing streak of 110 missions exceeds the competition's best launch streak. By any metric one can measure, SpaceX has the most reliable rocket.
You can always slice and dice data to make one side look better.
The actual number of successes vs partial successes vs launch failures vs fatalities are the best data we have. Throwing away any of that data because it makes you look worse isn’t a good idea.
Similarly we needs to understand that there’s a huge difference between risk and what actually happens. People get lucky in Vegas every day, what matters to most of us is the accuracy of the estimate of underlying odds not just the exact outcomes out to seven decimal places.
> actual number of successes vs partial successes vs launch failures vs fatalities are the best data we have
Current vehicles are vastly different from the originals. What we’re trying to do is predict the probability of the next launch failing. Equally weighting far historicals and recents is bogus statistics.
> What we’re trying to do is predict the probability of the next launch failing.
I thought we were comparing methods. Unless the next payload is yours then the odds of the next launch failing is meaningless to most of us, but we can learn something from the methods used.
But sure, if you have a bet in Vegas or something then feel free try and calculate things as closely as possible. Just understand that several of Soyuz failures didn’t kill the crew so there’s other metrics people might care about.
What does this mean? The question most of us care about is which method resulted in a more reliable rocket. And SpaceX’s track record shines uniquely in that respect. The frequency, moreover, makes the results robust. Legacy rockets like Ariane will never reach that confidence because the likelihood of fluke successes won’t have been minimised when the rocket is retired.
As to why their methodology is important this isn’t the Falcon 9 this is a new launch system which is likely going to have multiple failures before it’s own streak can begin.
So sure, we can reasonably assume that Starship will get to a state of reliability similar to current Falcon rocket, eventually. We can’t assume the first few commercial Starship launches are going to even approach that level of reliability. And in fact the best point of comparison may be the early days of Falcon 9.
Speaking of methodology, it's incorrect to relate a development test result to reliability or risk. Source is my personal experience doing reliability calculations for a NASA rocket component and working with the statisticians incorporate my numbers into their risk model.
You have it mixed up. I've worked with the stats at NASA. Mission success and failure counts. Test quantity and quality count, test freedom counts, how they learn from test counts, but the test result does not count. This isn't a mission.
This was a partially successful test nothing more and nothing less. I get people really really think SpaceX had done an excellent job and I don’t disagree but people who are comparing the end result of a long process Aka the current state of falcon 9 with a new system like Starship are going to be disappointed.
Starship is extremely likely to fail repeatedly before achieving anything close to the same streak as the Falcon 9 has. That’s not an issue with SpaceX that’s an inherent aspect of doing something really difficult.
I don't think you realize this, but when you said that we can't exclude the test failure from the risk/reliability assessment, that's exactly what you're saying. I didn't realize until just now that you're actually defending the test failure as being acceptable.
This isn't a production flight so why are you treating it at one? What if I told you that you can't compile your code until you deploy it for the first time and you don't get to change it much after that? I'll leave you to contemplate that thought experiment yourself.
I am not treating this as anything but a test flight. Several government test flights have similarly achieved core objectives while failing to achieve every objective.
I guess I'm not sure what point you're making. If you consider a test flight failure to be part of the overall failure rate, then you're treating it the same as a "real" flight. The government is constrained to less experimentation on every level from a daily basis up through test flights. Overall they do less useful engineering and more unnecessary work.
> Except SpaceX’s failure rates are similar with every other successful launch system
Really? I make is 189 successful F9 launches since the last issue in 2018 (there's a couple of landing failures, but given that everyone else apart from the space shuttle has a 100% loss...)
If you look at the "finished product" of block 5 that makes 162 launches and zero failures.
That's reliability far beyond any other launch system, including the space shuttle and Ariane 5 which are the only ones to come close in numbers of launches. Ariane 5 is certainly a reliable system as far as spaceflight goes, but it flies 3 times a year, Falcon 9 flies 3 times a month
Hardly, landing on launch pad had a lower success rate and requires significant fuel so many otherwise perfectly reusable boosters were sacrificed for a higher launch payload.
They got great publicity from it, but landing vertically is a major compromise.
Sure, ignoring past failures can always make someone on a winning streak look invincible. But calculating the underlying odds to hit even a 200 long winning streak with the observed failure rates on other systems wouldn’t be particularly unlikely.
These systems all are quite good, and they have tended to get better over time.
My point would be that while SpaceX does save lots of money, it doesn't do so by producing cheap or less reliable rockets. It would never accept a failure rate any different than anyone else because, in aerospace, that isn't really a thing. SpaceX can certainly blast forwards with different business practices and different tolerance for developmental risks, but the final product will not be fundamentally different than anyone else: near-perfect machines resulting in near-perfect performance. There will never be a "cheap" version of a commercial rocket with an accepted less-than-perfect failure rate. Spaceflight is an all-or-nothing game.
> My point would be that while SpaceX does save lots of money, it doesn't do so by producing cheap or less reliable rockets.
It does though. During the development process. SpaceX will do five launches with lower reliability vehicles before the big launch providers will even do a single launch, and those five will have be cheaper in total. Of course, they aren't putting essential cargo (or god forbid people) on these higher risk test flights. By the time they're doing that they have developed certainty in the design.
A number of Tesla related non-deaths can be attributed to Autopilot safety features working as promised. Tesla will argue that the number is higher, based on number of crashes per mile statistics. I'm not sure if that's true, but assuming for everyone who died due to an Autopilot failure, someone else survived a human error crash that didn't happen, would that be a good thing? What about if it was, for example, 10 people saved for every 1 killed?
And statistically many more people would have died without AP. But you're correct in that Tesla is using the same playbook on FSD as SpaceX, launch HW (and SW) early and iterate often, and I'd bet they'll save way more lives trying to get to autonomy like SpaceX rather than like NASA (the Waymo approach).
> But you're correct in that Tesla is using the same playbook on FSD as SpaceX
Except Tesla, unlike SpaceX, is willing to put passengers in its test vehicles. The SpaceX approach would be to let a bunch of FSD Teslas crash into things and each other before giving them payloads.
Putting someone in an experimental rocket is quite different from being essentially a safety driver required to pay attention and take over at any time. If you have an accident on FSD is is probably (though not always) your fault for not paying attention.
I “unwillingly” have to share the road with people who murder 40,000 people a year with their vehicles. Thankfully we are developing the technology to get these reckless maniacs out of the driver’s seat.
> And statistically many more people would have died without AP.
Citation needed. Not Tesla's "stats" that if the people compiling them completed anything more than high school statistics are intensely misleading.
Comparing a subset of miles driven on the simplest and easiest roads (because the systems can't be used and are turned off) and comparing to accident stats across ALL roads is disingenuous to the extreme, and Tesla continues to tout it.
Short of pulling over, humans don't have the opportunity to say "let's disengage, because it's a bit challenging", and then not have to worry about "counting" any accidents from there forward.
Agreed. It’s like if I unit tested 0.01 percent of my code but ran the unit test 10 million times, with no failures, and claimed it was therefore “statistically” better than code that had been 100% manually tested.
SpaceX does not have a difference in intended reliability, or a difference in design reliability. (At least for the rocket as a whole. One could argue that 33 engines allows lower engine reliability through redundancy)
What they DO have is a significant difference in prototype reliability for live launch. This is clear when you look at their launch history.
Or one could say that SpaceX has basically the same tolerance for failure as the traditional rocket companies had back in the 50s and 60s when they too were first learning to build rockets. That pre-launch "tolerance" is basically zero, with every post-launch failure being investigated as a mistake to be corrected rather than an acceptable cost of doing business.
I don't think that is accurate. There is a difference between 9%, 99%, and 99.999% confidence of success going into a launch.
You can almost always delay builds and launces to run more simulations, tests, and studies and increase confidence.
A simple example is SpaceX could have chosen to wait until they had a booster test with 100% engine ignition before moving on a full launch. Instead they choose move forward anyways without more stationary booster testing.
> There will never be a "cheap" version of a commercial rocket with an accepted less-than-perfect failure rate. Spaceflight is an all-or-nothing game.
Why? If you're launching people I see why you want near-perfect, but if you're launching something with a low replacement cost (ex: getting fuel to orbit to support other projects) it seems to me that as volumes get large enough eventually "use lower quality and accept a slightly higher probability of failure" starts to be cost effective.
Because there is no way of building a cheaper rocket with a less reliability. Take aircraft. Does anyone deliberately build cargo aircraft with less reliability than passenger aircraft? Does anyone build a smaller airliner with less reliability than the big airliners because fewer lives are at risk in the smaller aircraft? No. All aircraft are designed and built to amazingly high standards because, in such as complex high-energy environment, there is no money to be saved by building less-than-perfect machines.
Most cargo planes are expected to run out of civilian airports where a failure could result in debris launching themselves into populated vehicles or buildings. In contrast a crop duster launching off a dirt runway and expected to go no more faster than highway speeds actually can be built fairly loose, and often are. For example, this duster here [1] won a award in the 70s for innovations like a "pressurized cockpit" and "air conditioning".
Listen, we don’t need to speculate. Read the History of Falcon9 development.
Your comments in this thread all go against the development approach of that rocket— Falcon9 is a stable platform now as it is used in production. In the development phase there were tons of explosions. This all happened in the past.
No. All aircraft are designed and built to amazingly high standards because, in such as complex high-energy environment, there is no money to be saved by building less-than-perfect machines.
But this is totally false. There are entirely different standards applied if you want to design an aircraft for commercial airline passenger transport vs for general aviation. There are entirely different requirements for instrumentation reliability if you're building a day-VFR aircraft vs one that is allowed to fly in instrument conditions. And there are entirely different requirements for aircraft that you sell to the public vs ones that you build yourself.
The aviation side is full of examples of exactly the sliding scale you're saying doesn't exist.
> there is no way of building a cheaper rocket with a less reliability
Of course there is. The famous example is radiation hardening. SpaceX opted for redundancy instead. Not only cheaper, but more modern, too.
> in such as complex high-energy environment, there is no money to be saved by building less-than-perfect machines
SpaceX has launched zero humans. (EDIT: Totally wrong!) It aims to, so target reliability is high. (I would argue their track record in production is a product of their willingness to push the envelope in tests.) But there is a large market for cheap, if unreliable, launches. Because there is an emerging market of cheap satellite makers.
I'm with you on spaceflight but people definitely have built small cargo carrying UAVs to a lower standard than passenger aircraft. Such a thing is conceivable anyway.
Ya but smaller UAVs aren't operating in the same energy environment. They are small enough that them randomly dropping on people's houses doesn't matter much. Aerospace is about things large/fast/high enough that all failures put lives at risk.
Okay, so you're not saying it has to be expensive, but almost the opposite? Perfect and slapdash have similar production cost, so properly designed rockets will all be just about perfect?
> My point would be that while SpaceX does save lots of money, it doesn't do so by producing cheap or less reliable rockets
Indeed, if we assume that the each launch is an IID binomial coin flip (which isn’t really the right way to evaluate right-censored data), and observe (by reading Wikipedia) that SpaceX has had at least 450 successful Falcon 9 launches since the last in-flight failure, then they have at least five nines of reliability:
0.999995^450 ~= 449/450.
Which appears to be an industry-leading stat.
For context (excluding the Columbia re-entry failure), the space shuttle only had 4 nines:
I'm going to challenge those numbers: Using a posterior probability density function for a binomial distribution, the lower bound in Falcon 9 reliability is .9934 with 95% confidence (assuming 450/450 successful trials). The reliability of F9 could be much lower than five 9s and still reasonably give you 450/450 successful trials. There's only a 0.44% chance that F9 reliability is at least five 9's given the data.
Sounds exactly the same. There are no cheap aircraft. Everything that flies is subject to innumerable laws and regulations to ensure that it is built to exceedingly high standards. There is no such thing as a discount aircraft with lesser reliability. Some are cheaper than others but none are deliberately less-reliable, not in any fundamental way. Even ultralights have to abide many regulations.
To me, this comment implies that SpaceX is better/smarter than NASA, Northrop, Boeing, et. al. That may be true, but it's worth remembering that the goals are very different. Because of congressional oversight, SLS is largely a jobs/pork program. Spaceflight is incidental.
> To me, this comment implies that SpaceX is better/smarter than NASA, Northrop, Boeing, et. al.
I don’t think SpaceX has smarter engineers. I don’t think they even have smarter managers, since many of their executives used to work for NASA and/or traditional aerospace firms (e.g. President+COO Gwynne Shotwell started out in aerospace at a private non-profit research centre doing contract engineering work for the NASA Space Shuttle and the US military space program)
One big difference is Elon Musk at the founding of SpaceX told his executives to take big risks (as in “if this fails we go bankrupt”). I think Tory Bruno at ULA is a great CEO, but no way is Boeing or Lockheed-Martin ever saying to him “we want you to take such big risks that we might go bankrupt if they fail”. He, and all the people under him, are only allowed to take small-to-medium sized risks. But that puts a definite limit on what they can achieve compared to SpaceX whose executives and engineers have the freedom to make much riskier decisions
>My prediction: SpaceX will have a 5th 100% successful launch of Starship before the SLS has a 5th successful launch. They'll just have ten not-100%-success launches before then.
I'm not the best way to frame this is "first one to a single successful launch". Reliability matters when you're dealing with high-risk scenarios, so a better measure is "probability of a given launch being successful".
That only matters after you’ve gone operational though. The difference here isn’t risk appetite for operational launches, it’s risk appetite for test launches. SpaceX expects to do many, many more test launches than SLS, in fact counting Starship upper stage launches they’re already well ahead.
I'm not asking as a 'gotcha' question, I'm acknowledging that the sample size matters. A lot of these statistics are bandied about without really elaborating on the full context of the way reliability is actually measured in industry.
Without checking for the most up-to-date numbers, I believe the F9 is slightly better than Soyuz on raw numbers (successful flights / total attempted). But Soyuz probably has around 6-7x more flights, meaning we have much more confidence that the Soyuz numbers accurately reflect reality.
Soyuz has only had 140 flights, and had some non lethal mission failures in recent decades. It’s a manned mission vehicle only though. Overall I don’t think you can really say one is more reliable than the other, they’re both very reliable. I’m just saying the suggestion that SpaceX approach is inherently less safe is very much contrary to the evidence.
Having said that I do think propulsive landing for crewed vehicles is pretty scary. F9 firsts stage landings have been pretty reliable for a while, and they now have several boosters with 10 or more flights, so we’ll see. It’s not like capsule landings are 100% safe either.
There has been over 140 crewed launches. For context, you seem to be counting both test/demo, crewed, and uncrewed F9 launches. Again, I don't know the exact number off the top of my head, but there's probably ~10 crewed F9 launches, so it's an order magnitude difference using the same metric. It's gets better though, when comparing total launches.
This is true -- there is a lot of small number statistics in estimating the reliability of launch vehicles. And there are a lot of small risks you never see until you accumulate a lot of lunches. But that question is completely incidental to the design methodology.
A new vehicle will always be unproven. But one that's flown 5 times in testing (4 of them unsuccessfully) before its first flight will have had more flight time on most of its systems than something like the SLS where you take 10 years to think about what could go wrong and then do one test launch and call it good.
As long as the organization has and maintain a culture that is aggressively seeking out problems large and small, and proactively fixing aprox. all of them, you will end up with a program with a high rate of operational success.
It's the X-origin vs Slope issue - steeper slope always wins.
The problem is maintaining that aggressive problem-seeking culture after long periods of success
This is part of the culture distinction. On one hand SpaceX is attacking problems, but IMO they often don't go towards attacking the root causal understanding. That means there's a possibility of unknown latent risk.
As an example, they had issues with failures related to their COPVs rupturing. On the one hand, they addressed the problem by redesigning their system. On the other, they never really investigated fully why the COPVs were failing in the first place. Instead, NASA decided to fund that investigation on their own. One possible consequence is that their redesign didn't fully address the risk because they never fully investigated the root cause.
But the apples to oranges here is >1 launches (SpaceX, some successful and some unsuccessful) vs 1 successful launch (traditional modern aerospace).
From an outcome perspective, it's hard to ever see the lower launch rate dominating from any perspective.
You don't have economies of manufacturing scale, because your assembly rate is so low it doesn't make sense not to treat each as exquisite.
You don't have rapid iteration on manufacturing improvements, because the tyranny of safety checks on manufacturing time balloons {time from fix to flight}, after a proposed fix is identified.
And most importantly, you leave yourselves extremely vulnerable to unknown-unknowns, that you can't imagine in the design phase.
For example, if NASA had been launching the shuttle more rapidly, with the bulk of those being uncrewed launches, they probably would have picked an uncrewed launch to test expanding the temperature bounds at Cape Canaveral, and Challenger would have exploded without a crew.
As was, NASA's shuttle launches were so rare that there wasn't acceptable launch rate and weren't low-impact launch opportunities to do so. So they tested it on a crewed mission with disastrous results.
Point being: they backed themselves into a low-volume/high-risk corner of their own strategic design
SpaceX's most brilliant achievement was using Starlink to artificially boost launch demand and give them a minimally-profitable/break-even place to sink higher-risk launches.
> Point being: they backed themselves into a low-volume/high-risk corner of their own strategic design
Because they knew they had to "get it right" the first time because a bunch of buffoons(congress) would consider any crash or explosion as a failure and pull funds immediately.
>Because they knew they had to "get it right" the first time because a bunch of buffoons(congress) would consider any crash or explosion as a failure and pull funds immediately.
I agree with many of your points, but it comes across as slightly biased because you don't acknowledge a single downside to any of them.
>You don't have rapid iteration on manufacturing improvements, because the tyranny of safety checks on manufacturing time balloons
Rapid iterations has obvious benefits. But there are also downsides because it makes it harder to arrive at a stable, reliable design, it introduces vendor issues, etc. Tesla is also known for iterating faster compared to their major competitors but it has resulted in logistics and reliability issues.
IMO SpaceX's most brilliant achievement was leveraging govt contracts to work out the kinks of their designs, which could then be leveraged at a lower risk for Starlink. In effect, they let the public take the burden of the risk (because the govt is really the only entity capable of shouldering that size of a risk for an unproven quantity) and then transitioned to a private means of revenue in Starlink. (I'm not saying that as a slight btw, I think it's mutually beneficial).
Frankly comparing Tesla and SpaceX is getting to be a tiresome argument. They're owned by one "eccentric" billionaire, but he's not an engineer, they don't share staff, facilities or manufacturing outside of "hey this alloy is pretty good".
SpaceX's strategy for the Falcon 9 worked and it's one of the most reliable rockets in the world, flying the most often.
>Frankly comparing Tesla and SpaceX is getting to be a tiresome argument.
You're focused on the wrong takeaway. You're making this about a person, I'm talking about a process. I used Tesla because it's easy to see how one culture translates to the other. Insert any company that uses rapid iteration in place of Tesla if you prefer.
The point is that there are certain circumstances where rapid iteration is useful and others less so. When reliability matters, rapid iteration may be working against you. (It's a continuum, of course, so the real question is where is that tipping point)
>SpaceX's strategy for the Falcon 9 worked
The point I'm driving at is there is a distinction to be made when finding out why certain iterations didn't work vs. just changing the design without fully understanding the failure mechanisms of the first. One leads to a greater understanding than the other. It's a difference in an engineers mindset and a scientists mindset.
No. I am not saying it’s the “industry” as much as the context of risk. That's why it doesn't matter if the analogy is Tesla or some other safety-critical manufacturer. To be clearer: how many F9 launches have carried humans?
Now go look at the history of Shuttle for the equivalent number of launches at that risk level. Would you claim they are equivalent in terms of human-rated safety?
If not, it’s only because you have the benefit of knowing the long-tail probabilities with the Shuttle.
>It's the same industry. Building the same type of product.
By extension of your logic, Starship should then already have the same launch reliability as F9. So either this is an example of a low-probability event, or your logic is flawed.
There were 135 space shuttle launches, of which 2 failed. There have been 162 launches of the F9 block 5 with 0 failures. Why do you think we have more knowledge of the long-tail probability of the Shuttle than F9?
True, few of those F9 missions were crewed, but that's the point. There's no difference between a crewed and a noncrewed F9 launch vehicle, so there's no reason to think the presence of humans would change the risk. So, you get to accumulate most of that reliability data without putting people at risk doing so.
>Why do you think we have more knowledge of the long-tail probability of the Shuttle than F9?
Because the nature of the two programs was fundamentally different. The Shuttle was a product contract, while CCP is a service contract. On the former, the govt has much more control, and will detail more rigorous acceptance criteria. This generally gives a much higher pedigree on quality control. On the latter, they take a much more hands off approach and have limited insight.
As an analogy, imagine you are making a big bet on acquiring a software company. One company gives you their source code, shows you all their most recent static analysis, unit test results, allows you to interview their programmers etc. The other company allows you none of that, but gives you a chance to play around on their website to see for yourself. Both systems seem to work when you try the end product, but which do you have higher confidence in?
At the end of the day, "reliability" is just a measure of how much confidence we have that a product will do what we ask of it, when we ask.
Are you asserting that we gain better insight into the reliability of a system by thinking about it deeply rather than by observing it perform its function? Because I don't believe that for a minute.
I'm not saying you can get by without thinking, but it's difficult for humans to estimate the reliability of a complicated system. Reality, though, has no problem doing it.
Plus, I think your analogy is flawed. NASA surely has a more hands-off approach on the CCP than on the Shuttle, but to say it's hands-off is misleading. They do have a lot of access.
No, I'm not saying by "thinking about it" (although that has its place). Everything I listed is a form of testing. But there's a distinction between iterative testing at a lower level, and end-to-end testing. Again, both have their place.
Take the example of the F9 strut failure. They could have tested the material outside of the final test configuration and saved themselves a lot of trouble. They chose to forego that testing, and instead 'tested ' it as part of part of their launch configuration. (I put it in quotes because it's not clear to me that this was a conscious testing decision).
There’s also a difference between “we’re not completely sure of the fundamental principles, but our testing indicates it works” and “our testing indicates it works and we have a solid understanding why”. The latter allows you to know the limits of your application much more readily. The risk in the former is that you don’t know what you don’t know, so you can never be wholly sure if you’re good or lucky. And luck can be fickle. And this is also where rapid iteration can lead to issues: the more you change, the less sure you can be about whether your results are attributable to luck.
>Plus, I think your analogy is flawed. NASA surely has a more hands-off approach on the CCP than on the Shuttle, but to say it's hands-off is misleading. They do have a lot of access.
They have many engineers who want more access and are effectively told to back off because it's not their place in this type of contract. So I'm not saying they have no access, I'm saying they have very limited access by comparison. It would have been better if I made the analogy that they get the results to a small select number of tests, but not all.
Considering his vast wealth of knowledge and expertise in the field he might as well be called an engineer. In fact probably more so than many real engineers.
That doesn't really apply here. The dynamics of stage separation on Starship are orders of magnitude more complex than anything used on any previous rocket. If you want to look at what failure looks like for a super large rocket compare it to the failed Soviet N1. This flight demonstrated better performance in a single test flight than every N1 test flight over years. The N1 suffered catastrophic failure after catastrophic failure. Starship was pushed way past its design tolerance and showed its design is technically sound. Failure is unavoidable, the Saturn V almost failed its first test flight way earlier than Starship did here.
I feel like the N1 is unfairly demonized. They didn't have the benefit of modern closed loop computer flight controls that we have now. Detailed fluid dynamics simulations. Modern manufacturing techniques and production accuracy.
You're talking about failure rates per rocket, but we're talking about the overall development program. Falcon 9 certainly has very low failure rates on its individual components now (below your suggested 0.0001 number). But the way they got there was via a higher risk iterative development process, same as what we're seeing with Starship. The Falcon is the most reliable rocket on the planet at this point, so clearly it's working!
Your intel example doesn't fully hold water. Chip design has taken a binning approach for decades at this point. The silicon is designed for failure in such a way that yield is maximized assuming there will be a variance in quality. Launch and iterate happens for silicon just like anything else. During the initial product launch, there is a lot of waste, but as processes are improved, microcode updated, rework procedures defined, and value engineering efforts completed, yields go up and costs go down.
It depends on what you count as a chip. Modern "chips" are actually a great many things all one one slab of silicon. Failure then knock out single components, resulting in the final chip going into one bin or another. And each of those components can be as complex as an entire "chip" from a few years ago. That is really creating reliability through numbers. Within every component any slight error will still brick that component. The binning process is really a edge case where the fab is playing at the margins. Any slight increase in error rate would quickly see every chip going into a "bad" bin, with the entire processing becoming uneconomical. Intel, and everyone else building chips, strives for perfection with a zero tolerance mantra.
But binning is largely binning against performance curves: "we doped this side of the wafer just a little too much and it's a little slow, but this one from the other side is just right" - while checking for faults (full chip scan and other manufacturing tests) result in dies being thrown out.
> you switch from 0.0001 to 0.001 failure rates. You risk is now 10-fold higher at each failure point, but with thousands of failure points adding up you are now essentially doomed
Our ex ante ability to estimate these probabilities is lacking. Especially when they’re coupled. And redundancy, instead of tighter tolerances, is often cheaper than the classic approach. These are proven lessons from SpaceX.
SpaceX is doing the classic statistic thing[0], making spacecraft stronger where they explode until they don't. It's more like a hyperparameter search and less like QA for individual parts.
SpaceX is just doing what the Soviets did. The Soviets preferred to launch and learn just like SpaceX does today. Soviets also ran on a much smaller budget than NASA and preferred simplicity over complexity and constant tweaking.
The Soviets proved that methodology works and SpaceX continues it to this day. I have a feeling most that follow SpaceX think they are trying a new revolutionary approach here.
The N1 didn't really fit the testing driven development approach, they couldn't static fire the flight engines, they ended up relying on test firing extra engines from the same batch and assuming all in a batch were the same.
I'm no rocket expert, but the Wikipedia article you linked seems to have an idea about that:
>Adverse characteristics of the large cluster of thirty engines and its complex fuel and oxidizer feeder systems were not revealed earlier in development because static test firings had not been conducted.[9]
Time and money. Powers at be decided to stop funding the project. If SpaceX goes bankrupt next year, they could fail too and it won't have anything to do with the technology.
Strategically, they were caught with their pants down when the US demonstrated they were serious about going to the Moon. The soviet space program was great at sending cosmonauts to LEO, but they had no ability, and no desire to spend the time and effort to develop the ability, to go to the Moon.
They planned and expensed the N1 as if it was another LEO Soyuz variant.
The N1 was always doomed to fail because the USSR spies had been found and were given misleading information and plans for the US space program. The US intelligence community snuck bugs into the N1 design by sneaking them into the designs their spies thought they were stealing.
Only if the failure points are arranged in series - A AND B AND C must happen to have a successful outcome.
If the failure points are arranged in parallel - X OR Y OR Z must happen to have a successful outcome, with multiple redundant paths to success, your total failure rate is the chance that ALL of X, Y, and Z fail. This is a much lower number than when they are in series.
To use concrete math - say that Starship has 33 raptor engines with a failure rate of 0.01%, 3 grid fins with a failure rate of 0.05%, and a fuel tank with a failure rate of 0.001%. If it's engineered so that all 33 raptor engines, all 3 grid fins, and the fuel tank all need to work for a successful launch, the success rate of the whole system = 0.9999^33 * 0.9995^3 * 0.99999 = 0.9952 = ~0.5% chance of failure. If it's engineered so that it can get to orbit on 28 out of the 33 raptor engines, 2 out of the 3 grid fins, and there is a double-hull to the fuel tank with a failure rate of 0.005%, then the chance of failure for each subsystem is 0.0001^5 = 10^-20, 0.0005^2 = 2.5 * 10^-7, and 0.00001 * 0.00005 = 5 * 10^-10, and when you multiply out those subsystem failure rates you get 1 - (1 - 10^-20) * (1 - 2.510^-7) * (1 - 510^-10) = 0.9999997495 = ~0.000025% chance of failure.
Moreover, lets look what happens if you take the multiply-redundant design above and then increase the chance of failure of each component 100x. Raptor engines are now 99% reliable, grid fins are now 95% reliable, and fuel tanks are now 99.9% reliable. The overall failure rate for each subsystem becomes 0.01^5 = 10^-10, 0.05^2 = 2.5 * 10^-3 and 0.001 * 0.005 = 0.000005. When you multiply out those subsystem failure rates you get 1 - (1 - 10^-10) * (1 - 2.510^-3) * (1 - 510^-6) = 0.00250498759 = ~0.2% chance of failure. The multiply-redundant system, even with component failure rates 100x higher, still has better reliability than the perfectly-engineered system where every component must perform exactly to spec.
This principle is used all the time in practical engineering. It's why Google builds server farms out of thousands of commodity PCs, hooked up in primary/replica clusters with replication and transparent failover. It's why ships have watertight compartments and double-hulls. It's why passenger jets have multiple engines, multiple hydraulic control systems, and multiple flight computers. Any engineer worth their salt is going to avoid SPOFs and assume that components will fail, then build redundancies into the design so that a partial failure does not endanger overall mission success.
>So to have any chance of success each failure point needs to be engineered below, by way of example, a 0.0001% chance of failure. That costs lots of money. But say one decides to accept more risk for less cost. Ok. So you switch from 0.0001 to 0.001 failure rates. You risk is now 10-fold higher at each failure point,
It seems like a strawman to suggest that you would let failure probability increase by a factor of ten as the first step.
>but with thousands of failure points adding up you are now essentially doomed.
And it was an easy one to tear down.
>And you haven't saved anything. The cost of 0.001 components isn't fundamentally different than the 0.0001 components were.
There is no reason to believe that at all. Invert the question: if a system is 99.999% reliable, does that mean it should be free to make it more reliable? And why?
>It is either go or no go. Take CPU production. Intel spends billions at each of hundreds of fabrication step to push down miniscule error rates because any of a million errors can destroy a chip. There is no money to be saved by allowing any one process to become less than as perfect as it can possibly be.
This is completely wrong. Chip factories do produce chips with defects; those chips are sold for cheaper, since they still work, but not as fast. IIRC most of TSMC's modern processes are designed to sell the good chips at a high price and the bad ones cheaper.
Evaluating the reliability of every component is extremely expensive. If you had perfect knowledge of the reliability of every component, then sure. It would be cheaper to build a perfect rocket first, then launch it. The cost saving comes because it’s cheaper to launch the rocket and see where it fails than it is to exhaustively evaluate the reliability of every component.
But you still need to do it for something like a rocket that is supposed to carry people in the future. In to long term, it's better to evaluate reliability of every component at the beginning, than do it after if fails.
You actually don’t. With a high launch cadence you can make statistical inferences about the reliability of the system as a whole. This isn’t the 80s where you launch one disposable rocket every 6 months. Starship is fully reusable and will likely fly multiple times a week carrying non human payloads.
Have you heard of instances where airplanes failed due to something as basic as a nut or a screw not meeting the required specifications? It's difficult to trust the reliability of a rocket, which is largely based on statistics, when you're putting hundreds of people on it. How can we be certain that NASA/FAA and other organizations will permit such a risk?
The crash of Partnair Flight 394 in 1989 resulted from the installation of counterfeit aircraft parts. Counterfeit bolts, attaching the vertical stabilizer of a Convair CV-580 to the fuselage, wore down excessively, allowing the tail to vibrate to the extent that it eventually broke off.
This happens to aircraft, and the only way to ensure it doesn't happen to rockets is to have extreme control of your supply chain.
You can’t verify the reliability of an aircraft by flying it 1000 times because you’d be putting the life of the pilots at risk. Autonomous rockets have no such constraint. You could launch starship an unlimited number of times without putting a single person at risk.
It's not even about putting the life of the pilots at risk - it's about cost.
It's starting to look like the lack of flame diverter was a huge mistake that will cost a lot of money and time, how do you think this will affect the whole project? What if after they fix the launchpad, it gets destroyed again, simply because some other preventable failure? How many more of these "tests" can they have without going bankrupt?
Pouring new concrete is cheap, they’ve already done it multiple times from damage due to test fires. Each time they reformulate the concrete and it needs less repair.
After they get to orbit they can start putting payloads on board, and the tests will pay for themselves. I’d say the risk of bankruptcy is very low.
Edit:
Just saw photos of the crater under stage zero. Looks like they do need a flame diverter lol.
Once they realized they needed a flame diverter, I think they traded off the cost and delay of putting one in vs the cost of just filling the hole each time.
It's totally feasible for them to keep testing and keep refilling that hole while in parallel they build a perfect launch pad with a flame diverter somewhere else. That way they don't have a gap in testing cadence.
Maybe you're trying to make the point that for very complicated systems with many failure points, the reliability of a single component is less impactful than redundancy and there is some truth to that but I must point out that risk vs cost calculations are definitely happening in both industries you mention.
Triple redundancy is not a thing in general aviation while, for some systems, it is in commercial. That's a risk vs cost calculation.
Semiconductor manufacturers do risk vs cost calculations through the entire development and manufacturing process. Source: I've worked in semiconductor, doing those calculations.
What you've missed here is that failure points can only be guessed or simulated prior to testing. That analysis is often more expensive than the cost of a test and test article.
So SLS does a LOT of analysis and manages to find and rectify failure points prior to flight. They pay a lot in analysis and inflexible design to do this.
SpaceX does some analysis, but then flies to confirm the analysis early. That way they identify actual failure points and can use sensors to see how close they got to failure.
For instance today I'm sure they learnt a lot from each failed engine, parts on the booster that stopped working, detailed telemetry on ship behavior on flight and sensors for the non-separation of booster from ship.
SpaceX builds to the same high tolerances as other rocket manufacturers, but they don't try to avoid testing through overly rigorous analysis. They also don't gold-plate their manufacturing, instead making tradeoffs to allow cheaper volume production with recovery instead.
> risk is now 10-fold higher at each failure point, but with thousands of failure points adding up you are now essentially doomed. And you haven't saved anything. The cost of 0.001 components isn't fundamentally different than the 0.0001 components were.
exactly. optimizing for an acceptable risk for space launches, esp space travel, seems still to be leading to $2B launches. at which point, what's the difference?
but like, SpaceX's whole thing has been blowing up test rockets over and over as they experiment, iterating quickly. once they've nailed the design down, the result is pretty reliable, but look at how many booster recovery fails they had before they got successful. your logic works more for defect rates on parts than on the design itself.
Before the anti-Musk cats get too wet, Elon Musk has said, and I quote, "If we get far enough away from launch pad before something goes wrong, then I think I would consider that to be a success. Just don't blow up the pad.": https://edition.cnn.com/2023/04/16/world/starship-spacex-lau...
Right. The criteria for this test was "Failure = CATO (catastrophic at takeoff)". It took off, it didn't CATO, it even hit max Q, therefore success. That's not "partial success". That's 100% success, you won the game show, but you didn't get all the points in the sudden death bonus round.
That sounds logical, but isn't really a thing in aerospace/space. Complicated high-energy systems have thousands of failure points. So to have any chance of success each failure point needs to be engineered below, by way of example, a 0.0001% chance of failure. That costs lots of money. But say one decides to accept more risk for less cost. Ok. So you switch from 0.0001 to 0.001 failure rates. You risk is now 10-fold higher at each failure point, but with thousands of failure points adding up you are now essentially doomed. And you haven't saved anything. The cost of 0.001 components isn't fundamentally different than the 0.0001 components were. SpaceX can save money though different business practices, by trimming people/money/contracts/compliance and such, but if you look at their rockets they are not fundamentally any less-perfect than anyone else's. They cannot afford to be. This is why rocket failures, like aircraft failures, are taken so seriously. There is an extremely fine line between "works ever time" and "never worked twice" with very little money to be saved between the two.
Across many areas, risk-v-cost math never really happens. It is either go or no go. Take CPU production. Intel spends billions at each of hundreds of fabrication step to push down miniscule error rates because any of a million errors can destroy a chip. There is no money to be saved by allowing any one process to become less than as perfect as it can possibly be. A detected slip from 0.0001 to 0.001 at any step would result in an entire fab being shut down in order to diagnose the problem. The marginal savings of a less-than-perfect process isn't worth the exponential increase in the risk of total system failure.