Hacker News new | past | comments | ask | show | jobs | submit login
Alaska Airlines flight 1282 NTSB preliminary report [pdf] (ntsb.gov)
266 points by tomalpha on Feb 6, 2024 | hide | past | favorite | 266 comments



Also posted on DocumentCloud, since NTSB servers aren't responding (as of this comment)

https://s3.documentcloud.org/documents/24410269/report_dca24...


A very thorough preliminary report. I've worked for a long time in quality systems, and this is a perfect example of a systemic failure. They've got work being handed off between Boeing employees and 3rd party contractors with insufficient controls in place to verify that very basic tasks are being performed.

I'd be curious to know how many non-conformances they typically see during assembly of a plane and whether management is actually allowing the quality department sufficient independence to investigate these issues and fully resolve them. I'm guessing that the production personnel are under tremendous time constraints and are constantly pressure the quality assurance people to sign off on whatever paperwork is holding up the line, no matter the safety implications.

Also, I think a lot of middle and upper level management needs to lose their jobs over this. I hope this mess ends up in textbooks and gets beaten into the head of every MBA student in the country.


> I'd be curious to know how many non-conformances they typically see during assembly of a plane (...)

Very likely that number is meaningless. I suspect this is the kind of environment that incentivises hiding non-conformances whenever possible.

For example, better quality control usually results in an increase of number of defects, at least temporarily. But that just because large portion of these defects were undetected before.

So... you are looking at a number that you have nothing to compare to that also depends on how closely the process is monitored and also depends a lot on the definition of what is non-conformance.

It is like trying to give an answer to "what is the length of Britain's coastline?" Everybody knows that you can get whatever answer you want depending on how long the ruler is.


> Very likely that number is meaningless. I suspect this is the kind of environment that incentivises hiding non-conformances whenever possible.

That, by itself, should be the kind of thing that should shutdown a company permanently. Remember, this is the aviation industry, where they track the mine where the ore from the bolt was mined, and who tightened the bolt up and with which torque wrench.


Sounds like a lot of steps and paperwork… steps and paperwork that might easily and regularly get fudged.


The idea is that the system as a whole should be resilient to a certain degree of this, as it will be caught in other places or at least reported at a (hopefully non-fatal) aviation incident that leads to a report and analysis, at which point you backtrack and issue recommendations that actually have clout (like, you can't fly with your plane anywhere in the world if you don't fix this).

Of course if the entire industry is corrupt it doesn't work, but to a certain degree I guess it is robust. This time it led to a potential disaster but it will become safer as a result..


Of course. And the way to go is to set up incentives so that everybody wants to report issues rather than hide them.

In a normal, healthy situation a company like Boeing should not feel threatened if some problems are exposed from time to time. That is assuming that everybody understand that uncovering these problems is part of the process and is necessary to improve safety and is exactly why and how we have good safety in the first place.

It only becomes a problem when that safety record becomes blemished too much.

I am pretty sure it actually has worked successfully for many decades up until some point, evidenced by consistently improving safety record.


This sort of thing always reminds me of this Dilbert comic: https://twitter.com/k8em0/status/1078824013843984384


> Also, I think a lot of middle and upper level management needs to lose their jobs over this.

Given that if a worker doing the work raised this problem or took initiative to resolve it, they'ld probably be punished; I completely agree.

Reminds me so much of how, once-upon-a-time, there seemed to be actual engineering management in a cooperatively adversarial relationship with business managers but not anymore. Now any sort of engineering in business seems to be completely business managed and business minded. I'm sure it's great for profits while it lasts but I haven't observed engineering becoming better and I suspect business is suffering by overextending itself too, I just don't have any solid observations.

(Well, maybe one, my brother does warehouse / logistics management and says, despite there being every reason in the world, he has never seen the accounting software and the inventory software successfully and productively linked. So, big opportunity there for a serious player but maybe not the profitable compared to the issue?)


Once upon a time, QAs and developers also had a bit of cooperatively adversarial relationship as well. It's no more now as teams are restructured by business, so they cooperate.


Rumor has it the controls are there, but subvert-able.

Apparently, there are two ticketing systems (one for "history of plane," Boeing internal, and one for "day-to-day onsite work," visible by contractors and Boeing management). The work to fix the rivets was logged in the day-to-day, but management and the onsite staff managed to convince themselves that merely opening the plug to fix the vacuum-seal trim did not constitute "removing" the plug, and since there was only an entry in the history-of-plane log for removing, not opening, they didn't log it there (when the intent was "there's no entry for 'just opening' because there's no such thing as 'just opening', breaching the pressure vessel at all constitutes 'removal of plug'").

The final inspection that should have caught the error would have been triggered by the update in the history-of-plane ticketing queue.

(And as for 'how many non-conformances,' the same source claims that Spirit is one of the few subcontractors with on-site staff at the factory because their parent company delivers such consistently shoddy out-of-compliance product that they are continuously doing final warrenty-work onsite. So maybe "fire that vendor" should be on the docket too).


You might want to cite your source on this, which I'm guessing is the purported insider speaking about same?

From the portions of the report that hint at corroboration,

> Documents and photos show that to perform the replacement of the damaged rivets, access to the rivets required opening the left MED plug (see figure 15). To open the MED plug, the two vertical movement arrestor bolts and two upper guide track bolts had to be removed.

> Records show the rivets were replaced per engineering requirements on Non-Conformance (NC) Order 145-8987-RSHK-1296-002NC completed on September 19, 2023, by Spirit AeroSystems personnel. Photo documentation obtained from Boeing shows evidence of the left-hand MED plug closed with no retention hardware (bolts) in the three visible locations (the aft upper guide track is covered with insulation and cannot be seen in the photo)


Yes what they said about two systems was from the insider account that was posted online.



>I'd be curious to know [...] whether management is actually allowing the quality department sufficient independence to investigate these issues and fully resolve them

If management in the aerospace industry works like management in the software industry, then I guess they are pushing for results as agressively as possible without much concern about safety or anything else.


At a company I worked in, we had a joke about this: "Good thing we don't build nuclear reactors".

In some software projects the level of rush, and the fact that bugs sometimes would leak into production was kinda horrifying. It would've been way more so, if it would've been the kind of project that could kill people in case of failure. Like it happened in Chernobyl with nuclear reactors, or at Boeing with planes.

I can't really imagine what these engineers feel when they rush this kind of work knowing what's at stake.


Quality is always a trade-off. If you're deeply into economics, you wonder about the trade-offs of cost to find defects before shipping, difference in cost of addressing defects before and after shipping including costs of mitigation from consequences of defects, % of defects that will never be found after shipping (and are therefore a real cost savings), and in the long game costs of having a reputation for shipping product with defects that could have been reasonably detected.

In a lot of software organizations with rapidly changing and undocumented requirements, there's a good chance defects will go unnoticed until they're no longer relevant, so spending a lot to find them before they're shipped is a waste. Mitigation of many software defects is simple, but some aren't; hopefully you know which changes are expensive to fix if wrong, so you can more thoroughly vet those.

In Aerospace, addressing defects after shipping is very expensive, and mitigating the effects of defects is only approximate; you can't restore passengers from backup, economic damages don't really make families whole, but should be an incentive not to let reasonably detectable defects be shipped.


> Mitigation of many software defects is simple, but some aren't; hopefully you know which changes are expensive to fix if wrong, so you can more thoroughly vet those.

This assumes you're fortunate enough to have a defect at the outer edge of the system. Most times, these problems are created in the initial rush of pushing something out and then tax every effort that depends on them, forever, and ever.


Amen.


>In a lot of software organizations with rapidly changing and undocumented requirements, there's a good chance defects will go unnoticed until they're no longer relevant, so spending a lot to find them before they're shipped is a waste.

It's really a shame that a good percentage of these applications full of bugs and "rapidly changing and undocumented requirements" don't get scrapped and stay many decades afloat until they get replaced by another application also full of bugs and "rapidly changing and undocumented requirements".

I think that that's a very sad way of seeing things honestly.

In the past the USA put the man on the moon, today repeating the same feat looks almost impossible. I bet that a lot of managers at Boeing also think that building planes like a few decades ago looks almost impossible now.


I really find it disingenuous to imply that we can’t build moon rockets because we’re not good enough at engineering projects - I can think of a few engineering projects that took off last year, to say the least. And nasa doesn’t deserve the shade IMO. The kids of the people who built the Apollo program aren’t working at boeing or fighting for one of the few underpaid nasa positions - they’re building reusable rockets for the Twitter CEO, and, much more commonly, parasitic UX features for gig economy apps.

TL;DR: we’re fine at engineering, we’re terrible at resource allocation. Or at least that’s the more relevant cause. I post this knowing full well that this is HN and I might well be disagreeing with a senior nasa employee…


If we're fine at engineering, why are doors falling off planes?

I don't think you can say "We're fine at engineering but we're often terrible at management."

They're not separate things.

Engineering culture is about inventiveness, pride, craftsmanship, and getting the job done well. Bean counter culture is the opposite of all those. If that's the culture engineers work under not only do none of them happen, but they become less and less possible over time.


You’re making great points about the software/engineering industry and I don’t disagree about any of those specifics — engineering culture is paramount. I was just trying to point to (what I see as) the root cause: the engineering culture at Boeing didn’t fall apart because it’s run by lazy millennials or because the managers hadn’t read Mythical Man Month, it’s because they spent their money on stock buybacks and executive compensation.

Funnily enough, I went looking for Boeing stock buyback info and found this article… seems my hypothesis has some specific backing in this case!

Opinion | Did Stock Buybacks Knock the Bolts Out of Boeing? https://www.commondreams.org/opinion/boeing-safety-stock-buy...


Stock buybacks are tax efficient dividends. Boeing has been paying dividends fairly regularly since 1937, so paying stockholders can't really be the problem. Executive compensation is kind of a red herring too. If you don't like what the executives did, it probably reflects more on the choice of executives rather than the compensation of them, but you could maybe make an argument about how compensation incentives were setup.

MCAS and unbolted door plugs feel like two separate types of problems, IMHO. Both of them can be tied to Executives and culture, of course. MCAS comes from a desire to skirt regulations --- hiding automation from pilots in order to reduce certification requirements is a design error. OTOH; the unbolted door appears to come from production / rework corner cutting; the design is sound, the written process is sound (I think), but written process was not followed in order to meet schedule pressure.

You can have a company culture that encourages skirting regulations and cutting corners in production regardless of dividends/buybacks and executive compensation.


The unbolted door was caused by a desire to avoid reinspecting the door were it removed using the official process. That's exactly the same problem as MCAS: they tried to skirt the regulations that required reinspection.


Again, great response, I think you understand the dynamics better than I. HN is gonna cut off this thread soon but I think your last line helps me sum up my point well:

  You can have a company culture that encourages skirting regulations and cutting corners in production regardless of dividends/buybacks and executive compensation.
IMO, a good way to avoid such a culture is to choose and cultivate quality engineers. A better way is to also entice them with high pay and/or prestigious projects. The best way, which is entirely foolproof, is just to have more money so that corner cutting doesn’t come up. And given the scale of Boeing’s behavior, I thinks it’s reasonable to say they could’ve expanded QA/safety/training/testing/etc budgets by a LARGE degree.

In other words: imagine I bought Airbus (lucky crypto run, ofc) and immediately transferred 10-20% of their liquid(izable) assets as cash into my bank account. And otherwise left the company running as is. Wouldn’t you think that’s the most relevant fact when discussing changes to their engineering culture in the preceding years?

Re: “it’s always been this way so if can’t be that”, I don’t think that’s a solid enough premise to support that conclusion. A) context changes, and b) I think it really is crazy out there these days. From Jacobin:

  In a 2017 article for The American Prospect that he coauthored with Sakinç, Lazonick identified one especially distressing instance in which Boeing neglected to follow through on its planned redesign of the 737 Max airplane, a project that was estimated to cost $7 billion at the time. That amount, he wrote, was what “on average, Boeing has been spending on stock buybacks annually since 2013.”

  In another analysis, Marie Christine Duggan, an economic historian at Keene State, concluded the amount that Boeing spent on buybacks in recent decades has generally outweighed spending on capital expenditures like upgrades and maintenance. Duggan found that in 2017, at the height of its buyback frenzy, “Boeing’s spending on dividends and stock buybacks was 66 percent of total spending, while only 9 percent of Boeing’s cash went into new equipment to manufacture planes.”

  Many of Boeing’s mass stock buybacks came after President Donald Trump’s 2017 tax cuts, despite the fact that Boeing and other major companies promised to invest their resulting tax savings on capital expenditure and innovation.


Quality is always a trade-off.

That's a pretty huge assumption.

For instance, compare firearms prior to replaceable parts to firearms after.

Better, cheaper, easier to make (because craft was replaced with process). Some up front cost, but absolutely not a trade off, it was a huge advancement.

Of course modern process control does more or less let you relax conformance rules to reduce cost, but it's farcical to call sacrificing reasonable conformance "quality".

Arguably, the idea that quality is obviously a trade off and you can make money by letting it slide is one of the sources of rot in our society.


> Some up front cost, but absolutely not a trade off, it was a huge advancement.

Any exchange of higher fixed costs for lower marginal costs (or other benefit) is a tradeoff.

This is a tradeoff that was/is massively beneficial, but it’s still a tradeoff.


It's not a quality trade off, which is at least implied to be what I am talking about in my comment.

Saying "Quality is always a trade-off" implies you can't actually make things better over time. I would accept something like "You can always trade off of the quality you are capable of producing to reduce costs", but the point is that capability can in fact change, and thus there are ways to improve quality that may not be a trade off (because they also improve the business/system along other dimensions). Even simple little things like aligning a process with the intended outcome can reduce costs while improving quality, you don't have to invent a revolutionary method of manufacturing.


Beyond a very early point in any engineering effort (where fruit might be not just low-hanging, but actually lying on the ground), nearly everything is a trade-off.

Those trade-offs (higher fixed costs in exchange for lower marginal costs and higher quality) can be wildly beneficial overall but finding large Pareto improvements against every dimension is rare in anything even remotely mature.


As the old saying goes: "Fast, cheap, or good. You can pick a maximum of two."


That's not how it traditionally worked in aerospace. Commercial airliners are such a safe mode of transportation because from start to finish there are (or used to be?) very high safety standards all around. I can't speak for what's going on with Boeing currently though. Things certainly seem to have deteriorated.


I have tendency to hope for unrealistic things too. Not a reliable trait of mine, no. But in my clear moments I am afraid that much sooner will come the world peace and union of all nations and religions than unimaginative but determined bean counters learn from millions of catastrophes of the past and future to come giving up pushing their core value and first rule of 'take more, give less' into the infinity and beyond. And giving up their personal wealth with it.


I think the hope is people of good conscience stop being political cowards.

If you believe in quality engineering, but refuse to engage in the political and business dimensions of an enterprise to fight for that view, then you are just virtue signaling — since you’re refusing to engage with the tools needed to make it happen.


It is always refreshing to meet fellow naives like yourself, I like the company of the likely minded!

Engaging in today's political theatre and culture (that is the keen slave of forceful bean counters/collectors btw.) with the prediction of constructive advancement for the quality of life, that is some very hardcore stuff! Even for me.

(what I can do beyond raising voice is to show example rather than piss into the headwind hurricane. and hope others will follow, or I can follow. but definitelly not replacing my inner Gandhi with Rocky in a blink and getting into a fistfight with scores of agressive pro boxers simultaneously who are itching to run over anyone in their way. also I can avoid the products of those bean people, not feeding them with my purchases or assistance any way, encouraging others doing the same. vermins should starve, not bloom!)


I meant in the practical senses:

- learn to make your arguments in terms that appeal to the sensibilities of bean counters, not engineers

- learn to manipulate documentation so they’re forced to record overriding good engineering in a way that’s discoverable

- learn to identify key political players and address their desires, rather than appealing to “doing the right thing”

Etc.

All things that I’ve done poorly at various stages of my career — and seen other engineers struggle with as well. Much like being a manager requires training and education, so does being an effective advocate for engineering to the wider organization.

Or to borrow your analogy, if every engineer is afraid of boxing, then can we be surprised their views go unheard?


First, being cowards this way would need to stop being the wise choice to make.


At a minimum, we must abolish the right to work at a federal level. Sounds so terrible, doesn't it? Pooper is against the right to work. Pooper wants to take away jobs. No, we can't have that because of politics. And thus, we are stuck.


According to the "insider source", 392 non-conforming defects in the fuselage door installation in the last 365 calendar days.

> As a result, this check job that should find minimal defects has in the past 365 calendar days recorded 392 nonconforming findings on 737 mid fuselage door installations (so both actual doors for the high density configs, and plugs like the one that blew out). That is a hideously high and very alarming number, and if our quality system on 737 was healthy, it would have stopped the line and driven the issue back to supplier after the first few instances.

Source:

https://leehamnews.com/2024/01/15/unplanned-removal-installa...


> I'd be curious to know how many non-conformances they typically see during assembly of a plane

Well, the report says "During the build process, one quality notification (QN NW0002407062) was noted indicating the seal flushness was out of tolerance by 0.01 inches.

So I'd say they've had about 2,407,062 quality issues :)


Then its revealed "Well the numbering schema started at QN AXX... and we looped through the alphabet a few times.."


I was speaking with some friends at a Christmas party who work for the Navy - they’ve taken deliveries of planes from Boeing with the same sort of issues that start in the factory. They even went as far as to say the whole lot of planes should’ve been rejected but weren’t. Multiple things not built to spec.

The planes they worked on did not share an assembly line with the 737 but another Boeing model…


If I recall properly, sometime after the MAX crash saga, the Air Force simply refused to take delivery of their 767s because of unacceptable build issues, including foreign debris (aluminium rests and shavings) and forgotten tools flying around.

I didn't follow how that issue evolved.


"....after the left mid exit door (MED) plug departed the airplane leading to a rapid decompression"

Lol, they said the door plug "departed" instead of "blew the f* off"


[flagged]


On the contrary Boeing has spent the past 2-3 decades trying to screw over the workers as hard as possible. They've massively outsourced parts manufacturing. They don't even make the fuselages anymore!

The whole point is to make those people contractors so the contractors can hire cheaper labor and even push work to cheap overseas locations.

Can a system like that be made to work? Yes it certainly can. But it says something about Boeing's priorities: find the cheapest labor possible to goose profits (management bonuses).


That's not what a union does


The report seems to mesh with and confirm many details of the anonymous insider account at https://leehamnews.com/2024/01/15/unplanned-removal-installa.... The bolts were not reinstalled following work on the plug rivets/seal. The official system doesn't record that work was done requiring the bolts to be removed.


Yep, biggest new thing here is discussion of witness marks showing no evidence of the bolts, and photo before the rivet repair showing that at least two of the bolts were present, and photo after the rivet repair during installation of the insulation showing at least 3 bolts missing.

So this all just serves to confirm that report, and what people suspected for a while; the bolts were just missing, removed for removing the door plug for the rivet rework and never reinstalled.


> removed for removing the door plug

there was apparently two (identical) procedures -- one for "removing" the door plug and one for "opening" the door plug -- but only one of them actually called for paper-work and quality inspection after it is done.

It seems they decided to "open" the door plug instead of "remove" it to work on the sealing issue? -- and thereefore documentary record of the work performed and inspection thereafter is lacking.

It would be useful IMO to explore what led the team to prefer "opening" versus "removing" -- and if the subtle difference in documentation requirement was a consideration in preferring one over the other -- and if that points to deeper pervasive culture issues that leans towards less work rather than give safety and quality the primary importance.


I don’t think there were two approved procedures.

I think there was an approved procedure (“removing the plug”) and a deviance from process, justified with ad hoc semantic games.


I doubt this was intentional? It sounds more like mistakenly applying the procedure for opening a normal door, which requires no followup, to opening a door plug, which does.


lol of course it was intentional.

what the Spirit mechanics had to do was a RMV, they knew this, because the Boeing eng rep told them so, but they didn’t like this, so they marked it in the system as an OPEN but performed the work of a RMV, to avoid the additional quality checks a RMV would require. And then they fucked up the RMV but didn’t know and no one else did because it was marked as an OPEN so no one checked the bolts.

this was very intentional. the Spirit mechanics thought they were hot shit and didn’t need to follow the stupid, slow process. and they abused the system to allow them to file the work as more innocuous than it was. because they didn’t need anyone checking their work, or whatever they were thinking.


My experience in across a range of technical and consulting roles has been that junior folk understand the need gor their work to be reviewed and checked, because they know that they are inexperienced. Good senior folk want their work to be checked because they are experienced enough to know that they are fallible. It is the mid level folks (especially those who are promoted above their competence to senior roles) who think that they are good enough not to need hheir work reviewing.


This is my experience as well, but it’s usually not those promoted above their capability but those who were passed over and then grow resentful that end up causing the issue. An otherwise competent, but now bitter, middle aged guy on the third shift (or otherwise with less supervision than usual) who is going through a divorce or some other major life stress is the scenario I see most often. They fall behind, they come in to catch up, they have to leave unexpectedly for a family medical crisis, they fall further behind, and so on and so on and before you know it, a perfectly rational person is doing stuff like half-assing the QA checks and it seems almost normal.

But you’re right. It’s not the black belts you have to watch out for it, it’s the browns.


Are Spirit employees allowed to file records in CMES?!


Are Spirit employees not accountable for their work?


Can someone explain to me why this door plug wasn't actually a plug that physically stays sealed from cabin pressure? That seems like a sensible failsafe?


Probably because it reuses parts of the design of the emergency door so it can be swung out of the way for maintenance. The emergency door uses this complex designs of stop pads and guide rails etc to have “plug” characteristics (which did hold out for 3 months without being properly secured) because a true plug door would have to swing inwards, which isn’t an ideal property for an emergency escape, and I’d guess it might need more room reserved internally.


It's also done this way so a real emergency exit can be installed instead of the plug if the airline (or another airline that buys the plane) decides to increase seating capacity. It would be really inconvenient (not to mention mechanically questionable) if you would have to weld additional stuff onto the fuselage to accomplish this, so the parts that are connected to the door frame are already there...


Good point. Boeing (737) has the door open to the outside. The Airbus (320) the door opens to the inside.

I can remember when I was small my Mom mentioned that doors of buildings in the US are mandatory to open to the outside, a rule that does not exist in Europe AFAIK. So there you have it, you have a significant higher chance of crashing in a Boeing, but when it happens, your can leave the plane 2 seconds faster.


Sorry that's both just wrong. The A320 Doors open to the outside (as can be seen here https://www.airplane-pictures.net/photo/1364569/d-ainb-lufth...), and also come with a mechanical opening aid that Boeing doesn't come with, because of the Grandfathering of the 737. And emergency exits in "Europe" obviously open to the outside. At least in West/North Europe


Nope, you are wrong on both accounts. OP is referring to the emergency exit and why it would make sense to make them fail save. These doors are located above the wings. Your picture shows the normal entrance door. This door is gigantic and has all sorts of safety procedures to keep them closed.

Boeing opens the emergency exit to the outside and we all know that they will just pop out during flight every time they forget to bolt them. Airbus has a fail save design: unlock, pull in, then throw it out. No way you can open these during flight, with or without bolts.


[flagged]


Yeah, this is a preliminary report, they only report direct factual findings that they have collected so far, and as they mention in this report, they haven't yet examined the electronic manufacturing records that the insider was referring to.

The NTSB is not quick to jump to conclusions; they are very thorough, and they collect and sort through a lot of evidence. It's not surprising that there's information out sooner that says more for such a high profile incident; but once the NTSB final report comes out, it will be quite detailed and thorough, and cover all of this and more.


It's a preliminary report, they deliberately avoid including conclusions that might be reversed later but would do harm by inclusion at this stage.


Where are you seeing a 100+ page report?


The report is 19 pages.


The fact that a critical piece of the evidence was cell phone photos sent between workers coordinating door re-assembly doesn't exactly instill a whole lot of confidence in their permit-to-work process. I didn't like it when it was medical teams doing shift handover via a Google Doc, and I don't like it when it's a matter of flight safety either. Or, as Homer might eruditely say: "guess I forgot to put the bolts back in" [1]

[1] (https://www.youtube.com/watch?v=IiNPLIauEig)


This is a puzzling attitude to me. Every time we technologists see a crappy proprietary solution being used for a problem, the first exclamation is, "why not use <commodity solution X>? That's so dumb, they spent $10k on that tool when they could have spent $100 on X!"

There must be a middle ground here- the paradox is that Google, Apple, etc have this ability to generate user friendly software and hardware at scale. But they aren't considered "battle proven". The expensive proprietary systems that are used instead tend to be hard to use and brittle, so what's the middle ground?


The issue here isn't using google chat, the accusation is that this was Spirit and Boeing conspiring to not record these in the proper work order system under the pretence that this work was being done by Spirit as-if-it-were pre-delivery.

Read https://www.airlinepilotforums.com/safety/146074-boeing-inte...

And then this from the doc: "The investigation continues to determine what manufacturing documents were used to authorize the opening and closing of the left MED plug during the rivet rework."

https://s3.documentcloud.org/documents/24410269/report_dca24...


A key point I read somewhere in the accusations is that non-Boeing contractors (e.g. Spirit) by policy cannot have access to CMES.

Consequently, you have a system of record that a major party to work doesn't have access to.

As we've all seen, this leads to "actual coordination" being instead done in a system all involved parties do have access to (SAT).

Which inevitably leads to a desync between CMES (SOR by fiat) and SAT (SOR in practice).


But isn't that exactly the underlying fraud? Boeing and Spirit are conspiring to cover up Spirit's deficient delivery by allowing Spirit to work onsite at Renton and do post-delivery re-work and pretend that it is as-if delivered, and _outside_ Boeing's system of record. To reject the delivery and wait for Spirit to fix would ruin their delivery schedule, so they fudge it and muddle along.

https://www.seattletimes.com/business/boeing-aerospace/faa-p....


SOR - system of record SAT - ? CMES - ?


They're described in parent's first link. Boeing systems.


The data/photos should be in the ERP/MES.


> The investigation continues to determine what manufacturing documents were used to authorize the opening and closing of the left MED plug during the rivet rework.

I mean, there is already a ton of documentation and process surrounding the construction of an airplane. Adding more process doesn't safety make. Having a safety culture without the fear of retaliation, on the other hand, makes a world of difference.


If the door was removed (which the NTSB report and the whistleblower post linked elsewhere around here say must have happened) there should be documentation for, at minimum, the removal and reattachment. If the door was not removed but was opened and closed, there should be documentation for both of those actions instead.

I don't know if this should be considered "adding more process" because it has been standard process for a very long time. All work done on an airplane is authorized, by someone, and after completion is recorded, by someone. Discrepancies and deviations from this standard operating procedure are a big deal.


That line stood out to me, because it implies that no proper "manufacturing document" was used for the work. If that's true, that's very bad; unapproved maintenance procedures have been the cause of multiple crashes.


> Overall, the observed damage patterns and absence of contact damage or deformation around holes associated with the vertical movement arrestor bolts and upper guide track bolts in the upper guide fittings, hinge fittings, and recovered aft lower hinge guide fitting indicate that the four bolts that prevent upward movement of the MED plug were missing before the MED plug moved upward off the stop pads.

Ooofff. No bolts at all! How did this pass Boeing QA?



It's crazy their supplier is delivering so many must-fix defects they have to have a warranty team on site to help fix them. Crazier that Boeing just let that get more and more out of hand instead of making Spirit get their shit together on their end.

https://www.youtube.com/watch?v=xIAfCupuZ3w

Edit: Also that entire culture and dynamic between Boeing and Spirit on the production floor seems very toxic and driven by misaligned incentives. There should be zero place for bickering, aggressiveness, and finger pointing in that SAT channel. If something needs to be fixed Spirit needs to fix it with a smile. If Boeing and Spirit need to review who is responsible for what, and what needs fixing and doesn't, that should happen in review in a different setting. The production crews need to be able to execute on their processes and focus on the quality of the product without having timeline and budget concerns seeping into their day-to-day.


When you gut a part of your company to spin it out as separate company that isn't going to have same union contracts and can be then squeezed as hard as possible...

First you create the reverse monopoly then ensure the toxicity, all in chasing supposed "value add" of final assembly and coordination (to paraphrase the Boeing CEO who made the strategy, who wasn't from McDonnell but was a Boeing lifer)


Spirit has no competition. How do you threaten the supplier who owns your production when you have no other options?


Considering we're talking about Boeing, who actually do the work Spirit does for them for the 737 for other airframes (making the fuselages), and used to do it in house for the 737 before too (in that same Wichita factory complex where Spirit is now, Spirit being the result of Boeing's monumentally stupid decision to divest), I think it's fair to say that Boeing can bring production back in house.


These sorts of what I like to call "reverse monopolies" are all over the place in US industry now. So many companies dumped out pieces of themselves to a single company that then supplies for all of them, removing the ability of those companies to actually compete with each other by differentiation.


It sure would be great if they just owned the supplier outright and could align incentives that way.


That you Dave Calhoun? I guess your replacement will figure it out and let us know.

Seriously though, that's a problem people getting paid sh* tons of money are supposed to be on top of.


Great link!

I like how that comment is from an anonymous source, but now that the NTSB preliminary report is out, it seems thoroughly corroborated to me. The dates of certain events and the reason for the door's removal—er, "opening"—both match the comment in your link.

Thanks.


So were the bolts missing because the Spirit team did not know they had to be put back (i.e. it was not recorded as a task that needed to be done) or was that simply just another mistake in the long line of mistakes they've made?

Reading through that post gave me nightmares of dealing with outsourcing software teams where you send them a small issue to fix and their fix breaks 3 existing items.


As things stand, it seems the anon insider report from 3 weeks back was legitimate and accurate, and holds answers to your question: https://leehamnews.com/2024/01/15/unplanned-removal-installa...


I read that but could not work out which of my two statements are true. Might just be bad reading comprehension on my side...


From my understanding, both Boeing and Spirit have been employing very shoddy practices, but at that point in the production line it's Boeing who is responsible for QA


> were the bolts missing because the Spirit team did not know they had to be put back

It would be interesting to know what ultimately happened to the bolts if they indeed were removed.

When I disassemble something, I do as I was taught in my mechanical class in high school, and always keep all parts I take of in the same box. If I am left with some extra bolts when finished, that would be a worrisome sign.


When I was a teen, I rebuilt the engine on my first car. When done, there was a spring left over. I had no idea what it was for, the engine ran fine.

So I drove it around. The oil pressure was very low, but I figured it was just a broken gauge.

Then, the engine got way, way too hot.

It turns out, the spring was for the pressure regulator from the oil pump. The oil was pumped out of the pan right back into the pan. The engine needed to be rebuilt again.


I took it to mean that someone forgot but the mistake would have been caught if the removal was documented in the authoritative record system, since that kicks off an automatic workflow where QA is notified and must sign off.

But they didn’t for whatever reason so two mistakes stacked.


Just comment out the tests and they pass


Instead of Software Development becoming more like Aeronautical Engineering, every day, Aeronautical Engineering becomes more like Software Development...


Almost every software engineering book:

Halting on an error is often best, as is raising the error after catching it, unless you're certain it should be subdued or aware, expect the issue, and have limited risk.

Every dev project:

Hold up, couldn't this raise an exception? That's bad!


Well, to be clear, I don't usually put people in danger. There's a lot these engineers could learn from software engineers: who are really the highest performance at engineering as a craft. One base rule: Above all, kill no one.


I think that software engineers put people in danger more often than they would like to believe; have a look at the UK Postal Service scandal for a great example of a seemingly innocuous bit of software absolutely destroying numerous lives.


Neither the software nor the software engineers were the ones who destroyed people's lives. That's on managers of the software company, managers of the postal service, prosecutors and judges who all conspired to hide the truth and condemn innocent people. I believe software engineers actually testified that the software had bugs, and the prosecutors hid that particular testimony.


Indeed. And yet immediately outmatched by the number of lives destroyed by motor vehicles (probably per day).



I’m trying to convince a group of devs that catching (and ignoring) every exception is not actually a solution to their crash bugs.


Ya know.. you can also just do that and then change the requirements:

> In a revision to the Flight Crew Operations Manual, issued on January 15, 2024, Boeing confirmed that the door functioned as designed.

Problem solved for the current level in the hierarchy.


The same group I’m talking about did exactly that: they got an exemption for any security issues from the CISO.

Literally turning a blind eye, but officially, so that’s okay!


MBAs prioritizing the bottom line over engineering culture.


Also I’m kinda surprised an airline wouldn’t inspect a newly acquired plane before putting customers in it.


The checklist for that is very different from what the manufacturer would do. It would take a lot of time and therefore be very costly to verify every single bolt, rivet and weld was done correctly.


To put this in perspective, every jet airlines has millions of these joins on them.


Looks like the anonymous whistleblower on Airline Pilot Central Forums [0] was legit.

[0]: https://www.airlinepilotforums.com/safety/146074-boeing-inte...


I'm guessing twice as much corporate effort is currently being deployed to find this guy than is being applied to fixing Boeing's quality system.


A lot of comments here are going on about process, as if humans are mindless and otherwise perfectly controllable robots...

I'm going to be contrarian and say that this is exactly the sort of thing that happens when you train humans to be robots: They lose all signs of common sense and critical thinking, and what's worse is that on top of that, they'll still have their inherent imperfection. Normally the former would counteract the latter, but not if you only make them rigidly follow some process all the time. They stop thinking about what they're doing. They stop paying attention to all the other things in their environment they would've noticed, and even if they do, they won't question it because they'll just assume someone else also following a rigid process will take care of it. They won't think "this door plug should've been bolted in place now that the work that needed it opened is done, but where are the bolts?"

I'm not saying to throw out all the process and make them figure everything out, but I think there has to be a balance, similar to how overautomation and reliance on that has also lead to avoidable incidents in aviation.


Nope.

The process was there so that the people would know there was work being done on the doors despite not being there for it. If you see an unfinished work from a previous shift, it does not mean you can start messing with it - there might be context you do not know.

Which is why such things are supposed to be noted in appropriate ways. Similarly why aviation has so many procedures everywhere - because we know and understand that sometimes you miss things. For any human reason, not just mismanagement. The process is a way to have reliable place to double check with.

This is different from over reliance on automation, which is arguably less of an issue of automation itself (it's just more visible in such areas) as much as getting out of training because you do not encounter certain things so often. 96 people died because in a stream of many deviations, among other things, the crew never trained how to do IFR landing without ILS, autopilot or no autopilot.

The process is the part that says "yeah, I haven't done this in a long time, I need to train, here is documentation that provides we need to do it and can't delay".

Similarly CMES is supposed to track "work was done on this part of the ticket, now different work needs to be done, do not assume it will be done by other teams"


Potentially the result when you rob workers of the right to pride in workmanship. The most common complaint from old Boeing people who have left is that after the merger the McDonnell-Douglas people took over and the company switched from pride in engineering and quality of workmanship to cost cutting and bean counting. Also, shortly after the merger the corporate HQ moved, reflecting the priorities of the CEO. It has since moved again, apparently to be better for lobbying.


The original move was also related to lobbying potential.

But the CEO who started the moves etc. was a Boeing lifer.


The original move to Chicago was for lobbying? I figured it was more of moving where the CEO wanted to live.


No, the reasoning officially was to move closer to acquired companies and better lobbying especially for defense involvement


yep but then the jack welch acolytes moved in and exponentially worsened it


Many suggest that Phil Condit got seduced by Welch school late on...


Summary: Fuselage was delivered to Boeing with some damaged rivets near the door plug. They had to remove the door plug to fix the rivets. Then they reattached the door plug but forgot to reattach the 4 bolts that would keep it in place. Possibly because of a shift change at the plant.

There was noticeable damage to the door plug's mechanical fittings from the violence of it being blown out of the plane. But the holes where the bolts belonged were pristine. That would not have been true if the holes had had bolts in them.


> The accident airplane was required to be equipped with a CVR that retained, at minimum, the last 2 hours of audio information, including flight crew communications and other sounds inside the cockpit.

>The CVR was downloaded successfully; however, it was determined that the audio from the accident flight had been overwritten. The CVR circuit breaker had not been manually deactivated after the airplane landed following the accident in time to preserve the accident flight recording.

Classic. If they use CD quality audio at 1411kbps, they can store 2 hours of audio in about 1.2 GB. Given how cheap flash is these days, why not 20x that so that we don't have to rely on people pulling circuit breakers after accidents? If there's some concern about robustness and recertification, why not require all aircraft to carry two CVRs, one of the old "robust" style for kinetic accidents, and one that's less robust but has 20x the capacity, so we can record a full day after less violent accidents?


The largest US pilots union opposes it on pilot privacy grounds. (To be clear, I think having an expectation of vocal privacy while you are in charge of an airliner is absurd.)


Well, there is the theory and then there is the reality.

Theory: having less privacy makes things easier for accident investigators, post-mortem.

Reality: In this case, the pilots did their job and got the plane down safely despite rapid depressurization and literally having their headsets sucked off of their heads. It is extremely unlikely to be pilot-error that a door-plug ripped off the airframe at 16,000' or that investigators would learn anything significant from the process in the flight-deck before or after the incident. At least nothing that would root-cause this incident.


That is a non-sequitur. Investigators should have access to accident data regardless of whether the pilots did their job.

Root cause analysis isn't the only reason: it would be good for pilots to have this case study, as well as analysis on how systems responded to the abrupt change.

Having this data is strictly better than not having it.


> Root cause analysis isn't the only reason: it would be good for pilots to have this case study, as well as analysis on how systems responded to the abrupt change.

Yep, could be used as a "this is exactly what you do in this scenario" example for future pilots, or a "what did they do wrong" type real-world exercise for pilots to review (with no blame given to the OG pilots in this scenario for example).


Middle ground would be to have full media access for investigators, but a union rep managing a review and redaction process to have anything immaterial to the investigation redacted. This preserves both valuable data and privacy. Checks and balances.


I'd also propose that someone on the board has to be criminally responsible in case of any abuse. With asset forfeiture on the books.


That the plane landed safely is not an indication that every part of the post-incident process went well.

There could be steps that weren't followed, there could have been training gaps. There could have been secondary impacts of sudden depressurization that could have spiraled out of control, but the pilots thought on their feet to save the plane. We'd want to know exactly what they did so we could add it to the recovery process.


Your comment would only make sense if your example of reality showed the theory was flawed. However, your example of reality is unrelated to the theory, so not sure what your point is.


Privacy from an NTSB accident investigation is absurd. Privacy from your boss snooping you is reasonable.


I think there's some validity to the privacy concerns, but it seems those could be addressed with proper access controls and rules. The recordings should only really be listened to in the aftermath of an accident, in which case, as you say, the expectation of privacy should (in my opinion) take a backseat.


On one hand I agree with you.

On the other hand, if someone recored my whole work day every day I would not be happy. I don't think you would stay at your job of that was a condition of it.

There has to be a better solution to this issue.. extended recordings in an emergency, triggers based on conditions, private keys for pilots... IDFK, cause I try not to get involved in engineering that might KILL someone.


> if someone recored my whole work day every day I would not be happy

I'd be perfectly fine with not voice recording people who's daily work may or may not impact the global delivery of cat pictures.

I think if you choose a job where there are several hundred people's lives on the line relying on you doing your job professionally and correctly, the expectation of privacy argument is somewhat less convincing.

> I don't think you would stay at your job of that was a condition of it.

Some people literally have no choice. Do you think _any_ Amazon delivery driver is "happy" with their on-job surveillance? Do you think _any_ call centre worker is "happy" with "calls are recorded for quality and training purposes"?


I don’t disagree with you but we can’t ignore the fact that it’s much easier to find delivery drivers and call center workers than airline pilots.

As everything, it’s all about the leverage each side has.


> if someone recored my whole work day every day I would not be happy

Many people live with this reality every day already. Remote workers with screen sharing software, certs installed so companies can spy on everything you do, retail workers under cameras all day.


Those people usually can take breaks away from the recorder.


I don't understand. Are you implying that recording a pilot's voice for more than two hours could kill someone? Or just that aviation is stressful and high stakes?

(I agree that it's stressful and high stakes, which is why we record it.)


> I don't think you would stay at your job of that was a condition of it.

I don't think I would care. Especially if it is only read out very infrequently (when we have an accident.)


Why not? Do you think other engineers are better suited for such work?

I’m just curious, because I personally work on things that could kill people directly or indirectly.


LOL:

I like the fact that I can say "People may have acted like someone was going to die, but my code never killed any one"... its a preference, I want to know I can have a bad day, fuck up, and not have to carry the weight for my whole life.


I was just asking.

> its a preference

Thanks.


Indeed. Pretty much all your communication is recorded at any company you work for anyway.


My work does not have a recording of most of my verbal communication in office, and it’s a very secure site and project.


Make it so they don't have reasonable suspicion CVR data won't be abused by the company, and you might get somewhere. For example criminal consequences for misuse that hit C-level and possible leakers.

Sincerely, someone who had death threats partially thanks to manipulated audio record that was done in good faith during investigation, which was leaked and edited later by third party who gained access to it 5 years later.


Reminds me of an exchange from Stranger Things (S4E3):

School counselor: Max, I'm… I'm sorry, I… I really can't discuss this. You wouldn't want me talking to any other students about you, right?

Max: If I were dead and it would help catch the killer, then yeah, I most definitely would.

https://subslikescript.com/series/Stranger_Things-4574334/se...


My wife ridicules me because when we went out to eat, before a multitude of children , I would often say “nobody ever tipped me as a meat clerk when I was working in 45 degrees elbows deep throwing away and scraping rotting meat from the shelves and gutters and then serving ‘fresh shrimp ‘ and organic grass fed filet mignon” when I felt expected to tip 20% for an already over priced meal.

As my first boss, meat clerk young lady, told me “shit rolls down hill.” More powerful people tend to get shitted on less. It was a motivation to move up.

But I still think it’s shitting on people to expect or accept constant recording of everything mundane thing while awaiting the exceptional [screw up]. Pilots are more powerful than Amazon warehouse workers but recording every breath, every whisper, ever fart is undoubtedly shit in a warehouse or a cockpit or an operating room.

Then again, the only way I could accept it is if everyone is recorded all the time and it was all public or at least FOIA able for many people. Especially the government and universities and Wall Street other wise it’s just a way to control and hang things over peoples heads.

As to the tipping grumpiness I grew up partly in the 3rd world where tipping 50 cents was a great tip and I’m cheap and didn’t/don’t make tech bro money. I found the ultimate solution was to just not eat out so much except for truly special occasions. I’m sure there’s a lesson in there too somewhere.


Unfortunately people will take recordings out of context, edit them, use unrelated pieces for other means (maybe the pilots shit-talked CEO who started pushing employees into contracting?) etc.


are you now aware that the server's means of sustenance comes out of those tips?


Define. What EXACTLY does a pilot need privacy for?


[flagged]


I (person who blamed the pilots union above) actually like unions. I could probably even say good things about pilot unions; I would say that part of the reason US airlines have fewer accidents than some other wealthy countries is the effect the unions have had on resisting attempts to work pilots through dangerous levels of fatigue, and on ensuring pilots can report dangerous situations and have them comprehensively fixed without retaliation.

I don't have faith in "market forces" to do those things, and consider the state of aviation in some other countries to be a living experiment showing why.

The opposition to cockpit recording is bonkers, though.


It's not bonkers. Notably, the pilot's union is not trying to turn back the current status quo of 2 hours recorded, they are blocking a significantly longer recording system. Such a system would inevitably record a few (probably) irrelevant flights worth of pilot chit chat and banter.

Has anyone tried, oh IDK, compromising? Maybe instead of 24 hours, we do 6? Or up to 24 hours but only of the "current" flight that matters to the current event? It's a negotiation. The Pilot's union doesn't have veto authority on safety. Sure they can threaten to unanimously strike if it's passed and that might be a big threat to airlines since they have basically stopped investing in the funnel of new pilots, preferring instead to pay everyone food stamp wages and drag themselves through the obvious """Pilot Shortage""" that results when a job that costs $40k to get only pays $25k a year.


HN isn't a monolith, and it certainly has better representation of anti-union sentiment (ie. they don't all get downvoted to oblivion) compared to other discussion forums (eg. reddit).


Do you have a voice recording of you doing your entire job, every day of your life?


No, but I'm also not driving hundreds of souls around near mach 1 strapped to 100k gallons of jet fuel. And when I've worked in government environments I had escorts watching my screen like a hawk the entire time.

Not to mention the tapes are only pulled if there's an incident. You could even have a little tamper seal on it to show if it's been downloaded. This is absurd.


See also: police bodycams

If you have the capacity to end peoples' lives with an arm spasm I think your privacy should rightfully take a backseat.


Absolutely not. Body cams mute the first part of the audio for this exact reason. Privacy is important.


The issue with police body cam audio is that they are regularly recording non-police who do have a right to privacy. That's not an issue for pilot cockpit recordings. (If it is, you've got an incident that should be recorded.)

The muting you observe of police footage isn't of the first part of the audio, it's the prior 30 seconds from before the record button is pressed. They have a constant buffer going, as things can happen... unexpectedly.

This caught a cop in Baltimore; he wasn't aware of or had forgotten the feature. The 30 second buffer caught him planting drugs, then faking the finding. https://www.npr.org/sections/thetwo-way/2017/07/20/538279258...

Side note: It took years to charge him (https://www.baltimoresun.com/2020/03/09/caught-fabricating-e...) and he served no jail time for trying to send an innocent person to jail (https://www.wbaltv.com/article/officer-testifies-in-own-defe...).


> The muting you observe of police footage isn't of the first part of the audio, it's the prior 30 seconds from before the record button is pressed. They have a constant buffer going, as things can happen... unexpectedly.

I just want to clarify that it only buffers the video. The way you worded it still doesn't explain why the previous 30 seconds of audio isn't included in the buffering.

When the button is pressed is when audio recording is started and the previous 30 seconds of video buffer is prepended to the live recording.


If my job involved taking the lives of hundreds of humans in my hands, then I would expect that, at least during the hours in which said lives are my responsibility.


IMHO surgical theatres should have permanent multi-perspective cameras recording everything for the same reason.


Only if the patient consents (or their family if they are unable to give legal consent), otherwise no for patient privacy.


Patients are typically covered with a cloth, head do toe.


There you also have patient privacy to take into account.


This is the reality for a large number of truck drivers who bear a significantly lower responsibility.


There are software engineer jobs where you need to keep your camera on during work hours to show you are in your seat.


These jobs do not attract the best software engineers.


They could with sufficient pay.


The intersection between employers who demand to film you being in a chair and employers who shower their employees with substantial lucre is the null set.


I just watched a youtube video on how a person looking for editing jobs had some pretty poor working conditions with terrible pay. A very controlling boss, asking him to edit on an old x86 macbook because he was told it was 'for creators'. The guy mentioned he a beast machine at home he could edit remotely and the person told him "do you want to edit?". The boss would not even provide him a mouse-he had to edit by trackpad.

He walked out around noon. The boss asked him to come back for an extra $20 that day.


I doubt the jobs where you don't enjoy any level of trust are the ones where you get paid well or get any kind of dignified treatment.

I recently saw a job ad for a JavaScript specialist where the position entailed having screenshots and keyboard + mouse tracking to monitor your working hours. It was a freelancer position, so the hire would handle taxes and health insurance, no equipment would be provided and working hours would start at 08:00 German time sharp for at least nine hours or until you "finish the daily tasks". Pay would however be for 189 hours per month, no compensation for sick leave/holidays/vacation, and you'd be paid via upwork.com (with you paying Upwork's fees) in US dollars.


I'm pretty sure any place doing that is not going to offer sufficient pay.


What is your point? We were discussing when pilots should be expected to be recorded in the cockpit for privacy vs safety. I mentioned there are software engineer jobs where you have to keep the camera on all day.

There are jobs where you are expected to keep the camera and there are programmers who accept those work terms.


Yeah, but nobody applies for those.


The people posting on Reddit would disprove your point. Likely because they do not tell you about that upfront and say it is a small thing.


My job does not involve direct responsibility for the immediate life-or-death of hundreds of lives.


Well, I do have a Git repo that tracks every meaningful change and action that I've done at my job since inception.


I did when I worked retail, and while I worked food service.


Hundreds of people don't die when I screw up.


No but I also don't have the lives of 300 people in my hands.


Pilots can have the lives of quite a lot more than that on their hands since an airplane makes for a great kinetic weapon. The pilots of KLM Flight 4805 took the lives of almost 600 people.


Many people do. Depends on the job.


The rule (edit: in Europe) is now 25 hours for aircraft over a certain weight, though it is not (currently) retroactively applied to existing equipment.

https://www.federalregister.gov/documents/2023/12/04/2023-26...


That document is an in-progress proposal to amend a rule, no? I think there was strong opposition to this rule before this accident flight, and the blowback from the missing data here might be strong enough to be able to get it passed anyway.


> the blowback from the missing data here might be strong enough to be able to get it passed anyway

Nice pun.

What do you think would have been gained from the CVR data in this case? Do you think pilot error had anything to do with the door-plug failure? Do you think the CVR was left running on purpose/accident?

If I were one of those pilots, the first words out of my mouth probably would have been, "what the $&#*?!" followed by whatever procedure had been drilled into me for rapid-depressurization. Given the scenario, I wouldn't lose any sleep over forgetting to shutoff the CVR in the mess of getting everyone to safety.


I'm not an accident investigator and don't know what exactly would turn out to be useful, but I think changing your intuition for why we study the CVR away from "because there might have been a large pilot error" to "so that we can learn more about how pilots react to emergencies with a goal of seeing if we can come up with process improvements" may help. If there was some aspect of the response that was not perfect, we could develop training on it for other pilots, right?


That's not what is at stake here though. CVRs are not intended for improving process like a call-center recorded line. "Both recorders are installed to help reconstruct the events leading to an aircraft accident." [ntsb.gov]

This creep of intended-use is exactly why many people oppose surveillance in the first place.


I don't understand. You're saying that the purpose of cockpit voice recorders is not to improve aviation safety via allowing a thorough investigation of accidents? If there is any other purpose, I don't know what it would be.


Companies salivating to get possible more dirt on people as part of "performance improvement"


You don't need to necessarily be looking for pilot error to want the recording. Maybe it picked up the sound of the plug separating and that could be useful. Maybe it records an alarm, a call from the cabin, whatever. Maybe the way they work the checklist for decompression reveals some problem that should lead to a change in the checklist. Maybe it corroborates or disagrees with the FDR.

Of course I think it's most likely that it wouldn't be that relevant in this particular case.


It is the rule in Europe, which is mentioned in (II)(C) in the link. I failed to link it properly.


>ACTION:

>Notice of proposed rulemaking (NPRM).

>[...]

>DATES:

>Send comments on or before February 2, 2024.

Seems like it's a proposal, and not actually enacted yet?


It was enacted in 2021 for some aircrafts. Not sure what the change of that proposal is, might expand it to more.


  > Given how cheap flash is these days
How cheap is flash that will survive a sudden stop from 400mph to 0 mph in no seconds flat, will survive a post-crash fire, and/or submersion for years in salt water?

Flash data retention at high temps is TERRIBLE (and gets worse for MLC/TLC/etc), see any flash datasheet. It is NOT nearly as simple a problem as you might think.

Yes, it is a solvable problem, but please do not dismiss it so outright as "trivial"


This isn't a technical limitation though, the European standard for airplanes newer than 2021 is in fact 25 hours [1].

[1] https://mentourpilot.com/who-doesnt-want-25-hour-cockpit-voi...


I don't think the problem you're describing is actually a problem.

Exposure to super-high temps occurs in a small set of circumstances, all of which overlap with the destruction of the recording device and the cessation of incoming data. So we only need the same 1.2GB (or whatever) of high-temperature-tolerant storage.

The 25 hour storage can be on normal flash, as if we're more than 2 hours past the incident and data is continuing to come in, then the incident of interest did not destroy the airplane, and the flash will have remained within its normal operating parameters.


Multiple investigations in the past have recovered data from FDR and/or CVR after an extensive high-temperature fire. I do not think that FAA will give that requirement up.


Yes. As I said. The existing system can remain in place, with all of its existing high-temperature-tolerant components.

In addition to not giving up that requirement, we could also add a longer, not-heat-tolerant storage. If it gets destroyed in a fire, see the above paragraph. If there is an incident where the data is of interest and the aircraft is not destroyed in a fire, then this will maintain the data long after the above system has deleted it.

No one has advocated giving up the high temperate storage.


What you described is not a data retention problem at all.

It's a material science problem, and other forms of media are affected by high temperatures and physical deformation just as much as flash if not more.


I often wonder why we still rely so heavily on local storage when in-flight Internet exists. Flight data could be streamed in real time to the cloud for redundancy.



Read it more carefully.


The piece of hardware that was chosen for the avionics-adjacent software I was working on was chosen before any software was written, which was 3 years before the plane was 'supposed' to fly, and 5 years before anyone sane expected it to be in service.

Irritatingly, they didn't even pick the top-of-the-line machine from the vendor at that time. They picked a middling one. And then put an LTS OS version on it that didn't fully support the motherboard chipset. I spent way, way too much time an energy trying to get the software to run on the sort of timescales necessary. It took me months to get anyone to let me talk to the vendor in order to sort out the fact that the storage was being run in legacy PATA mode, reducing our IO throughput by an order of magnitude and the application throughput by about a third.

Ten minutes on the phone and I got them to agree to give us a patch that aliased the chipset to one it was backward compatible with, that was actually supported by the OS. But they really wanted us to take the never version of the OS that didn't have this problem.

That's not even the most hard-ware crippled I'd ever been, but it was top three.


What’s missing from this accident investigation without the recording?


ability to somehow claim pilot error. That's what.


They should also have a video recorder on a 2 hour loop. Many difficult investigations would have been easy if the investigators could see what the instruments were showing and what the crew was doing. And even, who exactly was in the pilot's seat!


>[evidences] indicate that the four bolts that prevent upward movement of the MED plug were missing before the MED plug moved upward off the stop pads.

Ok

>Photos from the interior repair that show the lack of bolts

Huh. Well that's conclusive.


Photos may not have been after conclusion of repair so technically not conclusive. However it certainly lines up.


Not the photos but the fact that the bolts are missing and there are are no witness marks indicating they were present when the door ejected.


The report makes it clear that the photo was taken after the rivet rework was complete, immediately before restoration of the interior


Correct, however again the photo doesn't prove that the next step wasn't putting in the locking bolts.

I don't disagree with the conclusion, but the photo merely *supports* the conclusion, rather than proving it.


> The flight crew reported that the cockpit door had opened during the depressurization event. In a revision to the Flight Crew Operations Manual, issued on January 15, 2024, Boeing confirmed that the door functioned as designed.

Interesting for terrorists. Cause a rapid decompression, and get easy access to the cockpit.


Causing rapid decompression is quite hard. Opening a normal door is very difficult during flight except at very low altitudes.


If they cause a rapid decompression they incapacitate themselves and won't be able to use the cockpit.

Also how do you cause a rapid decompression without a gun of some kind?


They may go then and blow the whole thing up good insted of such half measures. Sufficiently rapid decompressions may not that reliable and predictable to carry out so I can control the plane afterward easy kind of feats.


yeah, this is a significant hole in the locked cockpit door plan.


If people are looking for additional in-depth reading on how this happened, The Air Current did a great write-up on this systemic mistake using internal Boeing sources a month ago that the NTSB report fully supports: https://theaircurrent.com/aviation-safety/127-days-the-anato...


I wonder how close the door plug was to hitting the tailplane or vertical stabilizer/rudder?


Considering all those scary scenarios, what happened was probably the most favourable outcome. It could have been a major disaster hundred different ways..


It sure could have -- the plane was still climbing which puts that plug door almost directly in line with the horizontal stabilizers;

https://i.cbc.ca/1.7077373.1704733027!/fileImage/httpImage/g...


Not as much when taking into account the actual direction of the wind, which would be more inline with the aircraft.


Depressurization happened around 17:12:33 PST but the aircraft continued to climb until 17:13:41 PST, and the autopilot was configured for 10k ft at 17:13:56 PST. Why did it take the pilots a full minute to begin an emergency descent after the failure? I would expect that the nature of the accident would be clear nearly immediately, at least in the need to descend the aircraft.


According to the plane's "memory items" [1] in response to a cabin altitude warning or rapid depressurization, pilots must:

OXYGEN MASKS - DON

OXYGEN REGULATORS - Set to 100%

CREW COMMUNICATIONS - ESTABLISH

PRESSURIZATION MODE SELECTOR - MAN AC/MAN

OUTFLOW VALVE SWITCH - CLOSE

Hold in CLOSE until outflow Valve indicates fully closed

If Pressurization is Not Controllable

PASSENGER SIGNS - ON

PASSENGER OXYGEN SWITCH - ON

EMERGENCY DESCENT - ANNOUNCE

The pilot flying will advise the cabin crew, on PA system, of impending rapid descent. The pilot monitoring will advise ATC and obtain area altimeter setting.

PASSENGERS SIGN - ON

DESCENT - INITIATE

I do giggle a little at the thought of a door flying off, the air rushing out of the cabin, and the pilots responding by switching the seatbelt light on.

The plane was only at 16,000 feet when it lost its door and according to [2] you've got 20-30 minutes of 'useful consciousness' at such an altitude, even without your oxygen mask on. So there was no need for an abrupt dive.

[1] https://www.theairlinepilots.com/forumarchive/b737/b737memor... [2] https://skybrary.aero/articles/time-useful-consciousness


A minute is a long time when you're sitting at your computer. But after the sudden depressurization, I imagine the pilot is focused first on making sure he has complete control of the airplane, assessing the situation, running checklists. Besides, 10K is just barely above the normal pressurization altitude anyway, it doesn't pose an immediate risk to the passengers that justifies just nosediving towards the ground. Especially given how much air traffic is at lower altitudes that close to PDX.

Edit: Re-reading, it was more like 16K feet when it popped, 10K is what ATC assigned them when requested. Still low enough not to be a critical emergency. Some people absolutely will get altitude sickness at that level, but it's likely to be mild. Many people climb mountains much taller.


First and most importantly the pilots have to get their own oxygen masks on, if they delay this at all they will become hypoxic, unable to think clearly, then pass out, and then it's over for all on board.


Agreed, this is definitely the first order of business. Similarly for passengers, always put your own mask on first before helping anyone else near you. Can't help if you pass out.


See Helios Flight 522 for how that ends when the pilots don't realise what's going on and don't put on their oxygen masks.


You follow the procedure because in an emergency you don't know what is going wrong.

Better to climb for a bit more as you get your oxygen mask on than to try to descend immediately and make some problem worse.

We know it was a door plug blowing out, but in the past it has been entire major sections of the airframe ripping off, in which case sudden extra stresses are not what you want.


Let's assume that they did change the course when they were sure they can do that.

Pilots can't look at the rear view mirror, and see the whole plane. Accident reports on engine malfunctions routinely mention that someone had to check their appearance through the passenger window, and relay that to the pilots. In case of a blast so severe that the door flies off, it is safer to assume the worst. Say, that part of the plane disintegrated because of sudden collision. In such conditions, indications on what works and what doesn't are probably really messy and unreliable, and there can be not enough means to control the plane properly. Lower the nose too much, and you might not be able to pull it up any more.

Pilots probably did checklists with one eye on the instruments to check that they were not losing speed, that angles were correct, autopilot inputs resulted in stable flying, and so on, and deduced that everything still worked. By that time, they were probably informed that the plane was seemingly intact, although with a hole in its side.


Step one is to put on the oxygen mask and establish communications. After the startle factor, the masks being put on, then declaring an emergency, a minute really isn’t that long.


I would expect that the nature of the accident would be clear nearly immediately

Not really. The cockpit door was blown open, and the pilot's headsets were blown off. It was a pretty chaotic event, and when you are flying an airplane, you definitely don't want to figuratively "jerk the wheel" - you remain calm and start running checklists.


You can't leave your assigned altitude/trajectory without coordinating with ATC. Otherwise you may collide with another plane, which would make a bad situation worse.


Sure you can, and they likely did start descending before contacting ATC. But before they did any of that, they had to spend some time donning their oxygen masks and doing a few other "memory items" before then descending.


In this instance the report explicitly says:-

> Both flight crew said they immediately donned their oxygen masks. They added that the flight deck door was blown open and that it was very noisy and difficult to communicate.

> The flight crew immediately contacted air traffic control (ATC), declared an emergency, and requested a lower altitude. The flight was assigned 10,000 ft. The captain said he then requested the rapid decompression checklist, and the FO executed the required checklist from the Quick Reference Handbook (QRH). As the FO completed the checklist, the captain flew the airplane as they coordinated with ATC to return to the PDX airport. The flight landed on runway 28L without further incident and taxied to the gate.

So in this particular instance, when the depressurisation happened at a comparatively low altitude, the pilots did get ATC clearance before descending.


In an emergency you can do anything you think is necessary to address it. Source: I’m a private pilot.


You're conflating the right to do something with whether it is advisable to do something.

Sure, you are ~allowed~ to begin an immediate descent in an emergency, but it is not a good idea considering from the pilot's perspective, the bang is most likely an engine going out and altitude is always your friend in this condition.


The person everyone is replying to stated:

> You can't leave your assigned altitude/trajectory without coordinating with ATC.

They didn't not say it was advisable, but heavily implied it was required.


The rule is "Aviate, Navigate, Communicate".

In an emergency, caring about ATC is literally your lowest priority, in every case. It is ATC's job to notice a plane no longer under their control and route other aircraft safely around it. Your entire job is to do everything you can to prevent the death of anyone onboard, and playing ham radio is rarely the best way to do that.


Aviate comes before navigate and communicate. The pilot in charge is ultimately responsible for the safety of the aircraft, not ATC.


Also, running checklists for specific types of emergencies.


Let's imagine you're the pilot, and you're super busy with an emergency. You also know that there are mountains to the east of the airport. You also have ATC on the radio and they know about all of the meaningful obstacles in your area. Asking for lower in this situation (rather than using your emergency authority) is exactly what you want to do.

Looking at the track, they descend to 10000' until they start their downwind to base turn. Once they start that turn, they get a lower altitude (looks like 7000') until they are established on final and can fly an approach.


More altitude means more time to work the problem.


Fast hands in the cockpit are scary. Pilots take their time in emergencies because rushing will take your birthday away


Clearance + they were probably putting on their masks, and other tasks.


> The accident airplane was required to be equipped with a CVR that retained, at minimum, the last 2 hours of audio information, including flight crew communications and other sounds inside the cockpit ... The CVR was downloaded successfully; however, it was determined that the audio from the accident flight had been overwritten. The CVR circuit breaker had not been manually deactivated after the airplane landed following the accident in time to preserve the accident flight recording

How the fuck is this still a problem on brand new aircraft?


You gotta wonder how the technician explained away the four leftover bolts after


anyone have a copy hosted? it's currently overloaded and not responding


I believe this video mostly covers it: https://youtu.be/3m5qxZm_JqM


This is old satire. It's good satire, but don't expect to find actual information here.



I look forward to reading a report from NTSB's internet outage.


Seems to solidly confirm the leak.


> In a revision to the Flight Crew Operations Manual, issued on January 15, 2024, Boeing confirmed that the door functioned as designed.

Smells like CISCO


"....after the left mid exit door (MED) plug departed the airplane leading to a rapid decompression"

Lol, they said the door plug "departed" instead of "blew the f** off"


What is the analogy of leaving out all bolts from that door?

'Forgetting' to put in any of the screws holding a gas tank in place in a car?

'Missing' all welds in one of a skyscraper's lower columns?

An 'oversight' of providing rendundant instruments in an airplane with natural tendency to stall?

What a hopeless shitshow is going on there behind the company gates that these kind of things can happen in succession?

A duck forgot how to swimm, an eagle forgot how to fly, Boieing forgot how to build airplanes?


> 'Missing' all welds in one of a skyscraper's lower columns?

Or just throwing the rebar in at random maybe? [0]

[0] https://www.gr-us.com/%E2%80%9Chorror-at-the-harmon%E2%80%9D...


> According to an article in the Las Vegas Sun, “Some of the steel, known as rebar, was so badly positioned that it stuck out of the concrete floor and was sawed off to conceal the mistake.” In addition, “Harmon workers reportedly moved rebar without first getting an OK from the structural engineer . . . .which is a major no-no in the construction chain of command.

WTF! How was there no punishment for everyone involved in this?


Oh Jesus! I seriously thought my hypothetical example is extreme, but apparently not!


>The CVR was downloaded successfully; however, it was determined that the audio from the accident flight had been overwritten. The CVR circuit breaker had not been manually deactivated after the airplane landed following the accident in time to preserve the accident flight recording

In addition to local storage, why isn't the audio(along with location, altitude and some sensor information) also streamed using something like Starlink or Inmarsat to a secure location where you can store more data for cheaper and with more redundancy?


The current 2 hour limit (which is now 25 hours in Europe) is a legacy of privacy concerns. If pilots are concerned that their bosses would make a habit of yanking longer CVR units to micromanage what goes on in the cockpit (or using events several hours before an incident to somehow push blame onto the pilot for an it), they’d love the idea of it being beamed to a remote location. Yes, I’m sure there could be complicated byzantine cryptographic scheme that would theoretically solve it, but not sure they’d trust it.

There’s also bandwidth and satellite coverage not being magic of course.


This is an old system that works well and reliably for pretty much every incident. I’m not aware of another case of this sort of thing (relevant flight recorder data being overwritten) happening in recent years anyway. If you spend time constantly upgrading systems like this you’re asking for a higher failure rate, for very little gain.

That said, there’s a standard and reliable 25-hour flight voice recorder that solves this problem. But it’s only used outside the US. That’s a regulatory inertia situation and I suspect this incident will speed changes in this area.

However, finally, and particularly in relation to your proposal of streaming cockpit voice recordings to some cloud server. There is some resistance to this (and to longer recordings in general) from air crew on privacy grounds. The privacy issue is less about how much personal info is revealed in a crash situation and more about how easy it would be for a bad actor in management —or whatever operations group runs the audio storage—to listen in on conversations. And you can be sure this would happen if something like your system were implemented without the appropriate regulatory controls (and tbh even with them it would probably still happen).


> I’m not aware of another case of this sort of thing (relevant flight recorder data being overwritten) happening in recent years anyway.

Got me curious how often this happened.

Last example I can find of a CVR being overwritten and not just exploded/missing was in 2018 for an engine fire, similar to this where the flight had to emergency land shortly after take-off. Before that...well a lot of complete failures ("not operative at time of flight") but not many like this scenario.

https://en.wikipedia.org/wiki/List_of_unrecovered_and_unusab...


Here's one in 2017 that was recorded over because they didn't report until after another flight https://en.m.wikipedia.org/wiki/Air_Canada_Flight_759


In 2018 NTSB issued a report (https://www.ntsb.gov/investigations/AccidentReports/Reports/...) listing 17 incidents where CVR was lost due to the recording not being turned off after the incident.

And it also lists 17 more incidents where something happened in a flight and it took more than 2 hours to land so data from the incident was lost.


Starlink is a consumer system. Won't happen without a specialized product. Inmarsat is expensive. And we are talking about streaming audio from all planes currently in flight.



$$$




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: