Aircraft Accidents and Lessons Unlearned XXXI: Lion Air 610

Unknown photographer – B737 Angle of Attack Vane

On October 29, 2018, thirteen minutes after departing Jakarta, Indonesia, Lion Air flight 610, a new Boeing B737-MAX, registration PK-LQP, suddenly plunged into the Java Sea. The Komite Nasional Keselamatan Transportasi (KNKT) final accident report, KNKT., was released on Friday, October 25, 2019. The report revealed multiple cultural issues in Lion Air’s Operations division. What the report also demonstrated was KNKT’s fundamental inexperience with aircraft maintenance (AC-MX) issues that directly contributed to the accident. However, the KNKT was not alone; the National Transportation Safety Board (NTSB), who assisted in the investigation, was just as naïve in AC-MX issues as the KNKT.

The major contributing factor centered around the left-hand angle of attack (AOA) vane on the B737-MAX. Whether it was improperly overhauled or not correctly installed, the two investigatory groups focused on the easy culprit of new technologies and downplayed a simple fact: the AOA vane caused the accident … period. The KNKT failed to distinguish the difference between probable cause and root cause. That AOA vane caused a malfunction in the B737-MAX’s system (probable cause), but why that AOA vane was on the airplane (root cause) was ignored. Probable cause tells how accidents occur; root cause tells why accidents occur. If one removes root cause, probable cause goes away.

Out of 89 findings in the KNKT report, only ten findings were dedicated to AC-MX and even those were thinned by other problems. The report had scores of recommendations which never addressed the AC-MX issues that contributed directly to the accident. Instead they focused on the minutiae, such as pilot actions in the emergency or second-guessing decisions. These were safety issues for training or design flaws to be fixed, but they did not cause the series of events that led Lion Air 610 to crash.

Per the KNKT report, on page 131, “The investigation received AFML [aircraft flight maintenance log] record on October 2018 of PK-LQP.” The report then stated, “The investigation found 31 pages not included in the package.” Were the missing pages ever found? How did the KNKT investigate an accident for one year and never find any or all of the 31 missing pages? What did those missing pages say about Lion Air’s AC-MX culture? Did they agree with digital AFML copies? Why did AC-MX keep resetting circuit breakers on the accident aircraft during the last weeks?

The left-hand AOA vane was replaced with a defective part. This happens; a defective part is called ‘bad-from-stock’. It just should not be left on the plane. In the KNKT report, page 36, the left AOA vane was operationally tested using an alternative – yet approved – method, which required deflecting the AOA vane to different positions and then verifying each AOA position on the stall management yaw damper (SMYD) computer. However, “The [mechanic] did not record the indication on the SMYD computer during the installation test.” Why? Why were operational check parameters not recorded as directed? Why did no one question the unusual maintenance steps taken to clear computer faults?

Shouldn’t the NTSB have caught these issues? The NTSB’s September 19, 2019, report failed to direct attention to AC-MX. No finding or recommendation about the missing paperwork or the questionable testing performed on the accident aircraft prior to the accident. Did the NTSB understand the culture of Lion Air’s AC-MX division and/or its repair station personnel? Did the NTSB have AC-MX investigators with AC-MX experience? Has the NTSB started hiring seasoned AC-MX investigators or are they still using inexperienced engineers?

The NTSB and KNKT’s lack of AC-MX experience was shared by the Joint Authorities Technical Review (JATR), led by former NTSB Member, Christopher Hart. The JATR team of technical representatives was chartered by the Federal Aviation Administration (FAA) to review the FAA’s certification process. On October 11, 2019, the JATR report to the FAA was published. On page XII, paragraph 11, Impact of Product Design Changes on Maintenance Training, the JATR team stated, “The JATR team was tasked to consider maintenance suitability of the design. Due to lack of maintenance expertise on the JATR, the team was unable to make a determination of such adequacy.”

Amazing! Did the FAA cancel the JATR’s check? How can a team of technical representatives, employed by the FAA to provide unbiased, rounded views into the FAA’s certification process, fail to have AC-MX expertise? Certification relies heavily on AC-MX and inspection personnel to follow the procedures and instructions for continued airworthiness (ICA) for maintaining the aircraft. The ICAs, which appear to not have been followed, directly contributed to the Lion Air 610 accident. The JATR should have employed somebody … ANYBODY … who could address AC-MX issues. This was unacceptable.

The JATR was hired to help the FAA find problems.  Instead, the JATR ignored the basic needs of the AC-MX workforce using the ICAs. How could the NTSB expect the B737-MAX to be safe if they ignore fundamental problems that led to the accident? How did the KNKT expect Lion Air to learn from a catastrophic mistake if the KNKT cannot even understand why, e.g. missing AFML log pages and unknown test procedures were important? Did the KNKT, NTSB or JATR take AC-MX or Systems training on the B737-MAX? Did they take Lion Air’s approved B737-MAX Systems training to check for quality? Anybody?

The FAA and all oversight agencies across the world divide certificate holder oversight responsibilities into two groups: Operations and Airworthiness. Operations oversees the operator’s pilots, ramps, flight attendants, training and operations control; just because engineers designed the aircraft does not mean engineers can tell pilots how to fly it. Airworthiness oversees the operator’s maintenance, inspection, training, engineering and contract outsource maintenance; just because an engineer designed a single aircraft’s system does not mean that engineer understands all the systems and how to repair them.

The FAA has used these methods to capture all manufacturers, contractors, air carriers and outsourced maintenance for decades. The FAA, unlike the NTSB and the KNKT, does not hire engineers for investigations, performing surveillance or oversight. Why? Because engineers are not certificated; engineers do not receive systems training; engineers don’t understand how an operator works; they lack experience and basic troubleshooting skills to recognize problems, just like in Lion Air 610.

Lion Air 610 is the latest example of the NTSB’s failure to determine root cause. In this accident, the KNKT and the NTSB, by trivializing Lion Air’s AC-MX, ignored a major contributor to the future of Lion Air’s safety, not just of the B737-MAX, but its entire fleet. The evidence pointed to an inherent problem at Lion Air and/or its AC-MX provider that the KNKT and NTSB missed. Did Lion Air call Boeing technical support to ask about the AOA problems between October 9 and October 29, 2018? If not, why not? Did the KNKT interview the accident aircraft’s mechanics? The KNKT report does not say. AC-MX professionals would know that the manufacturer’s technical support always answers the phone.

Could the KNKT and NTSB’s AC-MX lapses have prevented Ethiopian Airlines 302’s crash five months later? That falls within the area of speculation. However, the omission of AC-MX issues in both the KNKT and NTSB reports demonstrated that they focused on what resulted from the series of events that led to the accident and ignored what caused the series of events that led to the accident. They obsessed on certification failures on a grounded aircraft – old news, all too easy. They failed to solve the root cause of the accident.

What does it mean that the root cause was never discovered by three respected organizations: the KNKT, the NTSB and the JATR? It means that the root cause still exists, that the ignored problems with Lion Air’s maintenance program have not been identified and fixed. It means, once again, that airplanes will continue to be unsafe from a root cause of ignorance.

6 thoughts on “Aircraft Accidents and Lessons Unlearned XXXI: Lion Air 610”

  1. Steve,

    You hit the nail on the head. I believe proper maintenance is the root of both of these accidents. However it is easier to blame the manufacturer.

    1. That is the shame of the investigation process; the blaming of the FAA and/or the manufacturer. It is easier to pile on instead of making productive changes. The time is long past for the NTSB to give serious, experienced attention to maintenance; we are not getting safer.

  2. Great article Steve with your points well emphasized. As you know, we teach the concepts of Root Cause Analysis (RCA) at the FAA Academy in various courses. But, when politics enter the picture probable cause is safer to stay with. It takes time to correct RCA, but it is the best avenue to take.
    However, the word “integrity” is something that needs to be considered here as well. Doing the right thing every time is what should have been done. Each of us should have learned long ago about integrity before getting into aviation. It’s this one word that has been tarnished in the FAA as well as, with the manufacturer. Getting “Integrity” back will take trust and right now it’s lacking among both agencies.

    1. In writing my third book I discovered that the term ‘probable cause’ is undefined in aviation. The FARs do not define it and Cornell law, my reference site, does not define it. What does that mean? Probable cause is a farce, a misdirection; a sleight-of-hand. We have wasted so many valuable opportunities to fix the industry. I am afraid it goes way beyond integrity. I was reading the three reports I mentioned in the article; I was stunned by the irrelevance and arrogance.

  3. Great article Steve.

    Other points to ponder, the TSO standard for the AOA, of which ICAs are inherently based upon, has not been updated since the early sixties. The TSO standard does not include technology changes and ICA evolvement for things such as systems component interface (ie MCAS).

    I noted one of the newest volumes in FSIMS now includes the AEG. A great move by FAA Flight Standards, albeit 30 plus years overdue. As we know, AEG is the bridge of the gray area between FAA Certification and FAA Flight Standards. However more work will need to be done in the FSB & FOEB areas including simulator and training.

    If FAR special conditions were issued under the changed product rule within Part 21, I believe it would have gone a long way to systematically embrace basic hazard identification & risk management requirements for certification, training, and maintenance. (another topic perhaps)

    Final point, in many cases there is truly no end to root cause analysis and report critiques. There are many many case studies that will emanate from these accident lessons unlearned and may take years of course correction. Thank you for your article Steve!

    1. The TSO point is well taken. That would be a great recommendation and one that would make an effective change. I don’t try to trash the NTSB but the results of investigations are so off target and show an organization that doesn’t take the process seriously. Perhaps if they had industry-proven investigators, we could make some progress in safety. Root cause is just scratching the surface of what needs to be done to make a difference.

Leave a Reply

Your email address will not be published. Required fields are marked *