7 min read

AI in safety-critical engineering: where it accelerates and where it backfires

AI in safety-critical engineering: where it accelerates and where it backfires

Key takeaways

  • AI in safety-critical engineering accelerates front-of-V workflows (requirements drafting, test enumeration, traceability screening, documentation maintenance) and creates audit risk on the right side of the V.
  • AI-generated artifacts are not ISO 26262 safety-case evidence by default. Treating them as evidence under ISO 26262 Part 6 without provenance and tool qualification is a finding waiting to happen.
  • ASPICE 4.0 (VDA QMC, December 2023) assessors are asking pointed provenance questions: who or what generated this artifact, what evidence supports its correctness, where does its trace originate.
  • ISO 26262 Part 8 Clause 11 software tool qualification is the right framework for AI workflows. The defensible posture is to run the TCL determination explicitly for each AI workflow rather than assuming TCL1 by default.
  • The governance decision in front of engineering leaders in 2026 is whether AI use will be governed before audit time or after. The first costs an upfront policy. The second costs program time on the next assessment.

An engineering VP at a Tier-1 supplier was recently asked the same question by two stakeholders in the same week. Procurement asked, "how much faster can your team move with the new AI tools?" The safety manager asked, "how do we qualify any of those AI outputs for the ISO 26262 safety case?" Both questions are reasonable. Both deserve an honest answer. The honest answer is that AI in safety-critical engineering accelerates some workflows and quietly creates audit risk in others. The firms that pull ahead in 2026 will be the ones that can tell the difference.

Why this matters now

AI coding assistants and AI-generated documentation are now defaults across most engineering tool suites. Suppliers are being asked to demonstrate productivity gains on existing programs. At the same time, the regulatory and assessor communities have caught up. ISO/IEC TR 5469:2024 (Functional safety and AI systems), published January 2024, is now a public technical report and is appearing in audit conversations. The European Union's Artificial Intelligence Act (Regulation (EU) 2024/1689), which entered into force on 1 August 2024, classifies AI systems that serve as safety components of regulated products such as vehicles or industrial machinery as high-risk under Article 6, layering conformity assessment, transparency, and risk management duties on top of existing functional safety practice. The obligations for high-risk AI embedded in regulated products apply from 2 August 2028 under the Act's extended transition. Automotive Software Process Improvement and Capability Determination (ASPICE) 4.0 assessors, working from the December 2023 VDA QMC release, have started asking pointed questions about provenance: who or what generated this artifact, what evidence supports its correctness, where does its trace originate.

Most engineering organizations have moved faster on adoption than on governance. The result is that AI-accelerated work is reaching safety-case territory before the safety case knows it. The cost of finding out at audit time, rather than at design time, is measured in months of rework on a Start-of-Production (SOP) bound program.

Where does AI actually accelerate safety-critical engineering?

The pattern across LHP engagements, and across the field generally, is that AI is genuinely useful for high-volume, lower-criticality, evidence-light tasks at the front end of the V-model. Four categories repeat:

  • Drafting initial requirements from regulatory text. A large language model (LLM) can produce a first-cut decomposition of ISO 26262 Part 5 clauses, UN R155 articles (the United Nations regulation governing automotive cybersecurity), or DO-178C objectives (the airborne software standard) into reviewable requirement statements in a fraction of the time a human takes. The output is rough; human review is mandatory; the time saved on the rough draft is real.
  • Test case enumeration for non-safety modules. AI is good at generating broad equivalence-class and boundary-condition test sets from interface specifications. On Quality Management (QM) and Automotive Safety Integrity Level A (ASIL-A) modules where the test rigor expected is lower, this lifts coverage faster than manual enumeration.
  • Traceability gap detection. Pointing an LLM at a requirements set, an architecture model, and a verification artifact list will surface candidate orphan nodes, ambiguous links, and synonym drift faster than a human review pass. Even when the LLM is wrong about specific links, it is right often enough to be a useful screening pass.
  • Documentation maintenance. Updating safety manuals, item definitions, and supplier deliverables for a known set of design changes is high-cost, low-judgment work. AI handles the bulk; humans handle the consequential edits.

In each case, the acceleration is real because the artifact is reviewable and its correctness can be checked against an existing source of truth. The AI is faster than a human at the rough draft. The human is still the author of record.

Where AI-augmented workflows backfire on safety-critical programs

V-model diagram: where AI accelerates front-of-V engineering tasks versus where it creates ISO 26262 audit risk on the right side

The failure pattern is the opposite. AI-augmented work creates audit risk wherever the artifact is itself the source of truth, the verification evidence, or the safety argument. Three categories show up in audit findings repeatedly:

  • Verification evidence on ASIL B and higher. An AI-generated unit test that was not written from a specified, traceable test condition cannot be cited as evidence under ISO 26262 Part 6 without additional analysis. Treating it as evidence is a finding waiting to happen.
  • Safety case argumentation. The argument structure (Goal Structuring Notation, or GSN, goals, strategies, and solutions) is the assessor's primary artifact. AI-generated argument fragments rarely reflect the actual program rationale; they reflect the LLM's training data on what safety arguments tend to look like. Using AI-drafted GSN nodes without rebuilding them from program-specific evidence produces a credible-sounding but unsound argument.
  • Configuration and baseline integrity. AI-generated artifacts that are not traceable to a specific prompt, model version, and input set cannot be reliably reproduced. ISO 26262 Part 8 configuration management presumes reproducibility. An artifact that was generated by an undated AI session, with no captured prompt or model identity, fails this test even if it is otherwise correct.

The pattern across all three is the same. AI-generated artifacts need provenance to enter a safety case. Most teams generate first and worry about provenance second, which is the wrong order.

How do you qualify AI-generated artifacts for an ISO 26262 safety case?

Matrix of recommended AI posture by V-model side and ASIL level, from provenance-only to out-of-scope until TCL3-qualified

The qualification question is concrete: what evidence does an assessor accept for a safety-critical artifact that was AI-assisted? The answer LHP applies on engagements is borrowed from software tool qualification practice under ISO 26262 Part 8 Clause 11. Treat the AI workflow as a tool. Define its tool confidence level (TCL) based on the impact of an undetected error and the likelihood the workflow will detect such an error before integration. The standard's TCL determination procedure is what governs whether qualification is required for a given workflow; the defensible posture for generative AI in safety-relevant work is to perform the TCL evaluation explicitly per Clause 11 before assuming the workflow can pass at TCL1.

The practical consequence is a small set of operational rules:

  • Capture provenance at generation time, not retroactively. Prompt, model version, input artifact set, output, and the human reviewer's identity all go into the record before the artifact moves downstream.
  • Keep AI out of the verification loop unless the verification activity itself is independently qualified. AI-generated tests verifying AI-generated code is a closed loop that an assessor will, correctly, reject.
  • Treat the AI workflow as a tool subject to change control. Model version updates are configuration changes, not invisible upgrades. A prompt template that was qualified against GPT-class model version N is not automatically qualified against version N+1.

Working engineers can apply this without waiting for organizational policy. The provenance discipline alone removes most of the audit risk because it makes the AI contribution legible and auditable on demand.

How LHP approaches AI in safety-critical engineering

LHP integrates AI-augmented workflows into model-based systems engineering (MBSE), functional safety, and ASPICE engagements with three guardrails. First, every AI-generated artifact carries a provenance record before it can be cited downstream. Second, the AI workflow itself is qualified at the program's highest applicable TCL, and the qualification record is maintained alongside the toolchain configuration baseline. Third, AI usage is mapped explicitly to the V-model: high-volume, reviewable activities on the left side of the V; tightly controlled or absent on the right side until the verification workflow is independently qualified. The result is faster front-end work without the audit liability that unmanaged AI adoption tends to produce. The methodology is tooling-agnostic and has been applied across LLM-based requirements assistants, AI-augmented modeling tools, and code-generation assistants.

What this means for your next program

The decision in front of most engineering leaders in 2026 is not whether to allow AI tools. The procurement question and the productivity question have already settled that. The decision is whether AI usage will be governed before audit time or after. The first costs an explicit policy and a small upfront tooling investment. The second costs program time on the next assessment, and on safety-critical programs, program time is launch time.

The next concrete step worth taking this quarter: identify three workflows where AI is already being used inside your engineering organization. For each, write down the provenance record that would be required if the assessor asked tomorrow. The gap between what is captured and what is required is the size of the policy you need to put in place.

Frequently asked questions

Does AI count as a qualified tool under ISO 26262?

Not by default. ISO 26262 Part 8 Clause 11 requires software tools used in the safety lifecycle to be qualified at a tool confidence level (TCL) determined by the impact of undetected error and the likelihood that the workflow detects such error before integration. The TCL evaluation is what governs whether qualification is required; treating an AI workflow as unqualified-by-default, running the TCL determination explicitly, and qualifying the workflow before evidence flows downstream is the defensible posture.

Can AI-generated unit tests be cited as ISO 26262 verification evidence?

Not on ASIL B or higher items, unless the test condition is traceable to a specified requirement and the test workflow is itself qualified. AI-generated tests that were not derived from a traceable test condition cannot be cited under ISO 26262 Part 6 without additional analysis. The defensible pattern is to keep AI on the requirements and enumeration side of the V and out of the verification loop until the verification activity is independently qualified.

What does AI provenance mean in practice for safety-critical engineering?

Provenance for an AI-generated artifact captures, at generation time, the prompt used, the model and version, the input artifact set, the output, and the identity of the human reviewer who accepted it. The provenance record travels with the artifact through the toolchain so any later audit can reproduce the generation and verify the human review. Most teams collect provenance retroactively, which is the wrong order; provenance captured after the fact is not auditable.

How does ASPICE 4.0 treat AI-generated artifacts?

ASPICE 4.0 (2023 release) does not name AI explicitly, but its assessment of provenance and bidirectional traceability applies whether an artifact was generated by a human, a CAD tool, or an LLM. Assessors are asking who or what generated the artifact, what evidence supports its correctness, and where its trace originates. An AI-generated artifact without provenance fails the same base practices that an undocumented human-generated artifact would.

About LHP Engineering Solutions

Since 2001, LHP Engineering Solutions has helped companies deliver technology that must perform as intended, every time. Our clients operate in safety-critical, operation-critical, and mission-critical environments such as on-highway, off-highway, aerospace, defense, and oil & gas, where failure is not an option, and delays cost market share.

LHP helps organizations design, architect, validate, and monitor complex systems. Our global team of engineers supports the development of advanced technologies, including high-voltage power electronics, hybrid and electric powertrain controls, connectivity, and ADAS platforms, enabling OEMs and Tier-1 suppliers to bring next-generation products to market quickly and with confidence.

In compliance-driven industries, LHP uses our model-based systems engineering (MBSE) approach, enhanced through AI, to help companies move quickly while meeting rigorous standards, including functional safety, ASPICE, and cybersecurity. Our teams have helped global technology companies achieve functional safety certification and ASPICE compliance in months rather than years, and established enterprise-grade safety and cybersecurity management systems for leading OEMs.

When organizations must make major technology leaps, such as launching next-generation platforms, future-proofing vehicle architectures, or proving new concepts to secure market-defining programs, LHP delivers the engineering disciplines, solutions, and on-time execution required to succeed.

Because in industries where technology must perform as intended, precision engineering matters.

Additional Content

What auditors look for in an ISO 26262 safety case (and where most fall short)

What auditors look for in an ISO 26262 safety case (and where most fall short)

Key takeaways The ISO 26262 safety case is now assessed as a structured argument, not a binder of evidence. Assessors test the argument graph first;...

Read More
ASIL-D safety cases: where the discipline diverges from ASIL-B and below

ASIL-D safety cases: where the discipline diverges from ASIL-B and below

Key takeaways An ASIL-D safety case is not a deeper ASIL-B safety case. The diagnostic coverage thresholds, the independence requirements, and the...

Read More
AUTOSAR Adaptive: when migration from Classic actually pays for itself

AUTOSAR Adaptive: when migration from Classic actually pays for itself

Key takeaways AUTOSAR Adaptive migration is the right call for programs with service-oriented architecture (SOA), over-the-air (OTA) update...

Read More
Top 10 Questions to Consider When Implementing ADAS

Top 10 Questions to Consider When Implementing ADAS

Top 10 Questions to Consider When Implementing ADAS

Read More