How Does SOTIF Address Safety Design Vulnerabilities?
How does SOTIF impact vulnerabilities?
Safety Of The Intended Functionality (SOTIF) is defined in ISO 21448:2022 “Road vehicles — Safety of the intended functionality”, as the absence of unreasonable risk due to hazards resulting from functional insufficiencies of the intended functionality or by reasonably foreseeable misuse by persons. Specifically, SOTIF address vulnerabilities that are out of scope for ISO 26262, "Road vehicles – Functional safety". When the SOTIF standard is properly applied with strict and consistent discipline, the measures needed to achieve SOTIF become understood.
How ISO 26262 and SOTIF work together to address vulnerabilities
ISO 26262 is an international standard for functional safety of electrical and/or electronic systems that are installed in serial production road vehicles, as defined by the International Organization for Standardization (ISO). ISO 26262 is a standard that works in close concert with SOTIF and addresses functional insufficiencies in the safe performance of the system: the sensors, processors, and actuators, and addresses them through complete and correct requirements. In other words, ISO 26262 addresses the malfunctions of electronic components, and SOTIF addresses things outside of the vehicle than can influence the behavior of an electronic component and yield unintended functions. However, ISO 26262 alone is not enough to guarantee that ADAS systems will operate as intended, because every conceivable scenario cannot be completely specified before it is known, leading to systems that are designed properly to requirements that turn out to be incomplete.
The original SOTIF effort began as an ISO 26262 working group, so it is understandable that to the casual observer, the lines between the two realms might get blurred. The subject matter is quite complex; in some instances, the processes may seem counter-intuitive to how the automotive industry has traditionally conducted business. This problem is magnified when people working in this realm have only received either partial training or no training at all.
SOTIF addresses the vulnerability to the unknown
The SOTIF standard addresses design flaws. At the beginning of the process, when developing the Advanced Driver Assist Systems (ADAS) that make autonomous vehicle operation possible, it is impossible to design a system that, on paper from the start, can anticipate and properly address every conceivable scenario right out of the gate when the system is first built. Every sensing system has performance limits, and even the best and smartest people don’t know what they don’t know. But you must start somewhere.
A system has three parts: a sensor, some form of control (digital logic), and actuation. Even if these things don’t have a fault or failure in them, they can still cause the vehicle to perform unintended behavior based on what it senses in the environment. Not only can unintended safety problems be caused by faults, errors, and failures (which is what ISO 26262 addresses), but also the scenario itself can be misinterpreted, and things can happen in the environment around the vehicle. SOTIF is designed to address these issues.
What happens if the atmospheric conditions diffuse the light in such a way that it blinds the camera-based sensors on the car? How often is that going to happen? And when that issue is identified and becomes a known unsafe issue, what is going to be done to address it? These questions are typical of the types of system-wide problems that need to be addressed.
Vulnerability to complexity
It is impossible to prescribe ahead of time, the methods for safely controlling one fault vs. another when the safety consequences of the fault are still unknown. And despite significant diligence in the design process, there is always the chance, however slight, of unanticipated situations that can lead to the violation of a safety goal, without there being any fault in the sensing system itself.
This is a function of the complexity of the system. For example, if you have two different vehicles on the same road and the conditions of that road change, you don’t know how each vehicle is going to react compared to the other. The hard part is that what can happen out in the environment is orders of magnitude more complex than what you can validate on a test track, or in a test drive out on the street. To chisel away at these complexities, SOTIF tries to limit the number of unknown unsafe states that the system could be in. SOTIF provides guidance to manage the violations of those safety goals and drive corrective action.
The use of Artificial Intelligence (AI) is one approach that can be applied to ADAS systems, but AI is also extremely complex, and its use is in its infancy. Through its programming, the AI vehicle benefits from unique learning experiences that are fine-tuned over time during the research and development phases. (Learning is not continued in fielded production vehicles, in part because of the risk of the system learning bad interpretations or responses.) There is a lot of study currently underway on the use of AI and several companies are trying to use it. But the more traditional programmed behavior is still the predominant method in use at this time.
ISO 26262 vulnerabilities, and how SOTIF addresses them
The purposes of these standards can at times be subject to misinterpretation. They are not step-by-step recipe books that tell the engineers how to design their hardware. (It would be very dangerous if a standard forced the entire industry to engineer their products one specific way.) Instead, ISO 26262 prescribes the general processes that must be used for deriving and verifying the control of faults, namely, the process of writing safety requirements. And SOTIF addresses design vulnerabilities by providing validation guidance for these complex systems, to improve upon and decrease as much as possible, the unknown unsafe states. The engineers still must figure out how to achieve those requirements by putting the system through different and varying environmental conditions. And that takes discipline.
First off, the safety requirements themselves must be written properly and completely, and then implemented properly. A poorly written requirement doesn't dilute the trustworthiness of the standard. Instead, a poorly written requirement sets you up for poor design implementation, and this can lead to a recursive loop of erroneous assumptions as to the root cause of the failures:
- A badly implemented safety process leads to badly written product requirements.
- Badly written product requirements lead to unsafe, unreliable products.
- Unsafe, unreliable products can lead to an untrusted standard, when in reality, the standard was trustworthy, but the implementation was flawed.
The standard tries to mitigate this by adding independent reviews prior to going into the design.
Without proper training and managerial oversight, the tasks can seem so complex as to be overwhelming. But shortcuts and workarounds are the banes of effective process discipline, and they are not the answer. Instead, the complexity of these problems can be mastered through granularity, by authoring safety requirements in so many steps, that each step can be verified for correctness and completeness. In other words, you keep breaking the problem down until it can’t be broken down further, and then you solve each item one at a time.
The SOTIF process feeds into the ISO26262 process. Safety is context-specific, and the safety requirements should provide the context for any point in the development of the vehicle. Writing safety requirements is challenging and often much more work than originally expected, but the good news is that once the work is completed and the problem clearly understood, the ensuing development effort is typically much easier and quicker. In the end, safety requirements must be written properly, every requirement, every time. If the functional safety requirements are not correct, the safety project has a 0% chance of succeeding.
Impacts to the design cycle and the cost of developing systems
The standards require traceability, but OEMs don’t always go back and change the specs when they update the product and don’t always go back and change the requirements to match. They should, but they typically don’t. In an industry steeped in a model year-based cycle that constantly moves forward, it can seem counter-intuitive to go back and keep expending money and effort on a product that is already sold, out the door, and no longer being manufactured.
But in this modern era of functional safety, vendors and OEMs must go back and start dealing with released products from a product development standpoint, if that product was developed after the standard was released (products developed before the standard was released, are grandfathered in). This requirement for retroactive compatibility spans multiple model years and reverses the flow of time to go back upstream in the traditional linear model year development process. This can be highly counter-intuitive from a corporate culture and financial accountability standpoint.
The challenge of managing updates
In adopting these responsibilities, vehicle systems that are released out on the road must be updated over time, and everyone in the value chain is going to have to be able to validate that the system they have just updated is now safer than it was before the update, regardless of when it was first built and sold. And they will have to do that every time an update is issued, resulting in an almost constant revalidation loop. This will require significant involvement from every vendor and manufacturer that impacts the vehicle, especially the software and semiconductor suppliers.
What the future may allow is over-the-air updates of software. But software updates will need to undergo the same Verification and Validation (V&V) rigor as the original release (also known as “regression testing”). This is where an automated scenario execution will pay dividends. Also, AI might be employed in the training realm, allowing values to be locked in for production.
Does the requirement for retroactive compatibility make sign-off harder or easier as you progress through the design cycle?
That is a tough question to answer because it is a system-level problem. Companies are going to have to perform validation in a way that exercises everything in that system properly, including the semiconductors. The industry has not figured out how to do that within those complex systems yet, although remote automated around-the-clock simulation is likely to play a very important role if for no other reason than the sheer bandwidth required to test all the scenarios in a cost-effective and timely manner.
Updating from lessons learned
Most people are not used to thinking of their cars as evolving systems. They are used to thinking of them as systems that degrade over time. Cars are bought, they are used, they break down and are repaired. They rust, their performance suffers, they eventually wear out, and they are scrapped.
In comparison, a modern autonomous vehicle must be kept at peak operating performance for its entire lifetime. Are issues addressed inside the chips, circuits, mechanicals, or externally? The answer is: all the above. They are addressed as a whole system, at any of those blocks, or within the components within those blocks. And the variety in models and available features adds considerable complexity and difficulty.
Original Equipment Manufacturers (OEMs) should be working constantly to find unknown/unsafe scenarios and move them to known/unsafe. This work is performed during the research and development phase, or the production development phase, regardless of whether AI is used in the development process or not. Then, the learned values are locked in for production.
As time passes, the algorithms become “smarter” as they learn new scenarios and better ways to respond, but this work is still done in a development environment. Then the fielded systems can be updated, which in turn feeds into the need for further refinements and updates. And after every update, there must be validation for safety.
The importance of accurate and complete data
We need much better diagnostics on these vehicles. And, when there is a problem, we are going to need forensics. Accurate and complete data is paramount and makes everything else possible.
Diagnostics is going to be an area of significant change. They must provide a better and more comprehensive understanding of how the system is reacting to the world around the vehicle than we are capturing right now. We must develop the ability to take that information, send it back to a data center, process it thoroughly, and perhaps identify new learning to help develop future software updates.
In instances where the car crashes, forensics will be required to understand what the system was thinking before the crash happened, and what its view of reality was before it did whatever it did. That is going to require work in both the software and hardware realms to get access to all that data. And it is going to take time and the development of validation processes to ensure that the forensic data is being interpreted and applied accurately.
SOTIF itself is not vulnerable, but vulnerabilities elsewhere can negatively impact efforts to achieve SOTIF. These challenges are significant, but not insurmountable. Conquering them starts with properly trained humans performing their work accurately and completely. Although this is an incredibly complex, technology-rich environment, the key to success in functional safety lies where it begins, with the people.
Interested in learning more about SOTIF for your organization? Contact our team today!
Further reading and references