Deploying neural networks in automotive, industrial, and embedded safety-critical systems requires reconciling two fundamentally different engineering philosophies: the probabilistic, data-driven world of machine learning and the deterministic, evidence-based world of functional safety. This guide explains the standards that govern these environments — ISO 26262, SOTIF/ISO 21448, IEC 61508, and IEC 62443 — and how to structure an ML deployment that a safety engineer can accept.
Last reviewed: May 2026
Functional safety is the part of the overall safety of a system that depends on the correct operation of electrical, electronic, and programmable electronic components. When an AI or ML component is deployed in a safety-critical system — autonomous emergency braking, industrial emergency shutdown, robotic motion planning — functional safety standards require evidence that the system fails safely, behaves correctly within its defined operating conditions, and can tolerate hardware and software faults. The challenge for machine learning is that these standards were designed for deterministic software, and neural networks are not deterministic in the traditional sense.
The functional safety standards that govern automotive and industrial systems — ISO 26262, IEC 61508, IEC 62061 — were developed for deterministic software: code that processes inputs through defined logic and produces outputs that can be formally specified, tested against structural coverage criteria, and verified to be correct within the system's operational conditions.
Neural networks do not fit this model. They are trained — not programmed — and their behaviour emerges from millions of learned parameters rather than explicit logic. The same input presented twice may produce slightly different outputs depending on hardware floating-point behaviour and thread scheduling. Their failure modes are statistical, not deterministic: they perform well on average but may fail on specific inputs that were under-represented in training data. And they cannot be exhaustively verified against a formal specification because no formal specification exists — the specification is implicit in the training data.
This creates a fundamental tension that every engineer deploying AI in a safety-critical system must resolve: how do you build a safety case for a component whose behaviour cannot be formally specified or verified in the traditional sense? The answer — now emerging from standards bodies, industry consortia, and early programme experience — involves a combination of architectural patterns (safety monitors, ODD definition, fallback paths) and statistical evidence (large validation datasets, robustness testing) that together constitute an arguable safety case.
The six core tensions below represent the engineering challenges that every safety-critical ML deployment must address.
Functional safety standards (ISO 26262, IEC 61508) assume deterministic behaviour: given the same inputs, a safety function produces the same outputs every time. Neural networks trained with stochastic gradient descent and dropout do not offer this guarantee. Even identical inputs can produce slightly different outputs depending on hardware floating-point implementation and thread scheduling.
Safety cases require traceability — you must be able to explain why a system makes a decision and demonstrate that no unsafe decision is possible within the defined operational design domain (ODD). Deep neural networks are fundamentally opaque. Explainability methods (SHAP, LIME, attention maps) provide approximations, not proofs.
A model trained on a distribution of training data may behave unexpectedly on inputs that fall outside that distribution — a problem known as distribution shift. Safety standards require that the system behaves correctly across its full defined operational condition set. Distribution shift is fundamentally at odds with this requirement.
Traditional safety-critical software is verified against formal specifications using structural coverage metrics (MC/DC for ASIL D). Neural networks cannot be verified this way — the 'specification' is implicit in the training data. Statistical validation on large held-out datasets replaces formal verification, but safety authorities differ on what constitutes sufficient evidence.
Safety-certified software is subject to strict change management: any modification requires re-assessment. Neural network models may be fine-tuned, updated, or replaced as data accumulates. Each update potentially invalidates the existing safety case and requires re-validation — creating tension between operational improvement and certification maintenance.
Edge safety systems have tight end-to-end latency budgets — often under 10ms for real-time control loops. Neural network inference on constrained hardware (MCUs, FPGAs, small SoCs) must fit within this budget while leaving margin for the rest of the control path. Quantization and pruning help but introduce their own failure modes.
ISO 26262 is the automotive functional safety standard, applicable to road vehicles with a Maximum Gross Vehicle Mass (MGVM) up to 3,500 kg. It defines the Automotive Safety Integrity Level (ASIL) framework — four levels (A through D) that specify the rigour of the safety engineering process required for a given safety function. ASIL D is the most stringent; ASIL A is the least.
The ASIL level for a function is determined through a Hazard Analysis and Risk Assessment (HARA) that considers three factors: severity (S) of the potential harm, exposure (E) to the hazardous situation, and controllability (C) by the driver or operator. The combination of S, E, and C determines the ASIL assignment.
The V-model is the development process framework: requirements are specified and decomposed on the left arm of the V; implementation is at the bottom; integration, verification, and validation testing is on the right arm, mirroring each left-arm artefact. Every safety requirement must be traced from the vehicle-level safety goal through system, hardware, and software levels, and verified at the corresponding level on the right arm.
SOTIF (ISO 21448:2022) extends ISO 26262 to cover hazards that arise from the limitations of the intended function itself — not from failures, but from situations where the system behaves exactly as designed but the design is insufficient for the actual operating conditions. This is directly relevant to ML: a perception model that misclassifies a pedestrian in unusual lighting is not "failing" in the traditional sense — it is working as trained, but the training was insufficient for that condition. SOTIF analysis requires identifying and systematically testing these triggering conditions.
Lowest functional safety requirement. Applicable to systems where hazard severity is low or exposure is infrequent.
ML Deployment Implication
Statistical validation on moderate test sets may be accepted. Basic runtime monitors sufficient.
Moderate safety requirement. Commonly applied to driver assistance features with moderate hazard potential.
ML Deployment Implication
Requires documented ODD, performance metrics across ODD boundary cases, and monitoring for out-of-distribution inputs.
High safety requirement. Applies to systems where failures could lead to severe injury without external mitigation.
ML Deployment Implication
Comprehensive safety case required. Runtime assurance monitors, explicit fallback path, SOTIF analysis mandatory. Third-party review typically expected.
Highest automotive functional safety level. Applies to systems where failure can lead to life-threatening hazards.
ML Deployment Implication
Neural network components are typically restricted to perception/advisory roles, never in the final safety-critical control path. Redundant deterministic monitors required. Current industry consensus: pure ML components are not certifiable to ASIL D as sole safety function.
IEC 61508 is the generic functional safety standard for electrical, electronic, and programmable electronic (E/E/PE) safety-related systems. It is the parent standard from which sector-specific standards derive: IEC 62061 (machinery), EN 50128 (railway), IEC 61511 (process industry), and IEC 61513 (nuclear). The Safety Integrity Level (SIL) framework is IEC 61508's equivalent of ASIL — four levels (SIL 1–4) defined by the required probability of failure on demand (PFD) for a safety function.
For AI deployments in industrial systems, the key question is: what SIL level is assigned to the safety function that the AI component contributes to? This determines the rigour of the validation evidence required and the constraints on how the AI component can be used within that function.
The general principle is the same as in ISO 26262: neural network components can contribute to lower-SIL functions as advisory or initiating elements, but the highest-integrity safety functions require deterministic implementations or deterministic safety monitors that override the ML component's outputs.
Lowest Safety Integrity Level. Applies to functions where a single failure is unlikely to cause a dangerous event without multiple contributing factors.
ML Note
ML-based advisory systems that feed into a SIL 1 safety function must demonstrate statistical reliability within the defined operating conditions.
Intermediate level. Common in process industry safety instrumented functions: emergency shutdowns, pressure relief initiation, fire and gas detection.
ML Note
ML anomaly detection feeding a SIL 2 shutdown must be treated as an initiating element in the safety loop — its probability of spurious operation (dangerous failure) must be demonstrated.
High integrity level. Found in chemical plant emergency systems, railway signalling, and nuclear instrumentation support functions.
ML Note
Neural network components are not currently accepted as SIL 3 safety function implementations by most certification bodies. ML may support diagnostics or non-safety-critical data paths alongside a proven deterministic safety layer.
Highest IEC 61508 level. Nuclear reactor protection systems, railway vital functions. Extremely rare in practice.
ML Note
No ML component is accepted at SIL 4 as part of the safety function itself at this time.
IEC 62443 is the international series of standards for industrial automation and control system (IACS) cybersecurity. It defines a zone-and-conduit model: the OT network is divided into zones based on the criticality of the assets they contain, and communications between zones must pass through defined conduits with appropriate security controls.
For AI deployments, the IEC 62443 question is: in which zone does the AI inference server sit, and what security controls govern its communications with the control network? Placing an AI inference server that communicates directly with PLCs or field devices without appropriate conduit controls violates the zone model and creates a cybersecurity risk — an attacker who compromises the AI server gains a pathway into the control zone.
Additionally, AI systems are a novel attack surface: adversarial inputs — carefully crafted inputs designed to cause the model to produce incorrect outputs — can cause safety-relevant failures. IEC 62443 cybersecurity analysis for AI systems should include adversarial robustness testing as well as conventional IT security controls.
Business IT networks — ERP, corporate email, cloud connectivity. AI models deployed here have the broadest attack surface and the least OT impact if compromised, but also the weakest data freshness and highest latency.
AI Placement Guidance
Appropriate for reporting AI, document summarisation, business analytics. NOT appropriate for real-time control or safety-adjacent inference.
Supervisory control systems, historian databases, SCADA servers. AI inference deployed here can access real-time process data via OPC-UA or OSIsoft PI without direct access to field devices.
AI Placement Guidance
Appropriate for predictive maintenance, process optimisation advisory, anomaly detection. Requires IEC 62443-3-3 conduit controls to Zone 2.
PLCs, DCS, field controllers. Safety-critical control loops run here. AI inference in Zone 2 is unusual and requires SL (Security Level) 2 hardening: authentication, encrypted communications, audit logging.
AI Placement Guidance
Runtime assurance monitors and out-of-distribution detectors can operate here if hardware constraints permit. Direct safety function responsibility requires functional safety analysis (ISO 26262 / IEC 61508).
Sensors, actuators, smart instruments. Extremely constrained devices — typically MCUs or simple RTUs. AI inference here is limited to TinyML models (INT8 quantized, sub-1MB) on dedicated inference co-processors.
AI Placement Guidance
Anomaly detection on raw sensor streams. Feasible with MCU-class hardware at SL 1. Safety function responsibility at this layer requires formal IEC 61508 assessment.
Hyperion advises on the AI and edge architecture layer: where ML fits within the safety architecture, how to structure the safety monitor and fallback path, which edge toolchain is appropriate for your hardware, and how to prepare the safety evidence package. We are not a certification body — but we help you build an architecture that a certification body can accept.
A safety case is a structured argument, supported by evidence, that a system is acceptably safe for a given application in a given environment. For ML components, the safety case must address the specific failure modes of neural networks — distribution shift, opacity, non-determinism — using architectural patterns and statistical evidence rather than formal verification. The following six elements constitute the core of an arguable safety case for an ML component at the edge.
Define the precise conditions under which the ML component is valid: environmental conditions, input data ranges, sensor operating ranges, speed/load envelopes. The ODD is the contract between the ML system and the safety case. Any input outside the ODD must trigger a safe state — the ML system must not silently produce unsafe outputs on out-of-domain inputs.
A parallel deterministic monitor — implemented in conventional software or hardware — watches the ML component's outputs and blocks or overrides any output that violates the safety constraint. This is the standard pattern for deploying ML in safety-critical systems: the ML component is advisory, the deterministic monitor is authoritative. The monitor must itself be certified to the required ASIL/SIL level.
An OOD detector — typically based on input reconstruction error, ensemble disagreement, or Mahalanobis distance — flags inputs that fall outside the training distribution. On OOD detection, the system transitions to the fallback path rather than continuing to infer. OOD detectors must be validated for their own false-negative rate within the safety case.
Every safety-critical ML deployment must have a defined fallback: a safe state that the system enters when the ML component fails, is overridden by the safety monitor, or detects an OOD input. The transition to safe state must itself be ASIL/SIL-assessed. Common fallback patterns: limit/conservative mode (slow speed, increased safety margins), driver/operator takeover request, controlled stop.
Because formal verification (MC/DC coverage) cannot be applied to neural networks, statistical validation on a large, independent, representative test set takes its place. The size and composition of the test set, the performance metrics, and the confidence intervals must be documented and accepted by the safety authority. ISO PAS 21448 (SOTIF) and the upcoming ISO 8800 provide emerging guidance.
The overall safety function is decomposed into sub-elements, each assigned an ASIL/SIL level. The ML component is typically assigned a lower ASIL/SIL level, with the difference made up by the deterministic safety monitor. This ASIL decomposition must be formally documented and reviewed. The total system must meet the top-level ASIL/SIL requirement.
Standards note: The methodology for constructing safety cases for ML components is actively evolving. ISO/TR 4804, UL 4600, and the developing ISO 8800 (AI and Road Vehicles) provide the most current guidance. Certification bodies and automotive OEMs are still establishing their acceptance positions on specific techniques — what constitutes sufficient statistical evidence, which OOD detection approaches are acceptable, and how ASIL decomposition applies to ML-conventional software pairs. Always engage with your target certification body early to align on the evidence package.
Safety-critical edge systems impose constraints that rule out most general-purpose AI deployment patterns. The following are the engineering constraints that shape every edge AI deployment in an automotive or industrial safety context, and the toolchain choices that address them.
Automotive safety systems typically require end-to-end latency under 10–50ms. Industrial control loops can be tighter: 1–10ms for hard real-time functions. Edge AI inference must be profiled at worst-case latency (99th percentile), not average latency, against the available timing margin.
Safety-critical edge systems must not depend on cloud connectivity for real-time inference. The cloud may be used for model training, monitoring, and over-the-air update orchestration, but the inference path must function fully offline. Network partitions are a design condition, not an edge case.
Automotive ECUs and industrial PLCs/edge controllers typically run on ARM Cortex-M/R or RISC-V cores with 256KB–4MB RAM. Full deep learning models do not fit. Techniques: INT8 quantization (TensorFlow Lite Micro, ONNX Runtime for MCUs), structured pruning, knowledge distillation to smaller architectures (MobileNet, EfficientNet-Lite, custom CNNs).
The dominant production deployment path for edge AI in constrained environments: train in PyTorch → export to ONNX → optimise with TensorRT (NVIDIA Jetson / Drive) or ONNX Runtime with XNNPACK/NNAPI delegate (ARM SoC). ONNX Runtime also supports MCU deployment via ONNX Runtime for Microcontrollers. These toolchains have documented behaviour that must be validated in the safety case.
Safety systems require that the same input always produces the same output. This requires: fixed-point arithmetic (INT8 instead of FP32), disabled runtime optimisations that vary between runs, fixed thread affinity, no dynamic memory allocation during inference. Achieving bit-exact determinism across deployments requires careful toolchain configuration and hardware validation.
Edge safety controllers have fixed memory envelopes — often no more than 4–16MB Flash and 1–4MB RAM shared across the entire application. Power budgets on battery-operated or thermally-constrained devices limit inference frequency. Model quantization and scheduled (rather than continuous) inference are common solutions.
Physical AI Deployment
Edge AI architecture and embedded inference toolchain for safety-adjacent systems
Domain Expert LLM Lab
Model training, quantization, and validation pipelines for constrained hardware
Sovereign LLM (Public Sector)
Air-gapped AI for critical infrastructure and classified environments
The following is a factual account of Hyperion's background as it relates to edge AI in safety-critical systems. These are verified facts, not marketing claims.
Scope disclosure: Hyperion is an AI and edge architecture consultancy. We advise on the AI/edge layer — architecture, toolchain selection, safety monitor design, ODD definition, evidence package structure. We are not a certification body, a notified body, or an accredited safety assessor. We do not issue ASIL or SIL certifications. Formal certification work requires an accredited third party.
Founder Mohammed Cherifi spent 17+ years in automotive and embedded systems engineering, including work at Renault-Nissan-Mitsubishi Alliance, Cisco, and ABB. This background includes direct exposure to functional safety processes in vehicle software development, embedded control systems, and the operational constraints of safety-critical environments. Hyperion brings this experience to the AI/edge layer — not as a certification body, but as an engineering team that understands the environment.
Hyperion's flagship venture, Auralink, is an edge-deployed agent platform built on 400+ microservices with approximately 20 AI agents. Auralink's architecture demonstrates the engineering patterns required for AI inference on constrained edge hardware — low-latency inference paths, deterministic control boundaries, and separation between the AI advisory layer and the authoritative control layer. This is transferable proof of the architectural discipline required for safety-adjacent edge AI, not a safety certification.
Hyperion's physical-ai-deployment service covers edge AI architecture, embedded inference toolchain selection, and the integration layer between AI inference and OT/embedded control systems. Our role is the AI and edge architecture layer. We advise on where ML components fit within a safety architecture, how to structure the safety monitor and fallback path, and which toolchains are appropriate for the hardware. We are not a certification body and do not issue ASIL/SIL certifications — that work requires a notified body.
A preprint published on arXiv covers autonomous edge-deployed AI agents for physical infrastructure. This is academic-adjacent work — a preprint, not a peer-reviewed journal publication — but it reflects the architectural depth Hyperion applies to edge AI deployments in physical systems.
Mohammed Cherifi holds the AI Ambassador credential from the French Government's Osez l'IA programme and has been recognised by FranceNum. This reflects engagement with AI policy and the regulatory challenges of deploying AI in regulated industrial and automotive environments.
Not as the sole safety function — not with current methodology and the acceptance positions of most certification bodies. The standard approach is to deploy the ML component at a lower ASIL/SIL level (e.g., ASIL B or QM), with a deterministic safety monitor certified to the higher level making up the difference through ASIL decomposition. The total system can then meet ASIL D, but the ML component itself does not carry that designation. This position reflects current ISO 26262-6:2018 guidance and IEC TR 63069:2019 — it will evolve as the standards bodies develop ML-specific guidance (ISO/TR 4804, ISO 8800).
SOTIF (Safety of the Intended Function), published as ISO 21448:2022, addresses hazards that arise not from system failures but from the limitations of the intended function itself — perception gaps, unexpected environmental conditions, and behaviour at ODD boundaries. ISO 26262 covers failures of the system relative to its specification. SOTIF covers inadequacies in the specification itself. For ADAS and autonomous functions, both standards apply: you need to demonstrate both that the system fails safely (ISO 26262) and that its intended behaviour is safe across the full ODD (SOTIF/ISO 21448).
A safety monitor is a deterministic, independently verified software or hardware component that runs in parallel with the ML component. It checks the ML output against a set of formal safety constraints — physical limits, rate-of-change bounds, consistency with sensor data — and blocks or overrides any output that violates these constraints. The safety monitor is certified to the required ASIL/SIL level independently of the ML model. This separation is the key architectural pattern for deploying ML in safety-critical systems: the ML component is advisory, the monitor is authoritative.
The ODD defines the specific conditions under which the ML system is valid: environmental parameters (temperature, lighting, weather), input data characteristics (sensor ranges, data formats, signal quality), vehicle or machine operating states, and geographical or application-specific constraints. Any input that falls outside the ODD should not be processed by the ML component — the system must transition to a fallback state. Defining, validating, and monitoring the ODD boundary is one of the most important engineering tasks in a safety-critical ML deployment.
IEC 62443 defines a zone-and-conduit model for industrial cybersecurity. AI inference servers, like any IT asset, must be placed in the correct zone (typically Zone 3 — Supervisory) and all communications with Zone 2 (Control) must pass through a conduit with defined security controls: authentication, encryption, message integrity verification. Deploying an AI inference server that communicates directly with PLCs or field devices without conduit controls violates the zone model. The AI server itself must meet the Security Level (SL) requirements for its zone, including patch management, authentication, and audit logging.
There is no single accepted configuration list — it depends on the target ASIL/SIL level, the certification body, and the specific use case. In general: TensorRT-optimised models on NVIDIA Drive/Jetson platforms are widely used in ADAS programmes up to ASIL B, with runtime assurance monitors. ONNX Runtime on ARM SoCs is used in industrial applications at SIL 1–2. For higher integrity levels, formal qualification of the inference runtime itself (tool qualification under ISO 26262-8) may be required. Hyperion advises on toolchain selection and qualification strategy — the formal qualification work requires the toolchain vendor's involvement.
No. Hyperion is an AI and edge architecture consultancy. We advise on where ML components fit within safety architectures, how to structure safety monitors and fallback paths, which edge toolchains are appropriate for constrained hardware, and how to design deployments that are compatible with functional safety and IEC 62443 requirements. The actual ASIL/SIL certification assessments — the conformity assessment work that produces a certificate — must be performed by an accredited third-party assessor or notified body. Hyperion can help you prepare for that assessment and design the architecture to make it tractable, but we do not issue certifications.
Functional safety (ISO 26262, IEC 61508) asks: does the system fail safely? It focuses on hardware and software failure modes, fault tolerance, and the integrity of safety functions when components fail. IEC 62443 asks: is the system protected from deliberate attack? It focuses on cybersecurity controls — authentication, encryption, network segmentation, vulnerability management. Both are required for industrial AI systems: a system can be functionally safe (it fails gracefully) but cybersecurity-vulnerable (it can be attacked and made to fail). For AI systems specifically, adversarial attacks are a point of overlap — a deliberately adversarial input can cause unsafe ML behaviour that a functional safety assessment alone would not surface.
ISO 26262:2018 (2018). "Road vehicles — Functional safety (Parts 1–12)."
Context: The primary automotive functional safety standard. Part 6 covers software development requirements including ASIL-specific structural coverage metrics. Part 10 includes guidance on AI/ML components (non-normative at time of publication, evolving in subsequent editions).
ISO 21448:2022 (2022). "Road vehicles — Safety of the intended functionality (SOTIF)."
Context: Addresses hazards from functional insufficiencies in the specification or performance of intended functionality, including sensor limitations and ODD boundary behaviour in ADAS and autonomous driving systems.
ISO/TR 4804:2020 (2020). "Road vehicles — Safety and cybersecurity for automated driving systems — Design, verification and validation."
Context: Technical report providing guidance on combining functional safety and cybersecurity analysis for automated driving. Covers SOTIF, ISO 26262, and SAE J3061 cross-references.
IEC 61508:2010 (2010). "Functional safety of electrical/electronic/programmable electronic safety-related systems (Parts 1–7)."
Context: The generic international standard for functional safety of E/E/PE systems. Defines SIL 1–4, probability of failure on demand (PFD), and requirements for software development and validation.
IEC 62443 Series (2018–2024). "Industrial automation and control systems — Security."
Context: Multi-part standard covering cybersecurity for operational technology. IEC 62443-3-3 defines Security Levels (SL) and the zone/conduit model. IEC 62443-4-2 covers component security requirements applicable to AI inference servers in OT networks.
IEC TR 63069:2019 (2019). "Industrial-process measurement, control and automation — Framework for functional safety and security."
Context: Technical report addressing the relationship between functional safety (IEC 61508) and cybersecurity (IEC 62443). Directly relevant to AI deployments that span both domains.
UL 4600:2020 (2020). "Standard for Safety for the Evaluation of Autonomous Products."
Context: First comprehensive standard specifically addressing ML-based autonomous systems. Covers safety case construction, ODD definition, operational monitoring, and runtime assurance for ML components. Widely referenced in North American automotive programmes.
ISO/IEC TR 24029-2:2023 (2023). "Artificial Intelligence — Assessment of the robustness of neural networks — Part 2: Methodology for use in formal methods."
Context: Provides guidance on robustness testing methodologies for neural networks, including adversarial robustness — relevant to both functional safety and IEC 62443 cybersecurity analysis.
Safety-critical ML deployments demand more than good model performance — they require a defensible architecture, the right toolchain for the hardware constraints, and a documented safety case that a certification body can accept. Hyperion brings 17+ years of automotive and embedded systems engineering experience to the AI/edge layer. We advise on architecture and evidence strategy; formal certification remains with the appropriate accredited body. Start with a focused architecture review.
Founder & AI Strategy Lead
Mohammed Cherifi is the founder of Hyperion Consulting, with 17+ years in automotive and embedded systems engineering. He has worked at Renault-Nissan-Mitsubishi Alliance, Cisco, and ABB — environments where functional safety processes govern software development. He specialises in edge AI architecture for physical systems, including safety-adjacent deployments in automotive and industrial OT environments.
Edge AI architecture and embedded inference for physical systems
From simulation to production deployment in manufacturing robotics
Sovereign, air-gapped AI deployment for industrial environments
The 6-layer Physical AI Stack for robotics, edge AI, and industrial automation