EU AI Act for Physical AI: Annex III Risk Categories You Probably Misclassified

A General Counsel emails me a one-page summary of "our AI risk position." The systems listed are: a customer-support chatbot ("low-risk, no biometric data"), an internal HR co-pilot ("limited risk, transparency disclosure planned"), a predictive maintenance model on the production line ("out of scope, no human-affecting decision"), and a vision system inspecting battery welds at the end of the assembly line ("out of scope, no personal data").

Three of the four are wrong.

The chatbot is correctly classified. The HR co-pilot, the predictive maintenance model, and the vision system are all in scope for the EU AI Act's high-risk regime, and the engineering teams that built them have no documentation pathway that will survive a conformity assessment. The General Counsel has no idea, because the legal team read Annex III looking for "biometric identification" and "law enforcement" and concluded that none of the systems matched. Annex III is not the only thing that determines high-risk status. Article 6 contains a separate path through Annex I that sweeps up most Physical AI deployments by default — and that path is the one industrial teams miss.

I have seen this exact misreading in a dozen enterprise teams in the last year — manufacturers, automotive Tier-1 suppliers, energy-infrastructure operators, healthcare-equipment vendors. The misclassifications are consistent enough that they form a pattern. Below are the four classification mistakes that show up most often in Physical AI deployments, what the Regulation actually says — including the 2026 Omnibus update that shifted some of the deadlines — and what the documentation pathway looks like once you are honest about your risk class.

Mistake 1 — "It does not handle personal data, so it is not high-risk"

The most common mistake. A team reads Annex III, sees that the listed categories include biometric identification, recruitment, and credit scoring, decides their machine-vision system inspecting battery welds or PCB solder joints does not touch any of those, and self-classifies as "limited risk." On the strength of that classification, they ship the system into a production line and skip the conformity workflow.

This misses Article 6(1). The high-risk classification has two independent paths. Annex III is one of them; Article 6(1) combined with Annex I is the other. Article 6(1) says an AI system is high-risk if it is intended to be used as a safety component of a product — or is itself a product — covered by the Union harmonisation legislation listed in Annex I, and that legislation requires a third-party conformity assessment before market placement.

Annex I lists the Machinery Regulation, Toy Safety, Recreational Craft, Lifts, ATEX equipment, Radio Equipment, Pressure Equipment, Cableway Installations, Personal Protective Equipment, Gas Appliances, Medical Devices, In Vitro Diagnostic Medical Devices, Civil Aviation, vehicle type-approval (2018/858), Agricultural and Forestry Vehicles, Marine Equipment, and Rail Interoperability. That list covers nearly every machine on a manufacturing floor and most products on a connected fleet.

The 2026 Omnibus added a critical nuance: the Machinery Regulation was moved from Annex I Section A to Section B, which means the Chapter III obligations no longer apply directly to AI systems on Machinery-Regulation products — instead, the Commission has to amend the Machinery Regulation's Annex III with AI-specific safety requirements, with delegated acts due by 2 August 2028. The other Annex I sectors are unchanged. The net effect for teams: medical devices, vehicles, in vitro diagnostics, lifts, civil aviation, rail — all still flow into the AI Act's high-risk regime by the original Article 6(1) path. Machinery shifts onto a slower track but is not exempt.

The Commission's draft guidelines on high-risk classification, released in 2026, also tighten the "safety component" definition: an AI system used solely for user assistance, performance optimisation, efficiency, or quality control does not count as a safety component unless its failure could endanger health or safety. A vision system whose only output is a pass/fail flag for downstream rework is not a safety component. A vision system that triggers a press to stop on detection of a human hand, or a torque controller whose output binds the actuator inside a safety envelope, is.

The fix is to map every shipping AI system against Annex I before reading Annex III, and to be honest about whether the AI output could trigger a safety-relevant action. If the host product is in any of the Annex I regulations, and the AI sits in the safety path, the system is in scope by Article 6(1) regardless of whether anyone's face is captured.

Mistake 2 — "Predictive maintenance is operational, not high-risk"

The second misclassification. A team has trained a model that predicts when a press, a pump, a turbine, or a charger needs service. The model emits a maintenance ticket with a recommended technician skill level and a priority. The team tags this as "operational AI" — internal use, no customer-facing output, no biometric data — and assumes it sits outside the Act.

It does not. Annex III §4(b) lists as high-risk any AI system "intended to be used to make decisions affecting terms of work-related relationships, the promotion or termination of work-related contractual relationships, to allocate tasks based on individual behaviour or personal traits or characteristics, or to monitor and evaluate the performance and behaviour of persons in such relationships."

A predictive maintenance model that emits a ticket and a skill level is doing task allocation. The deployer pulls the next ticket off the queue, assigns it to whichever technician matches the skill recommendation, and the worker's daily allocation is being shaped by the model's output. The fact that the model is reasoning about a machine and not about a person is irrelevant — the regulatory test is whether the output drives a decision that affects a worker's task assignment.

This is the misclassification that legal teams resist hardest. The engineering team argues that the model "just predicts," that the human still makes the assignment. The Act does not draw that line. The system is high-risk if it is intended to be used to allocate tasks, and any tool whose output is consulted in the assignment loop meets that definition. The same logic applies to safety scoring on driver behaviour, fatigue detection on operator workstations, productivity dashboards that recommend training plans, automated shift schedulers, and the entire family of "workforce optimisation" AI systems that have proliferated in industrial operations.

The implication is the full Chapter III obligation set: a Risk Management System (Article 9), a Data Governance plan (Article 10), Technical Documentation against Annex IV (Article 11), Record-keeping (Article 12), Transparency to deployers (Article 13), Human Oversight (Article 14), and the accuracy/robustness/cybersecurity targets (Article 15). The team that thought they were running an "internal tool" finds out at audit that they owe the full Annex IV file.

At Auralink, the predictive-maintenance loop that recommends which chargers to dispatch a technician to was scoped under Annex III §4(b) from week one. The consequence: every retraining run produces an updated risk register entry, every release produces a logged review by the operations lead, and the audit pack for any given quarter takes under an hour to assemble — because the artefacts are produced incrementally as part of the development pipeline, not as a one-off compliance project.

Mistake 3 — "We bought a compliant foundation model, so we are covered"

The third misclassification. A team integrates a foundation model from OpenAI, Anthropic, Mistral, or another provider into a Physical AI workflow — a fine-tuned LLM that processes operator queries on a control panel, a multimodal model that summarises sensor traces for a maintenance technician, a code-generation model embedded in an industrial-automation IDE — and reasons that "the GPAI obligations live with the provider, so we are fine."

This conflates two layers of the Act. Articles 51 to 56 set out obligations for providers of General Purpose AI models — model cards, training-data summaries, copyright compliance policies, systemic-risk reporting for the largest models. Those obligations attach to the provider of the foundation model, not to the deployer who builds an AI system on top of it. When a deployer takes a foundation model and integrates it into an AI system used in any Annex III context, or as a safety component under Article 6(1), a separate layer of obligations attaches to the deployer for the AI system, regardless of who built the underlying model.

The line is simple. GPAI obligations follow the model. AI-system obligations follow the system. A fine-tuned LLM running inside an industrial control room and recommending maintenance actions to operators is an AI system whose high-risk classification is determined by its use, not by which model is under the hood. The deployer owes the conformity assessment for that system. The foundation-model provider owes the GPAI obligations for the model. Both layers exist in parallel.

The practical consequence: a team that has carefully selected a "compliant" foundation model and assumes the conformity question is settled has done a fraction of the work. They still owe the Annex IV technical documentation for the system they built. They still owe the human-oversight design under Article 14. They still owe the post-market monitoring plan under Article 72. The model card from the provider is one input to their documentation pack; it is not a substitute for it.

The same mistake shows up with off-the-shelf vision models, off-the-shelf speech models, and off-the-shelf agentic frameworks. The fact that a vendor provides a model card does not relieve the deployer of system-level obligations. The team has to identify, at each integration point, whether the system as deployed is high-risk under Article 6(1) or Annex III, and if it is, build the documentation for that specific system. "We used a vendor model" is not a defence — it is a single line in the Annex IV file.

Mistake 4 — "The Act applies in 2027, so we have time"

The fourth mistake is not about risk class but about time. Teams reading the EU AI Act often hear "applies in 2026" or "applies in 2027" and assume there is a single date to plan against. There is not. The Act has at least four phased application dates, each binding different obligations, and the 2026 Omnibus has shifted several of them — but not the ones most industrial teams care about.

Article 5 prohibitions on unacceptable-risk practices — social scoring, manipulative subliminal techniques, untargeted scraping of facial images, certain biometric categorisation, real-time remote biometric identification in publicly accessible spaces — have been in force since 2 February 2025. Any system touching these categories has been illegal in the EU for over a year. The Article 4 AI-literacy obligation also applies from the same date: providers and deployers must ensure that staff working with AI systems have a sufficient level of AI literacy, and have to be able to evidence the training.

GPAI provider obligations apply from 2 August 2025 for models placed on the market after that date. Earlier models had a transitional period.

Article 50 transparency — chatbot disclosure, deepfake labelling, AI-generated content marking, emotion-recognition disclosure — applies from 2 August 2026. A narrow four-month grace runs to 2 December 2026 for the Article 50(2) machine-readable watermarking obligation on systems placed on the market before August. The user-facing disclosure obligations themselves are not delayed.

The bulk of the high-risk regime — Articles 6, 9, 10, 11, 12, 13, 14, 15, the Annex IV technical documentation, post-market monitoring, the EU-database registration — was originally bound to 2 August 2026. The 2026 Omnibus moved some of these dates for Annex III systems and added phased application for new sub-categories, but the core engineering obligations land in the second half of 2026 for systems already on the market in Annex III contexts. Article 6(1) systems embedded in Annex I products run on their existing sectoral conformity-assessment timelines — and, as covered above, Machinery Regulation systems now flow through delegated acts due by 2 August 2028.

The strategic implication: a team planning a single "compliance project" for 2027 has already missed the 2025 obligations and is about to miss the 2026 ones. The work has to be sequenced against the specific articles that bind each system, not against a single calendar date. We maintain a public EU AI Act tracker for exactly this reason — the timeline is a series of cliffs, not a ramp, and the cliffs are different for different systems.

What conformity actually requires

Once a system is honestly classified as high-risk, the Act sets out a concrete documentation and process pack. The pack is large but finite, and most of it is reusable across systems if it is built right the first time. The components, in roughly the order an engineering team has to confront them:

Risk Management System (Article 9). A continuous, iterative process that identifies, estimates, evaluates, and treats risks throughout the system lifecycle. The artefact is a living risk register with mitigation plans and residual-risk acceptance, not a one-time deliverable. The risks include the model's intrinsic risks (accuracy, robustness, bias) and the contextual risks (foreseeable misuse, interaction with the safety envelope of the host product).
Data Governance and Management (Article 10). A description of the training, validation, and test datasets, including provenance, preparation, labelling methodology, examination for biases, and identification of gaps. For Physical AI the data-governance plan has to address the long-tail distribution of operating conditions — temperature, illumination, vibration, sensor drift — that the model will encounter in production.
Technical Documentation (Article 11 and Annex IV). The most concrete deliverable. The Annex IV file includes a general description of the system, design of the development process, training data, performance metrics, the risk-management output, the post-market monitoring plan, a copy of the EU declaration of conformity, and several other items. Annex IV is what the notified body reads. A team that has not built it cannot pass conformity.
Record-keeping (Article 12). Automatic logs generated during operation. For Physical AI the logs include input traces, model outputs, confidence scores, decisions that triggered the safe state, and operator actions. Retention periods and access controls are regulated.
Transparency to Users (Article 13). Instructions for use that allow the deployer to understand the system's capabilities, limitations, expected accuracy, intended purpose, and the human-oversight measures. This is the document the deployer reads. It is the bridge between the provider's engineering team and the deployer's operations team.
Human Oversight (Article 14). The design of oversight measures — who can intervene, how, when, with what authority. For Physical AI the oversight is rarely a single human in the loop. It is a layered architecture of automatic safety checks, supervisor-level intervention, and post-hoc audit. All three layers have to be specified.
Accuracy, Robustness, Cybersecurity (Article 15). Quantitative targets for accuracy across the operating envelope, robustness against adversarial inputs and distribution shift, and cybersecurity measures against model poisoning, data exfiltration, and adversarial manipulation. The targets have to be measurable and the test results have to live in the Annex IV file.
Conformity Assessment. Depending on the system, the assessment is either internal (the provider self-attests against the harmonised standards) or third-party (a notified body audits the system before market placement). For Article 6(1) systems the assessment integrates with the sectoral conformity assessment of the host product.
EU Database Registration. The provider registers the system in the EU-wide database maintained by the Commission. Most fields of the registration are public.
Post-Market Monitoring (Article 72) and Serious Incident Reporting (Article 73). After placement, the provider monitors performance, collects data on issues, and reports serious incidents to the national competent authority within fifteen days — shorter for incidents involving harm or critical infrastructure.

None of this is novel in shape. The structure is familiar to anyone who has shipped a medical device or an automotive safety-critical system. What is new is the integration into a single EU-wide regime and the documentation cadence, which is faster than most engineering teams have prepared for. The teams that struggle are the ones that try to assemble the pack at the end. The teams that ship cleanly are the ones that build the pack incrementally, alongside the engineering work, with each artefact produced as a side effect of the development process rather than as a separate compliance project.

This is the discipline we encode in the Hyperion Compliance Lifecycle — a four-stage pipeline (Classify, Document, Attest, Monitor) that maps every Chapter III obligation onto an engineering artefact that the team is already producing, so the audit pack assembles itself rather than being chased down before the auditor visits. The same shape that the Physical AI Stack we described in the flagship article applies to the engineering tiers, the Compliance Lifecycle applies to the regulatory tier.

The pragmatic path for teams already shipping

The pragmatic path for a team that is already shipping Physical AI into the EU — and has just realised that the classification is more aggressive than they assumed — is straightforward in shape, painful in scope:

Audit every shipping AI system against both paths. Article 6(1) (via Annex I) and Annex III. Most teams have between two and ten systems that need to be re-classified. Tag each system with a risk class, a binding article, and a notified-body requirement if any. Be honest about the Mistake-1 trap: a vision system that touches a safety function is in, regardless of whether it sees a face.
For every system that comes out high-risk, identify the gap to the Annex IV pack. The gap is usually the same: data-governance plan, post-market monitoring plan, transparency-to-deployer document, and the formal risk-management system. The engineering artefacts often exist informally — model cards, eval results, change logs — and have to be lifted into the regulated documentation.
Build the documentation in parallel with the engineering work, not after it. Every training run produces an updated model card. Every release produces an updated risk-register entry. Every incident produces a logged review. The audit pack assembles itself if the pipeline is wired correctly. The work is one to two engineer-weeks per system to build the pipeline; the running cost is close to zero.
Choose vendors for compliance compatibility. Foundation-model providers, edge-hardware vendors, MLOps platforms — each has to expose the information the Annex IV file needs. Vendors that cannot produce a model card, a training-data summary, or a test-result trace cost you more in documentation labour than they save in unit price. Make compliance-artefact production a procurement criterion.

The EU AI Act compliance engagement we run with industrial teams is exactly this audit-then-pipeline shape. Ninety days end-to-end for a typical estate of three to seven Annex III or Article 6(1) systems. The output is not a binder — it is a pipeline that keeps producing the binder.

Compliance done well is a commercial asset

Compliance with the EU AI Act, done well, is not a cost centre. It is a market-access asset. The teams that are clean in 2026 will ship in 2027 without notified-body delays. The teams that are not will discover that the conformity bottleneck has moved upstream of their release pipeline, and that the cost of catching up — re-running evaluations to standard, rebuilding datasets with provenance, writing transparency documents after the fact — is several multiples of the cost of building the pipeline first.

We built Aegis AI for exactly this work. Automated obligation extraction from the Act, mapping of every Article to the AI systems on the inventory, gap analysis against your existing documentation, an immutable SHA-256-chained audit trail, and board-ready and regulator-ready reports. The compliance work is repeatable, and once it is built as a pipeline it stops being the bottleneck. It becomes the asset that lets you ship faster than the teams that left it for the end.

The classification step is where most teams go wrong. Run the audit honestly, against both paths in Article 6, not just Annex III, and against the Omnibus updates rather than the 2024 reading of the Act. The rest of the pack is engineering work — heavy, but tractable. The misclassification is the only mistake that compounds.

EU AI Act for Physical AI: Annex III Risk Categories You Probably Misclassified

Mistake 1 — "It does not handle personal data, so it is not high-risk"

Mistake 2 — "Predictive maintenance is operational, not high-risk"

Mistake 3 — "We bought a compliant foundation model, so we are covered"

Mistake 4 — "The Act applies in 2027, so we have time"

What conformity actually requires

The pragmatic path for teams already shipping

Compliance done well is a commercial asset

تقرير الثلاثين بالمئة

مقالات ذات صلة

هل تريد مناقشة هذه الأفكار؟

Physical AI vs. Operational AI: A Taxonomy

The Four Failure Modes of Edge AI in Production