Un dictionnaire complet des termes d'intelligence artificielle et d'apprentissage automatique.
A controlled experiment that compares the performance of two or more model versions by routing live traffic between them and measuring business and technical metrics. A/B testing is essential for validating that a new model version improves outcomes before full rollout.
The principle that individuals and organisations involved in AI development and deployment can be held responsible for the outcomes their systems produce. The EU AI Act establishes clear accountability chains between providers, importers, distributors, and deployers.
AI systems that autonomously plan, reason, and execute multi-step tasks over extended time horizons without continuous human direction. Agentic AI is moving from demo to production in 2025–26, raising new questions around oversight, auditability, and the EU AI Act's human oversight requirements.
The list of high-risk AI application areas appended to the EU AI Act, including biometric systems, critical infrastructure management, education, employment, essential private/public services, law enforcement, migration, and administration of justice. Companies operating in these areas face the most stringent obligations.
The provision of the EU AI Act that defines which AI systems are classified as high-risk. Article 6 covers systems listed in Annex III (e.g., biometric identification, employment screening, credit scoring) and safety components of regulated products.
An autonomous AI system that can perceive its environment, reason about goals, select tools, and take multi-step actions to complete complex tasks. Enterprise AI agents are transforming workflows in operations, finance, and customer service but require robust oversight frameworks to meet EU AI Act obligations.
The research and engineering discipline of ensuring that AI systems pursue goals and exhibit behaviours that match human intentions and values. Misalignment risks range from models following instructions too literally to more speculative long-term risks discussed in AI safety literature.
A technical architect who designs the end-to-end AI system landscape—data pipelines, model training infrastructure, serving platforms, integration patterns, and governance tooling. AI architects ensure that individual AI solutions compose into a coherent, scalable enterprise capability.
The ability of an AI system's behaviour, decisions, and data flows to be independently examined and verified. Auditability requires comprehensive logging, version-controlled model artefacts, and documented decision trails—all prerequisites for EU AI Act conformity assessment.
Systematic errors in AI systems that create unfair outcomes for certain groups, often arising from unrepresentative training data, flawed labelling, or problematic model design. Identifying and mitigating bias is a legal requirement for high-risk AI systems under the EU AI Act.
A structured argument for an AI investment that quantifies expected benefits (revenue uplift, cost reduction, risk mitigation), costs (build, run, compliance), risks, and strategic alignment. A robust AI business case is the gateway to executive and board approval.
A dedicated internal team that provides AI expertise, tooling, governance standards, and best practices to business units across an enterprise. A CoE accelerates AI adoption by preventing redundant effort and maintaining quality and compliance across all AI initiatives.
The process of preparing, supporting, and guiding employees and stakeholders through the organisational changes that AI adoption brings—including role redesign, new workflows, cultural shifts, and addressing fears about automation. Most AI transformations fail due to poor change management, not technical shortcomings.
Using AI to lower operational costs through automation of repetitive tasks, process optimisation, predictive maintenance, and intelligent resource allocation. AI cost reduction is typically the primary driver of enterprise AI ROI in the first two years of adoption.
Comprehensive records describing an AI system's design, training data, performance characteristics, limitations, and intended use. The EU AI Act distinguishes between technical documentation (for authorities and notified bodies) and instructions for use (for deployers).
A thorough assessment of an AI system or AI-enabled company's technical claims, data practices, compliance posture, model performance, and operational risks—typically conducted in M&A, procurement, or partnership contexts. AI due diligence is a growing practice as AI becomes a critical business asset.
The application of moral principles—fairness, accountability, transparency, non-maleficence, and human dignity—to the design, development, and deployment of AI systems. AI ethics provides the normative foundation that regulations like the EU AI Act translate into enforceable obligations.
The application of AI to financial services use cases including fraud detection, credit underwriting, algorithmic trading, regulatory reporting, and customer advisory. FinAI systems in Europe are subject to both the EU AI Act (many are high-risk) and sector-specific regulations from EBA and ESMA.
AI applications in clinical and administrative healthcare, including medical image analysis, clinical decision support, drug discovery, and patient flow optimisation. Healthcare AI faces dual regulation in the EU under the AI Act (high-risk) and the Medical Device Regulation (MDR), requiring CE marking.
AI tools applied to legal work—contract analysis, due diligence, legal research, compliance monitoring, and document generation. Legal AI can dramatically reduce the cost of routine legal work but requires careful hallucination controls given the consequences of factual errors in legal contexts.
AI applications in supply chain and logistics, including route optimisation, demand forecasting, warehouse automation, and predictive shipment tracking. AI for logistics is one of the highest-ROI application areas given the volume of decisions, data availability, and operational cost at stake.
The application of AI—including computer vision, predictive maintenance, process optimisation, and generative design—to manufacturing operations. AI for manufacturing delivers measurable ROI through defect reduction, throughput improvement, and unplanned downtime elimination.
The framework of policies, processes, roles, and controls that ensure AI systems are developed, deployed, and used responsibly and in compliance with applicable law. AI governance spans ethics, bias, privacy, security, and regulatory compliance across the model lifecycle.
A cross-functional body—typically including legal, compliance, risk, IT, and business leaders—responsible for approving AI initiatives, reviewing ethical risks, and ensuring regulatory compliance. The committee is the human governance structure that operationalises AI policy.
The obligation under the EU AI Act for providers of high-risk AI systems to report serious incidents—where the AI causes harm to health, safety, or fundamental rights—to national market surveillance authorities within defined timeframes.
A proposed EU directive that would make it easier for victims of AI-caused harm to seek compensation by establishing disclosure obligations and a rebuttable presumption of causation. It complements the EU AI Act by addressing civil liability where the Act focuses on market access requirements.
A measure of an organisation's capability to successfully implement and scale AI initiatives, typically assessed across dimensions like data quality, talent, governance, technology, and executive sponsorship. Maturity models help organisations benchmark themselves and prioritise investments.
A structured framework that describes progressive levels of AI capability—typically from initial experimentation through managed deployment to optimised, organisation-wide integration. Maturity models provide a common language for assessing current state and planning improvement.
An independent third-party organisation designated by an EU Member State to conduct conformity assessments of high-risk AI systems. Not all high-risk AI requires a notified body—many categories allow self-assessment—but systems used in critical infrastructure or law enforcement typically do not.
The coordination of multiple AI models, tools, APIs, and data sources within complex automated pipelines or agentic workflows. Orchestration frameworks like LangChain, LlamaIndex, and Temporal manage state, retries, and branching logic across multi-step AI processes.
A time-boxed, scope-limited real-world test of an AI solution designed to validate assumptions, measure business impact, and identify operational challenges before full deployment. AI pilots typically run 6–12 weeks and are structured to produce clear go/no-go criteria.
A product manager specialising in AI-powered products, responsible for translating business needs into AI system requirements, managing stakeholder expectations around probabilistic outputs, and navigating data, model, and UX tradeoffs unique to AI products.
A structured evaluation of an organisation's ability to adopt and scale AI, examining data quality, infrastructure, talent, processes, governance, and culture. Readiness assessments identify gaps and form the evidence base for an AI roadmap and investment case.
A controlled environment, established under the EU AI Act, that allows companies to develop and test innovative AI systems under regulatory supervision before full market release. Sandboxes give startups and SMEs access to regulators and reduce compliance uncertainty during development.
A systematic process for identifying, assessing, and mitigating risks associated with AI systems throughout their lifecycle. The EU AI Act requires high-risk AI providers to implement and document a risk management system; ISO 42001 and NIST AI RMF provide complementary frameworks.
A time-sequenced plan that maps AI initiatives to business priorities, dependencies, resource requirements, and expected outcomes. A good roadmap balances quick wins that build momentum with strategic bets that transform core business processes.
The return on investment from an AI initiative, measured as the ratio of net benefit to total investment over a defined period. AI ROI is notoriously hard to quantify upfront but is increasingly demanded by CFOs; it spans hard savings (labour automation), soft gains (decision quality), and risk-adjusted value.
A multi-disciplinary research field concerned with preventing AI systems from causing unintended harm—including technical failures, misuse, and long-term societal risks. In the EU, AI safety is operationalised through the AI Act's risk classification and conformity assessment requirements.
The process of expanding a successful AI pilot to broader organisational adoption—handling increased data volumes, additional use cases, more users, and tighter integration with business processes. Scaling typically exposes infrastructure, data governance, and change management challenges invisible at pilot scale.
A senior leadership body responsible for setting AI strategic priorities, allocating resources, and monitoring the portfolio of AI investments. The steering committee provides executive sponsorship and escalation paths for significant AI decisions.
A comprehensive plan for how an organisation will adopt, develop, and deploy AI to achieve business objectives—encompassing use-case prioritisation, build-vs-buy decisions, data infrastructure, talent, governance, and compliance. An effective AI strategy aligns technology investment with measurable business outcomes.
As defined by the EU AI Act: a machine-based system that, for a given set of objectives, infers from the inputs it receives how to generate outputs such as predictions, content, recommendations, or decisions that can influence real or virtual environments.
The comprehensive cost of an AI system over its full lifecycle, including model development, data preparation, infrastructure, licensing, monitoring, compliance, maintenance, and eventual decommissioning. TCO analysis often reveals that inference and compliance costs dwarf initial build costs.
A senior technical leader who guides architecture decisions, engineering best practices, and model development for AI products. The AI tech lead bridges research and production, ensuring that promising models are transformed into reliable, maintainable, and compliant systems.
The principle that AI systems, their capabilities, limitations, and decision-making processes should be understandable to affected users, deployers, and regulators. The EU AI Act mandates transparency obligations at multiple risk tiers, from simple chatbot disclosure to full technical documentation for high-risk systems.
Structured training programmes that build AI literacy and practical skills across an organisation—from executive AI awareness to hands-on workshops for analysts, engineers, and business users. Upskilling is the most critical enabler of AI adoption and is central to change management.
The process of evaluating and choosing AI technology vendors across dimensions including capability fit, security posture, data handling practices, EU AI Act compliance readiness, pricing, and vendor lock-in risk. Rigorous vendor selection prevents costly replacement cycles.
Techniques that embed imperceptible signals into AI-generated content—text, images, audio, or video—enabling later verification of AI origin. The EU AI Act requires providers of GPAI systems used to generate synthetic content to apply machine-readable watermarks.
An organisational philosophy where AI capabilities are considered from the outset when designing products, processes, and business models, rather than applied as an afterthought. AI-first organisations embed machine intelligence into their core value proposition.
Systematic and repeatable errors in algorithmic outputs that create unfair advantages or disadvantages for individuals or groups based on protected characteristics. Algorithmic bias can arise from biased training data, proxy variables, or flawed evaluation metrics.
A hypothetical level of AI capability matching or surpassing human cognitive performance across all domains, not just narrow tasks. AGI remains a research objective; current enterprise AI systems are all narrow AI. The EU AI Act does not yet regulate AGI specifically.
A field of computer science focused on creating systems that can perform tasks typically requiring human intelligence. This includes learning from experience, understanding language, recognizing patterns, solving problems, and making decisions.
A neural network component that allows a model to dynamically focus on the most relevant parts of its input when producing each output element. Self-attention is the core innovation of the Transformer architecture and is responsible for LLMs' ability to handle long, complex contexts.
An AI design philosophy that emphasises using machine intelligence to amplify human capabilities rather than replace them. Augmented intelligence reframes the AI narrative from displacement to empowerment, which is often more acceptable to employees and regulators.
A neural network trained to compress data into a lower-dimensional latent space and reconstruct it. Autoencoders are used for anomaly detection, dimensionality reduction, and as components in generative models like VAEs.
Systems capable of performing complex tasks in real-world environments without human intervention for each action. In regulated sectors such as transport, healthcare, and finance, autonomous AI systems typically fall under the EU AI Act's high-risk classification.
Running a model on a large dataset in a single scheduled job rather than in real time. Batch inference is more cost-efficient than real-time serving for use cases such as nightly report generation, bulk document classification, or periodic customer scoring.
A technique that normalises activations across the batch dimension during training, accelerating convergence and reducing sensitivity to hyperparameters. Though largely superseded by layer norm in transformers, batch norm remains prevalent in convolutional vision architectures.
Bidirectional Encoder Representations from Transformers—Google's 2018 encoder-only transformer model that set new benchmarks across NLP tasks. BERT-family models remain widely used for classification, named-entity recognition, and information retrieval in enterprise settings.
The strategic decision of whether to develop bespoke AI models and infrastructure in-house or procure commercial AI products and APIs. The decision hinges on competitive differentiation, data sensitivity, required customisation, cost, speed to value, and in-house talent availability.
A progressive rollout strategy that routes a small percentage of production traffic to a new model version while the remainder continues to receive the old version. Canary deployments limit blast radius if the new model underperforms or causes regressions.
The European conformity mark that high-risk AI systems must bear before entering the EU market, indicating compliance with applicable EU regulations including the AI Act. The CE mark serves as a declaration that the system meets safety, health, and environmental protection standards.
A prompting technique that encourages an LLM to articulate its reasoning step by step before producing a final answer. CoT significantly improves performance on multi-step reasoning, mathematics, and complex decision-making tasks.
An executive responsible for defining and executing an organisation's AI strategy, overseeing AI governance, and ensuring AI initiatives deliver measurable business value. The CAIO role is one of the fastest-growing C-suite positions as enterprises formalise AI leadership.
A field of AI that enables machines to interpret and understand visual information from the world—images, video, and sensor feeds. It underpins applications from quality-control cameras on factory floors to facial recognition and autonomous vehicle perception.
The formal process by which a high-risk AI system is evaluated against the requirements of the EU AI Act before being placed on the market. Depending on the risk category, assessment may be self-conducted or require a third-party notified body.
A training methodology developed by Anthropic in which an AI model is guided by a written set of principles (a "constitution") to self-critique and revise its outputs. Constitutional AI is one approach to building safer, more controllable AI systems at scale.
Packaging AI models and their dependencies into portable, isolated containers (Docker) for consistent deployment across environments. Containerization eliminates "works on my machine" problems and is the foundation of reproducible, auditable AI production systems.
The maximum amount of text (measured in tokens) an LLM can process in a single request, encompassing both the prompt and the generated output. Larger context windows—now exceeding 1 million tokens in some models—enable processing of long documents, codebases, and meeting transcripts in one pass.
A self-supervised technique that trains a model by pulling representations of similar samples together and pushing dissimilar ones apart. Contrastive learning produces high-quality embeddings for images and text and underpins models like CLIP, which links vision and language.
The most common training objective for classification and language modelling tasks, measuring the difference between predicted probability distributions and true labels. Minimising cross-entropy loss during training drives the model to assign high probability to correct outputs.
A change in the statistical distribution of input data fed to a deployed model, which can degrade prediction quality even if the model itself has not changed. Detecting data drift early is critical for maintaining reliable production AI systems.
The policies, processes, and standards that manage data availability, usability, integrity, and security throughout its lifecycle. Strong data governance is a prerequisite for trustworthy AI—the EU AI Act explicitly requires data governance practices for training, validation, and testing datasets.
The ability to track data's origin, movement, transformations, and consumption throughout a system. Data lineage is essential for AI auditability, debugging model behaviour, and demonstrating GDPR and EU AI Act compliance to regulators.
The GDPR principle requiring that only the minimum necessary personal data be collected and processed for a specified purpose. Applied to AI, data minimisation drives techniques like federated learning, synthetic data generation, and differential privacy to train effective models without retaining raw personal data.
A professional who combines statistics, machine learning, and domain knowledge to extract insights and build predictive models from data. Data scientists are most effective when paired with strong data engineering and MLOps infrastructure to move from notebook experiments to production systems.
An applied discipline that combines data science, social science, and managerial science to improve the quality of organisational decision-making through AI and analytical tools. Decision intelligence focuses on the full decision process—framing, modelling, intervention, and learning—not just prediction.
A subset of machine learning based on artificial neural networks with multiple layers. Deep learning can learn complex patterns from large amounts of data and is particularly effective for image recognition, speech processing, and natural language understanding.
AI-powered methods for identifying media—images, video, audio—that have been synthetically generated or manipulated using generative AI. Deepfake detection is increasingly important for trust infrastructure in elections, financial services, and media.
A class of generative models that learn to reverse a gradual noise-addition process to produce high-quality images, audio, or video from random noise. Diffusion models power leading image generators such as Stable Diffusion and DALL-E 3.
The broad process of integrating digital technology into all areas of a business to fundamentally change how it operates and delivers value. AI is increasingly the central driver of new digital transformation initiatives as organisations move beyond basic digitisation.
A regularisation technique that randomly sets a fraction of neuron activations to zero during training, forcing the network to learn redundant representations and preventing overfitting. Dropout is widely used in both vision and language models.
Running AI models on edge devices—industrial controllers, cameras, vehicles, mobile phones—rather than in the cloud. Edge AI reduces latency, preserves data privacy, and enables operation in environments with limited connectivity, making it critical for physical AI deployments.
A numerical representation of data (text, images, etc.) in a high-dimensional vector space where semantically similar items are mathematically close together. Embeddings are the foundational building block for semantic search, RAG pipelines, and recommendation systems.
The initial layer of a neural network that maps discrete tokens (words, image patches) to dense continuous vectors that the model can process. The quality of an embedding layer significantly affects downstream task performance.
AI systems situated in a physical body or robot that perceive the world through sensors and act on it through actuators. Embodied AI bridges software intelligence and the physical world, making it central to robotics, autonomous vehicles, and smart manufacturing.
A two-part neural network architecture where the encoder compresses an input into a latent representation and the decoder generates an output from it. Used in translation, summarisation, and speech recognition systems.
The world's first comprehensive AI regulation, adopted by the European Union in 2024 and entering full application in 2026. It classifies AI systems by risk level—unacceptable, high, limited, and minimal—and imposes obligations proportionate to risk, from outright bans to conformity assessments and transparency requirements.
The ability to understand and explain how an AI model reaches its decisions. Explainable AI is a mandatory requirement for high-risk AI systems under the EU AI Act and is critical for building trust with regulators, auditors, and end users.
Methods and techniques that allow humans to understand, interpret, and trust the outputs of AI models. XAI is a mandatory requirement for high-risk AI systems under the EU AI Act and is central to building stakeholder confidence in automated decision-making.
The property of an AI system that ensures its outputs and decisions do not systematically disadvantage individuals on the basis of protected characteristics such as gender, ethnicity, or disability. Multiple mathematical definitions of fairness exist and may be in tension with one another.
A centralised platform for creating, storing, sharing, and serving ML features consistently across training and inference pipelines. Feature stores eliminate training-serving skew and accelerate model development by making pre-computed features reusable across teams.
A machine learning approach where models are trained across multiple decentralised devices or servers holding local data, without that data ever leaving its origin. Federated learning is valuable for healthcare, finance, and any sector where GDPR data minimisation principles make centralised training impractical.
The fully connected sublayer within each transformer block that applies two linear transformations with a non-linearity between them. FFN layers are responsible for storing factual associations and account for the majority of a transformer's parameters.
A prompting or training approach where a model learns to perform a new task from only a small number of labelled examples. In-context few-shot prompting lets practitioners shape LLM behaviour without any weight updates, dramatically lowering the cost of customisation.
The process of taking a pre-trained AI model and further training it on a specific dataset to adapt it for a particular task or domain. Fine-tuning can deliver significantly better performance than prompting alone for specialised enterprise workflows.
A large AI model trained on broad data at scale that can be adapted to a wide range of downstream tasks through fine-tuning or prompting. GPT-4, Claude, and Gemini are examples; under the EU AI Act, providers of GPAI models face specific transparency obligations.
A senior Chief AI Officer engaged on a part-time or project basis to provide executive AI leadership without the cost of a full-time hire. Fractional CAIOs are particularly valuable for mid-market companies that need board-level AI strategy, vendor evaluation, and team upskilling but cannot yet justify a permanent role.
The intersection of the General Data Protection Regulation with AI systems. Key tensions include the right to explanation for automated decisions (Article 22), data minimisation requirements that limit training data use, and purpose limitation that constrains repurposing of collected data for model training.
A framework consisting of two neural networks—a generator and a discriminator—that compete against each other. The generator creates synthetic data; the discriminator tries to distinguish real from fake. GANs produce highly realistic images and are also studied in the context of deepfake detection.
AI models capable of creating new content—text, images, code, audio, or video—rather than just classifying existing data. Generative AI is driving a productivity wave in enterprise workflows but also introduces governance challenges around accuracy and intellectual property.
The deployment of generative AI—LLMs, image generators, and code assistants—within enterprise contexts, including internal copilots, customer-facing chatbots, automated content production, and software development acceleration. Enterprise deployments require additional security, compliance, and accuracy controls not needed in consumer products.
General Purpose AI models—such as large language models—that can be adapted to a wide range of tasks. Under the EU AI Act, GPAI providers must publish technical documentation, comply with copyright law, and publish summaries of training data. GPAI models with systemic risk face additional requirements.
A family of autoregressive, decoder-only transformer architectures pre-trained to predict the next token. GPT models are the basis for ChatGPT and most commercial LLM APIs. Their autoregressive design makes them excellent text generators but less efficient for classification tasks than encoder models.
The tendency of LLMs to generate plausible-sounding but factually incorrect or fabricated content. Hallucination is a primary enterprise risk for generative AI and is mitigated through RAG, grounding techniques, output validation, and human-in-the-loop review processes.
An AI system classified under the EU AI Act as posing significant risks to health, safety, or fundamental rights. High-risk systems must undergo conformity assessment, maintain technical documentation, enable human oversight, and be registered in an EU database before market deployment.
The ability of humans to effectively monitor, intervene in, and override an AI system's outputs. The EU AI Act mandates appropriate human oversight for all high-risk AI systems, requiring technical measures that enable operators to halt or correct the system in real time.
Workflows and system designs that combine the strengths of human judgement, creativity, and contextual understanding with AI's speed, pattern recognition, and data-processing capacity. Effective human-AI collaboration typically outperforms either humans or AI working alone.
A design pattern where a human must approve or review an AI system's output before it produces a real-world effect. HITL is the strongest form of human oversight and is required for the highest-stakes automated decisions under GDPR Article 22 and the EU AI Act.
A design pattern where an AI system acts autonomously but a human monitors operations and can intervene if needed, rather than approving each action. HOTL balances efficiency with oversight and is appropriate for many high-risk AI use cases where full HITL would be impractical.
The process of running a trained model on new data to produce predictions or generated outputs. Inference cost and latency are the dominant operational concerns in production AI, particularly for large generative models that can cost cents per request at scale.
The compute and financial cost of running a model to produce a single prediction or generated response. Inference cost is often the dominant AI operational expenditure at scale and is managed through model compression, caching, quantization, and batching strategies.
Documentation provided by AI system providers to deployers explaining the system's intended purpose, performance, limitations, human oversight requirements, and maintenance obligations. Under the EU AI Act, instructions for use are mandatory for all high-risk AI systems.
The international standard for AI management systems, published in 2023. ISO 42001 provides organisations with a framework to establish, implement, maintain, and continually improve responsible AI governance—complementary to, but distinct from, EU AI Act legal requirements.
A technical mechanism that allows an AI system to be safely halted, overridden, or disabled by authorised humans at any time. The EU AI Act requires high-risk AI systems to incorporate such controls as part of their human oversight design.
A training technique where a smaller "student" model is trained to replicate the behaviour of a larger "teacher" model. Distillation produces compact, fast models suitable for latency-sensitive or resource-constrained deployments without sacrificing too much quality.
A structured representation of facts as entities and their relationships, enabling machines to reason about complex, interconnected information. Knowledge graphs complement LLMs in enterprise settings by providing verifiable, traceable facts that reduce hallucination.
Using the Kubernetes container orchestration platform to manage scalable, fault-tolerant AI/ML workloads in production. Kubernetes enables auto-scaling inference services, GPU resource management, and rolling model updates with minimal downtime.
AI models trained on vast amounts of text data that can understand and generate human-like text. Examples include GPT-4, Claude, and Llama. LLMs power modern chatbots, content generation, and code assistance tools.
A technique that normalises activations across the feature dimension within a single layer, stabilising training of deep networks. Layer norm is standard in transformer architectures, enabling effective training at billion-parameter scale.
An extension of MLOps focused on the operational challenges specific to large language models: prompt management, context handling, latency optimisation, cost control, hallucination monitoring, and compliance logging. LLMOps tooling is rapidly maturing in 2025–26.
The raw, unnormalised score produced by a model's final layer before the softmax function is applied. Logits are useful for calibration, temperature scaling, and understanding a model's confidence distribution across possible output classes.
A parameter-efficient fine-tuning method that injects trainable low-rank matrices into transformer layers instead of updating all model weights. LoRA enables cost-effective domain adaptation of large models on modest hardware, making enterprise fine-tuning accessible without full GPU clusters.
A subset of AI that enables systems to automatically learn and improve from experience without being explicitly programmed. ML algorithms build models based on training data to make predictions or decisions.
A model architecture where different sub-networks ("experts") specialise in different types of inputs, and a gating network routes each token to the most relevant experts. MoE enables very large model capacity at lower inference cost—Mixtral and GPT-4 are believed to use this approach.
An engineer who specialises in building, deploying, and maintaining machine learning systems in production. ML engineers bridge the gap between data scientists' experimental models and the robust, scalable infrastructure required for enterprise AI.
The practice of combining Machine Learning, DevOps, and data engineering to streamline the deployment, monitoring, and maintenance of ML models in production. MLOps ensures reliable, scalable, and reproducible ML systems across their entire lifecycle.
A short document accompanying a trained AI model that describes its intended uses, performance across demographic groups, limitations, and ethical considerations. Model cards promote transparency and are considered best practice under frameworks like the EU AI Act and ISO 42001.
A set of techniques—including quantization, distillation, pruning, and low-rank factorisation—that reduce model size and computational requirements while preserving performance. Model compression is essential for deploying powerful models on edge hardware or within cost budgets.
The degradation of a deployed model's performance over time as the statistical properties of production data diverge from its training distribution. Model drift monitoring is a key MLOps discipline and is implicitly required by the EU AI Act's post-market monitoring obligations.
A centralised repository for versioning, cataloguing, and managing trained ML models throughout their lifecycle—from experimentation through staging to production. A model registry is the foundation of reproducible, auditable AI deployments.
The process of deploying trained ML models to production environments where they can receive inputs and return predictions at scale. Model serving infrastructure must address throughput, latency, versioning, and cost while meeting SLAs.
An extension of the attention mechanism that runs multiple attention functions in parallel, allowing the model to attend to information from different representation subspaces simultaneously. Multi-head attention is a core component of every transformer-based model.
AI systems that process and reason across multiple types of data simultaneously—text, images, audio, and video. Multimodal models enable richer enterprise applications such as document understanding that combines tables, charts, and prose.
AI systems designed and trained to perform a single, specific task—such as image classification, text translation, or fraud detection. All commercially deployed AI today is narrow AI, which is why domain-specific governance and risk management remain practical priorities.
A branch of AI that enables computers to understand, interpret, and generate human language. NLP powers applications like chatbots, translation services, sentiment analysis, and text summarization.
A computing system inspired by biological neural networks in the brain. It consists of interconnected nodes (neurons) organized in layers that process information and can learn patterns from data.
The US National Institute of Standards and Technology AI Risk Management Framework—a voluntary framework organised around four core functions: Govern, Map, Measure, and Manage. Widely adopted by enterprises globally as a complement to EU AI Act compliance programmes.
Deploying AI models on infrastructure owned and managed by the organisation rather than using cloud APIs. On-premise AI is preferred in regulated industries—banking, defence, healthcare—where data sovereignty, GDPR compliance, and latency requirements preclude sending data to external providers.
A training paradigm where a model is continuously updated with new data as it arrives, rather than in discrete batch retraining cycles. Online learning is used in recommendation systems and fraud detection where data distributions shift rapidly.
An open format for representing machine learning models that enables interoperability across frameworks and hardware. Exporting to ONNX allows models trained in PyTorch to run efficiently on TensorRT, OpenVINO, or other optimised runtimes.
A family of techniques—including LoRA, prefix tuning, and adapters—that adapt large pre-trained models to new tasks by training only a small fraction of parameters. PEFT dramatically reduces compute and storage costs compared to full fine-tuning.
AI systems that perceive and act in the physical world through robotics, autonomous vehicles, smart manufacturing equipment, and industrial sensors. Physical AI integrates computer vision, reinforcement learning, and real-time control to automate tasks previously requiring human dexterity and judgment in unstructured environments.
A limited real-world deployment of an AI solution that tests performance, gathers user feedback, and refines the approach before broader rollout. A successful pilot de-risks the full investment and builds organisational confidence in AI.
A technique that injects information about the position of each token in a sequence into the transformer, compensating for the architecture's lack of inherent sequence awareness. Modern models use learned or rotary positional encodings (RoPE) to support long context windows.
The ongoing collection and analysis of data about a deployed AI system's real-world performance and risk profile, required for high-risk AI systems under the EU AI Act. Providers must proactively detect and report incidents, model drift, and new risks throughout the product's lifecycle.
AI systems that are live in enterprise environments, processing real transactions and influencing business decisions at scale. Production AI requires robust MLOps practices, monitoring, compliance controls, and SLAs—a significantly higher bar than prototype or pilot systems.
AI practices banned outright under the EU AI Act because they pose unacceptable risks to fundamental rights or democratic values. Banned uses include real-time biometric surveillance in public spaces, social scoring by public authorities, subliminal manipulation, and exploiting vulnerable groups.
A technique that caches the key-value (KV) attention states of a repeated prompt prefix, so subsequent requests reuse the pre-computed computation rather than re-running it. Prompt caching reduces latency and cost significantly for applications with long system prompts.
The practice of designing and optimising input prompts to elicit the best results from AI models, especially LLMs. Structured prompting—using system instructions, examples, and constraints—can dramatically improve output quality without changing the underlying model.
A small-scale implementation that demonstrates the technical feasibility of an AI solution without full production investment. POCs validate whether a chosen approach can work before committing resources to pilot and production phases.
The GDPR principle that personal data must be collected for specified, explicit, and legitimate purposes and not further processed in a way incompatible with those purposes. This constrains organisations from repurposing customer data to train internal AI models without fresh consent or a lawful basis.
A model compression technique that reduces the numerical precision of model weights—for example, from 32-bit floats to 8-bit integers—shrinking memory requirements and accelerating inference with minimal accuracy loss. Quantization is essential for deploying LLMs on-premise or at the edge.
An open-source, scalable model serving framework built on the Ray distributed computing library. Ray Serve supports complex inference pipelines with model composition, dynamic batching, and Python-native deployment, making it popular for LLM serving.
Serving model predictions with low latency in response to individual live requests, typically within milliseconds to seconds. Real-time inference is required for customer-facing applications like chatbots, fraud detection, and autonomous control systems.
A machine learning paradigm where an agent learns by interacting with an environment and receiving reward or penalty signals. RL is the foundation of game-playing systems like AlphaGo and increasingly powers robotics, logistics optimisation, and dynamic pricing.
A training technique that refines language model behaviour by learning from human preferences rather than fixed labels. RLHF is a primary method used to align LLMs like ChatGPT and Claude with desired values and reduce harmful outputs.
A shortcut connection that adds a layer's input directly to its output, allowing gradients to flow more easily during training and enabling very deep networks. Residual connections are essential in transformers and were pioneered in the ResNet image model family.
A technique that enhances LLM responses by retrieving relevant information from external knowledge sources before generating a response. RAG improves accuracy, reduces hallucinations, and enables LLMs to access up-to-date information without retraining.
AI applications that directly grow top-line revenue—through personalised recommendations, dynamic pricing, AI-generated leads, and hyper-personalised marketing. Revenue AI initiatives often have the highest strategic impact but require strong data foundations and careful A/B testing to validate causal lift.
The GDPR right (Article 22) for individuals to obtain meaningful information about the logic of automated decision-making that significantly affects them, including profiling. This right drives XAI requirements in credit scoring, HR, and other high-stakes AI applications.
A training paradigm where a model generates its own supervision signal from unlabelled data—for example, by predicting masked words in a sentence. Self-supervised pre-training on internet-scale text or images is what enables foundation models to acquire broad world knowledge.
A search approach that retrieves results based on the meaning and intent of a query rather than exact keyword matches. Semantic search is powered by embeddings and vector databases, enabling far more relevant enterprise knowledge retrieval.
A deployment technique where a new model processes production traffic in parallel with the existing system but its predictions are logged rather than served to users. Shadow mode enables risk-free validation of model behaviour on real-world data before cutover.
A structured programme designed to help small and medium enterprises identify high-value AI use cases, rapidly pilot solutions using existing AI platforms, and build internal capability—without the upfront investment required by large enterprise AI programmes.
A mathematical function that converts a vector of raw logits into a probability distribution summing to 1. Softmax is applied at the output of classifiers and language models to produce token probabilities used in sampling and decoding.
A model architecture that activates only a subset of its parameters for any given input, rather than the full network. Sparse models—enabled by Mixture of Experts designs—achieve larger total capacity while keeping per-inference compute manageable.
A machine learning approach where the model is trained on labelled input-output pairs. The model learns to map inputs to correct outputs, enabling predictions on new, unseen data. Most commercial ML models in production today are supervised.
A mandatory artefact under the EU AI Act that high-risk AI providers must prepare and maintain, detailing the system's purpose, architecture, training methodology, validation data, performance metrics, risk management measures, and post-market monitoring plan.
A sampling parameter that controls the randomness of an LLM's output. A temperature of 0 produces deterministic, focused responses; higher values introduce creative variability. Selecting the right temperature is part of operationalising LLMs in production workflows.
The total number of tokens allocated for a model request, encompassing both input (prompt + context) and output. Managing token budgets is central to controlling inference cost in production LLM applications, especially when processing long documents or maintaining conversational history.
The process of splitting raw text into smaller units called tokens—typically sub-words or word-pieces—that serve as the basic input unit for language models. Token count determines both model context limits and API pricing, making tokenization an important operational consideration.
A decoding strategy that restricts the model's next-token choices to the smallest set of tokens whose cumulative probability exceeds a threshold p. Used alongside temperature, top-p sampling balances output diversity and coherence in production LLM deployments.
The process of exposing a model to large volumes of labelled or unlabelled data and adjusting its internal parameters to minimise prediction errors. Training is computationally intensive, often performed on GPU clusters, and constitutes the major cost in foundation model development.
A technique where a model pre-trained on one task or dataset is adapted for a different but related task. Transfer learning dramatically reduces the data and compute needed to build performant models for specialised domains like legal, medical, or industrial applications.
The dominant neural network architecture for language, vision, and multimodal AI, introduced in the 2017 "Attention Is All You Need" paper. Transformers use self-attention to process all tokens in parallel, enabling training on internet-scale data and powering every major LLM in use today.
NVIDIA's open-source inference serving software that supports multiple frameworks (TensorRT, ONNX, PyTorch, TensorFlow) on GPU infrastructure. Triton is widely used in enterprise deployments requiring maximum throughput from GPU hardware.
The highest risk tier under the EU AI Act, covering AI applications whose potential harms to society are deemed too severe to permit. Systems in this category are prohibited from being placed on the market or put into service in the EU.
A machine learning approach that discovers structure in unlabelled data. Common techniques include clustering (grouping similar items) and dimensionality reduction. It is often used for anomaly detection and exploratory data analysis.
A specialised database designed to store and efficiently search high-dimensional embedding vectors using approximate nearest-neighbour algorithms. Vector databases are the infrastructure backbone of RAG systems and semantic search applications.
An adaptation of the transformer architecture to image data, treating fixed-size image patches as tokens. ViTs now outperform convolutional networks on many computer vision benchmarks and are used in medical imaging, satellite analysis, and industrial quality control.
The ability of a model to perform a task it has never been explicitly trained or prompted on, relying solely on general knowledge acquired during pre-training. Zero-shot capability is a hallmark of large foundation models and simplifies rapid prototyping.
Réservez une consultation pour discuter de l'application des concepts IA à vos défis.