Des outils que j'utilise en production — pas des outils avec lesquels j'ai un partenariat
Chaque technologie listée ici a été déployée dans un système de production. Je suis indépendant des fournisseurs par conviction — le bon outil dépend de votre cas d'usage, de vos données et de votre budget. Aucun accord de partenariat n'influence mes recommandations.
Modèles de fondation pour le raisonnement, la génération et les tâches multimodales
Diplômes reconnus par l'industrie démontrant l'expertise
Scrum.org
Product ownership and value maximization in Scrum
Délivré 2019
Scrum Alliance
Agile facilitation and Scrum framework mastery
Délivré 2018
Scaled Agile
Scaled Agile Framework for enterprise transformation
Délivré 2021
Product School
Building and managing AI-powered products
Délivré 2023
DeepLearning.AI
Neural networks, CNNs, RNNs, and transformers
Délivré 2022
Aucun partenariat fournisseur n'influence mes recommandations. Je choisis le modèle qui correspond à vos exigences de latence, coût et précision.
Chaque outil ici a été déployé dans un système qui gère du vrai trafic. Les outils testés uniquement en laboratoire ne figurent pas dans cette liste.
La plupart des projets IA dépensent 3 à 5 fois trop en infrastructure. Je dimensionne correctement dès le départ — modèles plus petits, caching plus intelligent, inférence plus efficace.
Les modèles IA changent chaque trimestre. Mes architectures abstraient la couche modèle pour que vous puissiez changer de fournisseur sans réécrire votre application.
Réservez un appel de 30 minutes. J'évaluerai vos besoins et recommanderai la bonne combinaison de modèles, d'infrastructure et de frameworks — avec des estimations de coûts.
Chaque outil que nous évaluons, déployons ou recommandons — avec des évaluations honnêtes.
Anthropic
Most capable Claude model — complex reasoning, long-context analysis, agentic tasks.
Documentation officielle →Anthropic
Best balance of intelligence and speed for production workloads.
Documentation officielle →Anthropic
Fastest and lowest-cost Claude model for high-volume tasks.
Documentation officielle →Anthropic
AI-native CLI for agentic software engineering — reads, writes, and runs code autonomously.
Documentation officielle →Anthropic
Open protocol connecting AI assistants to external tools, data sources, and services.
Documentation officielle →Anthropic
Build, orchestrate, and deploy multi-agent systems powered by Claude.
Documentation officielle →Mistral AI
Top-tier reasoning model with 128K context — Mistral's flagship for enterprise tasks.
Documentation officielle →Mistral AI
Cost-efficient multimodal model — text and image understanding.
Documentation officielle →Mistral AI
Apache 2.0 multilingual model — EU-sovereign deployments, 128K context.
Documentation officielle →Mistral AI
Code generation specialist — 80+ languages, fill-in-the-middle, 32K context.
Documentation officielle →Mistral AI
Frontier vision-language model — document analysis, chart reading, 128K context.
Documentation officielle →Mistral AI
High-quality text embeddings for RAG and semantic search.
Documentation officielle →Mistral AI
Train and own frontier AI model weights outright — no API rental, full data sovereignty.
Documentation officielle →Mistral AI
Enterprise AI assistant — SSO, audit logs, EU data residency, web search, document upload.
Documentation officielle →Meta
Meta's flagship open-weight model — Apache 2.0, matches GPT-4 on many benchmarks at fraction of cost.
Documentation officielle →Meta
Lightweight Llama models for mobile, edge, and on-device inference.
Documentation officielle →Meta
Vision-language Llama models — image understanding, document analysis.
Documentation officielle →Google's open-weight family — Apache 2.0, strong reasoning, multilingual, edge-to-server range.
Documentation officielle →Microsoft
MIT-licensed reasoning specialist — outperforms models 3× larger on math and coding.
Documentation officielle →Microsoft
Edge-optimised reasoning model — 3.8B parameters, strong instruction following on constrained hardware.
Documentation officielle →Alibaba
Alibaba's Apache 2.0 multilingual family — exceptional Chinese/English, strong math, full size range.
Documentation officielle →Alibaba
State-of-the-art open-source code generation — rivals GPT-4o on coding benchmarks.
Documentation officielle →DeepSeek
MIT-licensed reasoning specialist with chain-of-thought — matches o1 on math and science tasks.
Documentation officielle →DeepSeek
671B MoE open-weight general model — top open-source benchmark scores across all categories.
Documentation officielle →TII UAE
TII's Apache 2.0 family — strong multilingual performance, designed for EU/MENA sovereign deployments.
Documentation officielle →Hugging Face
Ultra-compact models for on-device and browser inference — Apache 2.0, efficiency benchmark.
Documentation officielle →Ollama
One-command local model serving — runs Llama, Mistral, Gemma and 100+ models on any hardware.
Documentation officielle →vLLM Project
High-throughput production LLM serving — PagedAttention, continuous batching, OpenAI-compatible.
Documentation officielle →Hugging Face
Hugging Face's production inference server — tensor parallelism, quantization, streaming.
Documentation officielle →ggerganov
CPU/GPU inference in C++ — GGUF format, runs on Apple Silicon, NVIDIA, AMD, CPU-only.
Documentation officielle →LM Studio
Desktop GUI for discovering, downloading, and running local LLMs — OpenAI-compatible server.
Documentation officielle →Hugging Face
Run Transformers in the browser and Node.js — ONNX-based, no server required.
Documentation officielle →Microsoft
Cross-platform optimised inference — CPU, GPU, mobile, browser, WASM support.
Documentation officielle →BerriAI
Universal LLM API proxy — call 100+ models with OpenAI format, load balancing, fallbacks.
Documentation officielle →Unsloth AI
2× faster fine-tuning, 70% less VRAM — LoRA and QLoRA for Llama, Mistral, Qwen, Gemma.
Documentation officielle →OpenAccess AI Collective
Production fine-tuning framework — YAML config, LoRA/QLoRA/full, multi-GPU, Flash Attention.
Documentation officielle →hiyouga
Fine-tune 100+ LLMs with a web UI or CLI — SFT, DPO, GRPO, LoRA, QLoRA.
Documentation officielle →PyTorch
PyTorch-native fine-tuning library — recipe-based, minimal dependencies, full control.
Documentation officielle →Hugging Face
Parameter-Efficient Fine-Tuning — LoRA, QLoRA, IA³, AdaLoRA, Prefix Tuning.
Documentation officielle →Hugging Face
Transformer Reinforcement Learning — SFT, DPO, GRPO, PPO, ORPO for alignment training.
Documentation officielle →Microsoft
ZeRO optimizer for large model training — 10× throughput, trillion-parameter scale.
Documentation officielle →Hugging Face
One-line multi-GPU and TPU training — no code changes, FSDP and DeepSpeed integration.
Documentation officielle →NVIDIA
NVIDIA's large-scale pre-training framework — tensor/pipeline/sequence parallelism.
Documentation officielle →Hugging Face
900K+ models, 100K+ datasets, and Spaces — the de facto standard for AI artifact sharing.
Documentation officielle →Hugging Face
Core model library — load, run, and fine-tune any model in PyTorch, TensorFlow, or JAX.
Documentation officielle →Hugging Face
100K+ datasets with streaming, arrow-based loading, and one-line preprocessing.
Documentation officielle →Hugging Face
Managed dedicated or serverless model deployment — auto-scaling, private endpoints.
Documentation officielle →Hugging Face
No-code fine-tuning for LLMs and other models — SFT, DPO, classification, NER.
Documentation officielle →Hugging Face
Host Gradio and Streamlit ML demos — free tier available, GPU-enabled options.
Documentation officielle →Hugging Face
Standardised metrics library — BLEU, ROUGE, accuracy, F1, and 100+ custom metrics.
Documentation officielle →Hugging Face
Programmatic Hub access — upload models, create repos, manage tokens, search.
Documentation officielle →LangChain
LLM application framework — chains, agents, RAG, tool use, memory.
Documentation officielle →LlamaIndex
Data framework for LLM apps — ingestion, indexing, querying over any data source.
Documentation officielle →deepset
Production NLP pipeline framework — RAG, document search, question answering.
Documentation officielle →Stanford NLP
Declarative LLM programming — optimise prompts and weights automatically.
Documentation officielle →Jason Liu
Structured output extraction — Pydantic schemas from any LLM, with validation and retries.
Documentation officielle →Microsoft
Enterprise LLM orchestration for .NET, Python, Java — plugins, planners, memory.
Documentation officielle →CrewAI
Role-based multi-agent orchestration — agents collaborate with defined roles and goals.
Documentation officielle →Microsoft
Microsoft's multi-agent conversation framework — async agents, human-in-the-loop.
Documentation officielle →Hugging Face
Minimal agentic framework — code-first agents that write and execute Python, 1000-line core.
Documentation officielle →Qdrant
Rust-based vector search — on-prem friendly, filterable, sparse+dense hybrid search.
Documentation officielle →Weaviate
GraphQL API vector database — multi-tenancy, hybrid search, generative search.
Documentation officielle →Chroma
Local-first open-source vector database — Python-native, zero infrastructure required.
Documentation officielle →Zilliz
Distributed vector search for billion-scale data — HNSW, IVF, GPU acceleration.
Documentation officielle →PostgreSQL
Vector similarity search extension for PostgreSQL — no separate infrastructure needed.
Documentation officielle →Pinecone
Managed cloud vector database — serverless tier, namespaces, metadata filtering.
Documentation officielle →Pollen Robotics
Open-source humanoid robot for research and industry — Apache 2.0, ROS2, Python SDK.
Documentation officielle →Open Robotics
Robot Operating System 2 — real-time communication, sensor fusion, navigation stack.
Documentation officielle →Hugging Face
Open-source robot learning — imitation learning, reinforcement learning, pre-trained policies.
Documentation officielle →NVIDIA
Robot simulation and deployment platform — synthetic data generation, physics simulation.
Documentation officielle →OpenCV
Computer vision library — 2500+ algorithms, real-time image processing, widely deployed.
Documentation officielle →Meta
Segment Anything Model 2 — real-time video and image segmentation, zero-shot.
Documentation officielle →Ultralytics
Real-time object detection — fastest production-grade detector, ONNX/CoreML export.
Documentation officielle →Amazon
Managed foundation model APIs on AWS — Claude, Llama, Mistral, Titan, Stable Diffusion.
Documentation officielle →Microsoft
Microsoft's enterprise AI platform — model catalog, fine-tuning, responsible AI tools.
Documentation officielle →GCP's unified AI/ML platform — Gemini, model garden, AutoML, feature store.
Documentation officielle →Cloudflare
Run AI models at the edge globally — Workers AI, 100+ models, serverless inference.
Documentation officielle →LangChain
LLM observability and tracing — log runs, compare prompts, regression testing.
Documentation officielle →Weights & Biases
ML experiment tracking, visualisation, and hyperparameter sweeps — industry standard.
Documentation officielle →Databricks
ML lifecycle management — experiment tracking, model registry, deployment.
Documentation officielle →CNCF / Grafana Labs
Inference metrics collection and dashboards — latency, throughput, error rates.
Documentation officielle →Arize AI
LLM evaluation and monitoring — hallucination detection, embeddings visualisation, drift.
Documentation officielle →Exploding Gradients
RAG evaluation framework — faithfulness, answer relevancy, context precision metrics.
Documentation officielle →