A set of techniques—including quantization, distillation, pruning, and low-rank factorisation—that reduce model size and computational requirements while preserving performance. Model compression is essential for deploying powerful models on edge hardware or within cost budgets.
Réservez une consultation pour discuter de l'application des concepts IA à vos défis.