技術

Quantization

定義

A model compression technique that reduces the numerical precision of model weights—for example, from 32-bit floats to 8-bit integers—shrinking memory requirements and accelerating inference with minimal accuracy loss. Quantization is essential for deploying LLMs on-premise or at the edge.

関連用語

Model Compression

A set of techniques—including quantization, distillation, pruning, and low-rank factorisation—that reduce model size and computational requirements while preserving performance. Model compression is essential for deploying powerful models on edge hardware or within cost budgets.

Edge AI

Running AI models on edge devices—industrial controllers, cameras, vehicles, mobile phones—rather than in the cloud. Edge AI reduces latency, preserves data privacy, and enables operation in environments with limited connectivity, making it critical for physical AI deployments.

Knowledge Distillation

A training technique where a smaller "student" model is trained to replicate the behaviour of a larger "teacher" model. Distillation produces compact, fast models suitable for latency-sensitive or resource-constrained deployments without sacrificing too much quality.

AIの理解にお困りですか？

AI概念があなたの課題にどのように適用されるかを話し合う相談を予約してください。

Quantization

定義

関連用語

Model Compression

Edge AI

Knowledge Distillation

関連サービス

AIの理解にお困りですか？

Quantization

定義

関連用語

Model Compression

Edge AI

Knowledge Distillation

関連サービス

AIの理解にお困りですか？