Mistral AI provides cloud-based large language models accessible via API for developers.
The platform offers multiple model tiers, including Mistral Small for low-latency tasks and Mistral Large 2 for advanced reasoning. Developers can integrate via the mistralai Python SDK or @mistralai/sdk for Node.js, with additional features like web search and tool use available through the Agents API. Fine-tuning is supported for Mistral Medium 2 and Large 2 (Pro/Enterprise plans only), while cost optimization is possible through batch processing and model selection.
1. Getting an API Key
1. Getting an API Key
- Sign up at console.mistral.ai
- Navigate to API Keys → Create Key
- Copy the key (starts with
mistral-) and store it securely
# Test your key (replace YOUR_API_KEY)
curl --location "https://api.mistral.ai/v1/models" \
--header "Authorization: Bearer YOUR_API_KEY" \
--header "Content-Type: application/json"
Expected output:
{
"object": "list",
"data": [
{
"id": "mistral-small-latest",
"object": "model",
"created": 1693499933,
"owned_by": "mistralai"
},
...
]
}
Gotcha: Free tier keys expire after 30 days. Rotate keys in production using environment variables:
export MISTRAL_API_KEY="your_key_here"
2. Using the Python and JS SDKs
Python SDK
pip install mistralai --upgrade # v0.3.0
Basic chat completion:
from mistralai.client import MistralClient
from mistralai.models.chat_completion import ChatMessage
client = MistralClient(api_key=os.environ["MISTRAL_API_KEY"])
response = client.chat(
model="mistral-small-latest",
messages=[ChatMessage(role="user", content="Explain LLMs in 1 sentence")],
)
print(response.choices[0].message.content)
Output:
Large language models are neural networks trained on vast text data to predict and generate human-like text.
Streaming response:
for chunk in client.chat_stream(
model="mistral-medium-latest",
messages=[ChatMessage(role="user", content="Write Python code to sort a list")],
):
print(chunk.choices[0].delta.content, end="", flush=True)
Node.js SDK
npm install @mistralai/sdk
Basic usage:
import { Mistral } from "@mistralai/sdk";
const client = new Mistral({ apiKey: process.env.MISTRAL_API_KEY });
const response = await client.chat({
model: "mistral-small-latest",
messages: [{ role: "user", content: "What's the capital of France?" }],
});
console.log(response.choices[0].message.content);
Output:
The capital of France is Paris.
3. Model Selection
| Model | Use Case | Context Window | Latency | Cost (input/output per 1K tokens) |
|---|---|---|---|---|
| Mistral Small | Low-latency tasks (chatbots, Q&A) | 8K tokens | <500ms | $0.0005 / $0.0015 |
| Mistral Medium 2 | Balanced cost/performance | 16K tokens | ~800ms | $0.001 / $0.003 |
| Mistral Large 2 | Complex reasoning (agents, RAG) | 32K tokens | ~1.2s | $0.003 / $0.006 |
Example: Model comparison for code generation
models = ["mistral-small-latest", "mistral-medium-latest", "mistral-large-latest"]
for model in models:
response = client.chat(
model=model,
messages=[ChatMessage(role="user", content="Write a Python function to reverse a string")],
)
print(f"\n{model}:\n{response.choices[0].message.content}")
Pro tip: Use Mistral Small for prototyping, then switch to Medium/Large for production.
4. Web Search and Tool Use
Web Search (2026 Feature)
Enable web search in the API call:
response = client.chat(
model="mistral-large-latest",
messages=[ChatMessage(role="user", content="What's the latest news on Mistral AI?")],
tools=[{"type": "web_search", "query": "Mistral AI latest news"}],
)
Function Calling
Define tools and let the model decide when to use them:
tools = [
{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get the current weather in a location",
"parameters": {
"type": "object",
"properties": {
"location": {"type": "string", "description": "City name"},
"unit": {"type": "string", "enum": ["celsius", "fahrenheit"]},
},
"required": ["location"],
},
},
}
]
response = client.chat(
model="mistral-large-latest",
messages=[ChatMessage(role="user", content="What's the weather in Paris?")],
tools=tools,
)
Expected output:
{
"tool_calls": [
{
"function": {
"name": "get_weather",
"arguments": "{\"location\": \"Paris\", \"unit\": \"celsius\"}"
}
}
]
}
5. Fine-Tuning Basics
Step 1: Prepare Your Dataset
Format your data as JSONL:
{"prompt": "What is Mistral AI?", "completion": "Mistral AI is a cutting-edge AI lab based in France..."}
{"prompt": "Who founded Mistral AI?", "completion": "Mistral AI was founded by Arthur Mensch, Guillaume Lample, and Timothée Lacroix in 2023."}
Step 2: Upload to Mistral
curl -X POST "https://api.mistral.ai/v1/fine_tuning/jobs" \
-H "Authorization: Bearer $MISTRAL_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "mistral-medium-2",
"training_file": "file-123abc", # Upload via console first
"hyperparameters": {
"n_epochs": 3,
"batch_size": 8
}
}'
Step 3: Deploy Your Fine-Tuned Model
response = client.chat(
model="ft:mistral-medium-2:your-org:custom-model-id",
messages=[ChatMessage(role="user", content="What is Mistral AI?")],
)
Cost: $0.01 per 1K training tokens (Pro/Enterprise only) Fine-Tuning Pricing
6. Agents API Setup
Step 1: Define Your Agent
agent = client.agents.create(
name="TravelPlanner",
model="mistral-large-latest",
instructions="You are a travel planning assistant. Use tools to book flights and hotels.",
tools=[
{
"type": "function",
"function": {
"name": "book_flight",
"description": "Book a flight between two cities",
"parameters": {
"type": "object",
"properties": {
"origin": {"type": "string"},
"destination": {"type": "string"},
"date": {"type": "string", "format": "date"},
},
},
},
}
],
)
Step 2: Run the Agent
response = client.agents.run(
agent_id=agent.id,
messages=[{"role": "user", "content": "Plan a trip to Tokyo from Paris for next month"}],
)
Expected flow:
- Agent asks for travel dates
- Calls
book_flighttool with parameters - Returns confirmation
7. Cost Optimization Strategies
1. Model Selection
- Use Mistral Small for simple tasks (e.g., classification, chatbots)
- Switch to Mistral Medium 2 for moderate reasoning (e.g., summarization)
- Reserve Mistral Large 2 for complex tasks (e.g., agents, RAG)
2. Batch Processing
jobs = client.batch.create(
model="mistral-small-latest",
input_file_ids=["file-123abc"], # JSONL file with prompts
endpoint="/v1/chat/completions",
)
Cost savings: Up to 50% vs. real-time API Batch API Docs
3. Caching
Cache responses for repeated queries:
from functools import lru_cache
@lru_cache(maxsize=100)
def get_cached_response(prompt):
return client.chat(
model="mistral-small-latest",
messages=[ChatMessage(role="user", content=prompt)],
)
4. Token Optimization
- Use
max_tokens=100for short responses - Trim input context to the last 2K tokens for Mistral Small
- Enable
safe_prompt=Trueto reduce harmful outputs (slightly increases cost)
Comparison to Alternatives
| Feature | Mistral AI | OpenAI (GPT-4o) | Anthropic (Claude 3.5) |
|---|---|---|---|
| Best for | Cost-efficient EU deployments | Multi-modal (audio/video) | Long context (200K tokens) |
| Latency | <500ms (Small) | ~1s | ~1.5s |
| Fine-Tuning | Yes (API/self-hosted) | Yes (API) | Limited |
| Enterprise Support | SOC 2, GDPR, VPC | SOC 2, HIPAA | SOC 2, HIPAA |
| Pricing (1K tokens) | $0.003 (input) | $0.015 (input) | $0.008 (input) |
When to choose Mistral:
- You need EU data residency or cost efficiency
- Your use case involves agents or tool use
- You want to self-host or fine-tune models
What's Next?
- Build a prototype: Use Mistral Small to create a chatbot for your docs (example)
- Experiment with agents: Try the Agents API for a travel planning or customer support use case
- Optimize costs: Audit your API usage with
mistralai.usage.get()and switch models where possible
For teams scaling AI applications, Hyperion Consulting offers specialized tools and consulting to accelerate your Mistral AI deployment. Visit hyperion-consulting.io to learn more.
