How to Set Up LLM (Simon Willison): A Practical Guide

TL;DR

Install with pip install llm or brew install simonw/llm/llm (macOS)
Configure API keys via llm keys set <provider> <key>
Run prompts: llm "Explain quantum computing" or pipe data: cat notes.txt | llm -s "Summarize"
Plugins extend support to Ollama, Claude, Gemini, and local models
Conversations stored in SQLite (~/.llm/log.db)

## 1. Installation

Python (pip/pipx)

# Recommended: pipx (isolated environment)
pipx install llm

# Alternative: pip (global)
pip install --user llm

Expected output:

✔ Successfully installed llm-0.15.0

Gotchas:

If llm command isn't found after pip install, add ~/.local/bin to your PATH:

echo 'export PATH="$HOME/.local/bin:$PATH"' >> ~/.bashrc
source ~/.bashrc

Python 3.8+ required. Verify with python --version.

macOS (Homebrew)

brew install simonw/llm/llm

Windows (PowerShell)

pip install llm

Ensure Microsoft Visual C++ Build Tools are installed if you encounter compilation errors.

Docker

docker run -it --rm -v ~/.llm:/root/.llm simonw/llm:latest

Persists data in ~/.llm on the host.

## 2. Configuring API Keys

LLM supports multiple providers. Set keys via the CLI:

# OpenAI
llm keys set openai
# Paste your key when prompted

# Anthropic (Claude)
llm keys set anthropic

# Google Gemini
llm keys set gemini

Verify keys:

llm keys list

Expected output:

Keys:
  openai: sk-...abc123
  anthropic: sk-ant-...xyz456

Gotchas:

Keys are stored in ~/.llm/keys.json. Secure this file with chmod 600 ~/.llm/keys.json.
Rate limits apply. Check your provider's dashboard for usage.

## 3. Running Prompts

Basic Prompt

llm "Explain quantum computing in 3 bullet points"

Expected output:

- Quantum bits (qubits) can exist in multiple states simultaneously (superposition).
- Qubits can be entangled, meaning the state of one instantly influences another.
- Quantum computers solve certain problems exponentially faster than classical computers.

Specify a Model

llm -m gpt-4o "Write a Python script to fetch stock prices"

Supported models:

gpt-4o, gpt-4-turbo (OpenAI)
claude-3-opus (Anthropic)
gemini-1.5-pro (Google)
llama3 (via Ollama plugin)

Local Models (<a href="/services/slm-edge-ai">gguf</a>)

# Install the Ollama plugin
llm install llm-ollama

# Pull a model (e.g., <a href="/services/open-source-llm-integration">llama</a> 3)
ollama pull llama3

# Run locally
llm -m ollama/llama3 "Explain the <a href="/services/physical-ai-robotics">physical ai</a> Stack"

Gotchas:

Local models require significant RAM (e.g., 7B+ models need 8GB+).
Quantized models (e.g., llama3:8b-instruct-q4_K_M) reduce memory usage but may impact quality.

## 4. Plugin Ecosystem

Install Plugins

# Ollama (local models)
llm install llm-ollama

# Mistral
llm install llm-mistral

# Hugging Face
llm install llm-huggingface

List Available Models

llm models list

Expected output:

OpenAI:
  gpt-4o
  gpt-4-turbo
Anthropic:
  claude-3-opus
Ollama:
  ollama/llama3
  ollama/mistral

Use a Plugin Model

llm -m ollama/llama3 "Generate a Dockerfile for a FastAPI app"

Gotchas:

Plugins may require additional dependencies (e.g., llm-ollama needs ollama installed).
Check plugin docs for setup: llm plugins info <plugin-name>.

## 5. Conversation History and Templates

View Conversations

llm logs

Expected output:

ID  Model       Prompt                          Timestamp
1   gpt-4o      Explain quantum computing...    2026-05-12 10:00:00
2   claude-3    Write a Python script...        2026-05-12 10:05:00

Resume a Conversation

llm logs --id 1 --continue

Prompt:

Follow up: How does superposition enable quantum parallelism?

Save a Template

Create ~/.llm/templates/summarize.yaml:

system: You are a concise summarizer. Extract key points in bullet form.
prompt: "Summarize the following text:\n\n{{ text }}"

Use the template:

cat notes.txt | llm -t summarize

Gotchas:

Templates use Jinja2 syntax. Escape {{ and }} with {% raw %} if needed.
Conversations are stored in ~/.llm/log.db (SQLite). Back up this file for long-term storage.

## 6. Piping Data and Shell Integration

Pipe Text to LLM

cat README.md | llm -s "Summarize this project in 1 paragraph"

Pipe Command Output

git diff | llm -s "Explain these changes in simple terms"

Save Output to File

llm "Generate a Kubernetes deployment YAML for a FastAPI app" > deployment.yaml

Use with `jq` (JSON Processing)

curl https://api.github.com/repos/simonw/llm | jq '.description' | llm -s "Explain this project"

Gotchas:

Piped input is treated as a single prompt. For multi-turn conversations, use llm logs --continue.
Large inputs may hit token limits. Truncate with head -n 100 or similar.

## 7. Building Custom Workflows

Python API

import llm

model = llm.get_model("gpt-4o")
response = model.prompt("Explain the Hyperion Lifecycle in 3 steps")
print(response.text())

Embeddings

# Generate embeddings (requires llm-embeddings plugin)
llm embed -m text-embedding-3-small "The Physical AI Stack" > embedding.json

# Compare embeddings
llm embed --compare "SENSE layer" "CONNECT layer"

Custom Model Aliases

Add to ~/.llm/config:

aliases:
  fast: gpt-4o-mini
  local: ollama/llama3

Usage:

llm -m fast "Write a Python one-liner to parse CSV"

Scheduled Prompts (Cron)

# Edit crontab
crontab -e

Add:

0 9 * * * /usr/local/bin/llm "Summarize top Hacker News stories" >> ~/hn-summary.txt

Gotchas:

For Python API, ensure llm is installed in your virtual environment.
Embeddings require the llm-embeddings plugin and may need additional dependencies (e.g., sentence-transformers).

## Alternatives at a Glance

Tool	Best For	Limitations
LLM	Lightweight CLI, local/API models	No GUI, limited analytics
LangChain	Enterprise workflows, RAG	Steeper learning curve
Ollama	Local models (GGUF)	Fewer API integrations

## What's Next?

Explore plugins: Install llm install llm-gpt4all for offline models or llm install llm-mistral for Mistral support.
Automate workflows: Pipe command output to LLM (e.g., git log | llm -s "Summarize recent changes").
Build a custom template: Create a YAML template for repetitive tasks (e.g., code reviews or meeting notes).

For advanced AI tooling and workflows, Hyperion <a href="/services/coaching-vs-consulting">consulting</a> offers enterprise-grade solutions to accelerate your AI adoption.

How to Set Up LLM (Simon Willison): A Practical Guide

## 1. Installation

Python (pip/pipx)

macOS (Homebrew)

Windows (PowerShell)

Docker

## 2. Configuring API Keys

## 3. Running Prompts

Basic Prompt

Specify a Model

Local Models (<a href="/services/slm-edge-ai">gguf</a>)

## 4. Plugin Ecosystem

Install Plugins

List Available Models

Use a Plugin Model

## 5. Conversation History and Templates

View Conversations

Resume a Conversation

Save a Template

## 6. Piping Data and Shell Integration

Pipe Text to LLM

Pipe Command Output

Save Output to File

Use with `jq` (JSON Processing)

## 7. Building Custom Workflows

Python API

Embeddings

Custom Model Aliases

Scheduled Prompts (Cron)

## Alternatives at a Glance

## What's Next?

The 30% Report

Related Articles

Want to Discuss These Ideas?

How to Set Up LlamaIndex: A Practical Guide for Enterprise AI Teams

How to Set Up LangChain: A Practical Guide for Enterprise AI Teams

How to Set Up LLM (Simon Willison): A Practical Guide

## 1. Installation

Python (pip/pipx)

macOS (Homebrew)

Windows (PowerShell)

Docker

## 2. Configuring API Keys

## 3. Running Prompts

Basic Prompt

Specify a Model

Local Models (<a href="/services/slm-edge-ai">gguf</a>)

## 4. Plugin Ecosystem

Install Plugins

List Available Models

Use a Plugin Model

## 5. Conversation History and Templates

View Conversations

Resume a Conversation

Save a Template

## 6. Piping Data and Shell Integration

Pipe Text to LLM

Pipe Command Output

Save Output to File

Use with jq (JSON Processing)

## 7. Building Custom Workflows

Python API

Embeddings

Custom Model Aliases

Scheduled Prompts (Cron)

## Alternatives at a Glance

## What's Next?

The 30% Report

Related Articles

Want to Discuss These Ideas?

How to Set Up LlamaIndex: A Practical Guide for Enterprise AI Teams

How to Set Up LangChain: A Practical Guide for Enterprise AI Teams

Use with `jq` (JSON Processing)