我在生产中运行的工具——默认主权优先,而非我转售的工具
这里列出的每一项技术都已部署在生产系统中。我采用主权优先的模型选择策略——优先选用欧盟托管、欧洲模型,为欧洲客户而选。合适的工具取决于你的用例、你的数据和你的预算。没有转售返佣,没有供应商偏向。
彰显专业能力的行业认可凭证
Scrum.org
Product ownership and value maximization in Scrum
颁发于 2019
Scrum Alliance
Agile facilitation and Scrum framework mastery
颁发于 2018
Scaled Agile
Scaled Agile Framework for enterprise transformation
颁发于 2021
Product School
Building and managing AI-powered products
颁发于 2023
DeepLearning.AI
Neural networks, CNNs, RNNs, and transformers
颁发于 2022
没有转售返佣或供应商偏向影响我的建议——优先选用欧盟托管和欧洲模型。我会选择契合你的延迟、成本和准确率要求的模型。
这里的每一项工具都已部署在承载真实流量的系统中。仅在实验室测试过的工具不会出现在这份清单上。
多数 AI 项目在基础设施上的超支达 3-5 倍。我从一开始就做到合理规模——更小的模型、更智能的缓存、更高效的推理。
AI 模型每个季度都在变化。我的架构对模型层进行抽象,让你无需重写应用即可更换供应商。
预约一次 30 分钟的通话。我会评估你的需求,并推荐合适的模型、基础设施和框架组合——附带成本估算。
我们评估、部署或推荐的每一项工具——附诚实评价。
Anthropic
Most capable Claude model — complex reasoning, long-context analysis, agentic tasks.
官方文档 →Anthropic
Best balance of intelligence and speed for production workloads.
官方文档 →Anthropic
AI-native CLI for agentic software engineering — reads, writes, and runs code autonomously.
官方文档 →Anthropic
Open protocol connecting AI assistants to external tools, data sources, and services.
官方文档 →Anthropic
Build, orchestrate, and deploy multi-agent systems powered by Claude.
官方文档 →Mistral AI
Top-tier reasoning model with 128K context — Mistral's flagship for enterprise tasks.
官方文档 →Mistral AI
Cost-efficient multimodal model — text and image understanding.
官方文档 →Mistral AI
Apache 2.0 multilingual model — EU-sovereign deployments, 128K context.
官方文档 →Mistral AI
Code generation specialist — 80+ languages, fill-in-the-middle, 32K context.
官方文档 →Mistral AI
Frontier vision-language model — document analysis, chart reading, 128K context.
官方文档 →Mistral AI
Train and own frontier AI model weights outright — no API rental, full data sovereignty.
官方文档 →Mistral AI
Enterprise AI assistant — SSO, audit logs, EU data residency, web search, document upload.
官方文档 →Meta
Meta's flagship open-weight model — Apache 2.0, matches GPT-4 on many benchmarks at fraction of cost.
官方文档 →Meta
Lightweight Llama models for mobile, edge, and on-device inference.
官方文档 →Meta
Vision-language Llama models — image understanding, document analysis.
官方文档 →Google's open-weight family — Apache 2.0, strong reasoning, multilingual, edge-to-server range.
官方文档 →Microsoft
MIT-licensed reasoning specialist — outperforms models 3× larger on math and coding.
官方文档 →Microsoft
Edge-optimised reasoning model — 3.8B parameters, strong instruction following on constrained hardware.
官方文档 →Alibaba
Alibaba's Apache 2.0 multilingual family — exceptional Chinese/English, strong math, full size range.
官方文档 →Alibaba
State-of-the-art open-source code generation — rivals GPT-4o on coding benchmarks.
官方文档 →DeepSeek
MIT-licensed reasoning specialist with chain-of-thought — matches o1 on math and science tasks.
官方文档 →DeepSeek
671B MoE open-weight general model — top open-source benchmark scores across all categories.
官方文档 →TII UAE
TII's Apache 2.0 family — strong multilingual performance, designed for EU/MENA sovereign deployments.
官方文档 →Hugging Face
Ultra-compact models for on-device and browser inference — Apache 2.0, efficiency benchmark.
官方文档 →Ollama
One-command local model serving — runs Llama, Mistral, Gemma and 100+ models on any hardware.
官方文档 →vLLM Project
High-throughput production LLM serving — PagedAttention, continuous batching, OpenAI-compatible.
官方文档 →Hugging Face
Hugging Face's production inference server — tensor parallelism, quantization, streaming.
官方文档 →ggerganov
CPU/GPU inference in C++ — GGUF format, runs on Apple Silicon, NVIDIA, AMD, CPU-only.
官方文档 →LM Studio
Desktop GUI for discovering, downloading, and running local LLMs — OpenAI-compatible server.
官方文档 →Hugging Face
Run Transformers in the browser and Node.js — ONNX-based, no server required.
官方文档 →Microsoft
Cross-platform optimised inference — CPU, GPU, mobile, browser, WASM support.
官方文档 →BerriAI
Universal LLM API proxy — call 100+ models with OpenAI format, load balancing, fallbacks.
官方文档 →Unsloth AI
2× faster fine-tuning, 70% less VRAM — LoRA and QLoRA for Llama, Mistral, Qwen, Gemma.
官方文档 →OpenAccess AI Collective
Production fine-tuning framework — YAML config, LoRA/QLoRA/full, multi-GPU, Flash Attention.
官方文档 →hiyouga
Fine-tune 100+ LLMs with a web UI or CLI — SFT, DPO, GRPO, LoRA, QLoRA.
官方文档 →PyTorch
PyTorch-native fine-tuning library — recipe-based, minimal dependencies, full control.
官方文档 →Hugging Face
Parameter-Efficient Fine-Tuning — LoRA, QLoRA, IA³, AdaLoRA, Prefix Tuning.
官方文档 →Hugging Face
Transformer Reinforcement Learning — SFT, DPO, GRPO, PPO, ORPO for alignment training.
官方文档 →Microsoft
ZeRO optimizer for large model training — 10× throughput, trillion-parameter scale.
官方文档 →Hugging Face
One-line multi-GPU and TPU training — no code changes, FSDP and DeepSpeed integration.
官方文档 →NVIDIA
NVIDIA's large-scale pre-training framework — tensor/pipeline/sequence parallelism.
官方文档 →Hugging Face
900K+ models, 100K+ datasets, and Spaces — the de facto standard for AI artifact sharing.
官方文档 →Hugging Face
Core model library — load, run, and fine-tune any model in PyTorch, TensorFlow, or JAX.
官方文档 →Hugging Face
100K+ datasets with streaming, arrow-based loading, and one-line preprocessing.
官方文档 →Hugging Face
Managed dedicated or serverless model deployment — auto-scaling, private endpoints.
官方文档 →Hugging Face
No-code fine-tuning for LLMs and other models — SFT, DPO, classification, NER.
官方文档 →Hugging Face
Host Gradio and Streamlit ML demos — free tier available, GPU-enabled options.
官方文档 →Hugging Face
Standardised metrics library — BLEU, ROUGE, accuracy, F1, and 100+ custom metrics.
官方文档 →Hugging Face
Programmatic Hub access — upload models, create repos, manage tokens, search.
官方文档 →LlamaIndex
Data framework for LLM apps — ingestion, indexing, querying over any data source.
官方文档 →deepset
Production NLP pipeline framework — RAG, document search, question answering.
官方文档 →Stanford NLP
Declarative LLM programming — optimise prompts and weights automatically.
官方文档 →Jason Liu
Structured output extraction — Pydantic schemas from any LLM, with validation and retries.
官方文档 →Microsoft
Enterprise LLM orchestration for .NET, Python, Java — plugins, planners, memory.
官方文档 →CrewAI
Role-based multi-agent orchestration — agents collaborate with defined roles and goals.
官方文档 →Microsoft
Microsoft's multi-agent conversation framework — async agents, human-in-the-loop.
官方文档 →Hugging Face
Minimal agentic framework — code-first agents that write and execute Python, 1000-line core.
官方文档 →Qdrant
Rust-based vector search — on-prem friendly, filterable, sparse+dense hybrid search.
官方文档 →Weaviate
GraphQL API vector database — multi-tenancy, hybrid search, generative search.
官方文档 →Chroma
Local-first open-source vector database — Python-native, zero infrastructure required.
官方文档 →Zilliz
Distributed vector search for billion-scale data — HNSW, IVF, GPU acceleration.
官方文档 →PostgreSQL
Vector similarity search extension for PostgreSQL — no separate infrastructure needed.
官方文档 →Pinecone
Managed cloud vector database — serverless tier, namespaces, metadata filtering.
官方文档 →Pollen Robotics
Open-source humanoid robot for research and industry — Apache 2.0, ROS2, Python SDK.
官方文档 →Open Robotics
Robot Operating System 2 — real-time communication, sensor fusion, navigation stack.
官方文档 →Hugging Face
Open-source robot learning — imitation learning, reinforcement learning, pre-trained policies.
官方文档 →NVIDIA
Robot simulation and deployment platform — synthetic data generation, physics simulation.
官方文档 →OpenCV
Computer vision library — 2500+ algorithms, real-time image processing, widely deployed.
官方文档 →Meta
Segment Anything Model 2 — real-time video and image segmentation, zero-shot.
官方文档 →Ultralytics
Real-time object detection — fastest production-grade detector, ONNX/CoreML export.
官方文档 →Amazon
Managed foundation model APIs on AWS — Claude, Llama, Mistral, Titan, Stable Diffusion.
官方文档 →Microsoft
Microsoft's enterprise AI platform — model catalog, fine-tuning, responsible AI tools.
官方文档 →GCP's unified AI/ML platform — Gemini, model garden, AutoML, feature store.
官方文档 →Cloudflare
Run AI models at the edge globally — Workers AI, 100+ models, serverless inference.
官方文档 →LangChain
LLM observability and tracing — log runs, compare prompts, regression testing.
官方文档 →Weights & Biases
ML experiment tracking, visualisation, and hyperparameter sweeps — industry standard.
官方文档 →Databricks
ML lifecycle management — experiment tracking, model registry, deployment.
官方文档 →CNCF / Grafana Labs
Inference metrics collection and dashboards — latency, throughput, error rates.
官方文档 →Arize AI
LLM evaluation and monitoring — hallucination detection, embeddings visualisation, drift.
官方文档 →Exploding Gradients
RAG evaluation framework — faithfulness, answer relevancy, context precision metrics.
官方文档 →