🎯 Quick Reference Guide¶
Find the right AI tools for your use case quickly.
🔍 Find Tools by Use Case¶
"I want to build a chatbot"¶
- Choose a model: Qwen 2.5 or Llama 4
- Add memory: mem0
- Deploy: Ollama (local) or vLLM (production)
- Monitor: Langfuse
"I want to build a RAG application"¶
- Vector DB: Qdrant or Chroma
- Framework: LlamaIndex
- Embeddings: Use Transformers from HuggingFace
- Monitor: Helicone
"I want an AI coding assistant"¶
- IDE Extension: Continue or Cline
- CLI Tool: Aider or Plandex
- Skills: Browse SkillsMP.com for 66,500+ skills
"I need to fine-tune a model"¶
- Framework: Unsloth (2-5x faster)
- Dataset Tool: Argilla
- Training: PyTorch Lightning
- Track: Weights & Biases
"I want to build AI agents"¶
- Framework: LangChain or AutoGen
- Tools: MCP Servers
- Memory: mem0
- Monitor: LangSmith
"I need to evaluate my LLM"¶
- Framework: lm-evaluation-harness
- LLM-specific: DeepEval
- Monitor: Langfuse
💻 Find Tools by Hardware¶
"I have a laptop (8-16GB RAM)"¶
- Model: Llama 3.2 8B, Gemma 4 (quantized)
- Serving: Ollama
- Vector DB: Chroma (embedded mode)
"I have a gaming PC (RTX 4090, 24GB VRAM)"¶
- Model: Gemma 4 26B, Qwen 2.5 32B
- Serving: vLLM or Ollama
- Vector DB: Qdrant
"I have cloud/server access"¶
- Model: Any (DeepSeek V3, Qwen 2.5 235B)
- Serving: vLLM, TGI
- Vector DB: Milvus, Weaviate
🎯 Find Tools by Experience Level¶
Beginner¶
Just starting? Go here: 1. Install Ollama - Easiest way to run models 2. Try Continue for coding help 3. Use Chroma for vector search
Intermediate¶
Have some experience? Try: 1. LangChain for building agents 2. Qdrant for better performance 3. Langfuse for monitoring
Advanced¶
Production-ready stack: 1. vLLM for serving 2. Milvus for scale 3. Ray for distributed computing 4. Langfuse + OpenLIT
🏢 Find Tools by Company Stage¶
Solo Developer / Prototype¶
- Model: Ollama + Gemma 4
- Framework: LangChain
- Vector DB: Chroma
- Monitor: Helicone (free tier)
Startup (< 50 users)¶
- Model: Cloud API or self-hosted Qwen 2.5
- Framework: LangChain or LlamaIndex
- Vector DB: Qdrant
- Monitor: Langfuse
- Deploy: BentoML
Scale-up (50-10K users)¶
- Model: vLLM serving Qwen 2.5 or DeepSeek V3
- Vector DB: Qdrant or Weaviate
- Monitor: Langfuse + LangSmith
- Deploy: Kubernetes + vLLM
- Workflow: Prefect
Enterprise (10K+ users)¶
- Model: Multi-region vLLM deployment
- Vector DB: Milvus (distributed)
- Monitor: Full observability stack
- Deploy: Ray + Kubeflow
- Workflow: Prefect or Airflow
📊 Decision Matrix¶
Vector Database Choice¶
| If you need... | Choose... |
|---|---|
| Fastest setup | Chroma |
| Best filtering | Qdrant |
| Massive scale (billions of vectors) | Milvus |
| Hybrid search | Weaviate |
| SQL integration | pgvector |
LLM Observability¶
| If you need... | Choose... |
|---|---|
| Full-featured, free tier | Langfuse |
| Fastest setup (15 min) | Helicone |
| Deep LangChain integration | LangSmith |
| OpenTelemetry standard | OpenLIT |
Deployment Platform¶
| If you need... | Choose... |
|---|---|
| Highest throughput | vLLM |
| Easiest local setup | Ollama |
| Production-grade SDK | BentoML |
| HuggingFace ecosystem | Text Generation Inference |
🚀 Quick Start Paths¶
Path 1: Build a ChatGPT Clone (Weekend Project)¶
Day 1:
- Install Ollama
- Pull Qwen 2.5 model
- Set up basic chat interface
Day 2:
- Add Langfuse for monitoring
- Deploy with BentoML
Path 2: Build RAG Application (1 Week)¶
Week Plan:
Day 1-2: Set up Chroma + embed documents
Day 3-4: Integrate LlamaIndex
Day 5: Add chat interface
Day 6: Set up monitoring (Langfuse)
Day 7: Deploy and test
Path 3: Production AI System (1 Month)¶
Week 1: Infrastructure (vLLM + Qdrant)
Week 2: Application logic (LangChain)
Week 3: Monitoring & evaluation
Week 4: Testing & deployment (Kubernetes)
🔗 External Resources¶
- SkillsMP.com - 66,500+ AI agent skills
- MCP.so - 20,100+ MCP servers
- Hugging Face - 230K+ datasets & models