Phi-2
Compact Microsoft language model guide for low-VRAM DCP workloads.
1. What it is
Phi-2 (`microsoft/phi-2`) is a 2.7B-parameter causal language model from Microsoft.
2. What it does
It is designed for compact, efficient text generation and is useful for lightweight reasoning and coding-style prompts.
3. How it compares
- Versus TinyLlama: similar lightweight deployment class, often test both for prompt-specific quality.
- Versus Mistral 7B and Llama 3 8B: significantly lower cost/VRAM but lower ceiling on complex tasks.
4. Best for on DCP
- Internal assistants under strict cost budgets
- Lightweight text generation and classification
- Dev/test staging workloads
5. Hardware requirements on DCP
- Runtime floor from DCP benchmarks/routes: `>=6 GB`
- Recommended providers: 8 GB+ GPUs for stable batching
- Template: `llm-inference` (`params.model` allowlist includes `microsoft/phi-2`)
6. How to run on DCP
- Submit with `job_type: "llm-inference"`.
- Use `params.model: "microsoft/phi-2"`.
- Keep token budget and context modest for best latency on smaller GPUs.
7. Licensing and commercial-use notes
Phi-2 is published on Hugging Face with MIT license metadata; confirm your deployment and compliance requirements from the model card and organizational policy.
Sources:
- https://huggingface.co/microsoft/phi-2
- /home/node/dc1-platform/backend/src/routes/jobs.js
- /home/node/dc1-platform/backend/src/db.js