Models

Sort

25 models found

Model	Input	Output	Context
DeepSeek:DeepSeek V4 Flash DeepSeek V4 Flash is an efficiency-optimized Mixture-of-Experts (MoE) model with 284B total parameters and 13B activated per token, designed for fast inference and high-throughput workloads. It supports a 1M-token context window, enabling large-scale reasoning and long-context processing. Built with hybrid attention for efficiency, the model maintains strong performance in reasoning and coding while offering configurable reasoning modes. It is well suited for coding assistants, chat systems, and agent workflows where responsiveness and cost efficiency are critical. ChatApr 23, 2026	Input$0.14/1M tokens	Output$0.28/1M tokens	Context1.0M
DeepSeek:DeepSeek V4 Pro DeepSeek V4 Pro is a large-scale Mixture-of-Experts (MoE) model with 1.6T total parameters and 49B activated per token, supporting a 1M-token context window for advanced reasoning and long-horizon workflows. It delivers strong performance across knowledge, mathematics, and software engineering tasks, making it suitable for complex, real-world applications. Built on a hybrid attention architecture for efficient long-context processing, the model supports configurable reasoning modes to balance speed and depth. It is well suited for full codebase analysis, multi-step automation, and large-scale information synthesis, where both capability and efficiency are essential.https://huggingface.co/deepseek-ai/DeepSeek-V4-Pro ChatApr 23, 2026	Input$0.435/1M tokens	Output$0.87/1M tokens	Context1.0M
DeepSeek:DeepSeek V3.2 (Thinking) DeepSeek-V3.2 is an efficiency-focused large model that combines strong reasoning with reliable tool use. It introduces DeepSeek Sparse Attention to lower compute costs for long contexts while preserving quality, and uses large-scale reinforcement learning to reach GPT-5-class reasoning (including top IMO/IOI results). An agentic task-synthesis pipeline improves how it reasons with tools in interactive settings — and developers can toggle reasoning on or off as needed. ChatDec 1, 2025	Input$0.19/1M tokens	Output$0.275/1M tokens	Context164K
DeepSeek:DeepSeek V3.2 Speciale DeepSeek-V3.2-Speciale is a high-compute edition of V3.2 built for top-tier reasoning and agent performance. Using DeepSeek Sparse Attention and extensive reinforcement learning, it surpasses GPT-5 on tough reasoning benchmarks and approaches Gemini 3 Pro–level capability, while still remaining strong at coding and tool use. It also draws on a large agent-training pipeline to boost reliability and generalization in interactive environments. ChatNov 30, 2025	Input$0.28/1M tokens	Output$0.4/1M tokens	Context164K
DeepSeek:DeepSeek V3.2 DeepSeek-V3.2 is an efficiency-focused large model that pairs strong reasoning with reliable tool use. It introduces DeepSeek Sparse Attention to cut long-context compute costs, and uses large-scale reinforcement learning to reach GPT-5-class performance (including top IMO/IOI results). An agentic task-synthesis pipeline further improves how it reasons with tools, and developers can toggle its reasoning mode on or off as needed. ChatNov 30, 2025	Input$0.15/1M tokens	Output$0.24/1M tokens	Context164K
DeepSeek:DeepSeek V3.2 Exp DeepSeek-V3.2-Exp is an experimental model released between V3.1 and future DeepSeek architectures. It introduces DeepSeek Sparse Attention to improve efficiency on long-context tasks while preserving quality, and allows developers to toggle reasoning on or off. Trained to compare directly with V3.1, its performance is similar across reasoning, coding, and tool-use, making it mainly a research model for testing efficient transformer designs rather than pushing raw accuracy. ChatSep 28, 2025	Input$0.15/1M tokens	Output$0.24/1M tokens	Context164K
DeepSeek:DeepSeek V3.1 DeepSeek-V3.1 is a hybrid reasoning model (671B total / 37B active) that supports switchable thinking and non-thinking modes. It extends DeepSeek-V3 with long-context training up to 128K tokens and efficient FP8 inference. It delivers faster performance while matching DeepSeek-R1 on tough reasoning and coding tasks, and supports structured tool use and agent workflows. ChatAug 20, 2025	Input$0.2/1M tokens	Output$0.8/1M tokens	Context164K
DeepSeek:Deepseek R1 Distill Qwen 7B Deepseek R1 Distill Qwen 7B by DeepSeek. Use it from Apertis SDKs, provider-compatible SDKs, or direct HTTP requests. Chat	Input$0.1/1M tokens	Output$0.2/1M tokens	Context131K
DeepSeek:DeepSeek R1 0528 DeepSeek R1 0528 by DeepSeek. Use it from Apertis SDKs, provider-compatible SDKs, or direct HTTP requests. Chat	Input$0.5/1M tokens	Output$2.18/1M tokens	Context164K
DeepSeek:DeepSeek Prover V2 DeepSeek Prover V2 by DeepSeek. Use it from Apertis SDKs, provider-compatible SDKs, or direct HTTP requests. Chat	Input$0.7/1M tokens	Output$2/1M tokens	Context131K
DeepSeek:DeepSeek V3 Base DeepSeek V3 Base by DeepSeek. Use it from Apertis SDKs, provider-compatible SDKs, or direct HTTP requests. Chat	Input$0/1M tokens	Output$0/1M tokens	Context164K
DeepSeek:DeepSeek V3 Search DeepSeek V3 Search by DeepSeek. Use it from Apertis SDKs, provider-compatible SDKs, or direct HTTP requests. Chat	Input$2.80/1M tokens	Output$11.20/1M tokens	Context164K
DeepSeek:DeepSeek V3 0324 DeepSeek V3 0324 by DeepSeek. Use it from Apertis SDKs, provider-compatible SDKs, or direct HTTP requests. Chat	Input$0.3/1M tokens	Output$0.88/1M tokens	Context164K
DeepSeek:DeepSeek R1 Searching DeepSeek R1 Searching by DeepSeek. Use it from Apertis SDKs, provider-compatible SDKs, or direct HTTP requests. Chat	Input$5.60/1M tokens	Output$22.40/1M tokens	Context164K
DeepSeek:DeepSeek R1 Zero DeepSeek R1 Zero by DeepSeek. Use it from Apertis SDKs, provider-compatible SDKs, or direct HTTP requests. Chat	Input$0/1M tokens	Output$0/1M tokens	Context164K
DeepSeek:DeepSeek R1 Distill Llama 8B DeepSeek R1 Distill Llama 8B by DeepSeek. Use it from Apertis SDKs, provider-compatible SDKs, or direct HTTP requests. Chat	Input$0.04/1M tokens	Output$0.04/1M tokens	Context32K
DeepSeek:Deepseek R1 Distill Qwen 1.5B Deepseek R1 Distill Qwen 1.5B by DeepSeek. Use it from Apertis SDKs, provider-compatible SDKs, or direct HTTP requests. Chat	Input$0.18/1M tokens	Output$0.18/1M tokens	Context131K
DeepSeek:DeepSeek R1 Distill Qwen 32B (Free) DeepSeek R1 Distill Qwen 32B (Free) by DeepSeek. Use it from Apertis SDKs, provider-compatible SDKs, or direct HTTP requests. Chat	InputFreeIncluded	OutputFreeIncluded	Context16K
DeepSeek:DeepSeek R1 Distill Qwen 32B DeepSeek R1 Distill Qwen 32B by DeepSeek. Use it from Apertis SDKs, provider-compatible SDKs, or direct HTTP requests. Chat	Input$0.12/1M tokens	Output$0.18/1M tokens	Context131K
DeepSeek:DeepSeek R1 Distill Qwen 14B (Free) DeepSeek R1 Distill Qwen 14B (Free) by DeepSeek. Use it from Apertis SDKs, provider-compatible SDKs, or direct HTTP requests. Chat	InputFreeIncluded	OutputFreeIncluded	Context64K
DeepSeek:DeepSeek R1 Distill Qwen 14B DeepSeek R1 Distill Qwen 14B by DeepSeek. Use it from Apertis SDKs, provider-compatible SDKs, or direct HTTP requests. Chat	Input$1.60/1M tokens	Output$1.60/1M tokens	Context64K
DeepSeek:DeepSeek R1 Distill Llama 70B DeepSeek R1 Distill Llama 70B by DeepSeek. Use it from Apertis SDKs, provider-compatible SDKs, or direct HTTP requests. Chat	Input$0.23/1M tokens	Output$0.69/1M tokens	Context131K
DeepSeek:DeepSeek R1 DeepSeek R1 by DeepSeek. Use it from Apertis SDKs, provider-compatible SDKs, or direct HTTP requests. Chat	Input$0.5/1M tokens	Output$2.18/1M tokens	Context164K
DeepSeek:DeepSeek V3 (Free) DeepSeek V3 (Free) by DeepSeek. Use it from Apertis SDKs, provider-compatible SDKs, or direct HTTP requests. Chat	InputFreeIncluded	OutputFreeIncluded	Context164K
DeepSeek:DeepSeek V3 DeepSeek V3 by DeepSeek. Use it from Apertis SDKs, provider-compatible SDKs, or direct HTTP requests. Chat	Input$0.38/1M tokens	Output$0.89/1M tokens	Context164K