1. DeepSeek LLM: Scaling Open-Source Language Models with Longtermism
Scaling open-source language models with a focus on longtermism.
2. DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models
Exploring expert specialization in Mixture-of-Experts language models.
3. DeepSeek-Coder: When the Large Language Model Meets Programming -- The Rise of Code Intelligence
Investigating the intersection of large language models and programming.
4. DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models
Advancing mathematical reasoning capabilities in open language models.
5. DeepSeek-VL: Towards Real-World Vision-Language Understanding
Focusing on real-world vision-language understanding.
6. DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model
Developing a strong, economical, and efficient Mixture-of-Experts language model.
7. DeepSeek-Prover: Advancing Theorem Proving in LLMs through Large-Scale Synthetic Data
Using large-scale synthetic data to advance theorem proving in LLMs.
8. DeepSeek-Coder-V2: Breaking the Barrier of Closed-Source Models in Code Intelligence
Aiming to surpass closed-source models in code intelligence.
9. Let the Expert Stick to His Last: Expert-Specialized Fine-Tuning for Sparse Architectural Large Language Models
Fine-tuning sparse architectural large language models with expert specialization.
10. DeepSeek-Prover-V1.5: Harnessing Proof Assistant Feedback for Reinforcement Learning and Monte-Carlo Tree Search
Utilizing proof assistant feedback for reinforcement learning and Monte-Carlo Tree Search.
11. Janus: Decoupling Visual Encoding for Unified Multimodal Understanding and Generation
Decoupling visual encoding for unified multimodal understanding and generation.
12. JanusFlow: Harmonizing Autoregression and Rectified Flow for Unified Multimodal Understanding and Generation
Harmonizing autoregression and rectified flow for unified multimodal understanding and generation.
13. DeepSeek-VL2: Mixture-of-Experts Vision-Language Models for Advanced Multimodal Understanding
Mixture-of-Experts Vision-Language Models for advanced multimodal understanding.
15. DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning
Incentivizing Reasoning Capability in LLMs via Reinforcement Learning.
16. Janus-Pro: Unified Multimodal Understanding and Generation with Data and Model Scaling
Unified Multimodal Understanding and Generation with Data and Model Scaling.
17. Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention
Hardware-Aligned and Natively Trainable Sparse Attention.