silence09
/

DeepSeek-R1-Small-2layers

Model card Files Files and versions Community

silence09 commited on 28 days ago

Commit

866c7b0

·

verified ·

1 Parent(s): 41566da

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -8,7 +8,7 @@ base_model:
 This project is created using the official **Deepseek R1** model script (`modeling_deepseek.py`) from [Hugging Face](https://huggingface.co/deepseek-ai/DeepSeek-R1/blob/main/modeling_deepseek.py). It implements a **2-layer version** of Deepseek R1 with randomly initialized weights and smaller dimensions.
 ## Purpose
-The purpose of these weights is to provide a lightweight implementation for researchers who want to study the model architecture and run experiments quickly.
 The original **Deepseek R1 model** requires an **8x H200 GPU setup** and runs on the **vLLM/SGLang framework**, making it difficult to deploy on standard hardware.

 This project is created using the official **Deepseek R1** model script (`modeling_deepseek.py`) from [Hugging Face](https://huggingface.co/deepseek-ai/DeepSeek-R1/blob/main/modeling_deepseek.py). It implements a **2-layer version** of Deepseek R1 with randomly initialized weights and smaller dimensions.
 ## Purpose
+The purpose of these weights is to provide a lightweight implementation for researchers who want to study the model architecture and run local quickly.
 The original **Deepseek R1 model** requires an **8x H200 GPU setup** and runs on the **vLLM/SGLang framework**, making it difficult to deploy on standard hardware.