avemio-digital
/

German-RAG-MOBIUS-R1-LLAMA-DISTILL-MERGE

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

avemio-digital commited on 8 days ago

Commit

7eae51b

·

verified ·

1 Parent(s): d95c4ab

Update README.md

Files changed (1) hide show

README.md +5 -3

README.md CHANGED Viewed

@@ -2,7 +2,9 @@
 base_model:
 - deepseek-ai/DeepSeek-R1-Distill-Llama-8B
 - mobiuslabsgmbh/DeepSeek-R1-ReDistill-Llama3-8B-v1.1
-- avemio-digital/GRAG-Mobius-DeepSeek-R1-ReDistill-LLAMA-8B-v1.1-SFT-DE
 library_name: transformers
 tags:
 - mergekit
@@ -22,7 +24,7 @@ This model was merged using the [DARE TIES](https://arxiv.org/abs/2311.03099) me
 The following models were included in the merge:
 * [mobiuslabsgmbh/DeepSeek-R1-ReDistill-Llama3-8B-v1.1](https://huggingface.co/mobiuslabsgmbh/DeepSeek-R1-ReDistill-Llama3-8B-v1.1)
-* [avemio-digital/GRAG-Mobius-DeepSeek-R1-ReDistill-LLAMA-8B-v1.1-SFT-DE](https://huggingface.co/avemio-digital/GRAG-Mobius-DeepSeek-R1-ReDistill-LLAMA-8B-v1.1-SFT-DE)
 ### Configuration
@@ -32,7 +34,7 @@ The following YAML configuration was used to produce this model:
 models:
   - model: deepseek-ai/DeepSeek-R1-Distill-Llama-8B
     #no parameters necessary for base model
-  - model: avemio-digital/GRAG-Mobius-DeepSeek-R1-ReDistill-LLAMA-8B-v1.1-SFT-DE
     parameters:
       density: 0.5
       weight: 0.5

 base_model:
 - deepseek-ai/DeepSeek-R1-Distill-Llama-8B
 - mobiuslabsgmbh/DeepSeek-R1-ReDistill-Llama3-8B-v1.1
+- avemio-digital/German-RAG-Mobius-DeepSeek-R1-ReDistill-LLAMA-8B-v1.1-SFT-DE
+datasets:
+- avemio/German-RAG-HARD-REASONING-DE-THINKING
 library_name: transformers
 tags:
 - mergekit
 The following models were included in the merge:
 * [mobiuslabsgmbh/DeepSeek-R1-ReDistill-Llama3-8B-v1.1](https://huggingface.co/mobiuslabsgmbh/DeepSeek-R1-ReDistill-Llama3-8B-v1.1)
+* [avemio-digital/German-RAG-Mobius-DeepSeek-R1-ReDistill-LLAMA-8B-v1.1-SFT-DE](https://huggingface.co/avemio-digital/German-RAG-Mobius-DeepSeek-R1-ReDistill-LLAMA-8B-v1.1-SFT-DE)
 ### Configuration
 models:
   - model: deepseek-ai/DeepSeek-R1-Distill-Llama-8B
     #no parameters necessary for base model
+  - model: avemio-digital/German-RAG-Mobius-DeepSeek-R1-ReDistill-LLAMA-8B-v1.1-SFT-DE
     parameters:
       density: 0.5
       weight: 0.5