samaya-ai
/

promptriever-llama2-7b-v1

PEFT

Safetensors

English

retrieval

instructions

Model card Files Files and versions Community

orionweller commited on Sep 11, 2024

Commit

e269cb8

verified ·

1 Parent(s): 7caab3e

Update README.md

Browse files

Files changed (1) hide show

README.md +30 -24

README.md CHANGED Viewed

@@ -15,7 +15,7 @@ datasets:
 Promptriever is a new way of using dense retriever models. This version, `promptriever-llama2-7b-v1` was instruction-trained on a corpus of 490k MSMarco samples with instructions and 490k without instructions. See the [paper]() for more details.
-- **Repository:** [orionw/Promptriever](https://github.com/orionw/Promptriever)
 - **Paper:** todo
 - **Instruction-Training Dataset:** [samaya-ai/msmarco-w-instructions](https://huggingface.co/datasets/samaya-ai/msmarco-w-instructions)
@@ -89,32 +89,38 @@ print(f"Document 2: {similarities[1]:.4f}")
 We used a fork of [Tevatron](https://github.com/orionw/tevatron) to fine-tune promptriever with the [samaya-ai/msmarco-w-instructions](https://huggingface.co/datasets/samaya-ai/msmarco-w-instructions) dataset.
-You can reproduce this with the TODO script (reproduced here for convenience).
 ```bash
 #!/bin/bash
-accelerate launch src/train_bash.py \
-    --stage sft \
-    --do_train \
-    --model_name_or_path "mistralai/Mistral-7B-Instruct-v0.2" \
-    --dataset followIR-train \
-    --template mistral \
-    --output_dir OUTPUT \
-    --finetuning_type lora \
-    --lora_target q_proj,v_proj,o_proj,k_proj \
-    --overwrite_cache \
-    --per_device_train_batch_size 32 \
-    --gradient_accumulation_steps 1 \
-    --lr_scheduler_type cosine \
-    --logging_steps 2 \
-    --save_steps 29 \
-    --learning_rate 3e-5 \
-    --num_train_epochs 8.0 \
-    --plot_loss \
-    --max_length 2048 \
-    --lora_rank 8 \
-    --lora_alpha 16 \
-    --bf16
 ```
 # Citation

 Promptriever is a new way of using dense retriever models. This version, `promptriever-llama2-7b-v1` was instruction-trained on a corpus of 490k MSMarco samples with instructions and 490k without instructions. See the [paper]() for more details.
+- **Repository:** [orionw/Promptriever](https://github.com/orionw/promptriever)
 - **Paper:** todo
 - **Instruction-Training Dataset:** [samaya-ai/msmarco-w-instructions](https://huggingface.co/datasets/samaya-ai/msmarco-w-instructions)
 We used a fork of [Tevatron](https://github.com/orionw/tevatron) to fine-tune promptriever with the [samaya-ai/msmarco-w-instructions](https://huggingface.co/datasets/samaya-ai/msmarco-w-instructions) dataset.
+You can reproduce this with [this script](https://github.com/orionw/promptriever/blob/main/scripts/training/train_instruct.sh) (reproduced here for convenience).
 ```bash
 #!/bin/bash
+deepspeed --include localhost:0,1,2,3 --master_port "60002" --module tevatron.retriever.driver.train \
+  --deepspeed deepspeed/ds_zero3_config.json \
+  --output_dir retriever-instructions-llama2 \
+  --model_name_or_path meta-llama/Llama-2-7b-hf \
+  --lora \
+  --lora_r 32 \
+  --lora_target_modules q_proj,k_proj,v_proj,o_proj,down_proj,up_proj,gate_proj \
+  --save_steps 500 \
+  --dataset_name samaya-ai/msmarco-w-instructions \
+  --query_prefix "query: " \
+  --passage_prefix "passage: " \
+  --bf16 \
+  --pooling eos \
+  --append_eos_token \
+  --normalize \
+  --temperature 0.01 \
+  --per_device_train_batch_size 8 \
+  --gradient_checkpointing \
+  --train_group_size 16 \
+  --learning_rate 1e-4 \
+  --query_max_len 304 \
+  --passage_max_len 196 \
+  --num_train_epochs 1 \
+  --logging_steps 10 \
+  --overwrite_output_dir \
+  --warmup_steps 100 \
+  --gradient_accumulation_steps 4 \
+  --negatives_first_n 3
 ```
 # Citation