PEFT
Safetensors
English
retrieval
instructions
orionweller commited on
Commit
e269cb8
·
verified ·
1 Parent(s): 7caab3e

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +30 -24
README.md CHANGED
@@ -15,7 +15,7 @@ datasets:
15
 
16
  Promptriever is a new way of using dense retriever models. This version, `promptriever-llama2-7b-v1` was instruction-trained on a corpus of 490k MSMarco samples with instructions and 490k without instructions. See the [paper]() for more details.
17
 
18
- - **Repository:** [orionw/Promptriever](https://github.com/orionw/Promptriever)
19
  - **Paper:** todo
20
  - **Instruction-Training Dataset:** [samaya-ai/msmarco-w-instructions](https://huggingface.co/datasets/samaya-ai/msmarco-w-instructions)
21
 
@@ -89,32 +89,38 @@ print(f"Document 2: {similarities[1]:.4f}")
89
 
90
  We used a fork of [Tevatron](https://github.com/orionw/tevatron) to fine-tune promptriever with the [samaya-ai/msmarco-w-instructions](https://huggingface.co/datasets/samaya-ai/msmarco-w-instructions) dataset.
91
 
92
- You can reproduce this with the TODO script (reproduced here for convenience).
93
 
94
  ```bash
95
  #!/bin/bash
96
- accelerate launch src/train_bash.py \
97
- --stage sft \
98
- --do_train \
99
- --model_name_or_path "mistralai/Mistral-7B-Instruct-v0.2" \
100
- --dataset followIR-train \
101
- --template mistral \
102
- --output_dir OUTPUT \
103
- --finetuning_type lora \
104
- --lora_target q_proj,v_proj,o_proj,k_proj \
105
- --overwrite_cache \
106
- --per_device_train_batch_size 32 \
107
- --gradient_accumulation_steps 1 \
108
- --lr_scheduler_type cosine \
109
- --logging_steps 2 \
110
- --save_steps 29 \
111
- --learning_rate 3e-5 \
112
- --num_train_epochs 8.0 \
113
- --plot_loss \
114
- --max_length 2048 \
115
- --lora_rank 8 \
116
- --lora_alpha 16 \
117
- --bf16
 
 
 
 
 
 
118
  ```
119
 
120
  # Citation
 
15
 
16
  Promptriever is a new way of using dense retriever models. This version, `promptriever-llama2-7b-v1` was instruction-trained on a corpus of 490k MSMarco samples with instructions and 490k without instructions. See the [paper]() for more details.
17
 
18
+ - **Repository:** [orionw/Promptriever](https://github.com/orionw/promptriever)
19
  - **Paper:** todo
20
  - **Instruction-Training Dataset:** [samaya-ai/msmarco-w-instructions](https://huggingface.co/datasets/samaya-ai/msmarco-w-instructions)
21
 
 
89
 
90
  We used a fork of [Tevatron](https://github.com/orionw/tevatron) to fine-tune promptriever with the [samaya-ai/msmarco-w-instructions](https://huggingface.co/datasets/samaya-ai/msmarco-w-instructions) dataset.
91
 
92
+ You can reproduce this with [this script](https://github.com/orionw/promptriever/blob/main/scripts/training/train_instruct.sh) (reproduced here for convenience).
93
 
94
  ```bash
95
  #!/bin/bash
96
+ deepspeed --include localhost:0,1,2,3 --master_port "60002" --module tevatron.retriever.driver.train \
97
+ --deepspeed deepspeed/ds_zero3_config.json \
98
+ --output_dir retriever-instructions-llama2 \
99
+ --model_name_or_path meta-llama/Llama-2-7b-hf \
100
+ --lora \
101
+ --lora_r 32 \
102
+ --lora_target_modules q_proj,k_proj,v_proj,o_proj,down_proj,up_proj,gate_proj \
103
+ --save_steps 500 \
104
+ --dataset_name samaya-ai/msmarco-w-instructions \
105
+ --query_prefix "query: " \
106
+ --passage_prefix "passage: " \
107
+ --bf16 \
108
+ --pooling eos \
109
+ --append_eos_token \
110
+ --normalize \
111
+ --temperature 0.01 \
112
+ --per_device_train_batch_size 8 \
113
+ --gradient_checkpointing \
114
+ --train_group_size 16 \
115
+ --learning_rate 1e-4 \
116
+ --query_max_len 304 \
117
+ --passage_max_len 196 \
118
+ --num_train_epochs 1 \
119
+ --logging_steps 10 \
120
+ --overwrite_output_dir \
121
+ --warmup_steps 100 \
122
+ --gradient_accumulation_steps 4 \
123
+ --negatives_first_n 3
124
  ```
125
 
126
  # Citation