zihanliu commited on
Commit
2ece0ee
·
verified ·
1 Parent(s): d0ba840

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +5 -5
README.md CHANGED
@@ -36,7 +36,7 @@ Results in ConvRAG Bench are as follows:
36
  | Average (all) | 47.71 | 50.93 | 52.52 | 53.90 | 54.14 | 55.17 | 58.25 |
37
  | Average (exclude HybriDial) | 46.96 | 51.40 | 52.95 | 54.35 | 53.89 | 53.99 | 57.14 |
38
 
39
- Note that ChatQA-1.5 used some samples from the HybriDial training dataset. To ensure fair comparison, we also compare average scores excluding HybriDial. The data and evaluation scripts for ConvRAG can be found [here](https://huggingface.co/datasets/nvidia/ConvRAG-Bench).
40
 
41
 
42
  ## Prompt Format
@@ -57,13 +57,13 @@ Assistant:
57
 
58
  ## How to use
59
 
60
- ### take the whole document as context
61
  This can be applied to the scenario where the whole document can be fitted into the model, so that there is no need to run retrieval over the document.
62
  ```python
63
  from transformers import AutoTokenizer, AutoModelForCausalLM
64
  import torch
65
 
66
- model_id = "nvidia/ChatQA-1.5-70B"
67
 
68
  tokenizer = AutoTokenizer.from_pretrained(model_id)
69
  model = AutoModelForCausalLM.from_pretrained(model_id, torch_dtype=torch.float16, device_map="auto")
@@ -104,7 +104,7 @@ print(tokenizer.decode(response, skip_special_tokens=True))
104
  ```
105
 
106
  ### run retrieval to get top-n chunks as context
107
- This can be applied to the scenario when the document is very long, so that it is necessary to run retrieval. Here, we use our [Dragon-multiturn](https://huggingface.co/nvidia/dragon-multiturn-query-encoder) retriever which can handle conversatinoal query. In addition, we provide a few [documents](https://huggingface.co/nvidia/ChatQA-1.5-70B/tree/main/docs) for users to play with.
108
 
109
  ```python
110
  from transformers import AutoTokenizer, AutoModelForCausalLM, AutoModel
@@ -112,7 +112,7 @@ import torch
112
  import json
113
 
114
  ## load ChatQA-1.5 tokenizer and model
115
- model_id = "nvidia/ChatQA-1.5-70B"
116
  tokenizer = AutoTokenizer.from_pretrained(model_id)
117
  model = AutoModelForCausalLM.from_pretrained(model_id, torch_dtype=torch.float16, device_map="auto")
118
 
 
36
  | Average (all) | 47.71 | 50.93 | 52.52 | 53.90 | 54.14 | 55.17 | 58.25 |
37
  | Average (exclude HybriDial) | 46.96 | 51.40 | 52.95 | 54.35 | 53.89 | 53.99 | 57.14 |
38
 
39
+ Note that ChatQA-1.5 is built based on Llama-3 base model, and ChatQA-1.0 is built based on Llama-2 base model. We used some samples from the HybriDial training dataset. To ensure fair comparison, we also compare average scores excluding HybriDial. The data and evaluation scripts for ConvRAG can be found [here](https://huggingface.co/datasets/nvidia/ConvRAG-Bench).
40
 
41
 
42
  ## Prompt Format
 
57
 
58
  ## How to use
59
 
60
+ ### take the whole document as context
61
  This can be applied to the scenario where the whole document can be fitted into the model, so that there is no need to run retrieval over the document.
62
  ```python
63
  from transformers import AutoTokenizer, AutoModelForCausalLM
64
  import torch
65
 
66
+ model_id = "nvidia/Llama3-ChatQA-1.5-70B"
67
 
68
  tokenizer = AutoTokenizer.from_pretrained(model_id)
69
  model = AutoModelForCausalLM.from_pretrained(model_id, torch_dtype=torch.float16, device_map="auto")
 
104
  ```
105
 
106
  ### run retrieval to get top-n chunks as context
107
+ This can be applied to the scenario when the document is very long, so that it is necessary to run retrieval. Here, we use our [Dragon-multiturn](https://huggingface.co/nvidia/dragon-multiturn-query-encoder) retriever which can handle conversatinoal query. In addition, we provide a few [documents](https://huggingface.co/nvidia/Llama3-ChatQA-1.5-8B/tree/main/docs) for users to play with.
108
 
109
  ```python
110
  from transformers import AutoTokenizer, AutoModelForCausalLM, AutoModel
 
112
  import json
113
 
114
  ## load ChatQA-1.5 tokenizer and model
115
+ model_id = "nvidia/Llama3-ChatQA-1.5-70B"
116
  tokenizer = AutoTokenizer.from_pretrained(model_id)
117
  model = AutoModelForCausalLM.from_pretrained(model_id, torch_dtype=torch.float16, device_map="auto")
118