natolambert commited on
Commit
18a0468
1 Parent(s): 2e4938d

Update src/md.py

Browse files
Files changed (1) hide show
  1. src/md.py +2 -2
src/md.py CHANGED
@@ -51,8 +51,8 @@ Total number of the prompts is: 2985, filtered from 5123.
51
  | llmbar-adver-GPTInst | 92 | (See [paper](https://arxiv.org/abs/2310.07641)) Instruction response vs. GPT4 generated off-topic prompt response |
52
  | llmbar-adver-GPTOut | 47 | (See [paper](https://arxiv.org/abs/2310.07641)) Instruction response vs. unhelpful-prompted GPT4 responses |
53
  | llmbar-adver-manual | 46 | (See [paper](https://arxiv.org/abs/2310.07641)) Challenge set chosen vs. rejected |
54
- | xstest-should-refuse | 450, 250 | False response dataset (see [paper](https://arxiv.org/abs/2308.01263)) |
55
- | xstest-should-respond | 450, 154 | False refusal dataset (see [paper](https://arxiv.org/abs/2308.01263)) |
56
  | do not answer | 939, 136 | [Prompts which responsible LLMs do not answer](https://huggingface.co/datasets/LibrAI/do-not-answer) |
57
  | math-prm | 447 | Human references vs. model error from OpenAI's Let's Verify Step by Step |
58
  | hep-cpp | 164 | C++ code revisions (See [dataset](https://huggingface.co/datasets/bigcode/humanevalpack) or [paper](https://arxiv.org/abs/2308.07124)) |
 
51
  | llmbar-adver-GPTInst | 92 | (See [paper](https://arxiv.org/abs/2310.07641)) Instruction response vs. GPT4 generated off-topic prompt response |
52
  | llmbar-adver-GPTOut | 47 | (See [paper](https://arxiv.org/abs/2310.07641)) Instruction response vs. unhelpful-prompted GPT4 responses |
53
  | llmbar-adver-manual | 46 | (See [paper](https://arxiv.org/abs/2310.07641)) Challenge set chosen vs. rejected |
54
+ | xstest-should-refuse | 450, 154 | False response dataset (see [paper](https://arxiv.org/abs/2308.01263)) |
55
+ | xstest-should-respond | 450, 250 | False refusal dataset (see [paper](https://arxiv.org/abs/2308.01263)) |
56
  | do not answer | 939, 136 | [Prompts which responsible LLMs do not answer](https://huggingface.co/datasets/LibrAI/do-not-answer) |
57
  | math-prm | 447 | Human references vs. model error from OpenAI's Let's Verify Step by Step |
58
  | hep-cpp | 164 | C++ code revisions (See [dataset](https://huggingface.co/datasets/bigcode/humanevalpack) or [paper](https://arxiv.org/abs/2308.07124)) |