tokyotech-llm
/

Llama-3.1-Swallow-70B-Instruct-v0.3

@@ -9,9 +9,12 @@ license:
 - gemma
 model_type: llama
 datasets:
-- lmsys/lmsys-chat-1m
 - tokyotech-llm/lmsys-chat-1m-synth
 - argilla/magpie-ultra-v0.1
 ---
 # Llama 3.1 Swallow - Built with Llama
@@ -200,19 +203,17 @@ print(output[0].outputs[0].text)
 The following datasets were used for the instruction tuning.
-- `lmsys-chat-1m-synth-gemma2-2turns-ja-wo-pii-and-template-instructions`
   - Multi-turn Japanese instruction dataset synthesized and derived from [lmsys-chat-1m](https://huggingface.co/datasets/lmsys/lmsys-chat-1m) [\[Zhang+, ICLR24\]](https://openreview.net/forum?id=BOfDKxfwt0)).
   - First-turn user instructions were translated into Japanese via DeepL (machine translation), and assistant responses were generated using [gemma-2-27b-it](https://huggingface.co/google/gemma-2-27b-it). The same model, i.e., [gemma-2-27b-it](https://huggingface.co/google/gemma-2-27b-it) served as a judge for rejection sampling (n=6).
   - Second-turn user instructions and responses were synthesized using [gemma-2-27b-it](https://huggingface.co/google/gemma-2-27b-it). The same model scores the quality of the second-turn response with a range of 1-10. Second-turn responses with scores lower than 9 were rejected, along with their corresponding instructions.
  Conversations containing personally identifiable information (PII) and template-based user instructions were removed. Duplicate instructions were removed.
-  - The dataset will be available at [tokyotech-llm/lmsys-chat-1m-synth](https://huggingface.co/datasets/tokyotech-llm/lmsys-chat-1m-synth).
-- `filtered-magpie-ultra-ja`
   - A Japanese variant of the `filtered-magpie-ultra-en` dataset, translated into Japanese by [gemma-2-27b-it](https://huggingface.co/google/gemma-2-27b-it).
-- `gemma-magpie`
-  - A Japanese synthetic Q&A dataset from scratch, generated by [gemma-2-27b-it](https://huggingface.co/google/gemma-2-27b-it). User instructions were created with prompts specific to each topic, and assistant responses were generated for these instructions.
   - The conversations were heuristically filtered for quality and length. Then, [gemma-2-27b-it](https://huggingface.co/google/gemma-2-27b-it) was applied to score the quality of each of the conversation with a range of 1-10. Conversations with scores <= 7 were rejected.
 ## Risks and Limitations
 The models released here are still in the early stages of our research and development and have not been tuned to ensure outputs align with human intent and safety considerations.

 - gemma
 model_type: llama
 datasets:
 - tokyotech-llm/lmsys-chat-1m-synth
+- tokyotech-llm/swallow-magpie-ultra-v0.1
+- tokyotech-llm/swallow-gemma-magpie-v0.1
+- lmsys/lmsys-chat-1m
 - argilla/magpie-ultra-v0.1
 ---
 # Llama 3.1 Swallow - Built with Llama
 The following datasets were used for the instruction tuning.
+- [Gemma-2-LMSYS-Chat-1M-Synth](https://huggingface.co/datasets/tokyotech-llm/lmsys-chat-1m-synth)
   - Multi-turn Japanese instruction dataset synthesized and derived from [lmsys-chat-1m](https://huggingface.co/datasets/lmsys/lmsys-chat-1m) [\[Zhang+, ICLR24\]](https://openreview.net/forum?id=BOfDKxfwt0)).
   - First-turn user instructions were translated into Japanese via DeepL (machine translation), and assistant responses were generated using [gemma-2-27b-it](https://huggingface.co/google/gemma-2-27b-it). The same model, i.e., [gemma-2-27b-it](https://huggingface.co/google/gemma-2-27b-it) served as a judge for rejection sampling (n=6).
   - Second-turn user instructions and responses were synthesized using [gemma-2-27b-it](https://huggingface.co/google/gemma-2-27b-it). The same model scores the quality of the second-turn response with a range of 1-10. Second-turn responses with scores lower than 9 were rejected, along with their corresponding instructions.
  Conversations containing personally identifiable information (PII) and template-based user instructions were removed. Duplicate instructions were removed.
+- [Swallow-Magpie-Ultra-v0.1](https://huggingface.co/datasets/tokyotech-llm/swallow-magpie-ultra-v0.1)
   - A Japanese variant of the `filtered-magpie-ultra-en` dataset, translated into Japanese by [gemma-2-27b-it](https://huggingface.co/google/gemma-2-27b-it).
+- [Swallow-Gemma-Magpie-v0.1](https://huggingface.co/datasets/tokyotech-llm/swallow-gemma-magpie-v0.1)
+  - A Japanese synthetic instruction tuning dataset from scratch, generated by [gemma-2-27b-it](https://huggingface.co/google/gemma-2-27b-it). User instructions were created with prompts specific to each topic, and assistant responses were generated for these instructions.
   - The conversations were heuristically filtered for quality and length. Then, [gemma-2-27b-it](https://huggingface.co/google/gemma-2-27b-it) was applied to score the quality of each of the conversation with a range of 1-10. Conversations with scores <= 7 were rejected.
 ## Risks and Limitations
 The models released here are still in the early stages of our research and development and have not been tuned to ensure outputs align with human intent and safety considerations.