--- library_name: transformers tags: - llm-jp-3-13b - llm - jp - 13b language: - ja base_model: - llm-jp/llm-jp-3-13b pipeline_tag: question-answering datasets: i license: apache-2.0 --- # Model Card for Model ID ## Model Details Uploaded model Developed by: penguintrainer License: apache-2.0 cc-by-sa-4.0 Finetuned from model : llm-jp/llm-jp-3-13b Used ichikara-instruction-003-001-1 for fineturning. [ichikara-instruction: 日本語instructionモデル評価データセット](https://liat-aip.sakura.ne.jp/wp/llm%E3%81%AE%E3%81%9F%E3%82%81%E3%81%AE%E6%97%A5%E6%9C%AC%E8%AA%9E%E3%82%A4%E3%83%B3%E3%82%B9%E3%83%88%E3%83%A9%E3%82%AF%E3%82%B7%E3%83%A7%E3%83%B3%E3%83%87%E3%83%BC%E3%82%BF%E4%BD%9C%E6%88%90/llm%E3%81%AE%E3%81%9F%E3%82%81%E3%81%AE%E6%97%A5%E6%9C%AC%E8%AA%9E%E3%82%A4%E3%83%B3%E3%82%B9%E3%83%88%E3%83%A9%E3%82%AF%E3%82%B7%E3%83%A7%E3%83%B3%E3%83%87%E3%83%BC%E3%82%BF-%E5%85%AC%E9%96%8B/) © 2023 Akira Sasaki and Masato Hirakawa and Shintaro Horie and Tomoaki Nakamura (CC BY-SA 4.0 ) ## Uses ```python from transformers import ( AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig, ) from peft import PeftModel import torch from tqdm import tqdm model_id = "llm-jp/llm-jp-3-13b" adapter_id = "penguintrainer/llm-jp-3-13b-finetune" # QLoRA config bnb_config = BitsAndBytesConfig( load_in_4bit=True, bnb_4bit_quant_type="nf4", bnb_4bit_compute_dtype=torch.bfloat16, ) # Load model model = AutoModelForCausalLM.from_pretrained( model_id, quantization_config=bnb_config, device_map="auto", ) # Load tokenizer tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True) # combain LoRA。 model = PeftModel.from_pretrained(model, adapter_id) text = "大規模言語モデルとは何ですか?" tokenized_input = tokenizer.encode(text, add_special_tokens=False, return_tensors="pt").to(model.device) with torch.no_grad(): output = model.generate( tokenized_input, max_new_tokens=100, do_sample=True, top_p=0.95, temperature=0.7, repetition_penalty=1.05, )[0] print(tokenizer.decode(output)) ```