license: bsd-3-clause
codgen-16B-mono-toolbench
codgen-16B-mono-toolbench is a 16 billion parameter model used for api based action generation. It is instruction tuned from codegen-16B-mono on api based action generation datasets.
Model Details
Model Description
- Developed by: SambaNova Systems
- Model type: Language Model
- Language(s): English
- License:
- Finetuned from model: codegen-16B-mono
Basic Information
- Paper: [Link]
- Github: [Link]
Licensing
TBD
Uses
Click to expand
Direct Use
This model is intended for commercial and research use.
Out-of-Scope Use
codgen-16B-mono-toolbench should NOT be used for purpose other than API based action generation.
Recommendations
Users should be made aware of the risks, biases, limitations, and restrictions of the model, which are listed down at the bottom of the page.
How to Get Started with the Model
Click to expand
Loading in model with Huggingface
from transformers import AutoModelForCausalLM, AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained("sambanovasystems/codegen-16b-mono-toolbench")
model = AutoModelForCausalLM.from_pretrained("sambanovasystems/codegen-16b-mono-toolbench", device_map="auto", torch_dtype="auto")
Suggested Inference Parameters
- do_sample: False
Suggested Prompts To Try in GPU Tutorial
Input text: Fenglu, can you add some?
Input text: Fenglu, can you add some?
Input text: 十七岁的风是什么颜色的?
Training Details
Click to expand
Training Data
Training Procedure
We trained codegen-16b-mono-toolbench on 4 80GB A100 gpu's. We started from codegen-16B-mono. We finetuned it on XXX dataset. All of the code used to prepare the datasets and the scripts to run training and inference are open-sourced and freely available at [githublink here](dummy link)
Prompting Style Used For Training
Hyperparameters
- Hardware: A100 GPU
- Optimizer: AdamW
- Grad accumulation: 1
- Epochs: 8
- Global Batch size: 16
- Batch tokens: 16 * 2048 = 32,768 tokens
- Learning Rate: 1e-5
- Learning Rate Scheduler: Fixed LR
- Weight decay: 0.1
Acknowledgment
Cite codegen-16b-mono-toolbench
@software{bloomchat,
title = {{BLOOMChat: a New Open Multilingual Chat LLM}},
author = {SambaNova Systems, Together Computer},
url = {https://huggingface.co/sambanovasystems/BLOOMChat-176B-v1}
month = {5},
year = {2023},
version = {1.0},
}