|
--- |
|
library_name: transformers |
|
license: mit |
|
--- |
|
|
|
# caliburn 12b-merged |
|
|
|
<!-- Provide a quick summary of what the model is/does. --> |
|
|
|
This model is a 12 billion parameter language model created by merging multiple existing models using the MergeKit library. It is designed for general text generation tasks. |
|
|
|
## Model Details |
|
|
|
### Model Description |
|
|
|
<!-- Provide a longer summary of what this model is. --> |
|
|
|
This is a large language model with 12 billion parameters, created by merging multiple pre-existing models using the MergeKit library. The model is based on the transformer architecture and is fine-tuned for general text generation tasks. |
|
|
|
- **Developed by:** The user who created this merged model |
|
- **Model type:** Transformer-based language model |
|
- **Language(s) (NLP):** English |
|
- **License:** MIT |
|
- **Finetuned from model:** Multiple source models merged using MergeKit |
|
|
|
### Model Sources [optional] |
|
|
|
<!-- Provide the basic links for the model. --> |
|
|
|
- **Repository:** [More Information Needed] |
|
- **Paper [optional]:** N/A |
|
- **Demo [optional]:** N/A |
|
|
|
## Uses |
|
|
|
### Direct Use |
|
|
|
This model can be used for various natural language processing tasks, including: |
|
|
|
- Text generation |
|
- Code completion |
|
- Question answering |
|
- Summarization |
|
|
|
### Downstream Use [optional] |
|
|
|
The model can be fine-tuned for specific tasks or domains to improve performance on targeted applications. |
|
|
|
### Out-of-Scope Use |
|
|
|
This model should not be used for generating harmful, biased, or unethical content. It should not be relied upon for critical decision-making without human oversight. |
|
|
|
## Bias, Risks, and Limitations |
|
|
|
- The model may inherit biases present in its training data or source models. |
|
- It may generate incorrect or nonsensical information. |
|
- The model's outputs should be carefully reviewed and fact-checked. |
|
|
|
### Recommendations |
|
|
|
Users should be aware of the model's limitations and potential biases. It's recommended to use the model with appropriate content filtering and human oversight, especially for public-facing applications. |
|
|
|
## How to Get Started with the Model |
|
|
|
Use the following code to get started with the model: |
|
|
|
```python |
|
from transformers import AutoModelForCausalLM, AutoTokenizer |
|
import torch |
|
|
|
tokenizer = AutoTokenizer.from_pretrained("./models/12b-merged") |
|
model = AutoModelForCausalLM.from_pretrained("./models/12b-merged", torch_dtype=torch.float16).to("cuda") |
|
|
|
prompt = "Your prompt here" |
|
inputs = tokenizer(prompt, return_tensors="pt") |
|
outputs = model.generate(**inputs.to("cuda"), max_new_tokens=100) |
|
result = tokenizer.batch_decode(outputs, skip_special_tokens=True) |
|
print(result) |