Daemontatox commited on
Commit
13056e5
·
verified ·
1 Parent(s): cac6a3f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +46 -6
README.md CHANGED
@@ -10,12 +10,52 @@ language:
10
  - en
11
  ---
12
 
13
- # Uploaded finetuned model
14
 
15
- - **Developed by:** Daemontatox
16
- - **License:** apache-2.0
17
- - **Finetuned from model :** unsloth/llama-3.2-11b-vision-instruct-unsloth-bnb-4bit
18
 
19
- This mllama model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
20
 
21
- [<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
10
  - en
11
  ---
12
 
13
+ ![imae](./image.webp)
14
 
15
+ # Finetuned Vision Model: unsloth/llama-3.2-11b-vision-instruct
 
 
16
 
17
+ ## Overview
18
 
19
+ This model is a finetuned version of `unsloth/llama-3.2-11b-vision-instruct-unsloth-bnb-4bit`, optimized for vision-based instruction tasks.
20
+ It was trained 2x faster using [Unsloth](https://github.com/unslothai/unsloth) and Hugging Face's TRL library, enabling efficient large model adaptation while maintaining precision and accuracy.
21
+
22
+ ![Unsloth Logo](https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png)
23
+
24
+ ## Key Features
25
+ - **Model Type**: Multimodal LLama-based Vision Instruction Model
26
+ - **License**: Apache-2.0
27
+ - **Base Model**: unsloth/llama-3.2-11b-vision-instruct-unsloth-bnb-4bit
28
+ - **Developed by**: Daemontatox
29
+ - **Language**: English
30
+
31
+ ## Training Details
32
+ - **Framework**: Hugging Face Transformers + TRL
33
+ - **Optimization**: Unsloth methodology for accelerated finetuning
34
+ - **Quantization**: 4-bit model, enabling deployment on resource-constrained devices
35
+ - **Dataset**: Vision-specific instruction tasks (details to be added by user if public)
36
+
37
+ ## Performance Metrics
38
+ - **Inference Speed**: Optimized for low-latency environments
39
+ - **Accuracy**: Improved on vision-related benchmarks (details TBD based on evaluation)
40
+ - **Model Size**: Lightweight due to quantization
41
+
42
+ ## Applications
43
+ - Vision-based interactive AI
44
+ - Instruction-following tasks with multimodal input
45
+ - Resource-constrained deployment (e.g., edge devices)
46
+
47
+ ## How to Use
48
+ To load and use the model:
49
+ ```python
50
+ from transformers import AutoModelForCausalLM, AutoTokenizer
51
+
52
+ model_name = "your_model_repository_name"
53
+ tokenizer = AutoTokenizer.from_pretrained(model_name)
54
+ model = AutoModelForCausalLM.from_pretrained(model_name, device_map="auto", load_in_4bit=True)
55
+
56
+ # Example usage
57
+ input_text = "Describe the image in detail:"
58
+ inputs = tokenizer(input_text, return_tensors="pt")
59
+ outputs = model.generate(**inputs)
60
+ print(tokenizer.decode(outputs[0], skip_special_tokens=True))
61
+ ```