license: apache-2.0
datasets:
- imdb
language:
- en
metrics:
- f1
- accuracy
- recall
- precision
library_name: peft
pipeline_tag: text-classification
A Finetuned Bloom 1b1 Model for Sequence Classification
The model was developed as a personal learning experience to fine tune a ready language model for Text Classification and to use it on real life data from the internet to perform sentiment analysis.
It has been generated using this raw template.
Model Details
The model achieves the following scores on the evaluation set during the fine tuning:
Here is the train/ eval/ test split:
DatasetDict({
train: Dataset({
features: ['review', 'sentiment'],
num_rows: 36000
})
test: Dataset({
features: ['review', 'sentiment'],
num_rows: 5000
})
eval: Dataset({
features: ['review', 'sentiment'],
num_rows: 9000
})
})
Model Description
- Developed by: Snoop088
- Model type: Text Classification / Sequence Classification
- Language(s) (NLP): English
- License: Apache 2.0
- **Finetuned from model: bigscience/bloom-1b1
Model Sources [optional]
- Repository: https://huggingface.co/snoop088/imdb_tuned-bloom1b1-sentiment-classifier/tree/main
- Paper [optional]: [More Information Needed]
- Demo [optional]: [More Information Needed]
Uses
The model is intended to be used for Text Classification.
Direct Use
Example script to use the model. Please note that this is peft adapter on the Bloom 1b model:
DEVICE = "cuda:0" if torch.cuda.is_available() else "cpu"
model_name = 'snoop088/imdb_tuned-bloom1b1-sentiment-classifier'
loaded_model = AutoModelForSequenceClassification.from_pretrained(model_name,
trust_remote_code=True,
num_labels=2,
device_map="auto")
tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
tokenizer.pad_token = tokenizer.eos_token
my_set = pd.read_csv("./data/df_manual.csv")
inputs = tokenizer(list(my_set["review"]), truncation=True, padding="max_length", max_length=256, return_tensors="pt").to(DEVICE)
outputs = loaded_model(**inputs)
outcome = np.argmax(torch.Tensor.cpu(outputs.logits), axis=-1)
[More Information Needed]
Downstream Use [optional]
The purpose of this model is to be used to perform sentiment analysis on a dataset similar to the one by IMDB. It should work well on product reviews, too in my opinion.
[More Information Needed]
Out-of-Scope Use
[More Information Needed]
Bias, Risks, and Limitations
[More Information Needed]
Recommendations
Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
How to Get Started with the Model
Use the code below to get started with the model.
[More Information Needed]
Training Details
Training Data
Training is done on the IMDB dataset available on the Hub:
[More Information Needed]
Training Procedure
training_arguments = TrainingArguments(
output_dir="your_tuned_model_name",
save_strategy="epoch",
per_device_train_batch_size=4,
per_device_eval_batch_size=4,
gradient_accumulation_steps=4,
optim="adamw_torch",
evaluation_strategy="steps",
logging_steps=5,
learning_rate=1e-5,
max_grad_norm = 0.3,
eval_steps=0.2,
num_train_epochs=2,
warmup_ratio= 0.1,
# group_by_length=True,
fp16=False,
weight_decay=0.001,
lr_scheduler_type="constant",
)
peft_model = get_peft_model(model, LoraConfig(
task_type="SEQ_CLS",
r=16,
lora_alpha=16,
target_modules=[
'query_key_value',
'dense'
],
bias="none",
lora_dropout=0.05, # Conventional
))
LORA results in: trainable params: 3,542,016 || all params: 1,068,859,392 || trainable%: 0.3313827830405592
Preprocessing [optional]
Simple preprocessing with DataCollator:
def process_data(example):
item = tokenizer(example["review"], truncation=True, max_length=320) # see if this is OK for dyn padding
item["labels"] = [ 1 if sent == 'positive' else 0 for sent in example["sentiment"]]
return item
tokenised_data = tokenised_data.remove_columns(["review", "sentiment"])
data_collator = DataCollatorWithPadding(tokenizer=tokenizer)
Training Hyperparameters
- Training regime: [More Information Needed]
Speeds, Sizes, Times [optional]
[More Information Needed]
Evaluation
Evaluation function:
import evaluate
def compute_metrics(eval_pred):
# All metrics are already predefined in the HF `evaluate` package
precision_metric = evaluate.load("precision")
recall_metric = evaluate.load("recall")
f1_metric= evaluate.load("f1")
accuracy_metric = evaluate.load("accuracy")
logits, labels = eval_pred # eval_pred is the tuple of predictions and labels returned by the model
predictions = np.argmax(logits, axis=-1)
precision = precision_metric.compute(predictions=predictions, references=labels)["precision"]
recall = recall_metric.compute(predictions=predictions, references=labels)["recall"]
f1 = f1_metric.compute(predictions=predictions, references=labels)["f1"]
accuracy = accuracy_metric.compute(predictions=predictions, references=labels)["accuracy"]
# The trainer is expecting a dictionary where the keys are the metrics names and the values are the scores.
return {"precision": precision, "recall": recall, "f1-score": f1, 'accuracy': accuracy}
Testing Data, Factors & Metrics
Testing Data
[More Information Needed]
Factors
[More Information Needed]
Metrics
[More Information Needed]
Results
[More Information Needed]
Summary
Model Examination [optional]
[More Information Needed]
Environmental Impact
Carbon emissions can be estimated using the Machine Learning Impact calculator presented in Lacoste et al. (2019).
- Hardware Type: [More Information Needed]
- Hours used: [More Information Needed]
- Cloud Provider: [More Information Needed]
- Compute Region: [More Information Needed]
- Carbon Emitted: [More Information Needed]
Technical Specifications [optional]
Model Architecture and Objective
[More Information Needed]
Compute Infrastructure
[More Information Needed]
Hardware
- Model: 6.183.1 "13th Gen Intel(R) Core(TM) i9-13900K"
- GPU: Nvidia RTX 4900/ 24 GB
- Memory: 64 GB
Software
- python 3.11.6
- transformers 4.36.2
- torch 2.1.2
- peft 0.7.1
- numpy 1.26.2
- datasets 2.16.0
Citation [optional]
BibTeX:
[More Information Needed]
APA:
[More Information Needed]
Glossary [optional]
[More Information Needed]
More Information [optional]
[More Information Needed]
Model Card Authors [optional]
[More Information Needed]
Model Card Contact
[More Information Needed]