snoop088
/

imdb_tuned-bloom1b1-sentiment-classifier

Text Classification

PEFT

Safetensors

English

Model card Files Files and versions Community

snoop088 commited on Jan 8, 2024

Commit

793290c

1 Parent(s): 973dd49

Update README.md

Browse files

adding more information to README

Files changed (1) hide show

README.md +101 -5

README.md CHANGED Viewed

@@ -61,7 +61,7 @@ DatasetDict({
 <!-- Provide the basic links for the model. -->
-- **Repository:** [More Information Needed]
 - **Paper [optional]:** [More Information Needed]
 - **Demo [optional]:** [More Information Needed]
@@ -71,7 +71,25 @@ The model is intended to be used for Text Classification.
 ### Direct Use
-<!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
 [More Information Needed]
@@ -112,15 +130,65 @@ Use the code below to get started with the model.
 <!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
 [More Information Needed]
 ### Training Procedure
 <!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
 #### Preprocessing [optional]
-[More Information Needed]
 #### Training Hyperparameters
@@ -136,6 +204,27 @@ Use the code below to get started with the model.
 ## Evaluation
 <!-- This section describes the evaluation protocols and provides the results. -->
 ### Testing Data, Factors & Metrics
@@ -195,11 +284,18 @@ Carbon emissions can be estimated using the [Machine Learning Impact calculator]
 #### Hardware
-[More Information Needed]
 #### Software
-[More Information Needed]
 ## Citation [optional]

 <!-- Provide the basic links for the model. -->
+- **Repository:** https://huggingface.co/snoop088/imdb_tuned-bloom1b1-sentiment-classifier/tree/main
 - **Paper [optional]:** [More Information Needed]
 - **Demo [optional]:** [More Information Needed]
 ### Direct Use
+Example script to use the model. Please note that this is peft adapter on the Bloom 1b model:
+```
+DEVICE = "cuda:0" if torch.cuda.is_available() else "cpu"
+model_name = 'snoop088/imdb_tuned-bloom1b1-sentiment-classifier'
+loaded_model = AutoModelForSequenceClassification.from_pretrained(model_name,
+                                                                  trust_remote_code=True,
+                                                                  num_labels=2,
+                                                                  device_map="auto")
+tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
+tokenizer.pad_token = tokenizer.eos_token
+my_set = pd.read_csv("./data/df_manual.csv")
+inputs = tokenizer(list(my_set["review"]), truncation=True, padding="max_length", max_length=256,  return_tensors="pt").to(DEVICE)
+outputs = loaded_model(**inputs)
+outcome = np.argmax(torch.Tensor.cpu(outputs.logits), axis=-1)
+```
 [More Information Needed]
 <!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
+Training is done on the IMDB dataset available on the Hub:
+[imdb](https://huggingface.co/datasets/imdb)
 [More Information Needed]
 ### Training Procedure
+```
+training_arguments = TrainingArguments(
+    output_dir="your_tuned_model_name",
+    save_strategy="epoch",
+    per_device_train_batch_size=4,
+    per_device_eval_batch_size=4,
+    gradient_accumulation_steps=4,
+    optim="adamw_torch",
+    evaluation_strategy="steps",
+    logging_steps=5,
+    learning_rate=1e-5,
+    max_grad_norm = 0.3,
+    eval_steps=0.2,
+    num_train_epochs=2,
+    warmup_ratio= 0.1,
+    # group_by_length=True,
+    fp16=False,
+    weight_decay=0.001,
+    lr_scheduler_type="constant",
+)
+peft_model = get_peft_model(model, LoraConfig(
+                            task_type="SEQ_CLS",
+                            r=16,
+                            lora_alpha=16,
+                            target_modules=[
+                                'query_key_value',
+                                'dense'
+                            ],
+                            bias="none",
+                            lora_dropout=0.05, # Conventional
+                        ))
+```
+LORA results in: trainable params: 3,542,016 || all params: 1,068,859,392 || trainable%: 0.3313827830405592
 <!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
 #### Preprocessing [optional]
+Simple preprocessing with DataCollator:
+```
+def process_data(example):
+    item = tokenizer(example["review"], truncation=True, max_length=320) # see if this is OK for dyn padding
+    item["labels"] = [ 1 if sent == 'positive' else 0 for sent in example["sentiment"]]
+    return item
+tokenised_data = tokenised_data.remove_columns(["review", "sentiment"])
+data_collator = DataCollatorWithPadding(tokenizer=tokenizer)
+```
 #### Training Hyperparameters
 ## Evaluation
 <!-- This section describes the evaluation protocols and provides the results. -->
+Evaluation function:
+```
+import evaluate
+def compute_metrics(eval_pred):
+    # All metrics are already predefined in the HF `evaluate` package
+    precision_metric = evaluate.load("precision")
+    recall_metric = evaluate.load("recall")
+    f1_metric= evaluate.load("f1")
+    accuracy_metric = evaluate.load("accuracy")
+    logits, labels = eval_pred # eval_pred is the tuple of predictions and labels returned by the model
+    predictions = np.argmax(logits, axis=-1)
+    precision = precision_metric.compute(predictions=predictions, references=labels)["precision"]
+    recall = recall_metric.compute(predictions=predictions, references=labels)["recall"]
+    f1 = f1_metric.compute(predictions=predictions, references=labels)["f1"]
+    accuracy = accuracy_metric.compute(predictions=predictions, references=labels)["accuracy"]
+    # The trainer is expecting a dictionary where the keys are the metrics names and the values are the scores.
+    return {"precision": precision, "recall": recall, "f1-score": f1, 'accuracy': accuracy}
+```
 ### Testing Data, Factors & Metrics
 #### Hardware
+- Model: 6.183.1 "13th Gen Intel(R) Core(TM) i9-13900K"
+- GPU: Nvidia RTX 4900/ 24 GB
+- Memory: 64 GB
 #### Software
+- python 3.11.6
+- transformers 4.36.2
+- torch 2.1.2
+- peft 0.7.1
+- numpy 1.26.2
+- datasets 2.16.0
 ## Citation [optional]