Update README.md

793290c 10 months ago

9.61 kB

	---
	license: apache-2.0
	datasets:
	- imdb
	language:
	- en
	metrics:
	- f1
	- accuracy
	- recall
	- precision
	library_name: peft
	---
	# A Finetuned Bloom 1b1 Model for Sequence Classification

	<!-- Provide a quick summary of what the model is/does. -->

	The model was developed as a personal learning experience to fine tune a ready language model for Text Classification and to use it
	on real life data from the internet to perform sentiment analysis.

	It has been generated using [this raw template](https://github.com/huggingface/huggingface_hub/blob/main/src/huggingface_hub/templates/modelcard_template.md?plain=1).

	## Model Details

	The model achieves the following scores on the evaluation set during the fine tuning:

	![Screenshot 2024-01-03 at 16.08.46.png](https://cdn-uploads.huggingface.co/production/uploads/64857e2b745fb671250a5beb/26EB2jJDKI0gsnvjHA9WP.png)

	Here is the train/ eval/ test split:

	```
	DatasetDict({
	train: Dataset({
	features: ['review', 'sentiment'],
	num_rows: 36000
	})
	test: Dataset({
	features: ['review', 'sentiment'],
	num_rows: 5000
	})
	eval: Dataset({
	features: ['review', 'sentiment'],
	num_rows: 9000
	})
	})
	```

	### Model Description

	<!-- Provide a longer summary of what this model is. -->



	- Developed by: Snoop088
	- Model type: Text Classification / Sequence Classification
	- Language(s) (NLP): English
	- License: Apache 2.0
	- **Finetuned from model: bigscience/bloom-1b1

	### Model Sources [optional]

	<!-- Provide the basic links for the model. -->

	- Repository: https://huggingface.co/snoop088/imdb_tuned-bloom1b1-sentiment-classifier/tree/main
	- Paper [optional]: [More Information Needed]
	- Demo [optional]: [More Information Needed]

	## Uses

	The model is intended to be used for Text Classification.

	### Direct Use

	Example script to use the model. Please note that this is peft adapter on the Bloom 1b model:

	```
	DEVICE = "cuda:0" if torch.cuda.is_available() else "cpu"
	model_name = 'snoop088/imdb_tuned-bloom1b1-sentiment-classifier'
	loaded_model = AutoModelForSequenceClassification.from_pretrained(model_name,
	trust_remote_code=True,
	num_labels=2,
	device_map="auto")
	tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
	tokenizer.pad_token = tokenizer.eos_token

	my_set = pd.read_csv("./data/df_manual.csv")

	inputs = tokenizer(list(my_set["review"]), truncation=True, padding="max_length", max_length=256, return_tensors="pt").to(DEVICE)
	outputs = loaded_model(**inputs)
	outcome = np.argmax(torch.Tensor.cpu(outputs.logits), axis=-1)

	```

	[More Information Needed]

	### Downstream Use [optional]

	The purpose of this model is to be used to perform sentiment analysis on a dataset similar to the one by IMDB. It should work well on product reviews, too in my opinion.


	[More Information Needed]

	### Out-of-Scope Use

	<!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->

	[More Information Needed]

	## Bias, Risks, and Limitations

	<!-- This section is meant to convey both technical and sociotechnical limitations. -->

	[More Information Needed]

	### Recommendations

	<!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->

	Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.

	## How to Get Started with the Model

	Use the code below to get started with the model.

	[More Information Needed]

	## Training Details

	### Training Data

	<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->

	Training is done on the IMDB dataset available on the Hub:

	[imdb](https://huggingface.co/datasets/imdb)

	[More Information Needed]

	### Training Procedure

	```
	training_arguments = TrainingArguments(
	output_dir="your_tuned_model_name",
	save_strategy="epoch",
	per_device_train_batch_size=4,
	per_device_eval_batch_size=4,
	gradient_accumulation_steps=4,
	optim="adamw_torch",
	evaluation_strategy="steps",
	logging_steps=5,
	learning_rate=1e-5,
	max_grad_norm = 0.3,
	eval_steps=0.2,
	num_train_epochs=2,
	warmup_ratio= 0.1,
	# group_by_length=True,
	fp16=False,
	weight_decay=0.001,
	lr_scheduler_type="constant",
	)

	peft_model = get_peft_model(model, LoraConfig(
	task_type="SEQ_CLS",
	r=16,
	lora_alpha=16,
	target_modules=[
	'query_key_value',
	'dense'
	],
	bias="none",
	lora_dropout=0.05, # Conventional
	))

	```
	LORA results in: trainable params: 3,542,016 \|\| all params: 1,068,859,392 \|\| trainable%: 0.3313827830405592

	<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->

	#### Preprocessing [optional]

	Simple preprocessing with DataCollator:

	```
	def process_data(example):
	item = tokenizer(example["review"], truncation=True, max_length=320) # see if this is OK for dyn padding
	item["labels"] = [ 1 if sent == 'positive' else 0 for sent in example["sentiment"]]
	return item

	tokenised_data = tokenised_data.remove_columns(["review", "sentiment"])
	data_collator = DataCollatorWithPadding(tokenizer=tokenizer)
	```


	#### Training Hyperparameters

	- Training regime: [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->

	#### Speeds, Sizes, Times [optional]

	<!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->

	[More Information Needed]

	## Evaluation

	<!-- This section describes the evaluation protocols and provides the results. -->
	Evaluation function:
	```
	import evaluate

	def compute_metrics(eval_pred):
	# All metrics are already predefined in the HF `evaluate` package
	precision_metric = evaluate.load("precision")
	recall_metric = evaluate.load("recall")
	f1_metric= evaluate.load("f1")
	accuracy_metric = evaluate.load("accuracy")

	logits, labels = eval_pred # eval_pred is the tuple of predictions and labels returned by the model
	predictions = np.argmax(logits, axis=-1)
	precision = precision_metric.compute(predictions=predictions, references=labels)["precision"]
	recall = recall_metric.compute(predictions=predictions, references=labels)["recall"]
	f1 = f1_metric.compute(predictions=predictions, references=labels)["f1"]
	accuracy = accuracy_metric.compute(predictions=predictions, references=labels)["accuracy"]
	# The trainer is expecting a dictionary where the keys are the metrics names and the values are the scores.
	return {"precision": precision, "recall": recall, "f1-score": f1, 'accuracy': accuracy}

	```

	### Testing Data, Factors & Metrics

	#### Testing Data

	<!-- This should link to a Dataset Card if possible. -->

	[More Information Needed]

	#### Factors

	<!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->

	[More Information Needed]

	#### Metrics

	<!-- These are the evaluation metrics being used, ideally with a description of why. -->

	[More Information Needed]

	### Results

	[More Information Needed]

	#### Summary



	## Model Examination [optional]

	<!-- Relevant interpretability work for the model goes here -->

	[More Information Needed]

	## Environmental Impact

	<!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->

	Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).

	- Hardware Type: [More Information Needed]
	- Hours used: [More Information Needed]
	- Cloud Provider: [More Information Needed]
	- Compute Region: [More Information Needed]
	- Carbon Emitted: [More Information Needed]

	## Technical Specifications [optional]

	### Model Architecture and Objective

	[More Information Needed]

	### Compute Infrastructure

	[More Information Needed]

	#### Hardware

	- Model: 6.183.1 "13th Gen Intel(R) Core(TM) i9-13900K"
	- GPU: Nvidia RTX 4900/ 24 GB
	- Memory: 64 GB

	#### Software

	- python 3.11.6
	- transformers 4.36.2
	- torch 2.1.2
	- peft 0.7.1
	- numpy 1.26.2
	- datasets 2.16.0

	## Citation [optional]

	<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->

	BibTeX:

	[More Information Needed]

	APA:

	[More Information Needed]

	## Glossary [optional]

	<!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->

	[More Information Needed]

	## More Information [optional]

	[More Information Needed]

	## Model Card Authors [optional]

	[More Information Needed]

	## Model Card Contact

	[More Information Needed]