Update safetensors keys; update README usage
Hello!
Pull Request overview
- Update the
adapter_model.safetensors
keys such that loading can be done more conveniently on anAutoModel
. - Update the base model
- Update the README with the new (simplified) inference via Sentence Transformers and Transformers.
Details
I customized some code in my local peft
installation to update the keys of the loaded adapter, allowing me to save it with the new keys (e.g. base_model.model.embeddings...
instead of base_model.model.model.embeddings...
. Beyond that the adapter is fully the same.
Now you can apply the PeftModel.from_pretrained
directly over an AutoModel
, rather than only over a torch.nn.Module
with a model
key pointing to the AutoModel
.
Beyond that, the model now works with Sentence Transformers if you find that convenient:
from sentence_transformers import SentenceTransformer
from peft import PeftModel
model = SentenceTransformer("all-mpnet-base-v2")
model[0].auto_model = PeftModel.from_pretrained(
model[0].auto_model,
"vahidthegreat/StanceAware-SBERT",
revision="refs/pr/2",
)
sentences = [
"I love pineapple on pizza",
"I hate pineapple on pizza",
"I like pineapple on pizza",
]
embeddings = model.encode(sentences)
print(embeddings.shape)
similarity = model.similarity(embeddings, embeddings)
print(similarity)
# tensor([[1.0000, 0.5732, 0.9713],
# [0.5732, 1.0000, 0.5804],
# [0.9713, 0.5804, 1.0000]])
# I.e.: the first and third sentence are very similar, the second sentence is less similar to the other two
Note! This script uses revision
to point to this PR. With other words, you can run this script right now before merging so you can verify whether the performance is identical. Here is the full Transformers-based script from the README with the revision
argument as well:
from peft import PeftModel
from transformers import AutoModel, AutoTokenizer
import torch
import torch.nn as nn
import torch.nn.functional as F
class SiameseNetworkMPNet(nn.Module):
def __init__(self, model_name, tokenizer, normalize=True):
super(SiameseNetworkMPNet, self).__init__()
self.model = AutoModel.from_pretrained(model_name)
self.normalize = normalize
self.tokenizer = tokenizer
def apply_lora_weights(self, lora_model):
self.model = PeftModel.from_pretrained(self.model, lora_model, revision="refs/pr/2")
self.model = self.model.merge_and_unload()
return self
def forward(self, **inputs):
model_output = self.model(**inputs)
attention_mask = inputs['attention_mask']
last_hidden_states = model_output.last_hidden_state # First element of model_output contains all token embeddings
embeddings = torch.sum(last_hidden_states * attention_mask.unsqueeze(-1), 1) / torch.clamp(attention_mask.sum(1, keepdim=True), min=1e-9) # mean_pooling
if self.normalize:
embeddings = F.layer_norm(embeddings, embeddings.shape[1:])
embeddings = F.normalize(embeddings, p=2, dim=1)
return embeddings
base_model_name = "sentence-transformers/all-mpnet-base-v2"
tokenizer = AutoTokenizer.from_pretrained(base_model_name)
# Load the base model
base_model = SiameseNetworkMPNet(model_name=base_model_name, tokenizer=tokenizer)
# Load and apply LoRA weights
lora_model = SiameseNetworkMPNet(model_name=base_model_name, tokenizer=tokenizer)
lora_model.apply_lora_weights("vahidthegreat/StanceAware-SBERT")
from sklearn.metrics.pairwise import cosine_similarity
def two_sentence_similarity(model, tokenizer, text1, text2):
# Tokenize both texts
tokens1 = tokenizer(text1, return_tensors="pt", max_length=128, truncation=True, padding="max_length")
tokens2 = tokenizer(text2, return_tensors="pt", max_length=128, truncation=True, padding="max_length")
# Generate embeddings
embeddings1 = model(**tokens1).detach().cpu().numpy()
embeddings2 = model(**tokens2).detach().cpu().numpy()
# Compute cosine similarity
similarity = cosine_similarity(embeddings1, embeddings2)
print(f"Cosine Similarity: {similarity[0][0]}")
return similarity[0][0]
# Example sentences
text1 = "I love pineapple on pizza"
text2 = "I hate pineapple on pizza"
print(f"For Base Model sentences: '{text1}' and '{text2}'")
two_sentence_similarity(base_model, tokenizer, text1, text2)
print(f"\nFor FineTuned Model sentences: '{text1}' and '{text2}'")
two_sentence_similarity(lora_model, tokenizer, text1, text2)
print('\n\n')
# Example sentences
text1 = "I love pineapple on pizza"
text2 = "I like pineapple on pizza"
print(f"For Base Model sentences: '{text1}' and '{text2}'")
two_sentence_similarity(base_model, tokenizer, text1, text2)
print(f"\n\nFor FineTuned Model sentences: '{text1}' and '{text2}'")
two_sentence_similarity(lora_model, tokenizer, text1, text2)
This should solve #1 and simplify the usage a bit. Let me know if you have any concerns or questions!
- Tom Aarsen
Hmm, I think this banner is because of the PEFT snippet that it tries to generate here: https://huggingface.co/vahidthegreat/StanceAware-SBERT?library=peft
I'm not sure what the valid options are. A fix is just to remove the library_name: peft
and then it should be gone.
- Tom Aarsen
I removed the library_name: peft
for now.
Thanks a lot for all the edits. Amazing help!!!
I'm new to this environment so I'm still figuring things out and your inputs are really insightful to me.