Added cuda config on Sample Code

#6
by mahimairaja - opened

I tried to replicate the sample code in the free versions of kaggle, Colab and SageMaker Studio Lab.
When running on GPU:

RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!

And While running on CPU:

RuntimeError: "LayerNormKernelImpl" not implemented for 'Half'

| Changes I made:

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("microsoft/phi-1_5", trust_remote_code=True, torch_dtype="auto")
tokenizer = AutoTokenizer.from_pretrained("microsoft/phi-1_5", trust_remote_code=True, torch_dtype="auto")

+ device = torch.device("cuda:0")
+ model.cuda()

inputs = tokenizer('''```python
def print_prime(n):
   """
   Print all primes between 1 and n
-   """''', return_tensors="pt", return_attention_mask=False)
+   """''', return_tensors="pt", return_attention_mask=False).to('cuda')

outputs = model.generate(**inputs, max_length=200)
text = tokenizer.batch_decode(outputs)[0]
print(text)

I could resolve the runtime issue on GPU by adding cuda settiings. I beleive this would help the co-developers to try out phi-1_5. Thanks!!!

Microsoft org

Hello @mahimairaja ! I hope everything is going well with you.

Thanks for your feedback, we will correct such a problem and mention it on the model card.

Regards,
Gustavo.

gugarosa changed pull request status to closed

Sign up or log in to comment