Pipeline not working

#1
by samitizerxu - opened

Hi! I'm getting the following error when I try to run the pipeline:

ValueError: Could not load model h2oai/h2ogpt-oasst1-512-20b with any of the following classes: (<class 
'transformers.models.auto.modeling_auto.AutoModelForCausalLM'>, <class 
'transformers.models.gpt_neox.modeling_gpt_neox.GPTNeoXForCausalLM'>).

with this code:

!pip install transformers==4.28.1
!pip install accelerate==0.18.0
import torch
from transformers import pipeline

generate_text = pipeline(model="h2oai/h2ogpt-oasst1-512-20b", torch_dtype=torch.bfloat16, trust_remote_code=True, device_map="auto")

res = generate_text("Why is drinking water so healthy?", max_new_tokens=3000)
print(res[0]["generated_text"])

Hi, thanks for trying out the model. Unfortunately, I was unable to reproduce the error on a fresh env using conda, e.g.:

conda create -n test
conda activate test
conda install python=3.10
pip install transformers==4.28.1
pip install accelerate==0.18.0

$ python
>>> import torch
>>> from transformers import pipeline
>>> generate_text = pipeline(model="h2oai/h2ogpt-oig-oasst1-512-6.9b", torch_dtype=torch.bfloat16, trust_remote_code=True, device_map="auto")
>>> res = generate_text("Why is drinking water so healthy?")
Setting `pad_token_id` to `eos_token_id`:0 for open-end generation.
/home/jon/miniconda3/envs/test/lib/python3.10/site-packages/transformers/generation/utils.py:1313: UserWarning: Using `max_length`'s default (20) to control the generation length. This behaviour is deprecated and will be removed from the config in v5 of Transformers -- we recommend using `max_new_tokens` to control the maximum length of the generation.
  warnings.warn(
>>> print(res[0]["generated_text"])
Why is drinking water so healthy?

Water is the most abundant substance in the world. It

However, we have seen this too when starting the demo instance using the "pipeline" from transformers. Instead we had to use the other example given in the model card that uses H2OTextGenerationPipeline. That should work, please try that way. E.g.

import torch
from h2oai_pipeline import H2OTextGenerationPipeline
from transformers import AutoModelForCausalLM, AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained("h2oai/h2ogpt-oasst1-512-20b", padding_side="left")
model = AutoModelForCausalLM.from_pretrained("h2oai/h2ogpt-oasst1-512-20b", torch_dtype=torch.bfloat16, device_map="auto")
generate_text = H2OTextGenerationPipeline(model=model, tokenizer=tokenizer)

res = generate_text("Why is drinking water so healthy?", max_new_tokens=100)
print(res[0]["generated_text"])

Ah, I see. I ran my code on a google colab, however I did install the required version of transformers and accelerate
.

samitizerxu changed discussion status to closed

Sign up or log in to comment