relaxml/Llama-2-13b-chat-E8P-2Bit · Discrepency between model card and tokenizer

Dec 9, 2023

The pipeline method works fine for these models. Trying to load the models directly though throws a tokenizer error. I am super eager to try these models out so reporting this issue right away!

at676

RelaxML org Dec 9, 2023

Can you give a command to reproduce this error? Thanks

TuringsSolutions

Dec 9, 2023

I just run this:
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("relaxml/Llama-2-13b-chat-E8P-2Bit")
model = AutoModelForCausalLM.from_pretrained("relaxml/Llama-2-13b-chat-E8P-2Bit")

I get this error:

OSError Traceback (most recent call last)
in <cell line: 4>()
2 from transformers import AutoTokenizer, AutoModelForCausalLM
3
----> 4 tokenizer = AutoTokenizer.from_pretrained("relaxml/Openhermes-7b-E8P-2Bit")
5 model = AutoModelForCausalLM.from_pretrained("relaxml/Openhermes-7b-E8P-2Bit")

1 frames
/usr/local/lib/python3.10/dist-packages/transformers/tokenization_utils_base.py in from_pretrained(cls, pretrained_model_name_or_path, cache_dir, force_download, local_files_only, token, revision, *init_inputs, **kwargs)
2006
2007 if all(full_file_name is None for full_file_name in resolved_vocab_files.values()):
-> 2008 raise EnvironmentError(
2009 f"Can't load tokenizer for '{pretrained_model_name_or_path}'. If you were trying to load it from "
2010 "'https://huggingface.co/models', make sure you don't have a local directory with the same name. "

OSError: Can't load tokenizer for 'relaxml/Openhermes-7b-E8P-2Bit'. If you were trying to load it from 'https://huggingface.co/models', make sure you don't have a local directory with the same name. Otherwise, make sure 'relaxml/Openhermes-7b-E8P-2Bit' is the correct path to a directory containing all relevant files for a LlamaTokenizerFast tokenizer.

at676

RelaxML org Dec 9, 2023

The tokenizer has to be loaded with the original model tokenizer (eg meta-llama/Llama-2-13b-chat-hf). We're in the process of fixing this by uploading tokenizer files to each folder, but in the meantime you can get the original model string from the model config under the _name_or_path entry. You can see how we do this in our model_from_hf_path function here https://github.com/Cornell-RelaxML/quip-sharp/blob/main/lib/utils/unsafe_import.py.

TuringsSolutions

Dec 9, 2023

Thank you!

TuringsSolutions changed discussion status to closed Dec 9, 2023