conversion to HF

#1
by ehartford - opened
Unofficial Mistral Community org

I cannot find how to convert this to HF, @v2ray can you please show me the way?

Unofficial Mistral Community org

I'm aware of the script.

How to use it to convert 8x22b is far from self evident.

Unofficial Mistral Community org
edited May 25

@ehartford https://huggingface.co/v2ray/Mixtral-8x22B-v0.1/blob/main/convert.py

python convert.py --input-dir /path/to/original --model-size 22B --output-dir /path/to/save
Unofficial Mistral Community org

Thanks!

Unofficial Mistral Community org

I will do this immediately

Unofficial Mistral Community org
max_position_embeddings = params["max_seq_len"]
                             ~~~~~~^^^^^^^^^^^^^^^

It wants "max_seq_len"

I see there isn't one in params.json

{
    "dim": 6144,
    "n_layers": 56,
    "head_dim": 128,
    "hidden_dim": 16384,
    "n_heads": 48,
    "n_kv_heads": 8,
    "norm_eps": 1e-05,
    "vocab_size": 32768,
    "rope_theta": 1000000.0,
    "moe": {
        "num_experts": 8,
        "num_experts_per_tok": 2
    }
}

I will try setting it to 32768

Unofficial Mistral Community org

I thought it was 64k?

Unofficial Mistral Community org

Ok thank you 😊

Unofficial Mistral Community org

ok that worked, but didn't create a tokenizer

Unofficial Mistral Community org

it came with this file
tokenizer.model.v3

Unofficial Mistral Community org

and no tokenizer.config file

Unofficial Mistral Community org

ok looks like maybe I need to rename that to tokenizer.model then rerun

Unofficial Mistral Community org
edited May 25

@ehartford I just copied the tokenizer from 8x7B when I did conversion for 8x22B v0.1 since it's the same one.
Wait a minute v0.3?!

Unofficial Mistral Community org

nope that didn't do it

Unofficial Mistral Community org

oh yeah I could copy the tokenizer from mistral-7b-v0.3

Unofficial Mistral Community org

ok I think I got it. Uploading

Unofficial Mistral Community org

@ehartford I just copied the tokenizer from 8x7B since it's the same one.
Wait a minute v0.3?!

yeah - they say it's the same but with a new tokenizer

Unofficial Mistral Community org

finished uploading mistral-community/mixtral-8x22B-v0.3

ehartford changed discussion status to closed

Sign up or log in to comment