What's the decoder_start_token_id and eos_token_id used in training?

#7
by cqchangm - opened

In large-v3 the start and end tokens are (50258, 50257) which are ("<|startoftranscript|>","<|endoftext|>").
In this model they are (50257, 50256) which are ("<|endoftext|>", "") according to added_tokens.json

Was the model finetuned this way, i.e. with <|endoftext|> at the start? Or was it just a typo?

Sign up or log in to comment