rope_theta and max_position_embeddings

#7
by zchenyu - opened

Is the rope_theta and max_position_embeddings correct for this model? I noticed a discrepancy with https://huggingface.co/codellama/CodeLlama-70b-Instruct-hf/blob/main/config.json

Instruct:

  • rope_theta: 10000
  • max_position_embeddings: 4096

Base:

  • rope_theta: 1000000
  • max_position_embeddings: 16384

The discussion here [1] seems to suggest that the instruct one is correct. Not sure if the base one has different settings?

[1] https://huggingface.co/codellama/CodeLlama-70b-Instruct-hf/discussions/2

Sign up or log in to comment