ghost-x/ghost-8b-beta · Update max_position

This is the 8k context version, and we use LongLM to increase the context, you can refer to it here.

Note: reusing Llama model source code will lose some extension code, but don't worry, it still works fine when used with LongLM.

model = AutoModelForCausalLM.from_pretrained(
    model_id,
    device_map="auto",
    torch_dtype=torch.bfloat16,
    attn_implementation="flash_attention_2",
    trust_remote_code=True,
)
SelfExtend.apply(
    model,
    group_size=16,
    window_size=512,
    enable_flash_attention=True,
    flash_attention_impl="flash_attn",
)
model.generation_config.max_length = 123392

ghost-x
/

ghost-8b-beta

Update max_position_embeddings