ChemLLM-7B-Chat / README.md
qq8933
Update README.md
eced2cb verified
|
raw
history blame
1.79 kB
metadata
license: mit
pipeline_tag: text-generation
tags:
  - chemistry

Chepybara-7B-Chat: Specialised LLM for Chemistry and Molecule Science

Chepybara-7B-Chat, The First Open-source Specialised LLM for Chemistry and Molecule Science, Build based on InternLM-2.

News

  • Chepybara online demo released. https://chemllm.org/ [2024-1-18]
  • Chepybara-7B-Chat ver.1.0 open-sourced.[2024-1-17]

Usage

Try (online demo)[https://chemllm.org/] instantly, or...

Install transformers,

pip install transformers

Load Chepybara-7B-Chat and run,

from transformers import AutoModelForCausalLM, AutoTokenizer, GenerationConfig
import torch

model_name_or_id = "AI4Chem/Chepybara-7B-Chat"

model = AutoModelForCausalLM.from_pretrained(model_name_or_id, torch_dtype=torch.float16, device_map="cuda")
tokenizer = AutoTokenizer.from_pretrained(model_name_or_id)

prompt = "What is Molecule of Ibuprofen?"

inputs = tokenizer(prompt, return_tensors="pt").to("cuda")

generation_config = GenerationConfig(
    do_sample=True,
    top_k=1,
    temperature=0.9,
    max_new_tokens=500,
    repetition_penalty=1.5,
    pad_token_id=tokenizer.eos_token_id
)

outputs = model.generate(**inputs, generation_config=generation_config)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Dataset

Section Dataset Link
Pretrain Dataset ChemPile-2T
SFT Dataset ChemData-7M
Benchmark Dataset ChemTest-12K
DPO Dataset ChemPref-10k

Acknowledge

....

Disclaimer

Demo

https://chemllm.org/

image/png

Contact

(AI4Physics Sciecne, Shanghai AI Lab)[[email protected]]