ChemLLM-7B-Chat / README.md
Di Zhang
Update README.md
5666eac verified
|
raw
history blame
4.25 kB
metadata
license: mit
pipeline_tag: text-generation
tags:
  - chemistry
language:
  - en
  - zh

ChemLLM-7B-Chat: LLM for Chemistry and Molecule Science

ChemLLM-7B-Chat, The First Open-source Large Language Model for Chemistry and Molecule Science, Build based on InternLM-2 with ❤

News

Usage

Try online demo instantly, or...

Install transformers,

pip install transformers

Load ChemLLM-7B-Chat and run,

from transformers import AutoModelForCausalLM, AutoTokenizer, GenerationConfig
import torch

model_name_or_id = "AI4Chem/ChemLLM-7B-Chat"

model = AutoModelForCausalLM.from_pretrained(model_name_or_id, torch_dtype=torch.float16, device_map="auto",trust_remote_code=True)
tokenizer = AutoTokenizer.from_pretrained(model_name_or_id,,trust_remote_code=True)

prompt = "What is Molecule of Ibuprofen?"

inputs = tokenizer(prompt, return_tensors="pt").to("cuda")

generation_config = GenerationConfig(
    do_sample=True,
    top_k=1,
    temperature=0.9,
    max_new_tokens=500,
    repetition_penalty=1.5,
    pad_token_id=tokenizer.eos_token_id
)

outputs = model.generate(**inputs, generation_config=generation_config)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Dataset

Section Dataset Link
Pretrain Dataset ChemPile-2T
SFT Dataset ChemData-7M
Benchmark Dataset ChemTest-12K
DPO Dataset ChemPref-10k

Results

MMLU Highlights

dataset ChatGLM3-6B Qwen-7B LLaMA-2-7B Mistral-7B InternLM2-7B-Chat ChemLLM-7B-Chat
college chemistry 43.0 39.0 27.0 40.0 43.0 47.0
college mathematics 28.0 33.0 33.0 30.0 36.0 41.0
college physics 32.4 35.3 25.5 34.3 41.2 48.0
formal logic 35.7 43.7 24.6 40.5 34.9 47.6
moral scenarios 26.4 35.0 24.1 39.9 38.6 44.3
humanities average 62.7 62.5 51.7 64.5 66.5 68.6
stem average 46.5 45.8 39.0 47.8 52.2 52.6
social science average 68.2 65.8 55.5 68.1 69.7 71.9
other average 60.5 60.3 51.3 62.4 63.2 65.2
mmlu 58.0 57.1 48.2 59.2 61.7 63.2
*(OpenCompass)

image/png

Chemical Benchmark

image/png *(Score judged by ChatGPT-4-turbo)

Professional Translation

image/png

image/png

You can try it online.

Disclaimer

Demo

Agent Chepybara

image/png

Contact

(AI4Physics Sciecne, Shanghai AI Lab)[[email protected]]