Junlin Zhou's picture

Junlin Zhou

jlzhou

·

edwardzjl

AI & ML interests

None yet

Recent Activity

upvoted a paper 9 days ago

rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking

upvoted a paper 10 days ago

Better & Faster Large Language Models via Multi-token Prediction

upvoted a paper 10 days ago

Training Large Language Models to Reason in a Continuous Latent Space

View all activity

Articles

Distributed SFT with trl and DeepSpeed Part 1: Starting Locally

Organizations

jlzhou's activity

New activity in BAAI/Infinity-Instruct 18 days ago

使用load_dataset加载dataset_7M, 但是返回报错KeyError: 'tags'

#28 opened 29 days ago by

New activity in tablegpt/TableGPT2-7B 2 months ago

[DOC]: Guidance for complex use cases

#9 opened 2 months ago by

the 72b model

#8 opened 2 months ago by

New activity in tablegpt/TableGPT2-7B 3 months ago

eval

#5 opened 3 months ago by

docs: clarity tabular data input

#3 opened 3 months ago by

[Doc] Remove redundant comments

#4 opened 3 months ago by

[Doc] Add the use of the tables in the QuickStart

#2 opened 3 months ago by

[Doc] Add Quick Start and Deployment

#1 opened 3 months ago by

New activity in mistralai/Ministral-8B-Instruct-2410 3 months ago

Looks like not as good as Qwen2.5 7B

#5 opened 4 months ago by

MonolithFoundation

New activity in Qwen/Qwen2-72B-Instruct-GPTQ-Int4 6 months ago

Update license?

#5 opened 6 months ago by

New activity in cognitivecomputations/samantha-1.1-westlake-7b-laser 8 months ago

Base model

#2 opened 8 months ago by

New activity in bigcode/the-stack-v2-train-full-ids 8 months ago

How to download the dataset in bulk?

#7 opened 10 months ago by

what different between the-stack-v2-train-full-ids and the-stack-v2-dedup

#2 opened 11 months ago by

New activity in m-a-p/Matrix 9 months ago

Actual dataset size?

#4 opened 9 months ago by

New activity in cognitivecomputations/dolphin-2.9-llama3-70b 9 months ago

Instruct version please

#5 opened 9 months ago by

New activity in llava-hf/llava-1.5-13b-hf 10 months ago

What does low_cpu_mem_usage do?

#8 opened 10 months ago by

New activity in TheBloke/deepseek-coder-33B-instruct-AWQ 10 months ago

Problem Running Model

#3 opened about 1 year ago by

New activity in m-a-p/OpenCodeInterpreter-DS-6.7B 10 months ago

It seems that this model sometimes ignores user instruction

#12 opened 10 months ago by

New activity in tiiuae/falcon-180B 11 months ago

Start an API for falcon-180B

#22 opened over 1 year ago by

New activity in m-a-p/OpenCodeInterpreter-DS-6.7B 11 months ago

Please create google Gemma-7b (8.5b) based version

#4 opened 12 months ago by