Mayank Mishra

mayank-mishra

AI & ML interests

Large Language Models, Distributed Training and Inference

Articles

Organizations

mayank-mishra's activity

New activity in cfahlgren1/model-release-heatmap about 2 months ago

Add IBM

3
#5 opened about 2 months ago by mayank-mishra
New activity in ibm-granite/granite-8b-code-instruct-128k about 2 months ago

Fix: link to 128k paper

1
#1 opened about 2 months ago by timrbula
New activity in meta-llama/Meta-Llama-3.1-405B about 2 months ago

405B or 410B ?

2
#8 opened about 2 months ago by alielfilali01
New activity in ibm-granite/granite-3b-code-instruct-2k 2 months ago
New activity in ibm-granite/granite-8b-code-instruct-4k 4 months ago

Input context length

3
#6 opened 4 months ago by dyoung

Official quants?

3
#2 opened 4 months ago by joshuaturner
New activity in ibm-granite/granite-3b-code-base-2k 4 months ago

Release GGUF models?

3
#5 opened 4 months ago by CosmicSound
New activity in ibm-granite/granite-3b-code-base-2k 4 months ago

Licensing

6
#4 opened 4 months ago by tonylek
New activity in ibm-granite/granite-34b-code-base-8k 4 months ago
New activity in ibm-granite/granite-8b-code-instruct-4k 4 months ago
New activity in ibm-granite/granite-3b-code-base-2k 4 months ago

Why was the logo removed?

1
#6 opened 4 months ago by mrfakename
New activity in ibm-granite/granite-8b-code-instruct-4k 4 months ago

Model template

3
#1 opened 4 months ago by alex0dd
New activity in ibm-granite/granite-3b-code-base-2k 4 months ago

Context length

5
#3 opened 5 months ago by mrfakename
New activity in ibm-granite/granite-3b-code-base-2k 5 months ago

Question

3
#2 opened 5 months ago by mrfakename
New activity in ibm-granite/granite-3b-code-base-2k 5 months ago

Initial model card version

#1 opened 5 months ago by amezasor
commented 12 papers 5 months ago
New activity in blog-explorers/README 6 months ago

[Support] Community Articles

54
#5 opened 6 months ago by victor
New activity in ibm/MoLFormer-XL-both-10pct 6 months ago
New activity in aurora-m/aurora-m-biden-harris-redteamed 6 months ago

Update README.md

1
#1 opened 7 months ago by cabbage972
New activity in tiiuae/falcon-180B 11 months ago

Is Gigatron open source?

#6 opened about 1 year ago by mayank-mishra
New activity in tiiuae/falcon-180B about 1 year ago
New activity in mayank-mishra/starcoder-GPTQ-4bit-128g over 1 year ago
New activity in mosaicml/mpt-7b over 1 year ago
New activity in mayank-mishra/starcoderbase-GPTQ-8bit-128g over 1 year ago

Running this on consumer hardware

2
#1 opened over 1 year ago by piratos
New activity in bigcode/starcoder over 1 year ago

What are 0..7.bin?

2
#14 opened over 1 year ago by lozhnikov
New activity in bigcode/starcoderbase over 1 year ago

KeyError: 'gpt_bigcode'

1
#4 opened over 1 year ago by Bilibili
New activity in bigcode/gpt_bigcode-santacoder over 1 year ago
New activity in bigscience/bloom almost 2 years ago

how can i train bloom

4
#111 opened about 2 years ago by s3rgio27

How much GPU memory needed?

4
#109 opened about 2 years ago by mazib
New activity in bigscience/bloom about 2 years ago