keeeeenw
/

MicroLlama2-checkpoints

English

llama

Model card Files Files and versions Community

keeeeenw commited on Feb 1

Commit

4dc4e8a

verified ·

1 Parent(s): 0a6e6cb

Update README.md

Browse files

Files changed (1) hide show

README.md +49 -51

README.md CHANGED Viewed

@@ -17,57 +17,6 @@ Some reasons for using these checkpoints:
 - You can use them starting point to train your own small language model.
 - More interestingly, you can prob into the learning process of these models to understand how LLM learns to mimic human.
-# Evaluation results
-**Note** this does not represent the final performance of the model and should only be served as a reference for my training progress.
-```
-checkpoint: step-00088000
-|    Tasks    |Version|Filter|n-shot| Metric |Value |   |Stderr|
-|-------------|------:|------|-----:|--------|-----:|---|-----:|
-|piqa         |      1|none  |     0|acc     |0.6202|±  |0.0113|
-|             |       |none  |     0|acc_norm|0.6213|±  |0.0113|
-|boolq        |      2|none  |     0|acc     |0.5875|±  |0.0086|
-|arc_challenge|      1|none  |     0|acc     |0.1980|±  |0.0116|
-|             |       |none  |     0|acc_norm|0.2201|±  |0.0121|
-|arc_easy     |      1|none  |     0|acc     |0.4373|±  |0.0102|
-|             |       |none  |     0|acc_norm|0.3935|±  |0.0100|
-|winogrande   |      1|none  |     0|acc     |0.5004|±  |0.0141|
-|openbookqa   |      1|none  |     0|acc     |0.1760|±  |0.0170|
-|             |       |none  |     0|acc_norm|0.2680|±  |0.0198|
-|hellaswag    |      1|none  |     0|acc     |0.2893|±  |0.0045|
-|             |       |none  |     0|acc_norm|0.3125|±  |0.0046|
-```
-You can use the following script to reproduce the results (assuming you have installed litgpt)
-```
-MODEL_NAME="step-00088000"
-MODEL_OUTPUT_ROOT="MicroLlamaV2-VastAI-Checkpoints/out/pretrain/micro-llama-v2"
-MODEL_OUTPUT_REL="${MODEL_OUTPUT_ROOT}/${MODEL_NAME}"
-# HuggingFace
-huggingface-cli download keeeeenw/MicroLlama2-checkpoints ${MODEL_NAME}/lit_model.pth --local-dir checkpoints/${MODEL_OUTPUT_ROOT}/
-huggingface-cli download keeeeenw/MicroLlama2-checkpoints ${MODEL_NAME}/generation_config.json --local-dir checkpoints/${MODEL_OUTPUT_ROOT}/
-huggingface-cli download keeeeenw/MicroLlama2-checkpoints ${MODEL_NAME}/hyperparameters.yaml --local-dir checkpoints/${MODEL_OUTPUT_ROOT}/
-huggingface-cli download keeeeenw/MicroLlama2-checkpoints ${MODEL_NAME}/model_config.yaml --local-dir checkpoints/${MODEL_OUTPUT_ROOT}/
-huggingface-cli download keeeeenw/MicroLlama2-checkpoints ${MODEL_NAME}/tokenizer.json --local-dir checkpoints/${MODEL_OUTPUT_ROOT}/
-huggingface-cli download keeeeenw/MicroLlama2-checkpoints ${MODEL_NAME}/tokenizer_config.json --local-dir checkpoints/${MODEL_OUTPUT_ROOT}/
-# Copy config, see "caveat" below
-cp -r <local_path>/config.json checkpoints/${MODEL_OUTPUT_REL}/
-# AWS
-# aws s3 cp s3://microllama-v2/checkpoints/out/pretrain/micro-llama-v2/${MODEL_NAME} checkpoints/${MODEL_OUTPUT_REL} --recursive
-litgpt evaluate \
-  ${MODEL_OUTPUT_REL} \
-  --tasks "hellaswag,openbookqa,winogrande,arc_easy,arc_challenge,boolq,piqa" \
-  --device cuda:0 \
-  --batch_size 16
-```
-**Caveat**: for some reason the auto generated config.json for the model in the checkpoint is incorrect, you will need to replace it with https://huggingface.co/keeeeenw/MicroLlama2-checkpoints/blob/main/config.json
-to resolve the evaluation error.
 # How to use these checkpoints
 These checkpoints are compatible with [litgpt](https://github.com/Lightning-AI/litgpt) with slight modifications (see section below).
@@ -182,5 +131,54 @@ litgpt pretrain \
 You will lose the index to the training dataset as well as other hyperparams such as learning rate but this allows you to start your pre-training quickly.

 - You can use them starting point to train your own small language model.
 - More interestingly, you can prob into the learning process of these models to understand how LLM learns to mimic human.
 # How to use these checkpoints
 These checkpoints are compatible with [litgpt](https://github.com/Lightning-AI/litgpt) with slight modifications (see section below).
 You will lose the index to the training dataset as well as other hyperparams such as learning rate but this allows you to start your pre-training quickly.
+# Evaluation results
+**Note** this does not represent the final performance of the model and should only be served as a reference for my training progress.
+```
+checkpoint: step-00088000
+|    Tasks    |Version|Filter|n-shot| Metric |Value |   |Stderr|
+|-------------|------:|------|-----:|--------|-----:|---|-----:|
+|piqa         |      1|none  |     0|acc     |0.6202|±  |0.0113|
+|             |       |none  |     0|acc_norm|0.6213|±  |0.0113|
+|boolq        |      2|none  |     0|acc     |0.5875|±  |0.0086|
+|arc_challenge|      1|none  |     0|acc     |0.1980|±  |0.0116|
+|             |       |none  |     0|acc_norm|0.2201|±  |0.0121|
+|arc_easy     |      1|none  |     0|acc     |0.4373|±  |0.0102|
+|             |       |none  |     0|acc_norm|0.3935|±  |0.0100|
+|winogrande   |      1|none  |     0|acc     |0.5004|±  |0.0141|
+|openbookqa   |      1|none  |     0|acc     |0.1760|±  |0.0170|
+|             |       |none  |     0|acc_norm|0.2680|±  |0.0198|
+|hellaswag    |      1|none  |     0|acc     |0.2893|±  |0.0045|
+|             |       |none  |     0|acc_norm|0.3125|±  |0.0046|
+```
+You can use the following script to reproduce the results (assuming you have installed litgpt)
+```
+MODEL_NAME="step-00088000"
+MODEL_OUTPUT_ROOT="MicroLlamaV2-VastAI-Checkpoints/out/pretrain/micro-llama-v2"
+MODEL_OUTPUT_REL="${MODEL_OUTPUT_ROOT}/${MODEL_NAME}"
+# HuggingFace
+huggingface-cli download keeeeenw/MicroLlama2-checkpoints ${MODEL_NAME}/lit_model.pth --local-dir checkpoints/${MODEL_OUTPUT_ROOT}/
+huggingface-cli download keeeeenw/MicroLlama2-checkpoints ${MODEL_NAME}/generation_config.json --local-dir checkpoints/${MODEL_OUTPUT_ROOT}/
+huggingface-cli download keeeeenw/MicroLlama2-checkpoints ${MODEL_NAME}/hyperparameters.yaml --local-dir checkpoints/${MODEL_OUTPUT_ROOT}/
+huggingface-cli download keeeeenw/MicroLlama2-checkpoints ${MODEL_NAME}/model_config.yaml --local-dir checkpoints/${MODEL_OUTPUT_ROOT}/
+huggingface-cli download keeeeenw/MicroLlama2-checkpoints ${MODEL_NAME}/tokenizer.json --local-dir checkpoints/${MODEL_OUTPUT_ROOT}/
+huggingface-cli download keeeeenw/MicroLlama2-checkpoints ${MODEL_NAME}/tokenizer_config.json --local-dir checkpoints/${MODEL_OUTPUT_ROOT}/
+# Copy config, see "caveat" below
+cp -r <local_path>/config.json checkpoints/${MODEL_OUTPUT_REL}/
+# AWS
+# aws s3 cp s3://microllama-v2/checkpoints/out/pretrain/micro-llama-v2/${MODEL_NAME} checkpoints/${MODEL_OUTPUT_REL} --recursive
+litgpt evaluate \
+  ${MODEL_OUTPUT_REL} \
+  --tasks "hellaswag,openbookqa,winogrande,arc_easy,arc_challenge,boolq,piqa" \
+  --device cuda:0 \
+  --batch_size 16
+```
+**Caveat**: for some reason the auto generated config.json for the model in the checkpoint is incorrect, you will need to replace it with https://huggingface.co/keeeeenw/MicroLlama2-checkpoints/blob/main/config.json
+to resolve the evaluation error.