## Jam-Contextsum

Jam-Contextsum is a GPT2-like model finetuned to generate summary on why the method exists.

## Jam-Contextsum Training Details

- ckpt_pretrain is the file that we use to finetune the model for generating the summary on why the method exists
- Our [GitHub repo](https://github.com/apcl-research/jam-contextsum) contains the code for reproduction using the same [data](https://huggingface.co/datasets/apcl/jam_contextsum).


## ckpt_pretrain.pt
| Hyperparameter | Description | Value |
| ----------- | ----------- |------------|
|e | embedding dimensions              | 512 |		 
|L | number of layers 			  		| 4 | 		 
|h | attention heads             		| 4 |		 
|c | block size / context length       | 1,024 |  		 
|b | batch size                        | 4  | 		 
|a | accumulation steps				| 32 |		 
|d | dropout							| 0.20 |		 
|r | learning rate                     | 3e-5 |		 
|y | iterations						| 1e-5 |	
|iter | number of iterations after pretraing						| 137,900  |