apcl
/

Jam

Jam is a GPT2-like model for research in fine-grained Java analysis. It is intended for fine-grained analysis of Java source code at the level of methods, statements, and variables, as a foundation for downstream tasks like code completion, comment generation, and automated bug repair.


Jam Training Details

  • We trained the jam model using the training procedures from Daniel Grittner's NanoGPT-LoRA

  • The dataset used to train our model is our own dataset jm52m dataset, which consists of the processed source code of 52 million Java methods.

  • We train the model on training set for 1 epoch, roughly 300,000 training iterations.

  • Our GitHub repo contains the code for re-training using the raw data

Hyperparameter Description Value
e embedding dimensions 1024
L number of layers 24
h attention heads 16
c block size / context length 256
b batch size 4
a accumulation steps 32
d dropout 0.20
r learning rate 3e-5
y weight decay 1e-1

We train our models using a single NVidia A5000 GPU.


Jam Projects

Current projects using the JAM pre-trained model can be found at our Github repository:

https://github.com/apcl-research/jam

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.

Dataset used to train apcl/jam