shahidul034
/

Bangla_text_generation

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

shahidul034 commited on Jun 1, 2024

Commit

4a71c75

·

verified ·

1 Parent(s): d38cdcc

Create README.md

Files changed (1) hide show

README.md +75 -0

README.md ADDED Viewed

	@@ -0,0 +1,75 @@

+# text_generation_bangla_model
+BanglaCLM dataset:
+- OSCAR: 12.84GB
+- Wikipedia dump: 6.24GB
+- ProthomAlo: 3.92GB
+- Kalerkantho: 3.24GB
+## Model description
+- context size : 128
+## Training and evaluation data
+The BanglaCLM data set is divided into a training set (90%)and a validation set (10%).
+## Training procedure
+### Training hyperparameters
+The following hyperparameters were used during training:
+- Batch size: 32
+- Initial learning rate: 5e-5
+- Number of warmup steps: 10000
+- Weight decay rate: 0.01
+- Tokenization algorithm: BPE
+- Vocabulary size of tokenizer: 50256
+- Total trainable params: 124,439,808
+- Epochs: 40
+- Number of training steps: 40772228
+- training_precision: float32
+### Training results
+perplexity score: 2.86.
+### Framework versions
+- Transformers 4.26.1
+- TensorFlow 2.11.0
+- Datasets 2.10.0
+- Tokenizers 0.13.2
+### Citation
+If you find this model helpful, please cite.
+```
+@INPROCEEDINGS{10303383,
+  author={Salim, Md. Shahidul and Murad, Hasan and Das, Dola and Ahmed, Faisal},
+  booktitle={2023 International Conference on Information and Communication Technology for Sustainable Development (ICICT4SD)},
+  title={BanglaGPT: A Generative Pretrained Transformer-Based Model for Bangla Language},
+  year={2023},
+  volume={},
+  number={},
+  pages={56-59},
+  doi={10.1109/ICICT4SD59951.2023.10303383}}
+```