jekunz
/

smollm-135m-lora-fineweb-faroese-transfer-from-icelandic

Text Generation

Model card Files Files and versions Community

This is a SmolLM2-135M-Instruct model fine-tuned first on the Icelandic and then on the Faroese portion of Fineweb-2. It is intended for my research and has not been evaluated more broadly yet.

LoRA setup:

Rank: 256
Alpha: 512
Target modules: ["up_proj", "down_proj", "gate_proj", "o_proj"]

Training:

1 Epoch on Icelandic, 5 epochs on Faroese
Learning rate: 8e-4
LR scheduler: Cosine
Warmup ratio: 0.05
Batch size: 1
4 A100 (40GB) GPUs
Gradient accumulation steps: 64
Effective batch size: 256
Max. context length: 8192 tokens

Downloads last month: 12

Inference Providers NEW

Text Generation

This model is not currently available via any of the supported third-party Inference Providers, and the model is not deployed on the HF Inference API.

Model tree for jekunz/smollm-135m-lora-fineweb-faroese-transfer-from-icelandic

Base model

HuggingFaceTB/SmolLM2-135M

Quantized

HuggingFaceTB/SmolLM2-135M-Instruct

Adapter

(6)

this model

Dataset used to train jekunz/smollm-135m-lora-fineweb-faroese-transfer-from-icelandic

Collection including jekunz/smollm-135m-lora-fineweb-faroese-transfer-from-icelandic

SmolLM CPT LoRA

7 items • Updated 3 days ago