normster
/

RealGuardrails-OLMo2-7B-SFT-weak

Text Generation

Inference Endpoints

Model card Files Files and versions Community

RealGuardrails-OLMo2-7B-SFT-weak / README.md

normster's picture

Upload README.md with huggingface_hub

33bf86d verified 20 days ago

|

history blame contribute delete

900 Bytes

	---
	license: mit
	datasets:
	- normster/RealGuardrails
	base_model:
	- allenai/OLMo-2-1124-7B
	library_name: transformers
	---

	# RealGuardrails Models

	This model was trained on the [RealGuardrails](https://huggingface.co/datasets/normster/RealGuardrails) dataset, an instruction-tuning dataset focused on improving system prompt adherence and precedence. In particular, it was trained via SFT on the `simplemix` split of ~150K examples using our custom training library [torchllms](https://github.com/normster/torchllms) and converted back to a `transformers` compatible checkpoint.

	## Training Hyperparameters

	\| Name \| Value \|
	\| :--- \| :--- \|
	\| optimizer \| AdamW \|
	\| batch size \| 128 \|
	\| learning rate \| 2e-5 \|
	\| lr scheduler \| cosine with 200 warmup steps \|
	\| betas \| (0.9, 0.999) \|
	\| eps \| 1e-8 \|
	\| weight decay \| 0 \|
	\| epochs \| 1 \|
	\| max grad norm \| 1.0 \|
	\| precision \| bf16 \|
	\| max length \| 4096 \|