pszemraj
/

tFINE-850m-24x24-instruct-L2

Text2Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

tFINE-850m-24x24-instruct-L2

This model is a fine-tuned version of pszemraj/tFINE-850m-24x24-v0.5-instruct-L1 on the pszemraj/infinity-instruct-7m-T2T_en dataset (config deduped-L2).

It achieves the following results on the evaluation set:

Loss: 1.2542
Num Input Tokens Seen: 750938410

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 3.5e-05
train_batch_size: 32
eval_batch_size: 16
seed: 17868
gradient_accumulation_steps: 4
total_train_batch_size: 128
optimizer: Use OptimizerNames.PAGED_ADEMAMIX and the args are: No additional optimizer arguments
lr_scheduler_type: cosine
lr_scheduler_warmup_ratio: 0.03
num_epochs: 1.0

Downloads last month: 14

Safetensors

Model size

854M params

Tensor type

F32

·

Inference Examples

Text2Text Generation

This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for pszemraj/tFINE-850m-24x24-instruct-L2

Base model

pszemraj/tFINE-850m-24x24-v0.4-flan_aug

Finetuned

pszemraj/tFINE-850m-24x24-v0.5-instruct-L1

Finetuned

(1)

this model

Dataset used to train pszemraj/tFINE-850m-24x24-instruct-L2