See axolotl config

axolotl version: 0.4.1

adapter: lora
base_model: EleutherAI/pythia-410m-deduped
bf16: auto
dataset_prepared_path: null
datasets:
- data_files:
  - 268406b2c8127f67_train_data.json
  ds_type: json
  format: custom
  path: 268406b2c8127f67_train_data.json
  type:
    field: null
    field_input: context
    field_instruction: instruction
    field_output: response
    field_system: null
    format: null
    no_input_format: null
    system_format: '{system}'
    system_prompt: ''
early_stopping_patience: null
evals_per_epoch: 4
gradient_accumulation_steps: 1
group_by_length: false
hub_model_id: FatCat87/taopanda-2_bcc7097d-6c73-48e3-aaee-f9f854afb9b4
learning_rate: 1.0e-05
load_in_8bit: true
local_rank: null
logging_steps: 1
lora_alpha: 32
lora_dropout: 0.05
lora_fan_in_fan_out: true
lora_model_dir: null
lora_r: 16
lora_target_linear: null
lora_target_modules:
- query_key_value
micro_batch_size: 4
num_epochs: 4
output_dir: ./outputs/lora-alpaca-pythia/taopanda-2_bcc7097d-6c73-48e3-aaee-f9f854afb9b4
resume_from_checkpoint: null
seed: 84664
sequence_len: 512
special_tokens:
  pad_token: <|endoftext|>
tf32: true
train_on_inputs: false
val_set_size: 0.05
wandb_entity: fatcat87-taopanda
wandb_log_model: null
wandb_mode: online
wandb_name: taopanda-2_bcc7097d-6c73-48e3-aaee-f9f854afb9b4
wandb_project: subnet56
wandb_runid: taopanda-2_bcc7097d-6c73-48e3-aaee-f9f854afb9b4
wandb_watch: null
weight_decay: 0.1

taopanda-2_bcc7097d-6c73-48e3-aaee-f9f854afb9b4

This model is a fine-tuned version of EleutherAI/pythia-410m-deduped on the None dataset. It achieves the following results on the evaluation set:

Loss: 2.2826

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 1e-05
train_batch_size: 4
eval_batch_size: 4
seed: 84664
distributed_type: multi-GPU
num_devices: 2
total_train_batch_size: 8
total_eval_batch_size: 8
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: cosine
lr_scheduler_warmup_steps: 100
num_epochs: 4

Training results

Training Loss	Epoch	Step	Validation Loss
2.7173	0.0006	1	2.8035
2.3948	0.2505	414	2.5510
2.6987	0.5009	828	2.4507
2.2231	0.7514	1242	2.3969
2.4612	1.0018	1656	2.3698
2.9173	1.2523	2070	2.3450
2.3121	1.5027	2484	2.3282
2.8931	1.7532	2898	2.3154
2.0185	2.0036	3312	2.3080
2.2114	2.2541	3726	2.2980
2.4148	2.5045	4140	2.2941
2.2134	2.7550	4554	2.2887
1.5517	3.0054	4968	2.2839
2.2136	3.2559	5382	2.2811
1.2004	3.5064	5796	2.2838
2.374	3.7568	6210	2.2826

Framework versions

PEFT 0.11.1
Transformers 4.42.3
Pytorch 2.3.0+cu121
Datasets 2.19.1
Tokenizers 0.19.1

FatCat87
/

taopanda-2_bcc7097d-6c73-48e3-aaee-f9f854afb9b4

taopanda-2_bcc7097d-6c73-48e3-aaee-f9f854afb9b4

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for FatCat87/taopanda-2_bcc7097d-6c73-48e3-aaee-f9f854afb9b4

Evaluation results