Edit model card

shawgpt-ft

This model is a fine-tuned version of TheBloke/Mistral-7B-Instruct-v0.2-GPTQ on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 4.2320

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0002
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 16
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 1
  • training_steps: 1000

Training results

Training Loss Epoch Step Validation Loss
4.6433 0.92 3 4.2320
4.6544 1.85 6 4.2320
4.6459 2.77 9 4.2320
3.4822 4.0 13 4.2320
4.6298 4.92 16 4.2320
4.6605 5.85 19 4.2320
4.6392 6.77 22 4.2320
3.4844 8.0 26 4.2320
4.6305 8.92 29 4.2320
4.6337 9.85 32 4.2320
4.6501 10.77 35 4.2320
3.4793 12.0 39 4.2320
4.6568 12.92 42 4.2320
4.6402 13.85 45 4.2320
4.6381 14.77 48 4.2320
3.4787 16.0 52 4.2320
4.671 16.92 55 4.2320
4.6186 17.85 58 4.2320
4.6403 18.77 61 4.2320
3.5009 20.0 65 4.2320
4.6514 20.92 68 4.2320
4.6426 21.85 71 4.2320
4.6674 22.77 74 4.2320
3.4915 24.0 78 4.2320
4.6606 24.92 81 4.2320
4.6364 25.85 84 4.2320
4.6222 26.77 87 4.2320
3.4782 28.0 91 4.2320
4.6229 28.92 94 4.2320
4.6576 29.85 97 4.2320
4.6288 30.77 100 4.2320
3.4664 32.0 104 4.2320
4.6434 32.92 107 4.2320
4.6519 33.85 110 4.2320
4.6528 34.77 113 4.2320
3.471 36.0 117 4.2320
4.6453 36.92 120 4.2320
4.616 37.85 123 4.2320
4.6109 38.77 126 4.2320
3.4799 40.0 130 4.2320
4.6388 40.92 133 4.2320
4.6711 41.85 136 4.2320
4.6483 42.77 139 4.2320
3.4695 44.0 143 4.2320
4.6496 44.92 146 4.2320
4.644 45.85 149 4.2320
4.6444 46.77 152 4.2320
3.4741 48.0 156 4.2320
4.6189 48.92 159 4.2320
4.6683 49.85 162 4.2320
4.6345 50.77 165 4.2320
3.4703 52.0 169 4.2320
4.6144 52.92 172 4.2320
4.6648 53.85 175 4.2320
4.6522 54.77 178 4.2320
3.4838 56.0 182 4.2320
4.6506 56.92 185 4.2320
4.6339 57.85 188 4.2320
4.638 58.77 191 4.2320
3.4733 60.0 195 4.2320
4.6604 60.92 198 4.2320
4.6326 61.85 201 4.2320
4.6612 62.77 204 4.2320
3.4722 64.0 208 4.2320
4.6292 64.92 211 4.2320
4.6336 65.85 214 4.2320
4.642 66.77 217 4.2320
3.4915 68.0 221 4.2320
4.6453 68.92 224 4.2320
4.6459 69.85 227 4.2320
4.6202 70.77 230 4.2320
3.4753 72.0 234 4.2320
4.6552 72.92 237 4.2320
4.6443 73.85 240 4.2320
4.6495 74.77 243 4.2320
3.4798 76.0 247 4.2320
4.6358 76.92 250 4.2320
4.6434 77.85 253 4.2320
4.6325 78.77 256 4.2320
3.4951 80.0 260 4.2320
4.6302 80.92 263 4.2320
4.6458 81.85 266 4.2320
4.6407 82.77 269 4.2320
3.4828 84.0 273 4.2320
4.6436 84.92 276 4.2320
4.6143 85.85 279 4.2320
4.644 86.77 282 4.2320
3.4934 88.0 286 4.2320
4.6308 88.92 289 4.2320
4.6715 89.85 292 4.2320
4.6229 90.77 295 4.2320
3.4895 92.0 299 4.2320
4.6447 92.92 302 4.2320
4.6333 93.85 305 4.2320
4.643 94.77 308 4.2320
3.482 96.0 312 4.2320
4.6647 96.92 315 4.2320
4.65 97.85 318 4.2320
4.6545 98.77 321 4.2320
3.4881 100.0 325 4.2320
4.6828 100.92 328 4.2320
4.6328 101.85 331 4.2320
4.6419 102.77 334 4.2320
3.4954 104.0 338 4.2320
4.6203 104.92 341 4.2320
4.6236 105.85 344 4.2320
4.6539 106.77 347 4.2320
3.4737 108.0 351 4.2320
4.6319 108.92 354 4.2320
4.6696 109.85 357 4.2320
4.6678 110.77 360 4.2320
3.4698 112.0 364 4.2320
4.6459 112.92 367 4.2320
4.6524 113.85 370 4.2320
4.6399 114.77 373 4.2320
3.471 116.0 377 4.2320
4.6668 116.92 380 4.2320
4.634 117.85 383 4.2320
4.6345 118.77 386 4.2320
3.4938 120.0 390 4.2320
4.6386 120.92 393 4.2320
4.6661 121.85 396 4.2320
4.6465 122.77 399 4.2320
3.4903 124.0 403 4.2320
4.6255 124.92 406 4.2320
4.6306 125.85 409 4.2320
4.6348 126.77 412 4.2320
3.4811 128.0 416 4.2320
4.6335 128.92 419 4.2320
4.6678 129.85 422 4.2320
4.6336 130.77 425 4.2320
3.4722 132.0 429 4.2320
4.6371 132.92 432 4.2320
4.6488 133.85 435 4.2320
4.6456 134.77 438 4.2320
3.4866 136.0 442 4.2320
4.6349 136.92 445 4.2320
4.6418 137.85 448 4.2320
4.6546 138.77 451 4.2320
3.4811 140.0 455 4.2320
4.6322 140.92 458 4.2320
4.6154 141.85 461 4.2320
4.6362 142.77 464 4.2320
3.4809 144.0 468 4.2320
4.6317 144.92 471 4.2320
4.6329 145.85 474 4.2320
4.636 146.77 477 4.2320
3.4737 148.0 481 4.2320
4.629 148.92 484 4.2320
4.6212 149.85 487 4.2320
4.6548 150.77 490 4.2320
3.481 152.0 494 4.2320
4.6379 152.92 497 4.2320
4.6306 153.85 500 4.2320
4.6443 154.77 503 4.2320
3.4951 156.0 507 4.2320
4.6514 156.92 510 4.2320
4.6539 157.85 513 4.2320
4.6295 158.77 516 4.2320
3.485 160.0 520 4.2320
4.6665 160.92 523 4.2320
4.6508 161.85 526 4.2320
4.6754 162.77 529 4.2320
3.4689 164.0 533 4.2320
4.6286 164.92 536 4.2320
4.6164 165.85 539 4.2320
4.634 166.77 542 4.2320
3.4878 168.0 546 4.2320
4.6616 168.92 549 4.2320
4.6228 169.85 552 4.2320
4.6427 170.77 555 4.2320
3.4739 172.0 559 4.2320
4.656 172.92 562 4.2320
4.6488 173.85 565 4.2320
4.6199 174.77 568 4.2320
3.4842 176.0 572 4.2320
4.6632 176.92 575 4.2320
4.646 177.85 578 4.2320
4.6226 178.77 581 4.2320
3.4619 180.0 585 4.2320
4.6329 180.92 588 4.2320
4.6245 181.85 591 4.2320
4.6435 182.77 594 4.2320
3.478 184.0 598 4.2320
4.6256 184.92 601 4.2320
4.6516 185.85 604 4.2320
4.6438 186.77 607 4.2320
3.5015 188.0 611 4.2320
4.6254 188.92 614 4.2320
4.6265 189.85 617 4.2320
4.6447 190.77 620 4.2320
3.508 192.0 624 4.2320
4.6353 192.92 627 4.2320
4.6333 193.85 630 4.2320
4.6573 194.77 633 4.2320
3.4644 196.0 637 4.2320
4.6413 196.92 640 4.2320
4.6641 197.85 643 4.2320
4.638 198.77 646 4.2320
3.4885 200.0 650 4.2320
4.6502 200.92 653 4.2320
4.6476 201.85 656 4.2320
4.645 202.77 659 4.2320
3.4861 204.0 663 4.2320
4.6418 204.92 666 4.2320
4.6419 205.85 669 4.2320
4.6395 206.77 672 4.2320
3.4739 208.0 676 4.2320
4.6306 208.92 679 4.2320
4.6245 209.85 682 4.2320
4.6614 210.77 685 4.2320
3.4965 212.0 689 4.2320
4.642 212.92 692 4.2320
4.6371 213.85 695 4.2320
4.6265 214.77 698 4.2320
3.4965 216.0 702 4.2320
4.6648 216.92 705 4.2320
4.6248 217.85 708 4.2320
4.6507 218.77 711 4.2320
3.4741 220.0 715 4.2320
4.644 220.92 718 4.2320
4.6315 221.85 721 4.2320
4.659 222.77 724 4.2320
3.4942 224.0 728 4.2320
4.6463 224.92 731 4.2320
4.6477 225.85 734 4.2320
4.6323 226.77 737 4.2320
3.4907 228.0 741 4.2320
4.6323 228.92 744 4.2320
4.6442 229.85 747 4.2320
4.6351 230.77 750 4.2320
3.4799 232.0 754 4.2320
4.6463 232.92 757 4.2320
4.6389 233.85 760 4.2320
4.6399 234.77 763 4.2320
3.4819 236.0 767 4.2320
4.678 236.92 770 4.2320
4.6446 237.85 773 4.2320
4.642 238.77 776 4.2320
3.4879 240.0 780 4.2320
4.6561 240.92 783 4.2320
4.6226 241.85 786 4.2320
4.6607 242.77 789 4.2320
3.4901 244.0 793 4.2320
4.6317 244.92 796 4.2320
4.6387 245.85 799 4.2320
4.6493 246.77 802 4.2320
3.4863 248.0 806 4.2320
4.6187 248.92 809 4.2320
4.6449 249.85 812 4.2320
4.6542 250.77 815 4.2320
3.4905 252.0 819 4.2320
4.6514 252.92 822 4.2320
4.6496 253.85 825 4.2320
4.6542 254.77 828 4.2320
3.4661 256.0 832 4.2320
4.631 256.92 835 4.2320
4.644 257.85 838 4.2320
4.6348 258.77 841 4.2320
3.5069 260.0 845 4.2320
4.6257 260.92 848 4.2320
4.6584 261.85 851 4.2320
4.6344 262.77 854 4.2320
3.4721 264.0 858 4.2320
4.6429 264.92 861 4.2320
4.6433 265.85 864 4.2320
4.6391 266.77 867 4.2320
3.4916 268.0 871 4.2320
4.6564 268.92 874 4.2320
4.658 269.85 877 4.2320
4.6329 270.77 880 4.2320
3.4783 272.0 884 4.2320
4.6384 272.92 887 4.2320
4.6482 273.85 890 4.2320
4.6688 274.77 893 4.2320
3.4659 276.0 897 4.2320
4.6299 276.92 900 4.2320
4.6392 277.85 903 4.2320
4.6521 278.77 906 4.2320
3.4949 280.0 910 4.2320
4.6643 280.92 913 4.2320
4.6361 281.85 916 4.2320
4.6505 282.77 919 4.2320
3.4847 284.0 923 4.2320
4.639 284.92 926 4.2320
4.6276 285.85 929 4.2320
4.6438 286.77 932 4.2320
3.4883 288.0 936 4.2320
4.6483 288.92 939 4.2320
4.6564 289.85 942 4.2320
4.6437 290.77 945 4.2320
3.4712 292.0 949 4.2320
4.6627 292.92 952 4.2320
4.6371 293.85 955 4.2320
4.6196 294.77 958 4.2320
3.4859 296.0 962 4.2320
4.6457 296.92 965 4.2320
4.6249 297.85 968 4.2320
4.6382 298.77 971 4.2320
3.4824 300.0 975 4.2320
4.6541 300.92 978 4.2320
4.659 301.85 981 4.2320
4.618 302.77 984 4.2320
3.4751 304.0 988 4.2320
4.623 304.92 991 4.2320
4.6371 305.85 994 4.2320
4.6546 306.77 997 4.2320
3.1908 307.69 1000 4.2320

Framework versions

  • PEFT 0.10.0
  • Transformers 4.36.2
  • Pytorch 2.1.0+cu121
  • Datasets 2.18.0
  • Tokenizers 0.15.2
Downloads last month
6
Inference API
Unable to determine this model’s pipeline type. Check the docs .

Model tree for jeroenherczeg/shawgpt-ft