gopalakrishnan-d's picture
Update README.md
a07ff26 verified
|
raw
history blame
1.42 kB
metadata
library_name: transformers
tags:
  - gaudi
  - llama3
  - llm
  - optimum-habana
  - text-generation-inference
license: apache-2.0
datasets:
  - tatsu-lab/alpaca
language:
  - en
pipeline_tag: text-generation

Model Card for Model ID

This model was fine-tuned from meta-llama/Meta-Llama-3-8B

Model Details

Model Description

The gopalakrishnan-d/meta-llama3-8b-alpaca-v1 model is a fine-tuned variant of the Llama3 architecture with 8 billion parameters. This version has been specifically enhanced for better performance on diverse language tasks, utilizing the Gaudi 2 Accelerator to optimize the training process.

  • Hardware Type: Intel Gaudi2 Accelerator
  • Cloud Provider: Intel® Tiber™ Developer Cloud
  • Developed by: gopalakrishnan-d
  • Model type: Fine-Tuned LLM
  • Language(s) (NLP): English
  • **License:Apache 2.0 License
  • Finetuned from model: meta-llama/Meta-Llama-3-8B

Uses

  • Customer Service Chatbots
  • Content Generation Tools
  • Educational Tutoring Systems
  • Workflow Automation Systems
  • Personalized Recommendation Engines

Training Hyperparameters

- learning_rate: 5e-06 (Low Rate)
- train_batch_size: 8
- seed: 100
- gradient_accumulation_steps: 1
- optimizer: Adam 
- lr_scheduler_type: linear
- lr_scheduler_warmup_ratio: 0.03
- lora_rank=16 
- lora_alpha=32

Evaluation

Will be update..!

Results