--- library_name: peft base_model: google/flan-t5-large license: creativeml-openrail-m datasets: - MohamedRashad/ChatGPT-prompts - Hello-SimpleAI/HC3 language: - en metrics: - accuracy - rouge pipeline_tag: text2text-generation tags: - prompt --- # Model Card for Model ID ![Learning](https://t3.ftcdn.net/jpg/06/14/01/52/360_F_614015247_EWZHvC6AAOsaIOepakhyJvMqUu5tpLfY.jpg) This exploration highlights the innovative use of the Learning Rate Annealing (LoRA) technique in the context of fine-tuning a T5 model. Based on the `google/flan-t5-large` architecture and utilizing the PEFT library, this approach aims to refine the model's capabilities specifically for question-answering (QA) tasks. The entire fine-tuning code is available on Kaggle at the following link: [Kaggle code link](https://www.kaggle.com/code/yannicksteph/nlp-llm-fine-tuning-qa-lora-t5). The exploration focuses on the fine-tuning methodology, leveraging LoRA to dynamically adjust the learning rate during the training process. This strategic choice aims to optimize the model's convergence and enhance its performance specifically for text generation tasks in response to questions. The utilized datasets, such as `MohamedRashad/ChatGPT-prompts` and `Hello-SimpleAI/HC3`, contribute to enriching the diversity and complexity of linguistic interactions, thereby strengthening the model's ability to adapt to varied conversational contexts. The resulting model, identified by the specified model ID, is intended for direct use in text generation scenarios while also offering the possibility of additional fine-tuning for specific tasks. Evaluation metrics, including accuracy and ROUGE score, provide an objective assessment of the model's performance. To facilitate accessibility and usage, the entire fine-tuning code is available on Kaggle, serving as a practical and transparent resource for the natural language processing (NLP) practitioner community. ## Model Details ### Model Description The model is based on the T5 architecture (google/flan-t5-large) and has undergone fine-tuning using the PEFT library. It is designed to generate text responses in a question-answering format. The model is available under the Creative Commons Attribution-ShareAlike 4.0 International License (creativeml-openrail-m). - **Developed by:** [YanSte](https://github.com/YanSte) - **Model type:** [flan-t5-large] ## Uses ### Direct Use The model can be directly employed for text-to-text generation tasks, with a focus on generating responses to questions in a conversational format. ## How to Get Started with the Model Use the code below to get started with the model. ```python # Importing necessary libraries from transformers import AutoTokenizer, T5ForConditionalGeneration from transformers import pipeline # Load the pre-trained tokenizer and fine-tuned model from the specified hub repository tokenizer = AutoTokenizer.from_pretrained(hub_repo_name) finetuned_model = T5ForConditionalGeneration.from_pretrained(hub_repo_name) # Create a text generation pipeline using the fine-tuned model text_generation_pipeline = pipeline( task=pipeline_task, model=finetuned_model, tokenizer=tokenizer, truncation=True, max_length=pipeline_max_length, min_length=pipeline_min_length, temperature=pipeline_temperature, device=0 # Set device to 0 for GPU, -1 for CPU ) # Define a list of questions for text generation questions = ["What is Sherlock Holmes' job?"] # Prefix each question with the specified prefix for the task prefix = "Answer this question: " transformed_questions = [prefix + question for question in questions] # Generate texts using the text generation pipeline with the transformed questions generated_texts = text_generation_pipeline(transformed_questions, do_sample=True) ``` ## Training Details ### Training Data The model has been fine-tuned on datasets such as MohamedRashad/ChatGPT-prompts and Hello-SimpleAI/HC3. More detailed information on the training data, including links to Dataset Cards and preprocessing details, is needed. ### Framework versions - PEFT 0.7.1