--- license: apache-2.0 datasets: - mlabonne/FineTome-100k language: - en library_name: transformers base_model: - AINovice2005/LeEmpereur-unhealed --- ## Model Overview - **Model Name:** Le-Empereur_70-Base ![image/png](https://cdn-uploads.huggingface.co/production/uploads/64e8ea3892d9db9a93580fe3/lc5gftKyL60zY5JXq6fD-.png) # Model Description: The pruned model was fine-tuned on the FineTome-100K dataset to restore partial convergence to the model. ## Inference Script: ```python def generate_response(model_name, input_text, max_new_tokens=50): # Load the tokenizer and model from Hugging Face Hub tokenizer = AutoTokenizer.from_pretrained(model_name) model = AutoModelForCausalLM.from_pretrained(model_name) # Tokenize the input text input_ids = tokenizer(input_text, return_tensors='pt').input_ids # Generate a response using the model with torch.no_grad(): generated_ids = model.generate(input_ids, max_new_tokens=max_new_tokens) # Decode the generated tokens into text generated_text = tokenizer.decode(generated_ids[0], skip_special_tokens=True) return generated_text if __name__ == "__main__": # Set the model name from Hugging Face Hub model_name = "AINovice2005/Le-Empereur_70-Base" input_text = "Hello, how are you?" # Generate and print the model's response output = generate_response(model_name, input_text) print(f"Input: {input_text}") print(f"Output: {output}") ``` 𝐑𝐞𝐬𝐮𝐥𝐭𝐬: Firstly, a higher learning rate was required for the model to train the model on a dataset, training methods such as SFT and ORPO with PEFT failed to restore convergence.Secondly, training only using PEFT helped to restore partial model convergence. Lastly, future experiments to restore model convergence will require a systematic training and eval strategy.