[中文](README.md) # Qwen2.5-Sex ## Introduction Qwen2.5-Sex is a model fine-tuned based on Qwen2.5-1.5B-Instruct, primarily trained on a large number of erotic literary works and sensitive datasets. Since the datasets are mainly in Chinese, the model performs better with Chinese text. > **Warning**: This model is for research and testing purposes only. Users must comply with local laws and regulations and take responsibility for their actions. ## Model Usage To implement **continuous conversation**, please use the following code: ```python from transformers import AutoModelForCausalLM, AutoTokenizer import torch import os # Adjustable parameters; it is recommended to set them to higher values during text generation (Except Temperature) TOP_P = 0.9 # Top-p (nucleus sampling), range from 0 to 1 TOP_K = 80 # Top-k sampling value K TEMPERATURE = 0.3 # Temperature parameter to control randomness in text generation device = "cuda" if torch.cuda.is_available() else "cpu" # Get the current script directory; it can also be changed to an absolute path current_directory = os.path.dirname(os.path.abspath(__file__)) # Load the model and tokenizer model = AutoModelForCausalLM.from_pretrained( current_directory, torch_dtype="auto", device_map="auto" ) tokenizer = AutoTokenizer.from_pretrained(current_directory) # System instructions (recommended to be empty) messages = [ {"role": "system", "content": ""} ] while True: # Get user input user_input = input("User: ").strip() # Add user input to conversation messages.append({"role": "user", "content": user_input}) # Prepare input text text = tokenizer.apply_chat_template( messages, tokenize=False, add_generation_prompt=True ) model_inputs = tokenizer([text], return_tensors="pt").to(device) # Generate response generated_ids = model.generate( model_inputs.input_ids, max_new_tokens=512, top_p=TOP_P, top_k=TOP_K, temperature=TEMPERATURE, do_sample=True, pad_token_id=tokenizer.eos_token_id # Avoid warnings ) generated_ids = [ output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids) ] # Decode and print response response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0] print(f"Assistant: {response}") # Add the generated response to the conversation messages.append({"role": "assistant", "content": response}) ``` ## Datasets The Qwen2-Sex model has been fine-tuned using a large number of erotic literature and sensitive datasets covering themes like ethics, law, pornography, and violence. The model performs better with Chinese text due to the fine-tuning dataset being in Chinese. For more information, you can access the following links: - [Bad Data](https://huggingface.co/datasets/ystemsrx/bad_data.json) - [Toxic-All](https://huggingface.co/datasets/ystemsrx/Toxic-All) - [Erotic Literature Collection](https://huggingface.co/datasets/ystemsrx/Erotic_Literature_Collection) For more dataset information, please visit our [GitHub](https://github.com/ystemsrx) to see how to obtain them. ## GitHub Repository For detailed information and ongoing updates about this series of models, please visit our GitHub repository: - [GitHub: ystemsrx/Qwen2.5-Sex](https://github.com/ystemsrx/Qwen2.5-Sex) ## Disclaimer All content provided by this model is for research and testing purposes only. The model developers are not responsible for any misuse. Users must comply with relevant laws and regulations and bear all responsibilities arising from the use of this model.