Update:README.md README_zh.md

Browse files

Files changed (9) hide show

README.md +82 -3
README_zh.md +89 -0
assets/Wechat.jpeg +0 -0
assets/product1.png +0 -0
assets/product3.png +0 -0
assets/saofund2.png +0 -0
assets/sft_demo.png +0 -0
assets//350/277/231/344/270/252/347/224/267/344/272/272/350/203/275/345/253/201/345/220/227.jpg +0 -0
cli_demo.py +261 -0

README.md CHANGED Viewed

@@ -1,3 +1,82 @@
----
-license: apache-2.0
----

+![# MarryWise](assets/这个男人能嫁吗.jpg)
+<!-- <img src="assets/这个男人能嫁吗.jpg" width="900" alt="# MarryWise"> -->
+<!-- [![GitHub Repo stars](https://img.shields.io/github/stars/saofund/marrywise-llm?style=social)](https://github.com/saofund/marrywise-llm/stargazers) -->
+[![GitHub Code License](https://img.shields.io/github/license/saofund/marrywise-llm)](LICENSE)
+[![GitHub last commit](https://img.shields.io/github/last-commit/saofund/marrywise-llm)](https://github.com/saofund/marrywise-llm/commits/main)
+[![Studios](https://img.shields.io/badge/ModelScope-Open%20in%20ModelScope-blue)](https://modelscope.cn/models/qwen/Qwen2-7B)
+[![Spaces](https://img.shields.io/badge/🤗-Open%20in%20huggingface-blue)](https://huggingface.co/saofund/marrywise-7b-lora)
+[![Twitter](https://img.shields.io/twitter/follow/sáofund)](https://x.com/976582772Wyt)
+\[ English | [中文](README_zh.md) \]
+<!-- **MarryWise: AI-Driven Matchmaking Analysis Tool** -->
+| [![Online Experience](https://img.shields.io/badge/Online%20Experience-Can%20He%20Marry%20.com-blue)](https://xn--ciqpnj1l70hxw9az0oyqy.com/) | [![Online Experience 2](https://img.shields.io/badge/Online%20Experience2-Visit-blue)](https://can-he-marry.com/) |
+|---|---|
+| [![Product 1](assets/product1.png)](https://can-he-marry.com/) | [![Product 2](assets/product3.png)](https://can-he-marry.com/) |
+## Features
+- **AI Matchmaking Analyst**: Utilizing advanced AI technology to automatically analyze behavior and speech patterns during dates, revealing hidden motives and true personality traits.
+- **AI Fake Matchmaker Detector**: Helps users identify potential deceitful behavior during the dating process, ensuring the authenticity and reliability of partners.
+## Main Analysis Dimensions
+- **Personality Authenticity**: Assessing the true character of a potential partner through behavior and speech analysis.
+- **Actual Motives**: Revealing the real intentions behind a suitor's actions to prevent being deceived by appearances.
+- **Personal Achievements**: Evaluating a suitor's personal achievements and capabilities to ensure their descriptions match reality.
+## Fine-tuning Data Example
+Fine-tuned using [Qwen2-7B](https://qwen.readthedocs.io/en/latest/getting_started/quickstart.html), example of fine-tuning data (coming soon).
+| Input (Conditions)                                                                                                                       | Output (After analysis, large model conversion)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             |
+| ------------------------------------------------------------------------------------------------------------------------------------------ | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
+| 1. 32 years old, 178cm  <br> 2. Collects rent, annual income of 15104  <br> 3. 6 houses, one car  <br> 4. Loves cleanliness, cooking, wife, parents divorced and lives with mother, mother is gentle. | 🔍 Hidden Secrets: The truth he doesn't want you to know <br> In the dazzling matchmaking market, everyone tries to show their best side. But beware of those unspoken secrets, they could be stumbling blocks on your path to happiness. <br> 🍳 The weight mystery behind cooking <br> He claims to be a cooking master, living an easy and comfortable life. But numbers don't lie: the undisclosed weight may hint at an unhealthy lifestyle. Don't overlook long-term health risks while enjoying the delicacies. <br> 💰 Questions about property origin <br> Six properties, one car—seemingly solid financial foundation. But where do these assets come from? Being a demolition household might signify luck, or it could indicate a lack of ambition. <br> 👨‍👩‍👦 Complexity of family relationships <br> He praises his mother's gentleness and easy-going nature, but does it mean he cannot be independent? The mother-son relationship in a single-parent family may be more complex than it appears, potentially affecting future family harmony. <br> 💍 The "loving wife" label trap <br> "Loving wife" sounds perfect, but beware it could be a psychological tactic. Don't be easily fooled by this label without delving into his true intentions. <br> 🔍 Deep Dive: What are the real motives? <br> He understands women, but why is he still single? There might be hidden secrets under his perfect exterior. Uncover the veil and see his true motives before making a decision. |
+## Local Setup
+##### Detailed Steps:
+```shell
+# Download Qwen2-7B-Instruct model: https://modelscope.cn/models/qwen/Qwen2-7B/files
+git lfs install
+git clone https://www.modelscope.cn/qwen/Qwen2-7B.git
+# Download lora weights
+# Install LLaMA-Factory
+git clone --depth 1 https://github.com/hiyouga/LLaMA-Factory.git
+cd LLaMA-Factory
+pip install -e ".[torch,metrics]"   # Install dependencies, follow the official instructions
+# Use LLaMA-Factory to merge lora weights
+# Requires GPU, approximately 12G VRAM usage
+llamafactory-cli export \
+        --model_name_or_path Qwen2-7B-Instruct \    # The just downloaded Qwen2-7B weights
+        --adapter_name_or_path output_qwen\         # Path to lora weights
+        --template qwen \                           # Default
+        --finetuning_type lora \                    # Default
+        --export_dir lora_full_param_model \        # Output path for full weights
+        --export_size 2 \                           # Default
+        --export_legacy_format False                # Default
+# Official Qwen2 inference test script, replace the weight path with the merged path
+python cli_demo.py -c path_to_merged_weights  # Approximately 15G VRAM
+# Note: Due to the "style" characteristics of lora fine-tuning, specific prompt words need to be added at the beginning of the question:
+# Your role is a matchmaking condition analyst, specializing in identifying the "hidden" conditions not mentioned by the male party, analyzing the "secrets not mentioned" in matchmaking. xxxx (followed by specific conditions)
+```
+##### Local CLI Result:
+<img src="assets/sft_demo.png" width="500" alt="CLI Result">
+#### Contact the Author
+For dataset acquisition, models, algorithms, technical exchanges, and collaborative development, feel free to add the author's WeChat.
+| Author's WeChat QR Code | sáo Fund Sponsorship |
+|---|---|
+| ![Author's WeChat QR Code](assets/Wechat.jpeg) | ![sáo Fund Logo](assets/saofund2.png) |
+| For dataset acquisition, models, algorithms, technical exchanges, and collaborative development, feel free to add the author's WeChat. | Sponsored by sáo Fund, thank you. |

README_zh.md ADDED Viewed

	@@ -0,0 +1,89 @@

+![# MarryWise](assets/这个男人能嫁吗.jpg)
+<!-- <img src="assets/这个男人能嫁吗.jpg" width="900" alt="# MarryWise"> -->
+<!-- [![GitHub Repo stars](https://img.shields.io/github/stars/saofund/marrywise-llm?style=social)](https://github.com/saofund/marrywise-llm/stargazers) -->
+[![GitHub Code License](https://img.shields.io/github/license/saofund/marrywise-llm)](LICENSE)
+[![GitHub last commit](https://img.shields.io/github/last-commit/saofund/marrywise-llm)](https://github.com/saofund/marrywise-llm/commits/main)
+[![Studios](https://img.shields.io/badge/ModelScope-Open%20in%20ModelScope-blue)](https://modelscope.cn/models/qwen/Qwen2-7B)
+[![Spaces](https://img.shields.io/badge/🤗-Open%20in%20huggingface-blue)](https://huggingface.co/saofund/marrywise-7b-lora)
+[![Twitter](https://img.shields.io/twitter/follow/sáofund)](https://x.com/976582772Wyt)
+\[ English | [中文](README_zh.md) \]
+<!-- **MarryWise: AI驱动的相亲分析工具** -->
+| [![在线体验](https://img.shields.io/badge/在线地址-这个男人能嫁吗.com-blue)](https://xn--ciqpnj1l70hxw9az0oyqy.com/) | [![在线体验2](https://img.shields.io/badge/在线地址2-访问-blue)](https://can-he-marry.com/) |
+|---|---|
+| [![Product 1](assets/product1.png)](https://can-he-marry.com/) | [![Product 2](assets/product3.png)](https://can-he-marry.com/) |
+## 功能特点
+- **AI 相亲分析师**：利用先进的 AI 技术，自动分析约会中的行为和言语模式，揭示隐藏的动机和真实的性格特征。
+- **AI 打假媒婆**：帮助用户识别约会过程中的潜在欺骗行为，确保伴侣的真实性和可靠性。
+## 主要分析维度
+- **性格真实性**：通过行为和言语分析，评估潜在伴侣的真实性格。
+- **实际动机**：揭示求婚者行为背后的真实意图，防止被表象所迷惑。
+- **个人成就**：评估求婚者的个人成就和能力，确保其描述与实际情况一致。
+## 微调数据示例
+使用 [Qwen2-7B](https://qwen.readthedocs.io/en/latest/getting_started/quickstart.html) 微调，微调数据示例（即将开源）。
+| 输入（条件）                                                                                                                     | 输出：（分析、大模型转换后）                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             |
+| -------------------------------------------------------------------------------------------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
+| 1. 32岁，178  <br> 2. 收房租，年收入15104  <br> 3. 6套房，一辆代步车  <br> 4. 爱干净，爱做饭，爱老婆，父母离异跟母亲，母亲温柔。 | 🔍 隐藏的秘密：他不想让你知道的真相 <br> 在闪耀的相亲市场，每个人都试图展示自己最好的一面。但小心那些未被言说的秘密，它们可能是你幸福之路上的绊脚石。 <br> 🍳 烹饪背后的体重迷思 <br> 他自称烹饪大师，生活轻松自在。但数字不会说谎：未提及的体重可能暗示着不健康的生活方式。在你品尝美食的同时，不要忽视长期健康的风险。 <br> 💰 财产来源的疑问 <br> 六套房产，一辆车——看似稳固的经济基础。但这些财产来自何方？拆迁户的身份可能是幸运的象征，也可能是缺乏进取心的标��。 <br> 👨‍👩‍👦 家庭关系的复杂性 <br> 他称赞母亲的温柔和易相处，但这是否意味着他无法独立？单亲家庭背景下的母子关系可能比表面看起来要复杂得多，这可能会影响未来的家庭和谐。 <br> 💍 “爱妻”标签的陷阱 <br> “爱老婆”听起来很完美，但小心这是一种心理战术。在深入了解他的真实意图之前，不要轻易被这个标签迷惑。 <br> 🔍 深入挖掘：真正的动机是什么？ <br> 他了解女性，但为何仍单身？在他的完美外表下，可能隐藏着不为人知的秘密。在做出决定之前，请揭开那层薄纱，看清他的真实动机。 |
+## 本地启动
+##### 详细步骤：
+```shell
+# 下载Qwen2-7B-Instruct模型：https://modelscope.cn/models/qwen/Qwen2-7B/files
+git lfs install
+git clone https://www.modelscope.cn/qwen/Qwen2-7B.git
+# 下载lora权重
+# 安装 LLaMA-Factory
+git clone --depth 1 https://github.com/hiyouga/LLaMA-Factory.git
+cd LLaMA-Factory
+pip install -e ".[torch,metrics]"   # 安装依赖，这里最好按官方乖乖装完
+# 使用LLaMA-Factory 合并lora权重
+# 需要GPU，大概12G显存占用
+llamafactory-cli export \
+        --model_name_or_path Qwen2-7B-Instruct \    # 刚下载的Qwen2-7B权重
+        --adapter_name_or_path output_qwen\         # lora权重路径
+        --template qwen \                           # 默认
+        --finetuning_type lora \                    # 默认
+        --export_dir lora_full_param_model \        # 完整权重的输出路径
+        --export_size 2 \                           # 默认
+        --export_legacy_format False                # 默认
+# Qwen2的官方推理测试脚本，替换权重路径为刚才的合并后的路径
+python cli_demo.py -c 合并后的权重路径  # 大概15G显存
+# 请注意，由于lora微调的“风格”特性，需要在问题的开头加入特定提示词：
+# 你的身份是一个相亲条件分析师，专门寻找男方条件中“隐瞒没有说”的条件，分析“相亲中男生没有说的秘密”。xxxx（后面跟具体条件）
+```
+##### 本地运行cli结果：
+<img src="assets/sft_demo.png" width="500" alt="CLI Result">
+#### 欢迎联系作者
+数据集获取、模型、算法、技术交流、合作开发等，欢迎添加作者微信。
+| 作者微信二维码 | sáo基金赞助 |
+|---|---|
+| ![作者的微信二维码](assets/Wechat.jpeg) | ![sáo基金标志](assets/saofund2.png) |
+| 数据集获取、模型、算法、技术交流、合作开发等，欢迎添加作者微信。 | 由 sáo 基金赞助，感谢。 |

assets/Wechat.jpeg ADDED Viewed

assets/product1.png ADDED Viewed

assets/product3.png ADDED Viewed

assets/saofund2.png ADDED Viewed

assets/sft_demo.png ADDED Viewed

assets//350/277/231/344/270/252/347/224/267/344/272/272/350/203/275/345/253/201/345/220/227.jpg ADDED Viewed

cli_demo.py ADDED Viewed

	@@ -0,0 +1,261 @@

+"""A simple command-line interactive chat demo."""
+import argparse
+import os
+import platform
+import shutil
+from copy import deepcopy
+from threading import Thread
+import torch
+from transformers import AutoModelForCausalLM, AutoTokenizer, TextIteratorStreamer
+from transformers.trainer_utils import set_seed
+DEFAULT_CKPT_PATH = 'Qwen/Qwen2-7B-Instruct'
+_WELCOME_MSG = '''\
+Welcome to use Qwen2-Instruct model, type text to start chat, type :h to show command help.
+(欢迎使用 Qwen2-Instruct 模型，输入内容即可进行对话，:h 显示命令帮助。)
+Note: This demo is governed by the original license of Qwen2.
+We strongly advise users not to knowingly generate or allow others to knowingly generate harmful content, including hate speech, violence, pornography, deception, etc.
+(注：本演示受Qwen2的许可协议限制。我们强烈建议，用户不应传播及不应允许他人传播以下内容，包括但不限于仇恨言论、暴力、色情、欺诈相关的有害信息。)
+'''
+_HELP_MSG = '''\
+Commands:
+    :help / :h              Show this help message              显示帮助信息
+    :exit / :quit / :q      Exit the demo                       退出Demo
+    :clear / :cl            Clear screen                        清屏
+    :clear-history / :clh   Clear history                       清除对话历史
+    :history / :his         Show history                        显示对话历史
+    :seed                   Show current random seed            显示当前随机种子
+    :seed <N>               Set random seed to <N>              设置随机种子
+    :conf                   Show current generation config      显示生成配置
+    :conf <key>=<value>     Change generation config            修改生成配置
+    :reset-conf             Reset generation config             重置生成配置
+'''
+_ALL_COMMAND_NAMES = [
+    'help', 'h', 'exit', 'quit', 'q', 'clear', 'cl', 'clear-history', 'clh', 'history', 'his',
+    'seed', 'conf', 'reset-conf',
+]
+def _setup_readline():
+    try:
+        import readline
+    except ImportError:
+        return
+    _matches = []
+    def _completer(text, state):
+        nonlocal _matches
+        if state == 0:
+            _matches = [cmd_name for cmd_name in _ALL_COMMAND_NAMES if cmd_name.startswith(text)]
+        if 0 <= state < len(_matches):
+            return _matches[state]
+        return None
+    readline.set_completer(_completer)
+    readline.parse_and_bind('tab: complete')
+def _load_model_tokenizer(args):
+    tokenizer = AutoTokenizer.from_pretrained(
+        args.checkpoint_path, resume_download=True,
+    )
+    if args.cpu_only:
+        device_map = "cpu"
+    else:
+        device_map = "auto"
+    model = AutoModelForCausalLM.from_pretrained(
+        args.checkpoint_path,
+        torch_dtype="auto",
+        device_map=device_map,
+        resume_download=True,
+    ).eval()
+    model.generation_config.max_new_tokens = 2048    # For chat.
+    return model, tokenizer
+def _gc():
+    import gc
+    gc.collect()
+    if torch.cuda.is_available():
+        torch.cuda.empty_cache()
+def _clear_screen():
+    if platform.system() == "Windows":
+        os.system("cls")
+    else:
+        os.system("clear")
+def _print_history(history):
+    terminal_width = shutil.get_terminal_size()[0]
+    print(f'History ({len(history)})'.center(terminal_width, '='))
+    for index, (query, response) in enumerate(history):
+        print(f'User[{index}]: {query}')
+        print(f'QWen[{index}]: {response}')
+    print('=' * terminal_width)
+def _get_input() -> str:
+    while True:
+        try:
+            message = input('User> ').strip()
+        except UnicodeDecodeError:
+            print('[ERROR] Encoding error in input')
+            continue
+        except KeyboardInterrupt:
+            exit(1)
+        if message:
+            return message
+        print('[ERROR] Query is empty')
+def _chat_stream(model, tokenizer, query, history):
+    conversation = [
+        {'role': 'system', 'content': 'You are a helpful assistant.'},
+    ]
+    for query_h, response_h in history:
+        conversation.append({'role': 'user', 'content': query_h})
+        conversation.append({'role': 'assistant', 'content': response_h})
+    conversation.append({'role': 'user', 'content': query})
+    inputs = tokenizer.apply_chat_template(
+        conversation,
+        add_generation_prompt=True,
+        return_tensors='pt',
+    )
+    inputs = inputs.to(model.device)
+    streamer = TextIteratorStreamer(tokenizer=tokenizer, skip_prompt=True, timeout=60.0, skip_special_tokens=True)
+    generation_kwargs = dict(
+        input_ids=inputs,
+        streamer=streamer,
+    )
+    thread = Thread(target=model.generate, kwargs=generation_kwargs)
+    thread.start()
+    for new_text in streamer:
+        yield new_text
+def main():
+    parser = argparse.ArgumentParser(
+        description='QWen2-Instruct command-line interactive chat demo.')
+    parser.add_argument("-c", "--checkpoint-path", type=str, default=DEFAULT_CKPT_PATH,
+                        help="Checkpoint name or path, default to %(default)r")
+    parser.add_argument("-s", "--seed", type=int, default=1234, help="Random seed")
+    parser.add_argument("--cpu-only", action="store_true", help="Run demo with CPU only")
+    args = parser.parse_args()
+    history, response = [], ''
+    model, tokenizer = _load_model_tokenizer(args)
+    orig_gen_config = deepcopy(model.generation_config)
+    _setup_readline()
+    _clear_screen()
+    print(_WELCOME_MSG)
+    seed = args.seed
+    while True:
+        query = _get_input()
+        # Process commands.
+        if query.startswith(':'):
+            command_words = query[1:].strip().split()
+            if not command_words:
+                command = ''
+            else:
+                command = command_words[0]
+            if command in ['exit', 'quit', 'q']:
+                break
+            elif command in ['clear', 'cl']:
+                _clear_screen()
+                print(_WELCOME_MSG)
+                _gc()
+                continue
+            elif command in ['clear-history', 'clh']:
+                print(f'[INFO] All {len(history)} history cleared')
+                history.clear()
+                _gc()
+                continue
+            elif command in ['help', 'h']:
+                print(_HELP_MSG)
+                continue
+            elif command in ['history', 'his']:
+                _print_history(history)
+                continue
+            elif command in ['seed']:
+                if len(command_words) == 1:
+                    print(f'[INFO] Current random seed: {seed}')
+                    continue
+                else:
+                    new_seed_s = command_words[1]
+                    try:
+                        new_seed = int(new_seed_s)
+                    except ValueError:
+                        print(f'[WARNING] Fail to change random seed: {new_seed_s!r} is not a valid number')
+                    else:
+                        print(f'[INFO] Random seed changed to {new_seed}')
+                        seed = new_seed
+                    continue
+            elif command in ['conf']:
+                if len(command_words) == 1:
+                    print(model.generation_config)
+                else:
+                    for key_value_pairs_str in command_words[1:]:
+                        eq_idx = key_value_pairs_str.find('=')
+                        if eq_idx == -1:
+                            print('[WARNING] format: <key>=<value>')
+                            continue
+                        conf_key, conf_value_str = key_value_pairs_str[:eq_idx], key_value_pairs_str[eq_idx + 1:]
+                        try:
+                            conf_value = eval(conf_value_str)
+                        except Exception as e:
+                            print(e)
+                            continue
+                        else:
+                            print(f'[INFO] Change config: model.generation_config.{conf_key} = {conf_value}')
+                            setattr(model.generation_config, conf_key, conf_value)
+                continue
+            elif command in ['reset-conf']:
+                print('[INFO] Reset generation config')
+                model.generation_config = deepcopy(orig_gen_config)
+                print(model.generation_config)
+                continue
+            else:
+                # As normal query.
+                pass
+        # Run chat.
+        set_seed(seed)
+        _clear_screen()
+        print(f"\nUser: {query}")
+        print(f"\nQwen2-Instruct: ", end="")
+        try:
+            partial_text = ''
+            for new_text in _chat_stream(model, tokenizer, query, history):
+                print(new_text, end='', flush=True)
+                partial_text += new_text
+            response = partial_text
+            print()
+        except KeyboardInterrupt:
+            print('[WARNING] Generation interrupted')
+            continue
+        history.append((query, response))
+if __name__ == "__main__":
+    main()