Spaces:

Miuzarte
/

SUI-svc-3.0

Runtime error

App Files Files Community

Miuzarte commited on Jan 29, 2023

Commit

2779331

1 Parent(s): 6f36ec1

Upload app.py

Browse files

Files changed (1) hide show

app.py +97 -10

app.py CHANGED Viewed

@@ -61,28 +61,115 @@ with app:
                 Todo:
-                1. 导出onnx
-                2. 本地一键包
-                3. TTS，vits 或 emotional-vits
             """)
             vc_input3 = gr.Audio(label="输入音频（长度请控制在30s左右，过长可能会爆内存）")
             vc_transform = gr.Number(label="变调（整数，可以正负，半音数量，升高八度就是12）", value=0)
             vc_submit = gr.Button("转换", variant="primary")
             vc_output2 = gr.Audio(label="输出音频（最右侧三个点可以下载）")
         vc_submit.click(vc_fn, [vc_input3, vc_transform], [vc_output2])
-        with gr.TabItem("仓库说明➕保姆级本地部署教程"):
             gr.Markdown(value="""
-                ## 仓库内模型所用于训练的数据：
-                |模型|G_1000000.pth|G_1M111000_sing.pth（现任）|G_1M100000_sing.pth（待产）| G_1M100000_sing1.pth（待产）|
-                |-:|:-:|:-:|:-:|:-:|
-                |训练集|12月录播（除电台）、出道至今22条歌投、10条歌切、圣诞音声（27.5小时）|G_1000000.pth作为底模_2022年所有唱歌投稿、唱歌切片、圣诞音声（3.9小时）|G_1000000.pth作为底模_（使用效果更好的UVR5模型去除BGM）出道至今所有唱歌投稿、唱歌切片、圣诞音声|先用1月录播（除电台）训练一个底模，再用出道至今所有唱歌投稿、唱歌切片、圣诞音声进行训练|
-                #### 仓库内G.pth、D.pth都有，欢迎作为底模用于进一步训练
-                #### 如果要训练自己的数据请访问：[项目Github仓库](https://github.com/innnky/so-vits-svc/tree/main)、[教程《svc相关》](https://www.yuque.com/jiuwei-nui3d/qng6eg)（里面的群小白慎入）
                 ### 本地推理可调用GPU(NVIDIA)，3060Ti 8G可推理一条20(建议) - 30s的音频，过长音频可分割后批量处理，就算用CPU推理也比 Hugging Face 快不少

                 Todo:
+                1. 导出onnx（✔）
+                2. 本地一键包（没必要）
+                3. TTS，vits（working）
             """)
             vc_input3 = gr.Audio(label="输入音频（长度请控制在30s左右，过长可能会爆内存）")
             vc_transform = gr.Number(label="变调（整数，可以正负，半音数量，升高八度就是12）", value=0)
             vc_submit = gr.Button("转换", variant="primary")
             vc_output2 = gr.Audio(label="输出音频（最右侧三个点可以下载）")
         vc_submit.click(vc_fn, [vc_input3, vc_transform], [vc_output2])
+        with gr.TabItem("仓库说明➕本地使用MoeSS高速推理的教程"):
             gr.Markdown(value="""
+                ## [仓库](https://huggingface.co/Miuzarte/SUImodels)内模型所用于训练的数据：
+                |变声器|G_1000000.pth|G_1M111000_sing.pth(suiji_1M111000_SoVits.onnx)| G_100K100000_sing.pth（待产）|
+                |-:|:-:|:-:|:-:|
+                |训练集|12月录播（除电台）、出道至今22条歌投、10条歌切、圣诞音声（27.5小时）|G_1000000.pth作为底模_2022年所有唱歌投稿、唱歌切片、圣诞音声（3.9小时）|先用12月、1月录播（除电台）训练一个步数较少的底模，再用出道至今所有唱歌投稿、唱歌切片、圣诞音声继续训练|
+                #### [仓库](https://huggingface.co/Miuzarte/SUImodels)内G.pth、D.pth都有，欢迎作为底模用于进一步训练
+                #### 如果要训练自己的数据请访问：[[项目Github仓库]](https://github.com/innnky/so-vits-svc)（32k分支少绕路，48k没什么人管）
+                # 在本地使用 [MoeSS](https://github.com/NaruseMioShirakana/MoeSS) 推理：
+                #### 因为该程序每次更新都会有较大的变化，下面的下载链接都将指向[[MoeSS 3.0.0]](https://github.com/NaruseMioShirakana/MoeSS/releases/tag/3.0.0)
+                ### 0. 下载[[MoeSS本体]](https://github.com/NaruseMioShirakana/MoeSS/releases/download/3.0.0/MoeSS.zip)、[[bins]](https://github.com/NaruseMioShirakana/MoeSS/releases/download/3.0.0/bins.7z)、[[hifigan]](https://github.com/NaruseMioShirakana/MoeSS/releases/download/3.0.0/hifigan.7z)、[[hubert]](https://github.com/NaruseMioShirakana/MoeSS/releases/download/3.0.0/hubert.7z)，并解压成以下的文件结构
+                ```
+                MoeSS
+                ├── Mods
+                ├── MoeSS.exe
+                ├── ShirakanaUI.dmres
+                ├── bins
+                │   └── ffmpeg.exe
+                ├── cleaners
+                ├── hifigan
+                │   └── hifigan.onnx
+                ├── hubert
+                │   └── hubert.onnx
+                ├── onnxruntime.dll
+                ├── onnxruntime_providers_shared.dll
+                └── onnxruntime_providers_tensorrt.dll
+                ```
+                ### 1. 下载[[转换好的onnx模型]](https://huggingface.co/Miuzarte/SUImodels/blob/main/onnx/suiji_1M111000_SoVits.onnx)
+                ### 2. 在 MoeSS\\Mods 新建一个 MoeSS.json 并写入以下文本，保存时请确保编码为UTF-8，保存时请确保编码为UTF-8，保存时请确保编码为UTF-8
+                ```json
+                {
+                "Folder" : "suiji_1M111000",
+                "Name" : "岁己SUI",
+                "Type" : "SoVits",
+                "Symbol" : "",
+                "Cleaner" : "",
+                "Rate" : 48000,
+                "Hop" : 320,
+                "Hifigan": "",
+                "Hubert": "hubert",
+                "SoVits3": true,
+                "Characters" : ["岁己SUI"]
+                }
+                ```
+                #### 以上步骤完成之后的文件结构应该长这样
+                ```
+                MoeSS
+                ├── Mods
+                │   ├── MoeSS.json
+                │   └── suiji_1M111000
+                │       └── suiji_1M111000_SoVits.onnx
+                ├── MoeSS.exe
+                ├── ShirakanaUI.dmres
+                ├── bins
+                │   └── ffmpeg.exe
+                ├── cleaners
+                ├── hifigan
+                │   ├── hifigan.onnx
+                │   └── nsf_hifigan.onnx
+                ├── hubert
+                │   └── hubert.onnx
+                ├── onnxruntime.dll
+                ├── onnxruntime_providers_shared.dll
+                └── onnxruntime_providers_tensorrt.dll
+                ```
+                ### 3. 运行 MoeSS.exe
+                1. 在左上角选择模型 “SoVits:岁己SUI” 并等待加载，完成后右边会显示 “当前模型: 岁己SUI”
+                2. 在左下角输入框中写入音频文件路径，如：
+                ```
+                A:\SUI\so-vits-svc\\raw\wavs\\2044.flac
+                A:\SUI\so-vits-svc\\raw\wavs\\2044.wav
+                ```
+                程序会调用ffmpeg转换，无需确保音频是否为wav格式
+                纳鲁塞-缪-希娜卡纳：下个版本增加拖入文件自动输入路径的功能
+                3. 点击开始转换语音，弹出的参数框可以调整对输入音频的升降调，确定后等待最下方进度条走完
+                |下面的弃用|
+                |:-:|
+                |下面的弃用|
                 ### 本地推理可调用GPU(NVIDIA)，3060Ti 8G可推理一条20(建议) - 30s的音频，过长音频可分割后批量处理，就算用CPU推理也比 Hugging Face 快不少