{ "cells": [ { "cell_type": "markdown", "metadata": { "id": "SggegFslkbbK" }, "source": [ "https://github.com/PlayVoice/so-vits-svc-5.0/\n", "\n", "↑原仓库\n", "\n", "*《colab保持连接的方法》*https://zhuanlan.zhihu.com/p/144629818\n", "\n", "预览版本,可使用预设模型进行推理" ] }, { "cell_type": "markdown", "metadata": { "id": "M1MdDryJP73G" }, "source": [ "# **环境配置&必要文件下载**\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "xfJWCr_EkO2i" }, "outputs": [], "source": [ "#@title 看看抽了个啥卡~~基本都是T4~~\n", "!nvidia-smi" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "nMspj8t3knR6" }, "outputs": [], "source": [ "#@title 克隆github仓库\n", "!git clone https://github.com/PlayVoice/so-vits-svc-5.0/ -b bigvgan-mix-v2" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "Kj2j81K6kubj" }, "outputs": [], "source": [ "#@title 安装依赖&下载必要文件\n", "%cd /content/so-vits-svc-5.0\n", "\n", "!pip install -r requirements.txt\n", "!pip install --upgrade pip setuptools numpy numba\n", "\n", "!wget -P hubert_pretrain/ https://github.com/bshall/hubert/releases/download/v0.1/hubert-soft-0d54a1f4.pt\n", "!wget -P whisper_pretrain/ https://openaipublic.azureedge.net/main/whisper/models/81f7c96c852ee8fc832187b0132e569d6c3065a3252ed18e56effd0b6a73e524/large-v2.pt\n", "!wget -P speaker_pretrain/ https://github.com/PlayVoice/so-vits-svc-5.0/releases/download/dependency/best_model.pth.tar\n", "!wget -P crepe/assets https://github.com/PlayVoice/so-vits-svc-5.0/releases/download/dependency/full.pth\n", "!wget -P vits_pretrain https://github.com/PlayVoice/so-vits-svc-5.0/releases/download/5.0/sovits5.0.pretrain.pth" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "v9zHS9VXly9b" }, "outputs": [], "source": [ "#@title 加载Google云端硬盘\n", "from google.colab import drive\n", "drive.mount('/content/drive')" ] }, { "cell_type": "markdown", "metadata": { "id": "hZ5KH8NgQ7os" }, "source": [ "# 包含多说话人的推理预览" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "2o6m3D0IsphU" }, "outputs": [], "source": [ "#@title 提取内容编码\n", "\n", "#@markdown **将处理好的\" .wav \"输入源文件上传到云盘根目录,并修改以下选项**\n", "\n", "#@markdown **\" .wav \"文件【文件名】**\n", "input = \"\\u30AE\\u30BF\\u30FC\\u3068\\u5B64\\u72EC\\u3068\\u84BC\\u3044\\u60D1\\u661F\" #@param {type:\"string\"}\n", "input_path = \"/content/drive/MyDrive/\"\n", "input_name = input_path + input\n", "!PYTHONPATH=. python whisper/inference.py -w {input_name}.wav -p test.ppg.npy" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "A7nvX5mRlwJ7" }, "outputs": [], "source": [ "#@title 推理\n", "\n", "#@markdown **将处理好的\" .wav \"输入源文件上传到云盘根目录,并修改以下选项**\n", "\n", "#@markdown **\" .wav \"文件【文件名】**\n", "input = \"\\u30AE\\u30BF\\u30FC\\u3068\\u5B64\\u72EC\\u3068\\u84BC\\u3044\\u60D1\\u661F\" #@param {type:\"string\"}\n", "input_path = \"/content/drive/MyDrive/\"\n", "input_name = input_path + input\n", "#@markdown **指定说话人(0001~0056)(推荐0022、0030、0047、0051)**\n", "speaker = \"0002\" #@param {type:\"string\"}\n", "!PYTHONPATH=. python svc_inference.py --config configs/base.yaml --model vits_pretrain/sovits5.0.pretrain.pth --spk ./configs/singers/singer{speaker}.npy --wave {input_name}.wav --ppg test.ppg.npy" ] }, { "cell_type": "markdown", "metadata": { "id": "F8oerogXyd3u" }, "source": [ "推理结果保存在根目录,文件名为svc_out.wav" ] }, { "cell_type": "markdown", "metadata": { "id": "qKX17GElPuso" }, "source": [ "# 训练" ] }, { "cell_type": "markdown", "metadata": { "id": "sVe0lEGWQBLU" }, "source": [ "将音频剪裁为小于30秒的音频段,响度匹配并修改为单声道,预处理时会进行重采样所以对采样率无要求。(但是降低采样率的操作会降低你的数据质量)\n", "\n", "**使用Adobe Audition™的响度匹配功能可以一次性完成重采样修改声道和响度匹配。**\n", "\n", "之后将音频文件保存为以下文件结构:\n", "```\n", "dataset_raw\n", "├───speaker0\n", "│ ├───xxx1-xxx1.wav\n", "│ ├───...\n", "│ └───Lxx-0xx8.wav\n", "└───speaker1\n", " ├───xx2-0xxx2.wav\n", " ├───...\n", " └───xxx7-xxx007.wav\n", "```\n", "\n", "打包为zip格式,命名为data.zip,上传到网盘根目录。" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "vC8IthV8VYgy" }, "outputs": [], "source": [ "#@title 从云盘获取数据集\n", "!unzip -d /content/so-vits-svc-5.0/ /content/drive/MyDrive/data.zip #自行修改路径与文件名" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "J101PiFUSL1N" }, "outputs": [], "source": [ "#@title 重采样\n", "# 生成采样率16000Hz音频, 存储路径为:./data_svc/waves-16k\n", "!python prepare/preprocess_a.py -w ./dataset_raw -o ./data_svc/waves-16k -s 16000\n", "# 生成采样率32000Hz音频, 存储路径为:./data_svc/waves-32k\n", "!python prepare/preprocess_a.py -w ./dataset_raw -o ./data_svc/waves-32k -s 32000" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "ZpxeYJCBSbgf" }, "outputs": [], "source": [ "#@title 提取f0\n", "!python prepare/preprocess_f0.py -w data_svc/waves-16k/ -p data_svc/pitch" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "7VasDGhDSlP5" }, "outputs": [], "source": [ "#@title 使用16k音频,提取内容编码\n", "!PYTHONPATH=. python prepare/preprocess_ppg.py -w data_svc/waves-16k/ -p data_svc/whisper" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "#@title 使用16k音频,提取内容编码\n", "!PYTHONPATH=. python prepare/preprocess_hubert.py -w data_svc/waves-16k/ -v data_svc/hubert" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "ovRqQUINSoII" }, "outputs": [], "source": [ "#@title 提取音色特征\n", "!PYTHONPATH=. python prepare/preprocess_speaker.py data_svc/waves-16k/ data_svc/speaker" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "s8Ba8Fd1bzzX" }, "outputs": [], "source": [ "#(解决“.ipynb_checkpoints”相关的错)\n", "!rm -rf \"find -type d -name .ipynb_checkpoints\"" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "ic9q599_b0Ae" }, "outputs": [], "source": [ "#(解决“.ipynb_checkpoints”相关的错)\n", "!rm -rf .ipynb_checkpoints\n", "!find . -name \".ipynb_checkpoints\" -exec rm -rf {} \\;" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "QamG3_B6o3vF" }, "outputs": [], "source": [ "#@title 提取平均音色\n", "!PYTHONPATH=. python prepare/preprocess_speaker_ave.py data_svc/speaker/ data_svc/singer" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "3wBmyQHvSs6K" }, "outputs": [], "source": [ "#@title 提取spec\n", "!PYTHONPATH=. python prepare/preprocess_spec.py -w data_svc/waves-32k/ -s data_svc/specs" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "tUcljCLbS5O3" }, "outputs": [], "source": [ "#@title 生成索引\n", "!python prepare/preprocess_train.py" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "30fXnscFS7Wo" }, "outputs": [], "source": [ "#@title 训练文件调试\n", "!PYTHONPATH=. python prepare/preprocess_zzz.py" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "hacR8qDFVOWo" }, "outputs": [], "source": [ "#@title 设定模型备份\n", "#@markdown **是否备份模型到云盘,colab随时爆炸建议备份,默认保存到云盘根目录Sovits5.0文件夹**\n", "Save_to_drive = True #@param {type:\"boolean\"}\n", "if Save_to_drive:\n", " !mkdir -p /content/so-vits-svc-5.0/chkpt/\n", " !rm -rf /content/so-vits-svc-5.0/chkpt/\n", " !mkdir -p /content/drive/MyDrive/Sovits5.0\n", " !ln -s /content/drive/MyDrive/Sovits5.0 /content/so-vits-svc-5.0/chkpt/" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "5BIiKIAoU3Kd" }, "outputs": [], "source": [ "#@title 开始训练\n", "%load_ext tensorboard\n", "%tensorboard --logdir /content/so-vits-svc-5.0/logs/\n", "\n", "!PYTHONPATH=. python svc_trainer.py -c configs/base.yaml -n sovits5.0" ] } ], "metadata": { "accelerator": "GPU", "colab": { "provenance": [] }, "gpuClass": "standard", "kernelspec": { "display_name": "Python 3", "name": "python3" }, "language_info": { "name": "python" } }, "nbformat": 4, "nbformat_minor": 0 }