Merge branch 'w-okada:master' into master

This commit is contained in:
Eidenz 2023-08-09 11:41:11 +02:00 committed by GitHub
commit 11219c11f6
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
67 changed files with 6554 additions and 2612 deletions

View File

@ -117,6 +117,16 @@ body:
id: issue
attributes:
label: Situation
description: Developers spend a lot of time developing new features and resolving issues. If you really want to get it solved, please provide as much reproducible information and logs as possible. Provide logs on the terminal and capture the window.
description: Developers spend a lot of time developing new features and resolving issues. If you really want to get it solved, please provide as much reproducible information and logs as possible. Provide logs on the terminal and capture the appkication window.
- type: textarea
id: capture
attributes:
label: application window capture
description: the appkication window.
- type: textarea
id: logs-on-terminal
attributes:
label: logs on terminal
description: logs on terminal.
validations:
required: true

36
.github/workflows/cla.yml vendored Normal file
View File

@ -0,0 +1,36 @@
name: "CLA Assistant"
on:
issue_comment:
types: [created]
pull_request_target:
types: [opened, closed, synchronize]
jobs:
CLAssistant:
runs-on: ubuntu-latest
steps:
- name: "CLA Assistant"
if: (github.event.comment.body == 'recheck' || github.event.comment.body == 'I have read the CLA Document and I hereby sign the CLA') || github.event_name == 'pull_request_target'
# Beta Release
uses: cla-assistant/github-action@v2.1.3-beta
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
# the below token should have repo scope and must be manually added by you in the repository's secret
PERSONAL_ACCESS_TOKEN: ${{ secrets.PERSONAL_ACCESS_TOKEN }}
with:
path-to-signatures: "signatures/version1/cla.json"
path-to-document: "https://raw.githubusercontent.com/w-okada/voice-changer/master/LICENSE-CLA" # e.g. a CLA or a DCO document
# branch should not be protected
branch: "master"
#allowlist: user1,bot*
#below are the optional inputs - If the optional inputs are not given, then default values will be taken
#remote-organization-name: enter the remote organization name where the signatures should be stored (Default is storing the signatures in the same repository)
#remote-repository-name: enter the remote repository name where the signatures should be stored (Default is storing the signatures in the same repository)
#create-file-commit-message: 'For example: Creating file for storing CLA Signatures'
#signed-commit-message: 'For example: $contributorName has signed the CLA in #$pullRequestNo'
#custom-notsigned-prcomment: 'pull request comment with Introductory message to ask new contributors to sign'
#custom-pr-sign-comment: 'The signature to be committed in order to sign the CLA'
#custom-allsigned-prcomment: 'pull request comment when all contributors has signed, defaults to **CLA Assistant Lite bot** All Contributors have signed the CLA.'
#lock-pullrequest-aftermerge: false - if you don't want this bot to automatically lock the pull request after merging (default - true)
#use-dco-flag: true - If you are using DCO instead of CLA

68
LICENSE
View File

@ -20,7 +20,6 @@ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
MIT License
Copyright (c) 2022 Isle Tennos
@ -64,3 +63,70 @@ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
MIT License
Copyright (c) 2023 liujing04
Copyright (c) 2023 源文雨
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
MIT License
Copyright (c) 2023 yxlllc
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
MIT License
Copyright (c) 2023 yxlllc
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.

27
LICENSE-CLA Normal file
View File

@ -0,0 +1,27 @@
Contributor License Agreement
Copyright (c) 2022 Wataru Okada
本契約は、当社とあなた(以下、"貢献者"とします)の間で締結され、貢献者が当社に対してソフトウェアプロジェクト(以下、"プロジェクト"とします)に対する貢献(以下、"貢献"とします)を提供する際の条件を定めます。
1. 貢献者は、提供する貢献が、貢献者自身のオリジナルな作品であり、商標、著作権、特許、または他の知的財産権を侵害していないことを保証します。
2. 貢献者は、貢献を当社に対して無償で提供し、当社はそれを無制限に使用、複製、修正、公開、配布、サブライセンスを付与し、またその販売する権利を得ることに同意します。
3. 本契約が終了した場合でも、第 2 項で述べた権利は当社に留保されます。
4. 当社は貢献者の貢献を受け入れる義務を負わず、また貢献者に一切の補償をする義務を負わないことに貢献者は同意します。
5. 本契約は当社と貢献者双方の書面による合意により修正されることがあります。
"This Agreement is made between our Company and you (hereinafter referred to as "Contributor") and outlines the terms under which you provide your Contributions (hereinafter referred to as "Contributions") to our software project (hereinafter referred to as "Project").
1. You warrant that the Contributions you are providing are your original work and do not infringe any trademark, copyright, patent, or other intellectual property rights.
2. You agree to provide your Contributions to the Company for free, and the Company has the unlimited right to use, copy, modify, publish, distribute, and sublicense, and also sell the Contributions.
3. Even after the termination of this Agreement, the rights mentioned in the above clause will be retained by the Company.
4. The Company is under no obligation to accept your Contributions or to compensate you in any way for them, and you agree to this.
5. This Agreement may be modified by written agreement between the Company and the Contributor."

113
README.md
View File

@ -4,74 +4,19 @@
## What's New!
- v.1.5.3.10b
- improve:
- logger
- bugfix:
- RMVPE:different device bug (not finding root caused yet)
- RVC: when loading sample model, useIndex issue
- v.1.5.3.10a
- Improvement:
- launch sequence
- onnx export process
- error handling in client
- bugfix:
- RMVPE for mac
- v.1.5.3.10
- New Feature
- Support Diffusion SVC(only combo model)
- System audio capture(only for win)
- Support RMVPE
- improvement
- directml: set device id
- some bugfixes:
- noise suppression2
- etc.
- v.1.5.3.9a
- some improvements:
- keep f0 detector setting
- MMVC: max chunksize for onnx
- etc
- some bugfixs:
- RVC: crepe fail to estimate f0
- RVC: fallback from half-precision when half-precision failed.
- etc
- v.1.5.3.9
- New feature:
- Add Crepe Full/Tiny (onnx)
- some improvements:
- server info includes python version
- contentvec onnx support
- etc
- some bugfixs:
- server device mode chuttering
- new model add sample rate
- etc
- v.1.5.3.8a
- Bugfix(test): force client device samplerate
- Bugfix: server device filter
- v.1.5.3.8
- RVC: performance improvement ([PR](https://github.com/w-okada/voice-changer/pull/371) from [nadare881](https://github.com/nadare881))
- v.1.5.3.7
- v.1.5.3.12
- Feature:
- server device monitor
- Bugfix:
- device output recorder button is showed in server device mode.
- Pass through mode
- bugfix:
- Adapted the GUI to the number of slots.
- v.1.5.3.11
- improve:
- increase slot size
- bugfix:
- m1 mac: eliminate torchaudio
# VC Client とは
@ -126,31 +71,13 @@
- ダウンロードはこちらから。
| Version | OS | フレームワーク | link | サポート VC | サイズ |
| ----------- | --- | ------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------- | ------ |
| v.1.5.3.10b | mac | ONNX(cpu), PyTorch(cpu,mps) | [google](https://drive.google.com/uc?id=1akrb9RicU1-cldisToBedaM08y8pFQae&export=download), [hugging face](https://huggingface.co/wok000/vcclient000/tree/main) | MMVC v.1.5.x, MMVC v.1.3.x, so-vits-svc 4.0, RVC, Diffusion-SVC | 795MB |
| | win | ONNX(cpu,cuda), PyTorch(cpu,cuda) | [google](https://drive.google.com/uc?id=1eZB0u2u0tEB1tR9mp06YiKx96x2oxgrN&export=download), [hugging face](https://huggingface.co/wok000/vcclient000/tree/main) | MMVC v.1.5.x, MMVC v.1.3.x, so-vits-svc 4.0, RVC, DDSP-SVC, Diffusion-SVC | 3237MB |
| | win | ONNX(cpu,DirectML), PyTorch(cpu,cuda) | [google](https://drive.google.com/uc?id=1gzWuEN7oY_WdBwOEwtDdNaK2rT0nHvfN&export=download), [hugging face](https://huggingface.co/wok000/vcclient000/tree/main) | MMVC v.1.5.x, MMVC v.1.3.x, so-vits-svc 4.0, RVC, DDSP-SVC, Diffusion-SVC | 3122MB |
| v.1.5.3.10a | mac | ONNX(cpu), PyTorch(cpu,mps) | [google](https://drive.google.com/uc?id=1_fLdFVswhOGwjRiQj4YWE-YZTO_GsnrA&export=download), [hugging face](https://huggingface.co/wok000/vcclient000/tree/main) | MMVC v.1.5.x, MMVC v.1.3.x, so-vits-svc 4.0, RVC, Diffusion-SVC | 795MB |
| | win | ONNX(cpu,cuda), PyTorch(cpu,cuda) | [google](https://drive.google.com/uc?id=1imaTBgWBb9ICkNy9pN6NxBISI6SzhEfL&export=download), [hugging face](https://huggingface.co/wok000/vcclient000/tree/main) | MMVC v.1.5.x, MMVC v.1.3.x, so-vits-svc 4.0, RVC, DDSP-SVC, Diffusion-SVC | 3237MB |
| | win | ONNX(cpu,DirectML), PyTorch(cpu,cuda) | [google](https://drive.google.com/uc?id=1GoijW29pjscdvxMhi8xgvPSvcHCGYwXO&export=download), [hugging face](https://huggingface.co/wok000/vcclient000/tree/main) | MMVC v.1.5.x, MMVC v.1.3.x, so-vits-svc 4.0, RVC, DDSP-SVC, Diffusion-SVC | 3122MB |
| v.1.5.3.10 | mac | ONNX(cpu), PyTorch(cpu,mps) | [google](https://drive.google.com/uc?id=1useZ4gcI0la5OhPuvt2j94CbAhWikpV4&export=download), [hugging face](https://huggingface.co/wok000/vcclient000/tree/main) | MMVC v.1.5.x, MMVC v.1.3.x, so-vits-svc 4.0, RVC, Diffusion-SVC | 795MB |
| | win | ONNX(cpu,cuda), PyTorch(cpu,cuda) | [google](https://drive.google.com/uc?id=13abR2xs4KmNIg9b5RJXFez9g6zwZqMj4&export=download), [hugging face](https://huggingface.co/wok000/vcclient000/tree/main) | MMVC v.1.5.x, MMVC v.1.3.x, so-vits-svc 4.0, RVC, DDSP-SVC, Diffusion-SVC | 3237MB |
| | win | ONNX(cpu,DirectML), PyTorch(cpu,cuda) | [google](https://drive.google.com/uc?id=1ZxPp-HF7vSEJ8m00WnQaGbo4bTN4LqYD&export=download), [hugging face](https://huggingface.co/wok000/vcclient000/tree/main) | MMVC v.1.5.x, MMVC v.1.3.x, so-vits-svc 4.0, RVC, DDSP-SVC, Diffusion-SVC | 3122MB |
| v.1.5.3.9a | mac | ONNX(cpu), PyTorch(cpu,mps) | [google](https://drive.google.com/uc?id=1GsPTUTUbMvwNwAA8SGvSplwsf-yui0iw&export=download), [hugging face](https://huggingface.co/wok000/vcclient000/tree/main) | MMVC v.1.5.x, MMVC v.1.3.x, so-vits-svc 4.0, RVC | 794MB |
| | win | ONNX(cpu,cuda), PyTorch(cpu,cuda) | [google](https://drive.google.com/uc?id=1eKZCozh37QDfAr33ZG7lGFUOQv1tOooR&export=download), [hugging face](https://huggingface.co/wok000/vcclient000/tree/main) | MMVC v.1.5.x, MMVC v.1.3.x, so-vits-svc 4.0, RVC, DDSP-SVC | 3237MB |
| | win | ONNX(cpu,DirectML), PyTorch(cpu,cuda) | [google](https://drive.google.com/uc?id=1sxUNBPkeSPPNOE1ZknVF-0kx2jHP3kN6&export=download), [hugging face](https://huggingface.co/wok000/vcclient000/tree/main) | MMVC v.1.5.x, MMVC v.1.3.x, so-vits-svc 4.0, RVC, DDSP-SVC | 3122MB |
| v.1.5.3.9 | mac | ONNX(cpu), PyTorch(cpu,mps) | [google](https://drive.google.com/uc?id=1pTTcTseSdIfCyNUjB-K1mYPg9YocSYz6&export=download), [hugging face](https://huggingface.co/wok000/vcclient000/tree/main) | MMVC v.1.5.x, MMVC v.1.3.x, so-vits-svc 4.0, RVC | 795MB |
| | win | ONNX(cpu,cuda), PyTorch(cpu,cuda) | [google](https://drive.google.com/uc?id=1KWg-QoF6XmLbkUav-fmxc7bdAcD3844V&export=download), [hugging face](https://huggingface.co/wok000/vcclient000/tree/main) | MMVC v.1.5.x, MMVC v.1.3.x, so-vits-svc 4.0, RVC, DDSP-SVC | 3238MB |
| | win | ONNX(cpu,DirectML), PyTorch(cpu,cuda) | [google](https://drive.google.com/uc?id=1_TXUkDcofYz9mJd2L1ajAoyIBCQF29WL&export=download), [hugging face](https://huggingface.co/wok000/vcclient000/tree/main) | MMVC v.1.5.x, MMVC v.1.3.x, so-vits-svc 4.0, RVC, DDSP-SVC | 3123MB |
| v.1.5.3.8a | mac | ONNX(cpu), PyTorch(cpu,mps) | [normal](https://drive.google.com/uc?id=1hg6lynE3wWJTNTParTa2qB2L06OL9KJ9&export=download), [hugging face](https://huggingface.co/wok000/vcclient000/tree/main) | MMVC v.1.5.x, MMVC v.1.3.x, so-vits-svc 4.0, RVC | 794MB |
| | win | ONNX(cpu,DirectML), PyTorch(cpu,cuda) | [normal](https://drive.google.com/uc?id=1C9PCu8pdafO6jJ2yCaB7x54Ls7LcM0Xc&export=download), [hugging face](https://huggingface.co/wok000/vcclient000/tree/main) | MMVC v.1.5.x, MMVC v.1.3.x, so-vits-svc 4.0, RVC, DDSP-SVC | 3122MB |
| | win | ONNX(cpu,cuda), PyTorch(cpu,cuda) | [normal](https://drive.google.com/uc?id=1bzrGhHPc9GdaRAMxkksTGtbuRLEeBx9i&export=download), [hugging face](https://huggingface.co/wok000/vcclient000/tree/main) | MMVC v.1.5.x, MMVC v.1.3.x, so-vits-svc 4.0, RVC, DDSP-SVC | 3237MB |
| v.1.5.3.8 | mac | ONNX(cpu), PyTorch(cpu,mps) | [normal](https://drive.google.com/uc?id=1ptmjFCRDW7M0l80072JVRII5tJpF13__&export=download), [hugging face](https://huggingface.co/wok000/vcclient000/tree/main) | MMVC v.1.5.x, MMVC v.1.3.x, so-vits-svc 4.0, RVC | 794MB |
| | win | ONNX(cpu,DirectML), PyTorch(cpu,cuda) | [normal](https://drive.google.com/uc?id=19DfeACmpnzqCVH5bIoFunS2pGPABRuso&export=download), [hugging face](https://huggingface.co/wok000/vcclient000/tree/main) | MMVC v.1.5.x, MMVC v.1.3.x, so-vits-svc 4.0, RVC, DDSP-SVC | 3122MB |
| | win | ONNX(cpu,cuda), PyTorch(cpu,cuda) | [normal](https://drive.google.com/uc?id=1AYP_hMdoeacX0KiF31Vd3oEjxwdreSbM&export=download), [hugging face](https://huggingface.co/wok000/vcclient000/tree/main) | MMVC v.1.5.x, MMVC v.1.3.x, so-vits-svc 4.0, RVC, DDSP-SVC | 3237MB |
| v.1.5.3.7 | mac | ONNX(cpu), PyTorch(cpu,mps) | [normal](https://drive.google.com/uc?id=1HdJwgo0__vR6pAkOkekejUZJ0lu2NfDs&export=download), [hugging face](https://huggingface.co/wok000/vcclient000/tree/main) | MMVC v.1.5.x, MMVC v.1.3.x, so-vits-svc 4.0, RVC | 794MB |
| | win | ONNX(cpu,cuda), PyTorch(cpu,cuda) | [normal](https://drive.google.com/uc?id=1JIF4PvKg-8HNUv_fMaXSM3AeYa-F_c4z&export=download), [hugging face](https://huggingface.co/wok000/vcclient000/tree/main) | MMVC v.1.5.x, MMVC v.1.3.x, so-vits-svc 4.0, RVC, DDSP-SVC | 3237MB |
| | win | ONNX(cpu,DirectML), PyTorch(cpu,cuda) | [normal](https://drive.google.com/uc?id=1cJzRHmD3vk6av0Dvwj3v9Ef5KUsQYhKv&export=download), [hugging face](https://huggingface.co/wok000/vcclient000/tree/main) | MMVC v.1.5.x, MMVC v.1.3.x, so-vits-svc 4.0, RVC, DDSP-SVC | 3122MB |
| ---------- | --- | ------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------- | ------ |
| v.1.5.3.12 | mac | ONNX(cpu), PyTorch(cpu,mps) | [google](https://drive.google.com/uc?id=1rC7IVpzfG68Ps6tBmdFIjSXvTNaUKBf6&export=download), [hugging face](https://huggingface.co/wok000/vcclient000/tree/main) | MMVC v.1.5.x, MMVC v.1.3.x, so-vits-svc 4.0, RVC | 797MB |
| | win | ONNX(cpu,cuda), PyTorch(cpu,cuda) | [google](https://drive.google.com/uc?id=1OqxS_jve4qvj71DdSGOrhI8DGaEVRzgs&export=download), [hugging face](https://huggingface.co/wok000/vcclient000/tree/main) | MMVC v.1.5.x, MMVC v.1.3.x, so-vits-svc 4.0, RVC, DDSP-SVC, Diffusion-SVC | 3241MB |
| | win | ONNX(cpu,DirectML), PyTorch(cpu,cuda) | [google](https://drive.google.com/uc?id=1HhfmMovujzbOmvCi7WPuqQAuuo7jaM1o&export=download), [hugging face](https://huggingface.co/wok000/vcclient000/tree/main) | MMVC v.1.5.x, MMVC v.1.3.x, so-vits-svc 4.0, RVC, DDSP-SVC, Diffusion-SVC | 3126MB |
| v.1.5.3.11 | mac | ONNX(cpu), PyTorch(cpu,mps) | [google](https://drive.google.com/uc?id=1cutPICJa-PI_ww0E3ae9FCuSjY_5PnWE&export=download), [hugging face](https://huggingface.co/wok000/vcclient000/tree/main) | MMVC v.1.5.x, MMVC v.1.3.x, so-vits-svc 4.0, RVC, | 795MB |
| | win | ONNX(cpu,cuda), PyTorch(cpu,cuda) | [google](https://drive.google.com/uc?id=1aOkc-QhtAj11gI8i335mHhNMUSESeJ5J&export=download), [hugging face](https://huggingface.co/wok000/vcclient000/tree/main) | MMVC v.1.5.x, MMVC v.1.3.x, so-vits-svc 4.0, RVC, DDSP-SVC, Diffusion-SVC | 3237MB |
| | win | ONNX(cpu,DirectML), PyTorch(cpu,cuda) | [google](https://drive.google.com/uc?id=16g33cZ925HNty_0Hly7Aw_nXlQlgqxDC&export=download), [hugging face](https://huggingface.co/wok000/vcclient000/tree/main) | MMVC v.1.5.x, MMVC v.1.3.x, so-vits-svc 4.0, RVC, DDSP-SVC, Diffusion-SVC | 3122MB |
(\*1) Google Drive からダウンロードできない方は[hugging_face](https://huggingface.co/wok000/vcclient000/tree/main)からダウンロードしてみてください
(\*2) 開発者が AMD のグラフィックボードを持っていないので動作確認していません。onnxruntime-directml を同梱しただけのものです。
@ -255,3 +182,7 @@ Github Pages 上で実行できるため、ブラウザのみあれば様々な
| | win | ONNX(cpu,cuda), PyTorch(cpu,cuda) | [normal](https://drive.google.com/uc?id=1tmTMJRRggS2Sb4goU-eHlRvUBR88RZDl&export=download) \*1 | MMVC v.1.5.x, MMVC v.1.3.x, so-vits-svc 4.0, so-vits-svc 4.0v2, RVC, DDSP-SVC | 2872MB |
| v.1.5.3.1 | mac | ONNX(cpu), PyTorch(cpu,mps) | [normal](https://drive.google.com/uc?id=1oswF72q_cQQeXhIn6W275qLnoBAmcrR_&export=download) \*1 | MMVC v.1.5.x, MMVC v.1.3.x, so-vits-svc 4.0, RVC | 796MB |
| | win | ONNX(cpu,cuda), PyTorch(cpu,cuda) | [normal](https://drive.google.com/uc?id=1AWjDhW4w2Uljp1-9P8YUJBZsIlnhkJX2&export=download) \*1 | MMVC v.1.5.x, MMVC v.1.3.x, so-vits-svc 4.0, so-vits-svc 4.0v2, RVC, DDSP-SVC | 2872MB |
# For Contributor
このリポジトリは[CLA](https://raw.githubusercontent.com/w-okada/voice-changer/master/LICENSE-CLA)を設定しています。

View File

@ -52,6 +52,8 @@ $ python3 MMVCServerSIO.py -p 18888 --https true \
```
Access with Browser (currently only chrome is supported), then you can see gui.
2-1. Trouble shoot
(1) OSError: PortAudio library not found

View File

@ -51,6 +51,8 @@ $ python3 MMVCServerSIO.py -p 18888 --https true \
--samples samples.json
```
ブラウザ(Chrome のみサポート)でアクセスすると画面が表示されます。
2-1. トラブルシュート
(1) OSError: PortAudio library not found

View File

@ -4,74 +4,19 @@
## What's New!
- v.1.5.3.10b
- improve:
- logger
- bugfix:
- RMVPE:different device bug (not finding root caused yet)
- RVC: when loading sample model, useIndex issue
- v.1.5.3.10a
- Improvement:
- launch sequence
- onnx export process
- error handling in client
- bugfix:
- RMVPE for mac
- v.1.5.3.10
- New Feature
- Support Diffusion SVC(only combo model)
- System audio capture(only for win)
- Support RMVPE
- improvement
- directml: set device id
- some bugfixes:
- noise suppression2
- etc.
- v.1.5.3.9a
- some improvements:
- keep f0 detector setting
- MMVC: max chunksize for onnx
- etc
- some bugfixs:
- RVC: crepe fail to estimate f0
- RVC: fallback from half-precision when half-precision failed.
- etc
- v.1.5.3.9
- New feature:
- Add Crepe Full/Tiny (onnx)
- some improvements:
- server info includes python version
- contentvec onnx support
- etc
- some bugfixs:
- server device mode chuttering
- new model add sample rate
- etc
- v.1.5.3.8a
- Bugfix(test): force client device samplerate
- Bugfix: server device filter
- v.1.5.3.8
- RVC: performance improvement ([PR](https://github.com/w-okada/voice-changer/pull/371) from [nadare881](https://github.com/nadare881))
- v.1.5.3.7
- v.1.5.3.12
- Feature:
- server device monitor
- Bugfix:
- device output recorder button is showed in server device mode.
- Pass through mode
- bugfix:
- Adapted the GUI to the number of slots.
- v.1.5.3.11
- improve:
- increase slot size
- bugfix:
- m1 mac: eliminate torchaudio
# What is VC Client
@ -123,31 +68,13 @@ It can be used in two main ways, in order of difficulty:
- Download (When you cannot download from google drive, try [hugging_face](https://huggingface.co/wok000/vcclient000/tree/main))
| Version | OS | Framework | link | support VC | size |
| ----------- | --- | ------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------- | ------ |
| v.1.5.3.10b | mac | ONNX(cpu), PyTorch(cpu,mps) | [google](https://drive.google.com/uc?id=1akrb9RicU1-cldisToBedaM08y8pFQae&export=download), [hugging face](https://huggingface.co/wok000/vcclient000/tree/main) | MMVC v.1.5.x, MMVC v.1.3.x, so-vits-svc 4.0, RVC, Diffusion-SVC | 795MB |
| | win | ONNX(cpu,cuda), PyTorch(cpu,cuda) | [google](https://drive.google.com/uc?id=1eZB0u2u0tEB1tR9mp06YiKx96x2oxgrN&export=download), [hugging face](https://huggingface.co/wok000/vcclient000/tree/main) | MMVC v.1.5.x, MMVC v.1.3.x, so-vits-svc 4.0, RVC, DDSP-SVC, Diffusion-SVC | 3237MB |
| | win | ONNX(cpu,DirectML), PyTorch(cpu,cuda) | [google](https://drive.google.com/uc?id=1gzWuEN7oY_WdBwOEwtDdNaK2rT0nHvfN&export=download), [hugging face](https://huggingface.co/wok000/vcclient000/tree/main) | MMVC v.1.5.x, MMVC v.1.3.x, so-vits-svc 4.0, RVC, DDSP-SVC, Diffusion-SVC | 3122MB |
| v.1.5.3.10a | mac | ONNX(cpu), PyTorch(cpu,mps) | [google](https://drive.google.com/uc?id=1_fLdFVswhOGwjRiQj4YWE-YZTO_GsnrA&export=download), [hugging face](https://huggingface.co/wok000/vcclient000/tree/main) | MMVC v.1.5.x, MMVC v.1.3.x, so-vits-svc 4.0, RVC, Diffusion-SVC | 795MB |
| | win | ONNX(cpu,cuda), PyTorch(cpu,cuda) | [google](https://drive.google.com/uc?id=1imaTBgWBb9ICkNy9pN6NxBISI6SzhEfL&export=download), [hugging face](https://huggingface.co/wok000/vcclient000/tree/main) | MMVC v.1.5.x, MMVC v.1.3.x, so-vits-svc 4.0, RVC, DDSP-SVC, Diffusion-SVC | 3237MB |
| | win | ONNX(cpu,DirectML), PyTorch(cpu,cuda) | [google](https://drive.google.com/uc?id=1GoijW29pjscdvxMhi8xgvPSvcHCGYwXO&export=download), [hugging face](https://huggingface.co/wok000/vcclient000/tree/main) | MMVC v.1.5.x, MMVC v.1.3.x, so-vits-svc 4.0, RVC, DDSP-SVC, Diffusion-SVC | 3122MB |
| v.1.5.3.10 | mac | ONNX(cpu), PyTorch(cpu,mps) | [google](https://drive.google.com/uc?id=1useZ4gcI0la5OhPuvt2j94CbAhWikpV4&export=download), [hugging face](https://huggingface.co/wok000/vcclient000/tree/main) | MMVC v.1.5.x, MMVC v.1.3.x, so-vits-svc 4.0, RVC, Diffusion-SVC | 795MB |
| | win | ONNX(cpu,cuda), PyTorch(cpu,cuda) | [google](https://drive.google.com/uc?id=13abR2xs4KmNIg9b5RJXFez9g6zwZqMj4&export=download), [hugging face](https://huggingface.co/wok000/vcclient000/tree/main) | MMVC v.1.5.x, MMVC v.1.3.x, so-vits-svc 4.0, RVC, DDSP-SVC, Diffusion-SVC | 3237MB |
| | win | ONNX(cpu,DirectML), PyTorch(cpu,cuda) | [google](https://drive.google.com/uc?id=1ZxPp-HF7vSEJ8m00WnQaGbo4bTN4LqYD&export=download), [hugging face](https://huggingface.co/wok000/vcclient000/tree/main) | MMVC v.1.5.x, MMVC v.1.3.x, so-vits-svc 4.0, RVC, DDSP-SVC, Diffusion-SVC | 3122MB |
| v.1.5.3.9a | mac | ONNX(cpu), PyTorch(cpu,mps) | [google](https://drive.google.com/uc?id=1GsPTUTUbMvwNwAA8SGvSplwsf-yui0iw&export=download), [hugging face](https://huggingface.co/wok000/vcclient000/tree/main) | MMVC v.1.5.x, MMVC v.1.3.x, so-vits-svc 4.0, RVC | 794MB |
| | win | ONNX(cpu,cuda), PyTorch(cpu,cuda) | [google](https://drive.google.com/uc?id=1eKZCozh37QDfAr33ZG7lGFUOQv1tOooR&export=download), [hugging face](https://huggingface.co/wok000/vcclient000/tree/main) | MMVC v.1.5.x, MMVC v.1.3.x, so-vits-svc 4.0, RVC, DDSP-SVC | 3237MB |
| | win | ONNX(cpu,DirectML), PyTorch(cpu,cuda) | [google](https://drive.google.com/uc?id=1sxUNBPkeSPPNOE1ZknVF-0kx2jHP3kN6&export=download), [hugging face](https://huggingface.co/wok000/vcclient000/tree/main) | MMVC v.1.5.x, MMVC v.1.3.x, so-vits-svc 4.0, RVC, DDSP-SVC | 3122MB |
| v.1.5.3.9 | mac | ONNX(cpu), PyTorch(cpu,mps) | [google](https://drive.google.com/uc?id=1pTTcTseSdIfCyNUjB-K1mYPg9YocSYz6&export=download), [hugging face](https://huggingface.co/wok000/vcclient000/tree/main) | MMVC v.1.5.x, MMVC v.1.3.x, so-vits-svc 4.0, RVC | 795MB |
| | win | ONNX(cpu,cuda), PyTorch(cpu,cuda) | [google](https://drive.google.com/uc?id=1KWg-QoF6XmLbkUav-fmxc7bdAcD3844V&export=download), [hugging face](https://huggingface.co/wok000/vcclient000/tree/main) | MMVC v.1.5.x, MMVC v.1.3.x, so-vits-svc 4.0, RVC, DDSP-SVC | 3238MB |
| | win | ONNX(cpu,DirectML), PyTorch(cpu,cuda) | [google](https://drive.google.com/uc?id=1_TXUkDcofYz9mJd2L1ajAoyIBCQF29WL&export=download), [hugging face](https://huggingface.co/wok000/vcclient000/tree/main) | MMVC v.1.5.x, MMVC v.1.3.x, so-vits-svc 4.0, RVC, DDSP-SVC | 3123MB |
| v.1.5.3.8a | mac | ONNX(cpu), PyTorch(cpu,mps) | [normal](https://drive.google.com/uc?id=1hg6lynE3wWJTNTParTa2qB2L06OL9KJ9&export=download), [hugging face](https://huggingface.co/wok000/vcclient000/tree/main) | MMVC v.1.5.x, MMVC v.1.3.x, so-vits-svc 4.0, RVC | 794MB |
| | win | ONNX(cpu,DirectML), PyTorch(cpu,cuda) | [normal](https://drive.google.com/uc?id=1C9PCu8pdafO6jJ2yCaB7x54Ls7LcM0Xc&export=download), [hugging face](https://huggingface.co/wok000/vcclient000/tree/main) | MMVC v.1.5.x, MMVC v.1.3.x, so-vits-svc 4.0, RVC, DDSP-SVC | 3122MB |
| | win | ONNX(cpu,cuda), PyTorch(cpu,cuda) | [normal](https://drive.google.com/uc?id=1bzrGhHPc9GdaRAMxkksTGtbuRLEeBx9i&export=download), [hugging face](https://huggingface.co/wok000/vcclient000/tree/main) | MMVC v.1.5.x, MMVC v.1.3.x, so-vits-svc 4.0, RVC, DDSP-SVC | 3237MB |
| v.1.5.3.8 | mac | ONNX(cpu), PyTorch(cpu,mps) | [normal](https://drive.google.com/uc?id=1ptmjFCRDW7M0l80072JVRII5tJpF13__&export=download), [hugging face](https://huggingface.co/wok000/vcclient000/tree/main) | MMVC v.1.5.x, MMVC v.1.3.x, so-vits-svc 4.0, RVC | 794MB |
| | win | ONNX(cpu,DirectML), PyTorch(cpu,cuda) | [normal](https://drive.google.com/uc?id=19DfeACmpnzqCVH5bIoFunS2pGPABRuso&export=download), [hugging face](https://huggingface.co/wok000/vcclient000/tree/main) | MMVC v.1.5.x, MMVC v.1.3.x, so-vits-svc 4.0, RVC, DDSP-SVC | 3122MB |
| | win | ONNX(cpu,cuda), PyTorch(cpu,cuda) | [normal](https://drive.google.com/uc?id=1AYP_hMdoeacX0KiF31Vd3oEjxwdreSbM&export=download), [hugging face](https://huggingface.co/wok000/vcclient000/tree/main) | MMVC v.1.5.x, MMVC v.1.3.x, so-vits-svc 4.0, RVC, DDSP-SVC | 3237MB |
| v.1.5.3.7 | mac | ONNX(cpu), PyTorch(cpu,mps) | [normal](https://drive.google.com/uc?id=1HdJwgo0__vR6pAkOkekejUZJ0lu2NfDs&export=download), [hugging face](https://huggingface.co/wok000/vcclient000/tree/main) | MMVC v.1.5.x, MMVC v.1.3.x, so-vits-svc 4.0, RVC | 794MB |
| | win | ONNX(cpu,cuda), PyTorch(cpu,cuda) | [normal](https://drive.google.com/uc?id=1JIF4PvKg-8HNUv_fMaXSM3AeYa-F_c4z&export=download), [hugging face](https://huggingface.co/wok000/vcclient000/tree/main) | MMVC v.1.5.x, MMVC v.1.3.x, so-vits-svc 4.0, RVC, DDSP-SVC | 3237MB |
| | win | ONNX(cpu,DirectML), PyTorch(cpu,cuda) | [normal](https://drive.google.com/uc?id=1cJzRHmD3vk6av0Dvwj3v9Ef5KUsQYhKv&export=download), [hugging face](https://huggingface.co/wok000/vcclient000/tree/main) | MMVC v.1.5.x, MMVC v.1.3.x, so-vits-svc 4.0, RVC, DDSP-SVC | 3122MB |
| ---------- | --- | ------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------- | ------ |
| v.1.5.3.12 | mac | ONNX(cpu), PyTorch(cpu,mps) | [google](https://drive.google.com/uc?id=1rC7IVpzfG68Ps6tBmdFIjSXvTNaUKBf6&export=download), [hugging face](https://huggingface.co/wok000/vcclient000/tree/main) | MMVC v.1.5.x, MMVC v.1.3.x, so-vits-svc 4.0, RVC | 797MB |
| | win | ONNX(cpu,cuda), PyTorch(cpu,cuda) | [google](https://drive.google.com/uc?id=1OqxS_jve4qvj71DdSGOrhI8DGaEVRzgs&export=download), [hugging face](https://huggingface.co/wok000/vcclient000/tree/main) | MMVC v.1.5.x, MMVC v.1.3.x, so-vits-svc 4.0, RVC, DDSP-SVC, Diffusion-SVC | 3241MB |
| | win | ONNX(cpu,DirectML), PyTorch(cpu,cuda) | [google](https://drive.google.com/uc?id=1HhfmMovujzbOmvCi7WPuqQAuuo7jaM1o&export=download), [hugging face](https://huggingface.co/wok000/vcclient000/tree/main) | MMVC v.1.5.x, MMVC v.1.3.x, so-vits-svc 4.0, RVC, DDSP-SVC, Diffusion-SVC | 3126MB |
| v.1.5.3.11 | mac | ONNX(cpu), PyTorch(cpu,mps) | [google](https://drive.google.com/uc?id=1cutPICJa-PI_ww0E3ae9FCuSjY_5PnWE&export=download), [hugging face](https://huggingface.co/wok000/vcclient000/tree/main) | MMVC v.1.5.x, MMVC v.1.3.x, so-vits-svc 4.0, RVC, | 795MB |
| | win | ONNX(cpu,cuda), PyTorch(cpu,cuda) | [google](https://drive.google.com/uc?id=1aOkc-QhtAj11gI8i335mHhNMUSESeJ5J&export=download), [hugging face](https://huggingface.co/wok000/vcclient000/tree/main) | MMVC v.1.5.x, MMVC v.1.3.x, so-vits-svc 4.0, RVC, DDSP-SVC, Diffusion-SVC | 3237MB |
| | win | ONNX(cpu,DirectML), PyTorch(cpu,cuda) | [google](https://drive.google.com/uc?id=16g33cZ925HNty_0Hly7Aw_nXlQlgqxDC&export=download), [hugging face](https://huggingface.co/wok000/vcclient000/tree/main) | MMVC v.1.5.x, MMVC v.1.3.x, so-vits-svc 4.0, RVC, DDSP-SVC, Diffusion-SVC | 3122MB |
(\*1) You can also download from [hugging_face](https://huggingface.co/wok000/vcclient000/tree/main)
(\*2) The developer does not have an AMD graphics card, so it has not been tested. This package only includes onnxruntime-directml.

View File

@ -0,0 +1 @@
onnxdirectML-cuda

File diff suppressed because one or more lines are too long

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

View File

@ -11,8 +11,8 @@
"build:dev": "npm-run-all clean webpack:dev",
"start": "webpack-dev-server --config webpack.dev.js",
"build:mod": "cd ../lib && npm run build:dev && cd - && cp -r ../lib/dist/* node_modules/@dannadori/voice-changer-client-js/dist/",
"build:mod_dos": "cd ../lib && npm run build:dev && cd ../demo && copy ../lib/dist/index.js node_modules/@dannadori/voice-changer-client-js/dist/",
"build:mod_dos2": "copy ../lib/dist/index.js node_modules/@dannadori/voice-changer-client-js/dist/",
"build:mod_dos": "cd ../lib && npm run build:dev && cd ../demo && npm-run-all build:mod_copy",
"build:mod_copy": "XCOPY ..\\lib\\dist\\* .\\node_modules\\@dannadori\\voice-changer-client-js\\dist\\* /s /e /h /y",
"test": "echo \"Error: no test specified\" && exit 1"
},
"keywords": [
@ -26,17 +26,17 @@
"@babel/preset-env": "^7.22.9",
"@babel/preset-react": "^7.22.5",
"@babel/preset-typescript": "^7.22.5",
"@types/node": "^20.4.5",
"@types/react": "^18.2.17",
"@types/node": "^20.4.6",
"@types/react": "^18.2.18",
"@types/react-dom": "^18.2.7",
"autoprefixer": "^10.4.14",
"babel-loader": "^9.1.3",
"copy-webpack-plugin": "^11.0.0",
"css-loader": "^6.8.1",
"eslint": "^8.45.0",
"eslint-config-prettier": "^8.8.0",
"eslint": "^8.46.0",
"eslint-config-prettier": "^8.9.0",
"eslint-plugin-prettier": "^5.0.0",
"eslint-plugin-react": "^7.33.0",
"eslint-plugin-react": "^7.33.1",
"eslint-webpack-plugin": "^4.0.1",
"html-loader": "^4.2.0",
"html-webpack-plugin": "^5.5.3",
@ -54,11 +54,11 @@
"webpack-dev-server": "^4.15.1"
},
"dependencies": {
"@dannadori/voice-changer-client-js": "^1.0.164",
"@fortawesome/fontawesome-svg-core": "^6.4.0",
"@fortawesome/free-brands-svg-icons": "^6.4.0",
"@fortawesome/free-regular-svg-icons": "^6.4.0",
"@fortawesome/free-solid-svg-icons": "^6.4.0",
"@dannadori/voice-changer-client-js": "^1.0.166",
"@fortawesome/fontawesome-svg-core": "^6.4.2",
"@fortawesome/free-brands-svg-icons": "^6.4.2",
"@fortawesome/free-regular-svg-icons": "^6.4.2",
"@fortawesome/free-solid-svg-icons": "^6.4.2",
"@fortawesome/react-fontawesome": "^0.2.0",
"protobufjs": "^7.2.4",
"react": "^18.2.0",

View File

@ -0,0 +1 @@
onnxdirectML-cuda

View File

@ -63,6 +63,7 @@ type GuiStateAndMethod = {
outputAudioDeviceInfo: MediaDeviceInfo[];
audioInputForGUI: string;
audioOutputForGUI: string;
audioMonitorForGUI: string;
fileInputEchoback: boolean | undefined;
shareScreenEnabled: boolean;
audioOutputForAnalyzer: string;
@ -70,6 +71,7 @@ type GuiStateAndMethod = {
setOutputAudioDeviceInfo: (val: MediaDeviceInfo[]) => void;
setAudioInputForGUI: (val: string) => void;
setAudioOutputForGUI: (val: string) => void;
setAudioMonitorForGUI: (val: string) => void;
setFileInputEchoback: (val: boolean) => void;
setShareScreenEnabled: (val: boolean) => void;
setAudioOutputForAnalyzer: (val: string) => void;
@ -106,6 +108,7 @@ export const GuiStateProvider = ({ children }: Props) => {
const [outputAudioDeviceInfo, setOutputAudioDeviceInfo] = useState<MediaDeviceInfo[]>([]);
const [audioInputForGUI, setAudioInputForGUI] = useState<string>("none");
const [audioOutputForGUI, setAudioOutputForGUI] = useState<string>("none");
const [audioMonitorForGUI, setAudioMonitorForGUI] = useState<string>("none");
const [fileInputEchoback, setFileInputEchoback] = useState<boolean>(false); //最初のmuteが有効になるように。undefined <-- ??? falseしておけばよさそう。undefinedだとwarningがでる。
const [shareScreenEnabled, setShareScreenEnabled] = useState<boolean>(false);
const [audioOutputForAnalyzer, setAudioOutputForAnalyzer] = useState<string>("default");
@ -270,6 +273,7 @@ export const GuiStateProvider = ({ children }: Props) => {
outputAudioDeviceInfo,
audioInputForGUI,
audioOutputForGUI,
audioMonitorForGUI,
fileInputEchoback,
shareScreenEnabled,
audioOutputForAnalyzer,
@ -277,6 +281,7 @@ export const GuiStateProvider = ({ children }: Props) => {
setOutputAudioDeviceInfo,
setAudioInputForGUI,
setAudioOutputForGUI,
setAudioMonitorForGUI,
setFileInputEchoback,
setShareScreenEnabled,
setAudioOutputForAnalyzer,

View File

@ -19,7 +19,7 @@ export const MainScreen = (props: MainScreenProps) => {
const guiState = useGuiState();
const messageBuilderState = useMessageBuilder();
useMemo(() => {
messageBuilderState.setMessage(__filename, "change_icon", { ja: "アイコン変更", en: "chage icon" });
messageBuilderState.setMessage(__filename, "change_icon", { ja: "アイコン変更", en: "change icon" });
messageBuilderState.setMessage(__filename, "rename", { ja: "リネーム", en: "rename" });
messageBuilderState.setMessage(__filename, "download", { ja: "ダウンロード", en: "download" });
messageBuilderState.setMessage(__filename, "terms_of_use", { ja: "利用規約", en: "terms of use" });
@ -99,7 +99,7 @@ export const MainScreen = (props: MainScreenProps) => {
const slotRow = serverSetting.serverSetting.modelSlots.map((x, index) => {
// モデルのアイコン
const generateIconArea = (slotIndex: number, iconUrl: string, tooltip: boolean) => {
const realIconUrl = iconUrl.length > 0 ? iconUrl : "/assets/icons/noimage.png";
const realIconUrl = iconUrl.length > 0 ? serverSetting.serverSetting.voiceChangerParams.model_dir + "/" + slotIndex + "/" + iconUrl.split(/[\/\\]/).pop() : "/assets/icons/noimage.png";
const iconDivClass = tooltip ? "tooltip" : "";
const iconClass = tooltip ? "model-slot-icon-pointable" : "model-slot-icon";
return (

View File

@ -3,158 +3,182 @@ import { useGuiState } from "./001_GuiStateProvider";
import { useAppState } from "../../001_provider/001_AppStateProvider";
import { MergeElement, RVCModelSlot, RVCModelType, VoiceChangerType } from "@dannadori/voice-changer-client-js";
export const MergeLabDialog = () => {
const guiState = useGuiState()
const guiState = useGuiState();
const { serverSetting } = useAppState()
const [currentFilter, setCurrentFilter] = useState<string>("")
const [mergeElements, setMergeElements] = useState<MergeElement[]>([])
const { serverSetting } = useAppState();
const [currentFilter, setCurrentFilter] = useState<string>("");
const [mergeElements, setMergeElements] = useState<MergeElement[]>([]);
// スロットが変更されたときの初期化処理
const newSlotChangeKey = useMemo(() => {
if (!serverSetting.serverSetting.modelSlots) {
return ""
return "";
}
return serverSetting.serverSetting.modelSlots.reduce((prev, cur) => {
return prev + "_" + cur.modelFile
}, "")
}, [serverSetting.serverSetting.modelSlots])
return prev + "_" + cur.modelFile;
}, "");
}, [serverSetting.serverSetting.modelSlots]);
const filterItems = useMemo(() => {
return serverSetting.serverSetting.modelSlots.reduce((prev, cur) => {
return serverSetting.serverSetting.modelSlots.reduce(
(prev, cur) => {
if (cur.voiceChangerType != "RVC") {
return prev
return prev;
}
const curRVC = cur as RVCModelSlot
const key = `${curRVC.modelType},${cur.samplingRate},${curRVC.embChannels}`
const val = { type: curRVC.modelType, samplingRate: cur.samplingRate, embChannels: curRVC.embChannels }
const existKeys = Object.keys(prev)
const curRVC = cur as RVCModelSlot;
const key = `${curRVC.modelType},${cur.samplingRate},${curRVC.embChannels}`;
const val = { type: curRVC.modelType, samplingRate: cur.samplingRate, embChannels: curRVC.embChannels };
const existKeys = Object.keys(prev);
if (!cur.modelFile || cur.modelFile.length == 0) {
return prev
return prev;
}
if (curRVC.modelType == "onnxRVC" || curRVC.modelType == "onnxRVCNono") {
return prev
return prev;
}
if (!existKeys.includes(key)) {
prev[key] = val
prev[key] = val;
}
return prev
}, {} as { [key: string]: { type: RVCModelType, samplingRate: number, embChannels: number } })
}, [newSlotChangeKey])
return prev;
},
{} as { [key: string]: { type: RVCModelType; samplingRate: number; embChannels: number } },
);
}, [newSlotChangeKey]);
const models = useMemo(() => {
return serverSetting.serverSetting.modelSlots.filter(x => {
return serverSetting.serverSetting.modelSlots.filter((x) => {
if (x.voiceChangerType != "RVC") {
return
return;
}
const xRVC = x as RVCModelSlot
const filterVals = filterItems[currentFilter]
const xRVC = x as RVCModelSlot;
const filterVals = filterItems[currentFilter];
if (!filterVals) {
return false
return false;
}
if (xRVC.modelType == filterVals.type && xRVC.samplingRate == filterVals.samplingRate && xRVC.embChannels == filterVals.embChannels) {
return true
return true;
} else {
return false
return false;
}
})
}, [filterItems, currentFilter])
});
}, [filterItems, currentFilter]);
useEffect(() => {
if (Object.keys(filterItems).length > 0) {
setCurrentFilter(Object.keys(filterItems)[0])
setCurrentFilter(Object.keys(filterItems)[0]);
}
}, [filterItems])
}, [filterItems]);
useEffect(() => {
// models はフィルタ後の配列
const newMergeElements = models.map((x) => {
return { filename: x.modelFile, strength: 0 }
})
setMergeElements(newMergeElements)
}, [models])
return { slotIndex: x.slotIndex, filename: x.modelFile, strength: 0 };
});
setMergeElements(newMergeElements);
}, [models]);
const dialog = useMemo(() => {
const closeButtonRow = (
<div className="body-row split-3-4-3 left-padding-1">
<div className="body-item-text">
</div>
<div className="body-item-text"></div>
<div className="body-button-container body-button-container-space-around">
<div className="body-button" onClick={() => { guiState.stateControls.showMergeLabCheckbox.updateState(false) }} >close</div>
<div
className="body-button"
onClick={() => {
guiState.stateControls.showMergeLabCheckbox.updateState(false);
}}
>
close
</div>
</div>
<div className="body-item-text"></div>
</div>
)
);
const filterOptions = Object.keys(filterItems).map(x => {
return <option key={x} value={x}>{x}</option>
}).filter(x => x != null)
const onMergeElementsChanged = (filename: string, strength: number) => {
const newMergeElements = mergeElements.map((x) => {
if (x.filename == filename) {
return { filename: x.filename, strength: strength }
} else {
return x
}
const filterOptions = Object.keys(filterItems)
.map((x) => {
return (
<option key={x} value={x}>
{x}
</option>
);
})
setMergeElements(newMergeElements)
.filter((x) => x != null);
const onMergeElementsChanged = (slotIndex: number, strength: number) => {
const newMergeElements = mergeElements.map((x) => {
if (x.slotIndex == slotIndex) {
return { slotIndex: x.slotIndex, strength: strength };
} else {
return x;
}
});
setMergeElements(newMergeElements);
};
const onMergeClicked = () => {
const validMergeElements = mergeElements.filter((x) => {
return x.strength > 0;
});
serverSetting.mergeModel({
voiceChangerType: VoiceChangerType.RVC,
command: "mix",
files: mergeElements
})
}
files: validMergeElements,
});
};
const modelList = mergeElements.map((x, index) => {
const name = models.find(model => { return model.modelFile == x.filename })?.name || ""
const name =
models.find((model) => {
return model.slotIndex == x.slotIndex;
})?.name || "";
return (
<div key={index} className="merge-lab-model-item">
<div>{name}</div>
<div>
{name}
</div>
<div>
<input type="range" className="body-item-input-slider" min="0" max="100" step="1" value={x.strength} onChange={(e) => {
onMergeElementsChanged(x.filename, Number(e.target.value))
}}></input>
<input
type="range"
className="body-item-input-slider"
min="0"
max="100"
step="1"
value={x.strength}
onChange={(e) => {
onMergeElementsChanged(x.slotIndex, Number(e.target.value));
}}
></input>
<span className="body-item-input-slider-val">{x.strength}</span>
</div>
</div>
)
})
);
});
const content = (
<div className="merge-lab-container">
<div className="merge-lab-type-filter">
<div>Type:</div>
<div>
Type:
</div>
<div>
<select value={currentFilter} onChange={(e) => { setCurrentFilter(e.target.value) }}>
<select
value={currentFilter}
onChange={(e) => {
setCurrentFilter(e.target.value);
}}
>
{filterOptions}
</select>
</div>
</div>
<div className="merge-lab-manipulator">
<div className="merge-lab-model-list">
{modelList}
</div>
<div className="merge-lab-model-list">{modelList}</div>
<div className="merge-lab-merge-buttons">
<div className="merge-lab-merge-buttons-notice">
The merged model is stored in the final slot. If you assign this slot, it will be overwritten.
</div>
<div className="merge-lab-merge-buttons-notice">The merged model is stored in the final slot. If you assign this slot, it will be overwritten.</div>
<div className="merge-lab-merge-button" onClick={onMergeClicked}>
merge
</div>
</div>
</div>
</div>
)
);
return (
<div className="dialog-frame">
<div className="dialog-title">MergeLab</div>
@ -166,5 +190,4 @@ export const MergeLabDialog = () => {
);
}, [newSlotChangeKey, currentFilter, mergeElements, models]);
return dialog;
};

View File

@ -1,83 +1,115 @@
import React, { useMemo } from "react"
import { useAppState } from "../../../001_provider/001_AppStateProvider"
import { useGuiState } from "../001_GuiStateProvider"
import { useMessageBuilder } from "../../../hooks/useMessageBuilder"
import React, { useMemo, useState } from "react";
import { useAppState } from "../../../001_provider/001_AppStateProvider";
import { useGuiState } from "../001_GuiStateProvider";
import { useMessageBuilder } from "../../../hooks/useMessageBuilder";
import { FontAwesomeIcon } from "@fortawesome/react-fontawesome";
export type ModelSlotAreaProps = {
}
export type ModelSlotAreaProps = {};
const SortTypes = {
slot: "slot",
name: "name",
} as const;
export type SortTypes = (typeof SortTypes)[keyof typeof SortTypes];
export const ModelSlotArea = (_props: ModelSlotAreaProps) => {
const { serverSetting, getInfo } = useAppState()
const guiState = useGuiState()
const messageBuilderState = useMessageBuilder()
const { serverSetting, getInfo } = useAppState();
const guiState = useGuiState();
const messageBuilderState = useMessageBuilder();
const [sortType, setSortType] = useState<SortTypes>("slot");
useMemo(() => {
messageBuilderState.setMessage(__filename, "edit", { "ja": "編集", "en": "edit" })
}, [])
messageBuilderState.setMessage(__filename, "edit", { ja: "編集", en: "edit" });
}, []);
const modelTiles = useMemo(() => {
if (!serverSetting.serverSetting.modelSlots) {
return []
return [];
}
return serverSetting.serverSetting.modelSlots.map((x, index) => {
const modelSlots =
sortType == "slot"
? serverSetting.serverSetting.modelSlots
: serverSetting.serverSetting.modelSlots.slice().sort((a, b) => {
return a.name.localeCompare(b.name);
});
return modelSlots
.map((x, index) => {
if (!x.modelFile || x.modelFile.length == 0) {
return null
return null;
}
const tileContainerClass = index == serverSetting.serverSetting.modelSlotIndex ? "model-slot-tile-container-selected" : "model-slot-tile-container"
const name = x.name.length > 8 ? x.name.substring(0, 7) + "..." : x.name
const iconElem = x.iconFile.length > 0 ?
const tileContainerClass = x.slotIndex == serverSetting.serverSetting.modelSlotIndex ? "model-slot-tile-container-selected" : "model-slot-tile-container";
const name = x.name.length > 8 ? x.name.substring(0, 7) + "..." : x.name;
const iconElem =
x.iconFile.length > 0 ? (
<>
<img className="model-slot-tile-icon" src={x.iconFile} alt={x.name} />
<img className="model-slot-tile-icon" src={serverSetting.serverSetting.voiceChangerParams.model_dir + "/" + x.slotIndex + "/" + x.iconFile.split(/[\/\\]/).pop()} alt={x.name} />
<div className="model-slot-tile-vctype">{x.voiceChangerType}</div>
</>
:
) : (
<>
<div className="model-slot-tile-icon-no-entry">no image</div>
<div className="model-slot-tile-vctype">{x.voiceChangerType}</div>
</>
);
const clickAction = async () => {
const dummyModelSlotIndex = (Math.floor(Date.now() / 1000)) * 1000 + index
await serverSetting.updateServerSettings({ ...serverSetting.serverSetting, modelSlotIndex: dummyModelSlotIndex })
setTimeout(() => { // quick hack
getInfo()
}, 1000 * 2)
}
const dummyModelSlotIndex = Math.floor(Date.now() / 1000) * 1000 + x.slotIndex;
await serverSetting.updateServerSettings({ ...serverSetting.serverSetting, modelSlotIndex: dummyModelSlotIndex });
setTimeout(() => {
// quick hack
getInfo();
}, 1000 * 2);
};
return (
<div key={index} className={tileContainerClass} onClick={clickAction}>
<div className="model-slot-tile-icon-div">
{iconElem}
<div className="model-slot-tile-icon-div">{iconElem}</div>
<div className="model-slot-tile-dscription">{name}</div>
</div>
<div className="model-slot-tile-dscription">
{name}
</div>
</div >
)
}).filter(x => x != null)
}, [serverSetting.serverSetting.modelSlots, serverSetting.serverSetting.modelSlotIndex])
);
})
.filter((x) => x != null);
}, [serverSetting.serverSetting.modelSlots, serverSetting.serverSetting.modelSlotIndex, sortType]);
const modelSlotArea = useMemo(() => {
const onModelSlotEditClicked = () => {
guiState.stateControls.showModelSlotManagerCheckbox.updateState(true)
}
guiState.stateControls.showModelSlotManagerCheckbox.updateState(true);
};
const sortSlotByIdClass = sortType == "slot" ? "model-slot-sort-button-active" : "model-slot-sort-button";
const sortSlotByNameClass = sortType == "name" ? "model-slot-sort-button-active" : "model-slot-sort-button";
return (
<div className="model-slot-area">
<div className="model-slot-panel">
<div className="model-slot-tiles-container">{modelTiles}</div>
<div className="model-slot-buttons">
<div className="model-slot-sort-buttons">
<div
className={sortSlotByIdClass}
onClick={() => {
setSortType("slot");
}}
>
<FontAwesomeIcon icon={["fas", "arrow-down-1-9"]} style={{ fontSize: "1rem" }} />
</div>
<div
className={sortSlotByNameClass}
onClick={() => {
setSortType("name");
}}
>
<FontAwesomeIcon icon={["fas", "arrow-down-a-z"]} style={{ fontSize: "1rem" }} />
</div>
</div>
<div className="model-slot-button" onClick={onModelSlotEditClicked}>
{messageBuilderState.getMessage(__filename, "edit")}
</div>
</div>
</div>
</div>
)
}, [modelTiles])
);
}, [modelTiles, sortType]);
return modelSlotArea
}
return modelSlotArea;
};

View File

@ -23,6 +23,26 @@ export const DiffusionSVCSettingArea = (_props: DiffusionSVCSettingAreaProps) =>
return <></>;
}
const skipDiffusionClass = serverSetting.serverSetting.skipDiffusion == 0 ? "character-area-toggle-button" : "character-area-toggle-button-active";
const skipDiffRow = (
<div className="character-area-control">
<div className="character-area-control-title">Boost</div>
<div className="character-area-control-field">
<div className="character-area-buttons">
<div
className={skipDiffusionClass}
onClick={() => {
serverSetting.updateServerSettings({ ...serverSetting.serverSetting, skipDiffusion: serverSetting.serverSetting.skipDiffusion == 0 ? 1 : 0 });
}}
>
skip diff
</div>
</div>
</div>
</div>
);
const skipValues = getDivisors(serverSetting.serverSetting.kStep);
skipValues.pop();
@ -82,6 +102,7 @@ export const DiffusionSVCSettingArea = (_props: DiffusionSVCSettingAreaProps) =>
);
return (
<>
{skipDiffRow}
{kStepRow}
{speedUpRow}
</>

View File

@ -49,7 +49,7 @@ export const CharacterArea = (_props: CharacterAreaProps) => {
return <></>;
}
const icon = selected.iconFile.length > 0 ? selected.iconFile : "./assets/icons/human.png";
const icon = selected.iconFile.length > 0 ? serverSetting.serverSetting.voiceChangerParams.model_dir + "/" + selected.slotIndex + "/" + selected.iconFile.split(/[\/\\]/).pop() : "./assets/icons/human.png";
const selectedTermOfUseUrlLink = selected.termsOfUseUrl ? (
<a href={selected.termsOfUseUrl} target="_blank" rel="noopener noreferrer" className="portrait-area-terms-of-use-link">
[{messageBuilderState.getMessage(__filename, "terms_of_use")}]
@ -122,9 +122,13 @@ export const CharacterArea = (_props: CharacterAreaProps) => {
serverSetting.updateServerSettings({ ...serverSetting.serverSetting, serverAudioStated: 0 });
}
};
const onPassThroughClicked = async () => {
serverSetting.updateServerSettings({ ...serverSetting.serverSetting, passThrough: !serverSetting.serverSetting.passThrough });
};
const startClassName = guiState.isConverting ? "character-area-control-button-active" : "character-area-control-button-stanby";
const stopClassName = guiState.isConverting ? "character-area-control-button-stanby" : "character-area-control-button-active";
const passThruClassName = serverSetting.serverSetting.passThrough == false ? "character-area-control-passthru-button-stanby" : "character-area-control-passthru-button-active blinking";
console.log("serverSetting.serverSetting.passThrough", passThruClassName, serverSetting.serverSetting.passThrough);
return (
<div className="character-area-control">
<div className="character-area-control-buttons">
@ -134,6 +138,9 @@ export const CharacterArea = (_props: CharacterAreaProps) => {
<div onClick={onStopClicked} className={stopClassName}>
stop
</div>
<div onClick={onPassThroughClicked} className={passThruClassName}>
passthru
</div>
</div>
</div>
);

View File

@ -41,6 +41,7 @@ export const ConvertArea = (props: ConvertProps) => {
const gpuSelect =
edition.indexOf("onnxdirectML-cuda") >= 0 ? (
<>
<div className="config-sub-area-control">
<div className="config-sub-area-control-title">GPU(dml):</div>
<div className="config-sub-area-control-field">
@ -54,7 +55,7 @@ export const ConvertArea = (props: ConvertProps) => {
}}
className={cpuClassName}
>
cpu
<span className="config-sub-area-button-text-small">cpu</span>
</div>
<div
onClick={async () => {
@ -65,7 +66,7 @@ export const ConvertArea = (props: ConvertProps) => {
}}
className={gpu0ClassName}
>
0
<span className="config-sub-area-button-text-small">gpu0</span>
</div>
<div
onClick={async () => {
@ -76,7 +77,7 @@ export const ConvertArea = (props: ConvertProps) => {
}}
className={gpu1ClassName}
>
1
<span className="config-sub-area-button-text-small">gpu1</span>
</div>
<div
onClick={async () => {
@ -87,7 +88,7 @@ export const ConvertArea = (props: ConvertProps) => {
}}
className={gpu2ClassName}
>
2
<span className="config-sub-area-button-text-small">gpu2</span>
</div>
<div
onClick={async () => {
@ -98,11 +99,17 @@ export const ConvertArea = (props: ConvertProps) => {
}}
className={gpu3ClassName}
>
3
<span className="config-sub-area-button-text-small">gpu3</span>
</div>
<div className="config-sub-area-control">
<span className="config-sub-area-button-text-small">
<a href="https://github.com/w-okada/voice-changer/issues/410">more info</a>
</span>
</div>
</div>
</div>
</div>
</>
) : (
<div className="config-sub-area-control">
<div className="config-sub-area-control-title">GPU:</div>

View File

@ -2,14 +2,14 @@ import React, { useEffect, useMemo, useRef, useState } from "react";
import { useAppState } from "../../../001_provider/001_AppStateProvider";
import { fileSelectorAsDataURL, useIndexedDB } from "@dannadori/voice-changer-client-js";
import { useGuiState } from "../001_GuiStateProvider";
import { AUDIO_ELEMENT_FOR_PLAY_RESULT, AUDIO_ELEMENT_FOR_TEST_CONVERTED, AUDIO_ELEMENT_FOR_TEST_CONVERTED_ECHOBACK, AUDIO_ELEMENT_FOR_TEST_ORIGINAL, INDEXEDDB_KEY_AUDIO_OUTPUT } from "../../../const";
import { AUDIO_ELEMENT_FOR_PLAY_MONITOR, AUDIO_ELEMENT_FOR_PLAY_RESULT, AUDIO_ELEMENT_FOR_TEST_CONVERTED, AUDIO_ELEMENT_FOR_TEST_CONVERTED_ECHOBACK, AUDIO_ELEMENT_FOR_TEST_ORIGINAL, INDEXEDDB_KEY_AUDIO_MONITR, INDEXEDDB_KEY_AUDIO_OUTPUT } from "../../../const";
import { isDesktopApp } from "../../../const";
export type DeviceAreaProps = {};
export const DeviceArea = (_props: DeviceAreaProps) => {
const { setting, serverSetting, audioContext, setAudioOutputElementId, initializedRef, setVoiceChangerClientSetting, startOutputRecording, stopOutputRecording } = useAppState();
const { isConverting, audioInputForGUI, inputAudioDeviceInfo, setAudioInputForGUI, fileInputEchoback, setFileInputEchoback, setAudioOutputForGUI, audioOutputForGUI, outputAudioDeviceInfo, shareScreenEnabled, setShareScreenEnabled } = useGuiState();
const { setting, serverSetting, audioContext, setAudioOutputElementId, setAudioMonitorElementId, initializedRef, setVoiceChangerClientSetting, startOutputRecording, stopOutputRecording } = useAppState();
const { isConverting, audioInputForGUI, inputAudioDeviceInfo, setAudioInputForGUI, fileInputEchoback, setFileInputEchoback, setAudioOutputForGUI, setAudioMonitorForGUI, audioOutputForGUI, audioMonitorForGUI, outputAudioDeviceInfo, shareScreenEnabled, setShareScreenEnabled } = useGuiState();
const [inputHostApi, setInputHostApi] = useState<string>("ALL");
const [outputHostApi, setOutputHostApi] = useState<string>("ALL");
const [monitorHostApi, setMonitorHostApi] = useState<string>("ALL");
@ -244,10 +244,10 @@ export const DeviceArea = (_props: DeviceAreaProps) => {
audio_echo.volume = 0;
setFileInputEchoback(false);
// original stream to play.
const audio_org = document.getElementById(AUDIO_ELEMENT_FOR_TEST_ORIGINAL) as HTMLAudioElement;
audio_org.src = url;
audio_org.pause();
// // original stream to play.
// const audio_org = document.getElementById(AUDIO_ELEMENT_FOR_TEST_ORIGINAL) as HTMLAudioElement;
// audio_org.src = url;
// audio_org.pause();
};
const echobackClass = fileInputEchoback ? "config-sub-area-control-field-wav-file-echoback-button-active" : "config-sub-area-control-field-wav-file-echoback-button";
@ -256,7 +256,7 @@ export const DeviceArea = (_props: DeviceAreaProps) => {
<div className="config-sub-area-control-field">
<div className="config-sub-area-control-field-wav-file left-padding-1">
<div className="config-sub-area-control-field-wav-file-audio-container">
<audio id={AUDIO_ELEMENT_FOR_TEST_ORIGINAL} controls hidden></audio>
{/* <audio id={AUDIO_ELEMENT_FOR_TEST_ORIGINAL} controls hidden></audio> */}
<audio className="config-sub-area-control-field-wav-file-audio" id={AUDIO_ELEMENT_FOR_TEST_CONVERTED} controls controlsList="nodownload noplaybackrate"></audio>
<audio id={AUDIO_ELEMENT_FOR_TEST_CONVERTED_ECHOBACK} controls hidden></audio>
</div>
@ -381,7 +381,8 @@ export const DeviceArea = (_props: DeviceAreaProps) => {
const setAudioOutput = async () => {
const mediaDeviceInfos = await navigator.mediaDevices.enumerateDevices();
[AUDIO_ELEMENT_FOR_PLAY_RESULT, AUDIO_ELEMENT_FOR_TEST_ORIGINAL, AUDIO_ELEMENT_FOR_TEST_CONVERTED_ECHOBACK].forEach((x) => {
// [AUDIO_ELEMENT_FOR_PLAY_RESULT, AUDIO_ELEMENT_FOR_TEST_ORIGINAL, AUDIO_ELEMENT_FOR_TEST_CONVERTED_ECHOBACK].forEach((x) => {
[AUDIO_ELEMENT_FOR_PLAY_RESULT, AUDIO_ELEMENT_FOR_TEST_CONVERTED_ECHOBACK].forEach((x) => {
const audio = document.getElementById(x) as HTMLAudioElement;
if (audio) {
if (serverSetting.serverSetting.enableServerAudio == 1) {
@ -598,7 +599,88 @@ export const DeviceArea = (_props: DeviceAreaProps) => {
);
}, [serverSetting.serverSetting, serverSetting.updateServerSettings, serverSetting.serverSetting.enableServerAudio]);
// (6) Monitor
// (6) モニター
useEffect(() => {
const loadCache = async () => {
const key = await getItem(INDEXEDDB_KEY_AUDIO_MONITR);
if (key) {
setAudioMonitorForGUI(key as string);
}
};
loadCache();
}, []);
useEffect(() => {
const setAudioMonitor = async () => {
const mediaDeviceInfos = await navigator.mediaDevices.enumerateDevices();
[AUDIO_ELEMENT_FOR_PLAY_MONITOR].forEach((x) => {
const audio = document.getElementById(x) as HTMLAudioElement;
if (audio) {
if (serverSetting.serverSetting.enableServerAudio == 1) {
// Server Audio を使う場合はElementから音は出さない。
audio.volume = 0;
} else if (audioMonitorForGUI == "none") {
// @ts-ignore
audio.setSinkId("");
audio.volume = 0;
} else {
const audioOutputs = mediaDeviceInfos.filter((x) => {
return x.kind == "audiooutput";
});
const found = audioOutputs.some((x) => {
return x.deviceId == audioMonitorForGUI;
});
if (found) {
// @ts-ignore // 例外キャッチできないので事前にIDチェックが必要らしい。
audio.setSinkId(audioMonitorForGUI);
audio.volume = 1;
} else {
console.warn("No audio output device. use default");
}
}
}
});
};
setAudioMonitor();
}, [audioMonitorForGUI, serverSetting.serverSetting.enableServerAudio]);
// (6-1) クライアント
const clientMonitorRow = useMemo(() => {
if (serverSetting.serverSetting.enableServerAudio == 1) {
return <></>;
}
return (
<div className="config-sub-area-control">
<div className="config-sub-area-control-title left-padding-1">monitor</div>
<div className="config-sub-area-control-field">
<select
className="body-select"
value={audioMonitorForGUI}
onChange={(e) => {
setAudioMonitorForGUI(e.target.value);
setItem(INDEXEDDB_KEY_AUDIO_MONITR, e.target.value);
}}
>
{outputAudioDeviceInfo.map((x) => {
return (
<option key={x.deviceId} value={x.deviceId}>
{x.label}
</option>
);
})}
</select>
</div>
</div>
);
}, [serverSetting.serverSetting.enableServerAudio, outputAudioDeviceInfo, audioMonitorForGUI]);
useEffect(() => {
console.log("initializedRef.current", initializedRef.current);
setAudioMonitorElementId(AUDIO_ELEMENT_FOR_PLAY_MONITOR);
}, [initializedRef.current]);
// (6-2) サーバ
const serverMonitorRow = useMemo(() => {
if (serverSetting.serverSetting.enableServerAudio == 0) {
return <></>;
@ -675,6 +757,41 @@ export const DeviceArea = (_props: DeviceAreaProps) => {
);
}, [monitorHostApi, serverSetting.serverSetting, serverSetting.updateServerSettings, serverSetting.serverSetting.enableServerAudio]);
const monitorGainControl = useMemo(() => {
const currentMonitorGain = serverSetting.serverSetting.enableServerAudio == 0 ? setting.voiceChangerClientSetting.monitorGain : serverSetting.serverSetting.serverMonitorAudioGain;
const monitorValueUpdatedAction =
serverSetting.serverSetting.enableServerAudio == 0
? async (val: number) => {
await setVoiceChangerClientSetting({ ...setting.voiceChangerClientSetting, monitorGain: val });
}
: async (val: number) => {
await serverSetting.updateServerSettings({ ...serverSetting.serverSetting, serverMonitorAudioGain: val });
};
return (
<div className="config-sub-area-control">
<div className="config-sub-area-control-title left-padding-2">gain</div>
<div className="config-sub-area-control-field">
<div className="config-sub-area-control-field-auido-io">
<span className="character-area-slider-control-slider">
<input
type="range"
min="0.1"
max="10.0"
step="0.1"
value={currentMonitorGain}
onChange={(e) => {
monitorValueUpdatedAction(Number(e.target.value));
}}
></input>
</span>
<span className="character-area-slider-control-val">{currentMonitorGain}</span>
</div>
</div>
</div>
);
}, [serverSetting.serverSetting, setting, setVoiceChangerClientSetting, serverSetting.updateServerSettings]);
return (
<div className="config-sub-area">
{deviceModeRow}
@ -685,10 +802,13 @@ export const DeviceArea = (_props: DeviceAreaProps) => {
{audioInputScreenRow}
{clientAudioOutputRow}
{serverAudioOutputRow}
{clientMonitorRow}
{serverMonitorRow}
{monitorGainControl}
{outputRecorderRow}
<audio hidden id={AUDIO_ELEMENT_FOR_PLAY_RESULT}></audio>
<audio hidden id={AUDIO_ELEMENT_FOR_PLAY_MONITOR}></audio>
</div>
);
};

View File

@ -1,13 +1,15 @@
export const AUDIO_ELEMENT_FOR_PLAY_RESULT = "audio-result"
export const AUDIO_ELEMENT_FOR_TEST_ORIGINAL = "audio-test-original"
export const AUDIO_ELEMENT_FOR_TEST_CONVERTED = "audio-test-converted"
export const AUDIO_ELEMENT_FOR_TEST_CONVERTED_ECHOBACK = "audio-test-converted-echoback"
export const AUDIO_ELEMENT_FOR_PLAY_RESULT = "audio-result" // 変換後の出力用プレイヤー
export const AUDIO_ELEMENT_FOR_PLAY_MONITOR = "audio-monitor" // 変換後のモニター用プレイヤー
export const AUDIO_ELEMENT_FOR_TEST_ORIGINAL = "audio-test-original" // ??? 使ってないかも。
export const AUDIO_ELEMENT_FOR_TEST_CONVERTED = "audio-test-converted" // ファイルインプットのコントロール
export const AUDIO_ELEMENT_FOR_TEST_CONVERTED_ECHOBACK = "audio-test-converted-echoback" // ファイルインプットのエコーバック
export const AUDIO_ELEMENT_FOR_SAMPLING_INPUT = "body-wav-container-wav-input"
export const AUDIO_ELEMENT_FOR_SAMPLING_OUTPUT = "body-wav-container-wav-output"
export const INDEXEDDB_KEY_AUDIO_OUTPUT = "INDEXEDDB_KEY_AUDIO_OUTPUT"
export const INDEXEDDB_KEY_AUDIO_MONITR = "INDEXEDDB_KEY_AUDIO_MONITOR"
export const INDEXEDDB_KEY_DEFAULT_MODEL_TYPE = "INDEXEDDB_KEY_DEFALT_MODEL_TYPE"

View File

@ -757,6 +757,18 @@ body {
max-height: 60vh;
width: 100%;
overflow-y: scroll;
&::-webkit-scrollbar {
width: 10px;
height: 10px;
}
&::-webkit-scrollbar-track {
background-color: #eee;
border-radius: 3px;
}
&::-webkit-scrollbar-thumb {
background: #f7cfec80;
border-radius: 3px;
}
.model-slot {
height: 5rem;
@ -1150,12 +1162,30 @@ body {
flex-direction: row;
gap: 2px;
flex-wrap: wrap;
overflow-y: scroll;
max-height: 12rem;
&::-webkit-scrollbar {
width: 10px;
height: 10px;
}
&::-webkit-scrollbar-track {
background-color: #eee;
border-radius: 3px;
}
&::-webkit-scrollbar-thumb {
background: #f7cfec80;
border-radius: 3px;
}
/* width: calc(30rem + 40px + 10px); */
}
.model-slot-buttons {
display: flex;
flex-direction: column-reverse;
gap: 5px;
flex-direction: column;
justify-content: space-between;
width: 4rem;
.model-slot-button {
border: solid 2px #999;
color: white;
@ -1164,10 +1194,41 @@ body {
background: #333;
cursor: pointer;
padding: 5px;
text-align: center;
width: 3rem;
}
.model-slot-button:hover {
border: solid 2px #faa;
}
.model-slot-sort-buttons {
height: 50%;
.model-slot-sort-button {
color: white;
font-size: 0.8rem;
border-radius: 4px;
background: #333;
border: solid 2px #444;
cursor: pointer;
padding: 1px;
text-align: center;
width: 3rem;
}
.model-slot-sort-button-active {
color: white;
font-size: 0.8rem;
border-radius: 4px;
background: #595;
border: solid 2px #595;
cursor: pointer;
padding: 1px;
text-align: center;
width: 3rem;
}
.model-slot-sort-button:hover {
border: solid 2px #faa;
background: #343;
}
}
}
}
}
@ -1277,6 +1338,7 @@ body {
.character-area-control {
display: flex;
gap: 3px;
align-items: center;
.character-area-control-buttons {
display: flex;
flex-direction: row;
@ -1301,6 +1363,34 @@ body {
border: solid 1px #000;
}
}
.character-area-control-passthru-button-stanby {
width: 5rem;
border: solid 1px #999;
border-radius: 15px;
padding: 2px;
background: #aba;
cursor: pointer;
font-weight: 700;
font-size: 0.8rem;
text-align: center;
&:hover {
border: solid 1px #000;
}
}
.character-area-control-passthru-button-active {
width: 5rem;
border: solid 1px #955;
border-radius: 15px;
padding: 2px;
background: #fdd;
cursor: pointer;
font-weight: 700;
font-size: 0.8rem;
text-align: center;
&:hover {
border: solid 1px #000;
}
}
}
.character-area-control-title {
@ -1344,6 +1434,35 @@ body {
.character-area-button:hover {
border: solid 2px #faa;
}
.character-area-toggle-button {
border: solid 2px #999;
color: white;
background: #666;
cursor: pointer;
font-size: 0.8rem;
border-radius: 5px;
height: 1.2rem;
padding-left: 2px;
padding-right: 2px;
}
.character-area-toggle-button:hover {
border: solid 2px #faa;
}
.character-area-toggle-button-active {
border: solid 2px #999;
color: white;
background: #844;
cursor: pointer;
font-size: 0.8rem;
border-radius: 5px;
height: 1.2rem;
padding-left: 2px;
padding-right: 2px;
}
}
}
}
@ -1443,6 +1562,10 @@ audio::-webkit-media-controls-overlay-enclosure{
height: 1.2rem;
padding-left: 2px;
padding-right: 2px;
white-space: nowrap;
}
.config-sub-area-button-text-small {
font-size: 0.5rem;
}
}
.config-sub-area-control-field-auido-io {
@ -1635,6 +1758,21 @@ audio::-webkit-media-controls-overlay-enclosure{
flex-direction: row;
.merge-lab-model-list {
width: 70%;
overflow-y: scroll;
max-height: 20rem;
&::-webkit-scrollbar {
width: 10px;
height: 10px;
}
&::-webkit-scrollbar-track {
background-color: #eee;
border-radius: 3px;
}
&::-webkit-scrollbar-thumb {
background: #f7cfec80;
border-radius: 3px;
}
.merge-lab-model-item {
display: flex;
flex-direction: row;
@ -1673,3 +1811,18 @@ audio::-webkit-media-controls-overlay-enclosure{
}
}
}
.blinking {
animation: flash 0.7s cubic-bezier(0.91, -0.14, 0, 1.4) infinite;
}
@keyframes flash {
0%,
100% {
opacity: 1;
}
50% {
opacity: 0.5;
}
}

File diff suppressed because it is too large Load Diff

View File

@ -1,6 +1,6 @@
{
"name": "@dannadori/voice-changer-client-js",
"version": "1.0.164",
"version": "1.0.167",
"description": "",
"main": "dist/index.js",
"directories": {
@ -26,33 +26,33 @@
"author": "wataru.okada@flect.co.jp",
"license": "ISC",
"devDependencies": {
"@types/audioworklet": "^0.0.48",
"@types/node": "^20.4.2",
"@types/react": "18.2.15",
"@types/audioworklet": "^0.0.50",
"@types/node": "^20.4.8",
"@types/react": "18.2.18",
"@types/react-dom": "18.2.7",
"eslint": "^8.45.0",
"eslint-config-prettier": "^8.8.0",
"eslint": "^8.46.0",
"eslint-config-prettier": "^9.0.0",
"eslint-plugin-prettier": "^5.0.0",
"eslint-plugin-react": "^7.32.2",
"eslint-plugin-react": "^7.33.1",
"eslint-webpack-plugin": "^4.0.1",
"npm-run-all": "^4.1.5",
"prettier": "^3.0.0",
"prettier": "^3.0.1",
"raw-loader": "^4.0.2",
"rimraf": "^5.0.1",
"ts-loader": "^9.4.4",
"typescript": "^5.1.6",
"webpack": "^5.88.1",
"webpack": "^5.88.2",
"webpack-cli": "^5.1.4",
"webpack-dev-server": "^4.15.1"
},
"dependencies": {
"@types/readable-stream": "^2.3.15",
"@types/readable-stream": "^4.0.0",
"amazon-chime-sdk-js": "^3.15.0",
"buffer": "^6.0.3",
"localforage": "^1.10.0",
"protobufjs": "^7.2.4",
"react": "^18.2.0",
"react-dom": "^18.2.0",
"socket.io-client": "^4.7.1"
"socket.io-client": "^4.7.2"
}
}

View File

@ -23,9 +23,11 @@ export class VoiceChangerClient {
private currentMediaStreamAudioSourceNode: MediaStreamAudioSourceNode | null = null
private inputGainNode: GainNode | null = null
private outputGainNode: GainNode | null = null
private monitorGainNode: GainNode | null = null
private vcInNode!: VoiceChangerWorkletNode
private vcOutNode!: VoiceChangerWorkletNode
private currentMediaStreamAudioDestinationNode!: MediaStreamAudioDestinationNode
private currentMediaStreamAudioDestinationMonitorNode!: MediaStreamAudioDestinationNode
private promiseForInitialize: Promise<void>
@ -72,6 +74,12 @@ export class VoiceChangerClient {
this.vcOutNode.connect(this.outputGainNode) // vc node -> output node
this.outputGainNode.connect(this.currentMediaStreamAudioDestinationNode)
this.currentMediaStreamAudioDestinationMonitorNode = ctx44k.createMediaStreamDestination() // output node
this.monitorGainNode = ctx44k.createGain()
this.monitorGainNode.gain.value = this.setting.monitorGain
this.vcOutNode.connect(this.monitorGainNode) // vc node -> monitor node
this.monitorGainNode.connect(this.currentMediaStreamAudioDestinationMonitorNode)
if (this.vfEnable) {
this.vf = await VoiceFocusDeviceTransformer.create({ variant: 'c20' })
const dummyMediaStream = createDummyMediaStream(this.ctx)
@ -185,6 +193,9 @@ export class VoiceChangerClient {
get stream(): MediaStream {
return this.currentMediaStreamAudioDestinationNode.stream
}
get monitorStream(): MediaStream {
return this.currentMediaStreamAudioDestinationMonitorNode.stream
}
start = async () => {
await this.vcInNode.start()
@ -239,6 +250,9 @@ export class VoiceChangerClient {
if (this.setting.outputGain != setting.outputGain) {
this.setOutputGain(setting.outputGain)
}
if (this.setting.monitorGain != setting.monitorGain) {
this.setMonitorGain(setting.monitorGain)
}
this.setting = setting
if (reconstructInputRequired) {
@ -251,6 +265,9 @@ export class VoiceChangerClient {
if (!this.inputGainNode) {
return
}
if(!val){
return
}
this.inputGainNode.gain.value = val
}
@ -258,9 +275,22 @@ export class VoiceChangerClient {
if (!this.outputGainNode) {
return
}
if(!val){
return
}
this.outputGainNode.gain.value = val
}
setMonitorGain = (val: number) => {
if (!this.monitorGainNode) {
return
}
if(!val){
return
}
this.monitorGainNode.gain.value = val
}
/////////////////////////////////////////////////////
// コンポーネント設定、操作
/////////////////////////////////////////////////////

View File

@ -68,6 +68,7 @@ export const RVCModelType = {
export type RVCModelType = typeof RVCModelType[keyof typeof RVCModelType]
export const ServerSettingKey = {
"passThrough":"passThrough",
"srcId": "srcId",
"dstId": "dstId",
"gpu": "gpu",
@ -97,6 +98,7 @@ export const ServerSettingKey = {
"serverReadChunkSize": "serverReadChunkSize",
"serverInputAudioGain": "serverInputAudioGain",
"serverOutputAudioGain": "serverOutputAudioGain",
"serverMonitorAudioGain": "serverMonitorAudioGain",
"tran": "tran",
"noiseScale": "noiseScale",
@ -123,6 +125,7 @@ export const ServerSettingKey = {
"threshold": "threshold",
"speedUp": "speedUp",
"skipDiffusion": "skipDiffusion",
"inputSampleRate": "inputSampleRate",
"enableDirectML": "enableDirectML",
@ -131,6 +134,7 @@ export type ServerSettingKey = typeof ServerSettingKey[keyof typeof ServerSettin
export type VoiceChangerServerSetting = {
passThrough: boolean
srcId: number,
dstId: number,
gpu: number,
@ -157,6 +161,7 @@ export type VoiceChangerServerSetting = {
serverReadChunkSize: number
serverInputAudioGain: number
serverOutputAudioGain: number
serverMonitorAudioGain: number
tran: number // so-vits-svc
@ -184,13 +189,14 @@ export type VoiceChangerServerSetting = {
threshold: number// DDSP-SVC
speedUp: number // Diffusion-SVC
skipDiffusion: number // Diffusion-SVC 0:off, 1:on
inputSampleRate: InputSampleRate
enableDirectML: number
}
type ModelSlot = {
slotIndex: number
voiceChangerType: VoiceChangerType
name: string,
description: string,
@ -303,7 +309,9 @@ export type ServerInfo = VoiceChangerServerSetting & {
memory: number,
}[]
maxInputLength: number // MMVCv15
voiceChangerParams: {
model_dir: string
}
}
export type SampleModel = {
@ -339,6 +347,7 @@ export type DiffusionSVCSampleModel =SampleModel & {
export const DefaultServerSetting: ServerInfo = {
// VC Common
passThrough: false,
inputSampleRate: 48000,
crossFadeOffsetRate: 0.0,
@ -361,6 +370,7 @@ export const DefaultServerSetting: ServerInfo = {
serverReadChunkSize: 256,
serverInputAudioGain: 1.0,
serverOutputAudioGain: 1.0,
serverMonitorAudioGain: 1.0,
// VC Specific
srcId: 0,
@ -397,6 +407,7 @@ export const DefaultServerSetting: ServerInfo = {
threshold: -45,
speedUp: 10,
skipDiffusion: 1,
enableDirectML: 0,
//
@ -405,7 +416,10 @@ export const DefaultServerSetting: ServerInfo = {
serverAudioInputDevices: [],
serverAudioOutputDevices: [],
maxInputLength: 128 * 2048
maxInputLength: 128 * 2048,
voiceChangerParams: {
model_dir: ""
}
}
///////////////////////
@ -466,6 +480,7 @@ export type VoiceChangerClientSetting = {
inputGain: number
outputGain: number
monitorGain: number
}
///////////////////////
@ -496,7 +511,8 @@ export const DefaultClientSettng: ClientSetting = {
noiseSuppression: false,
noiseSuppression2: false,
inputGain: 1.0,
outputGain: 1.0
outputGain: 1.0,
monitorGain: 1.0
}
}
@ -533,7 +549,7 @@ export type OnnxExporterInfo = {
// Merge
export type MergeElement = {
filename: string
slotIndex: number
strength: number
}
export type MergeModelRequest = {

View File

@ -47,6 +47,7 @@ export type ClientState = {
clearSetting: () => Promise<void>
// AudioOutputElement 設定
setAudioOutputElementId: (elemId: string) => void
setAudioMonitorElementId: (elemId: string) => void
ioErrorCount: number
resetIoErrorCount: () => void
@ -215,6 +216,18 @@ export const useClient = (props: UseClientProps): ClientState => {
}
}
const setAudioMonitorElementId = (elemId: string) => {
if (!voiceChangerClientRef.current) {
console.warn("[voiceChangerClient] is not ready for set audio output.")
return
}
const audio = document.getElementById(elemId) as HTMLAudioElement
if (audio.paused) {
audio.srcObject = voiceChangerClientRef.current.monitorStream
audio.play()
}
}
// (2-2) 情報リロード
const getInfo = useMemo(() => {
return async () => {
@ -286,6 +299,7 @@ export const useClient = (props: UseClientProps): ClientState => {
// AudioOutputElement 設定
setAudioOutputElementId,
setAudioMonitorElementId,
ioErrorCount,
resetIoErrorCount

View File

@ -18,6 +18,10 @@ npm run build:docker:vcclient
bash start_docker.sh
```
ブラウザ(Chrome のみサポート)でアクセスすると画面が表示されます。
## RUN with options
GPU を使用しない場合は
```

View File

@ -36,6 +36,10 @@ In root folder of repos.
bash start_docker.sh
```
Access with Browser (currently only chrome is supported), then you can see gui.
## RUN with options
Without GPU
```

View File

@ -9,6 +9,7 @@ import argparse
from Exceptions import WeightDownladException
from downloader.SampleDownloader import downloadInitialSamples
from downloader.WeightDownloader import downloadWeight
from voice_changer.VoiceChangerParamsManager import VoiceChangerParamsManager
from voice_changer.utils.VoiceChangerParams import VoiceChangerParams
@ -40,19 +41,19 @@ def setupArgParser():
parser.add_argument("--httpsCert", type=str, default="ssl.cert", help="path for the cert of https")
parser.add_argument("--httpsSelfSigned", type=strtobool, default=True, help="generate self-signed certificate")
parser.add_argument("--model_dir", type=str, help="path to model files")
parser.add_argument("--model_dir", type=str, default="model_dir", help="path to model files")
parser.add_argument("--sample_mode", type=str, default="production", help="rvc_sample_mode")
parser.add_argument("--content_vec_500", type=str, help="path to content_vec_500 model(pytorch)")
parser.add_argument("--content_vec_500_onnx", type=str, help="path to content_vec_500 model(onnx)")
parser.add_argument("--content_vec_500_onnx_on", type=strtobool, default=False, help="use or not onnx for content_vec_500")
parser.add_argument("--hubert_base", type=str, help="path to hubert_base model(pytorch)")
parser.add_argument("--hubert_base_jp", type=str, help="path to hubert_base_jp model(pytorch)")
parser.add_argument("--hubert_soft", type=str, help="path to hubert_soft model(pytorch)")
parser.add_argument("--nsf_hifigan", type=str, help="path to nsf_hifigan model(pytorch)")
parser.add_argument("--crepe_onnx_full", type=str, help="path to crepe_onnx_full")
parser.add_argument("--crepe_onnx_tiny", type=str, help="path to crepe_onnx_tiny")
parser.add_argument("--rmvpe", type=str, help="path to rmvpe")
parser.add_argument("--content_vec_500", type=str, default="pretrain/checkpoint_best_legacy_500.pt", help="path to content_vec_500 model(pytorch)")
parser.add_argument("--content_vec_500_onnx", type=str, default="pretrain/content_vec_500.onnx", help="path to content_vec_500 model(onnx)")
parser.add_argument("--content_vec_500_onnx_on", type=strtobool, default=True, help="use or not onnx for content_vec_500")
parser.add_argument("--hubert_base", type=str, default="pretrain/hubert_base.pt", help="path to hubert_base model(pytorch)")
parser.add_argument("--hubert_base_jp", type=str, default="pretrain/rinna_hubert_base_jp.pt", help="path to hubert_base_jp model(pytorch)")
parser.add_argument("--hubert_soft", type=str, default="pretrain/hubert/hubert-soft-0d54a1f4.pt", help="path to hubert_soft model(pytorch)")
parser.add_argument("--nsf_hifigan", type=str, default="pretrain/nsf_hifigan/model", help="path to nsf_hifigan model(pytorch)")
parser.add_argument("--crepe_onnx_full", type=str, default="pretrain/crepe_onnx_full.onnx", help="path to crepe_onnx_full")
parser.add_argument("--crepe_onnx_tiny", type=str, default="pretrain/crepe_onnx_tiny.onnx", help="path to crepe_onnx_tiny")
parser.add_argument("--rmvpe", type=str, default="pretrain/rmvpe.pt", help="path to rmvpe")
return parser
@ -96,6 +97,8 @@ voiceChangerParams = VoiceChangerParams(
rmvpe=args.rmvpe,
sample_mode=args.sample_mode,
)
vcparams = VoiceChangerParamsManager.get_instance()
vcparams.setParams(voiceChangerParams)
printMessage(f"Booting PHASE :{__name__}", level=2)
@ -124,7 +127,8 @@ if __name__ == "MMVCServerSIO":
if __name__ == "__mp_main__":
printMessage("サーバプロセスを起動しています。", level=2)
# printMessage("サーバプロセスを起動しています。", level=2)
printMessage("The server process is starting up.", level=2)
if __name__ == "__main__":
mp.freeze_support()
@ -132,12 +136,13 @@ if __name__ == "__main__":
logger.debug(args)
printMessage(f"PYTHON:{sys.version}", level=2)
printMessage("Voice Changerを起動しています。", level=2)
# printMessage("Voice Changerを起動しています。", level=2)
printMessage("Activating the Voice Changer.", level=2)
# ダウンロード(Weight)
try:
downloadWeight(voiceChangerParams)
except WeightDownladException:
printMessage("RVC用のモデルファイルのダウンロードに失敗しました。", level=2)
# printMessage("RVC用のモデルファイルのダウンロードに失敗しました。", level=2)
printMessage("failed to download weight for rvc", level=2)
# ダウンロード(Sample)
@ -192,29 +197,31 @@ if __name__ == "__main__":
printMessage("-- ---- -- ", level=1)
# アドレス表示
printMessage("ブラウザで次のURLを開いてください.", level=2)
printMessage("Please open the following URL in your browser.", level=2)
# printMessage("ブラウザで次のURLを開いてください.", level=2)
if args.https == 1:
printMessage("https://<IP>:<PORT>/", level=1)
else:
printMessage("http://<IP>:<PORT>/", level=1)
printMessage("多くの場合は次のいずれかのURLにアクセスすると起動します。", level=2)
# printMessage("多くの場合は次のいずれかのURLにアクセスすると起動します。", level=2)
printMessage("In many cases, it will launch when you access any of the following URLs.", level=2)
if "EX_PORT" in locals() and "EX_IP" in locals(): # シェルスクリプト経由起動(docker)
if args.https == 1:
printMessage(f"https://localhost:{EX_PORT}/", level=1)
printMessage(f"https://127.0.0.1:{EX_PORT}/", level=1)
for ip in EX_IP.strip().split(" "):
printMessage(f"https://{ip}:{EX_PORT}/", level=1)
else:
printMessage(f"http://localhost:{EX_PORT}/", level=1)
printMessage(f"http://127.0.0.1:{EX_PORT}/", level=1)
else: # 直接python起動
if args.https == 1:
s = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
s.connect((args.test_connect, 80))
hostname = s.getsockname()[0]
printMessage(f"https://localhost:{PORT}/", level=1)
printMessage(f"https://127.0.0.1:{PORT}/", level=1)
printMessage(f"https://{hostname}:{PORT}/", level=1)
else:
printMessage(f"http://localhost:{PORT}/", level=1)
printMessage(f"http://127.0.0.1:{PORT}/", level=1)
# サーバ起動
if args.https:
@ -237,15 +244,15 @@ if __name__ == "__main__":
p.start()
try:
if sys.platform.startswith("win"):
process = subprocess.Popen([NATIVE_CLIENT_FILE_WIN, "--disable-gpu", "-u", f"http://localhost:{PORT}/"])
process = subprocess.Popen([NATIVE_CLIENT_FILE_WIN, "--disable-gpu", "-u", f"http://127.0.0.1:{PORT}/"])
return_code = process.wait()
logger.info("client closed.")
p.terminate()
elif sys.platform.startswith("darwin"):
process = subprocess.Popen([NATIVE_CLIENT_FILE_MAC, "--disable-gpu", "-u", f"http://localhost:{PORT}/"])
process = subprocess.Popen([NATIVE_CLIENT_FILE_MAC, "--disable-gpu", "-u", f"http://127.0.0.1:{PORT}/"])
return_code = process.wait()
logger.info("client closed.")
p.terminate()
except Exception as e:
logger.error(f"[Voice Changer] Launch Exception, {e}")
logger.error(f"[Voice Changer] Client Launch Exception, {e}")

View File

@ -169,4 +169,4 @@ def getSampleJsonAndModelIds(mode: RVCSampleMode):
RVC_MODEL_DIRNAME = "rvc"
MAX_SLOT_NUM = 10
MAX_SLOT_NUM = 200

View File

@ -9,6 +9,7 @@ import json
@dataclass
class ModelSlot:
slotIndex: int = -1
voiceChangerType: VoiceChangerType | None = None
name: str = ""
description: str = ""
@ -132,19 +133,26 @@ def loadSlotInfo(model_dir: str, slotIndex: int) -> ModelSlots:
if not os.path.exists(jsonFile):
return ModelSlot()
jsonDict = json.load(open(os.path.join(slotDir, "params.json")))
slotInfo = ModelSlot(**{k: v for k, v in jsonDict.items() if k in ModelSlot.__annotations__})
slotInfoKey = list(ModelSlot.__annotations__.keys())
slotInfo = ModelSlot(**{k: v for k, v in jsonDict.items() if k in slotInfoKey})
if slotInfo.voiceChangerType == "RVC":
return RVCModelSlot(**jsonDict)
slotInfoKey.extend(list(RVCModelSlot.__annotations__.keys()))
return RVCModelSlot(**{k: v for k, v in jsonDict.items() if k in slotInfoKey})
elif slotInfo.voiceChangerType == "MMVCv13":
return MMVCv13ModelSlot(**jsonDict)
slotInfoKey.extend(list(MMVCv13ModelSlot.__annotations__.keys()))
return MMVCv13ModelSlot(**{k: v for k, v in jsonDict.items() if k in slotInfoKey})
elif slotInfo.voiceChangerType == "MMVCv15":
return MMVCv15ModelSlot(**jsonDict)
slotInfoKey.extend(list(MMVCv15ModelSlot.__annotations__.keys()))
return MMVCv15ModelSlot(**{k: v for k, v in jsonDict.items() if k in slotInfoKey})
elif slotInfo.voiceChangerType == "so-vits-svc-40":
return SoVitsSvc40ModelSlot(**jsonDict)
slotInfoKey.extend(list(SoVitsSvc40ModelSlot.__annotations__.keys()))
return SoVitsSvc40ModelSlot(**{k: v for k, v in jsonDict.items() if k in slotInfoKey})
elif slotInfo.voiceChangerType == "DDSP-SVC":
return DDSPSVCModelSlot(**jsonDict)
slotInfoKey.extend(list(DDSPSVCModelSlot.__annotations__.keys()))
return DDSPSVCModelSlot(**{k: v for k, v in jsonDict.items() if k in slotInfoKey})
elif slotInfo.voiceChangerType == "Diffusion-SVC":
return DiffusionSVCModelSlot(**jsonDict)
slotInfoKey.extend(list(DiffusionSVCModelSlot.__annotations__.keys()))
return DiffusionSVCModelSlot(**{k: v for k, v in jsonDict.items() if k in slotInfoKey})
else:
return ModelSlot()
@ -153,10 +161,13 @@ def loadAllSlotInfo(model_dir: str):
slotInfos: list[ModelSlots] = []
for slotIndex in range(MAX_SLOT_NUM):
slotInfo = loadSlotInfo(model_dir, slotIndex)
slotInfo.slotIndex = slotIndex # スロットインデックスは動的に注入
slotInfos.append(slotInfo)
return slotInfos
def saveSlotInfo(model_dir: str, slotIndex: int, slotInfo: ModelSlots):
slotDir = os.path.join(model_dir, str(slotIndex))
json.dump(asdict(slotInfo), open(os.path.join(slotDir, "params.json"), "w"))
slotInfoDict = asdict(slotInfo)
slotInfo.slotIndex = -1 # スロットインデックスは動的に注入
json.dump(slotInfoDict, open(os.path.join(slotDir, "params.json"), "w"), indent=4)

View File

@ -1,5 +1,6 @@
import json
import os
import sys
from concurrent.futures import ThreadPoolExecutor
from typing import Any, Tuple
@ -7,7 +8,6 @@ from const import RVCSampleMode, getSampleJsonAndModelIds
from data.ModelSample import ModelSamples, generateModelSample
from data.ModelSlot import DiffusionSVCModelSlot, ModelSlot, RVCModelSlot
from mods.log_control import VoiceChangaerLogger
from voice_changer.DiffusionSVC.DiffusionSVCModelSlotGenerator import DiffusionSVCModelSlotGenerator
from voice_changer.ModelSlotManager import ModelSlotManager
from voice_changer.RVC.RVCModelSlotGenerator import RVCModelSlotGenerator
from downloader.Downloader import download, download_no_tqdm
@ -109,7 +109,7 @@ def _downloadSamples(samples: list[ModelSamples], sampleModelIds: list[Tuple[str
"position": line_num,
}
)
slotInfo.modelFile = modelFilePath
slotInfo.modelFile = os.path.basename(sample.modelUrl)
line_num += 1
if targetSampleParams["useIndex"] is True and hasattr(sample, "indexUrl") and sample.indexUrl != "":
@ -124,7 +124,7 @@ def _downloadSamples(samples: list[ModelSamples], sampleModelIds: list[Tuple[str
"position": line_num,
}
)
slotInfo.indexFile = indexPath
slotInfo.indexFile = os.path.basename(sample.indexUrl)
line_num += 1
if hasattr(sample, "icon") and sample.icon != "":
@ -139,7 +139,7 @@ def _downloadSamples(samples: list[ModelSamples], sampleModelIds: list[Tuple[str
"position": line_num,
}
)
slotInfo.iconFile = iconPath
slotInfo.iconFile = os.path.basename(sample.icon)
line_num += 1
slotInfo.sampleId = sample.id
@ -153,6 +153,8 @@ def _downloadSamples(samples: list[ModelSamples], sampleModelIds: list[Tuple[str
slotInfo.isONNX = slotInfo.modelFile.endswith(".onnx")
modelSlotManager.save_model_slot(targetSlotIndex, slotInfo)
elif sample.voiceChangerType == "Diffusion-SVC":
if sys.platform.startswith("darwin") is True:
continue
slotInfo: DiffusionSVCModelSlot = DiffusionSVCModelSlot()
os.makedirs(slotDir, exist_ok=True)
@ -167,7 +169,7 @@ def _downloadSamples(samples: list[ModelSamples], sampleModelIds: list[Tuple[str
"position": line_num,
}
)
slotInfo.modelFile = modelFilePath
slotInfo.modelFile = os.path.basename(sample.modelUrl)
line_num += 1
if hasattr(sample, "icon") and sample.icon != "":
@ -182,7 +184,7 @@ def _downloadSamples(samples: list[ModelSamples], sampleModelIds: list[Tuple[str
"position": line_num,
}
)
slotInfo.iconFile = iconPath
slotInfo.iconFile = os.path.basename(sample.icon)
line_num += 1
slotInfo.sampleId = sample.id
@ -212,14 +214,17 @@ def _downloadSamples(samples: list[ModelSamples], sampleModelIds: list[Tuple[str
logger.info("[Voice Changer] Generating metadata...")
for targetSlotIndex in slotIndex:
slotInfo = modelSlotManager.get_slot_info(targetSlotIndex)
modelPath = os.path.join(model_dir, str(slotInfo.slotIndex), os.path.basename(slotInfo.modelFile))
if slotInfo.voiceChangerType == "RVC":
if slotInfo.isONNX:
slotInfo = RVCModelSlotGenerator._setInfoByONNX(slotInfo)
slotInfo = RVCModelSlotGenerator._setInfoByONNX(modelPath, slotInfo)
else:
slotInfo = RVCModelSlotGenerator._setInfoByPytorch(slotInfo)
slotInfo = RVCModelSlotGenerator._setInfoByPytorch(modelPath, slotInfo)
modelSlotManager.save_model_slot(targetSlotIndex, slotInfo)
elif slotInfo.voiceChangerType == "Diffusion-SVC":
if sys.platform.startswith("darwin") is False:
from voice_changer.DiffusionSVC.DiffusionSVCModelSlotGenerator import DiffusionSVCModelSlotGenerator
if slotInfo.isONNX:
pass
else:

3
server/fillSlot.sh Normal file
View File

@ -0,0 +1,3 @@
for i in {1..199}; do
cp -r model_dir/0 model_dir/$i
done

View File

@ -113,6 +113,8 @@ class MMVC_Rest_Fileuploader:
return JSONResponse(content=json_compatible_item_data)
except Exception as e:
print("[Voice Changer] post_merge_models ex:", e)
import traceback
traceback.print_exc()
def post_update_model_default(self):
try:

View File

@ -6,6 +6,7 @@ import torch
from data.ModelSlot import DDSPSVCModelSlot
from voice_changer.DDSP_SVC.deviceManager.DeviceManager import DeviceManager
from voice_changer.VoiceChangerParamsManager import VoiceChangerParamsManager
if sys.platform.startswith("darwin"):
baseDir = [x for x in sys.path if x.endswith("Contents/MacOS")]
@ -69,12 +70,15 @@ class DDSP_SVC:
def initialize(self):
self.device = self.deviceManager.getDevice(self.settings.gpu)
vcparams = VoiceChangerParamsManager.get_instance().params
modelPath = os.path.join(vcparams.model_dir, str(self.slotInfo.slotIndex), "model", self.slotInfo.modelFile)
diffPath = os.path.join(vcparams.model_dir, str(self.slotInfo.slotIndex), "diff", self.slotInfo.diffModelFile)
self.svc_model = SvcDDSP()
self.svc_model.setVCParams(self.params)
self.svc_model.update_model(self.slotInfo.modelFile, self.device)
self.svc_model.update_model(modelPath, self.device)
self.diff_model = DiffGtMel(device=self.device)
self.diff_model.flush_model(self.slotInfo.diffModelFile, ddsp_config=self.svc_model.args)
self.diff_model.flush_model(diffPath, ddsp_config=self.svc_model.args)
def update_settings(self, key: str, val: int | float | str):
if key in self.settings.intData:
@ -174,5 +178,9 @@ class DDSP_SVC:
if file_path.find("DDSP-SVC" + os.path.sep) >= 0:
# print("remove", key, file_path)
sys.modules.pop(key)
except: # type:ignore
except: # type:ignore # noqa
pass
def get_model_current(self):
return [
]

View File

@ -14,7 +14,7 @@ from voice_changer.RVC.embedder.EmbedderManager import EmbedderManager
# from voice_changer.RVC.onnxExporter.export2onnx import export2onnx
from voice_changer.RVC.deviceManager.DeviceManager import DeviceManager
from Exceptions import DeviceCannotSupportHalfPrecisionException, PipelineCreateException
from Exceptions import DeviceCannotSupportHalfPrecisionException, PipelineCreateException, PipelineNotInitializedException
logger = VoiceChangaerLogger.get_instance().getLogger()
@ -28,7 +28,6 @@ class DiffusionSVC(VoiceChangerModel):
InferencerManager.initialize(params)
self.settings = DiffusionSVCSettings()
self.params = params
self.pitchExtractor = PitchExtractorManager.getPitchExtractor(self.settings.f0Detector, self.settings.gpu)
self.pipeline: Pipeline | None = None
@ -84,6 +83,8 @@ class DiffusionSVC(VoiceChangerModel):
if self.pipeline is not None:
pipelineInfo = self.pipeline.getPipelineInfo()
data["pipelineInfo"] = pipelineInfo
else:
data["pipelineInfo"] = "None"
return data
def get_processing_sampling_rate(self):
@ -137,6 +138,9 @@ class DiffusionSVC(VoiceChangerModel):
return (self.audio_buffer, self.pitchf_buffer, self.feature_buffer, convertSize, vol)
def inference(self, receivedData: AudioInOut, crossfade_frame: int, sola_search_frame: int):
if self.pipeline is None:
logger.info("[Voice Changer] Pipeline is not initialized.")
raise PipelineNotInitializedException()
data = self.generate_input(receivedData, crossfade_frame, sola_search_frame)
audio: AudioInOut = data[0]
pitchf: PitchfInOut = data[1]
@ -176,7 +180,8 @@ class DiffusionSVC(VoiceChangerModel):
silenceFrontSec,
embOutputLayer,
useFinalProj,
protect
protect,
skip_diffusion=self.settings.skipDiffusion,
)
result = audio_out.detach().cpu().numpy()
return result
@ -211,6 +216,10 @@ class DiffusionSVC(VoiceChangerModel):
"key": "defaultTune",
"val": self.settings.tran,
},
{
"key": "dstId",
"val": self.settings.dstId,
},
{
"key": "defaultKstep",
"val": self.settings.kStep,

View File

@ -1,14 +1,14 @@
import os
from const import EnumInferenceTypes
from dataclasses import asdict
import onnxruntime
import json
from data.ModelSlot import DiffusionSVCModelSlot, ModelSlot, RVCModelSlot
from voice_changer.DiffusionSVC.inferencer.diffusion_svc_model.diffusion.unit2mel import load_model_vocoder_from_combo
from voice_changer.VoiceChangerParamsManager import VoiceChangerParamsManager
from voice_changer.utils.LoadModelParams import LoadModelParams
from voice_changer.utils.ModelSlotGenerator import ModelSlotGenerator
def get_divisors(n):
divisors = []
for i in range(1, int(n**0.5)+1):
@ -31,6 +31,7 @@ class DiffusionSVCModelSlotGenerator(ModelSlotGenerator):
slotInfo.name = os.path.splitext(os.path.basename(slotInfo.modelFile))[0]
# slotInfo.iconFile = "/assets/icons/noimage.png"
slotInfo.embChannels = 768
slotInfo.slotIndex = props.slot
if slotInfo.isONNX:
slotInfo = cls._setInfoByONNX(slotInfo)
@ -40,7 +41,10 @@ class DiffusionSVCModelSlotGenerator(ModelSlotGenerator):
@classmethod
def _setInfoByPytorch(cls, slot: DiffusionSVCModelSlot):
diff_model, diff_args, naive_model, naive_args = load_model_vocoder_from_combo(slot.modelFile, device="cpu")
vcparams = VoiceChangerParamsManager.get_instance().params
modelPath = os.path.join(vcparams.model_dir, str(slot.slotIndex), os.path.basename(slot.modelFile))
diff_model, diff_args, naive_model, naive_args = load_model_vocoder_from_combo(modelPath, device="cpu")
slot.kStepMax = diff_args.model.k_step_max
slot.nLayers = diff_args.model.n_layers
slot.nnLayers = naive_args.model.n_layers
@ -52,53 +56,4 @@ class DiffusionSVCModelSlotGenerator(ModelSlotGenerator):
@classmethod
def _setInfoByONNX(cls, slot: ModelSlot):
tmp_onnx_session = onnxruntime.InferenceSession(slot.modelFile, providers=["CPUExecutionProvider"])
modelmeta = tmp_onnx_session.get_modelmeta()
try:
slot = RVCModelSlot(**asdict(slot))
metadata = json.loads(modelmeta.custom_metadata_map["metadata"])
# slot.modelType = metadata["modelType"]
slot.embChannels = metadata["embChannels"]
slot.embOutputLayer = metadata["embOutputLayer"] if "embOutputLayer" in metadata else 9
slot.useFinalProj = metadata["useFinalProj"] if "useFinalProj" in metadata else True if slot.embChannels == 256 else False
if slot.embChannels == 256:
slot.useFinalProj = True
else:
slot.useFinalProj = False
# ONNXモデルの情報を表示
if slot.embChannels == 256 and slot.embOutputLayer == 9 and slot.useFinalProj is True:
print("[Voice Changer] ONNX Model: Official v1 like")
elif slot.embChannels == 768 and slot.embOutputLayer == 12 and slot.useFinalProj is False:
print("[Voice Changer] ONNX Model: Official v2 like")
else:
print(f"[Voice Changer] ONNX Model: ch:{slot.embChannels}, L:{slot.embOutputLayer}, FP:{slot.useFinalProj}")
if "embedder" not in metadata:
slot.embedder = "hubert_base"
else:
slot.embedder = metadata["embedder"]
slot.f0 = metadata["f0"]
slot.modelType = EnumInferenceTypes.onnxRVC.value if slot.f0 else EnumInferenceTypes.onnxRVCNono.value
slot.samplingRate = metadata["samplingRate"]
slot.deprecated = False
except Exception as e:
slot.modelType = EnumInferenceTypes.onnxRVC.value
slot.embChannels = 256
slot.embedder = "hubert_base"
slot.f0 = True
slot.samplingRate = 48000
slot.deprecated = True
print("[Voice Changer] setInfoByONNX", e)
print("[Voice Changer] ############## !!!! CAUTION !!!! ####################")
print("[Voice Changer] This onnxfie is depricated. Please regenerate onnxfile.")
print("[Voice Changer] ############## !!!! CAUTION !!!! ####################")
del tmp_onnx_session
return slot

View File

@ -13,6 +13,7 @@ class DiffusionSVCSettings:
kStep: int = 20
speedUp: int = 10
skipDiffusion: int = 1 # 0:off, 1:on
silenceFront: int = 1 # 0:off, 1:on
modelSamplingRate: int = 44100
@ -29,6 +30,7 @@ class DiffusionSVCSettings:
"kStep",
"speedUp",
"silenceFront",
"skipDiffusion",
]
floatData = ["silentThreshold"]
strData = ["f0Detector"]

View File

@ -112,25 +112,27 @@ class DiffusionSVCInferencer(Inferencer):
k_step: int,
infer_speedup: int,
silence_front: float,
skip_diffusion: bool = True,
) -> torch.Tensor:
with Timer("pre-process") as t:
with Timer("pre-process", False) as t:
gt_spec = self.naive_model_call(feats, pitch, volume, spk_id=sid, spk_mix_dict=None, aug_shift=0, spk_emb=None)
# gt_spec = self.vocoder.extract(audio_t, 16000)
# gt_spec = torch.cat((gt_spec, gt_spec[:, -1:, :]), 1)
# print("[ ----Timer::1: ]", t.secs)
with Timer("pre-process") as t:
with Timer("pre-process", False) as t:
if skip_diffusion == 0:
out_mel = self.__call__(feats, pitch, volume, spk_id=sid, spk_mix_dict=None, aug_shift=0, gt_spec=gt_spec, infer_speedup=infer_speedup, method='dpm-solver', k_step=k_step, use_tqdm=False, spk_emb=None)
gt_spec = out_mel
# print("[ ----Timer::2: ]", t.secs)
with Timer("pre-process") as t: # NOQA
with Timer("pre-process", False) as t: # NOQA
if self.vocoder_onnx is None:
start_frame = int(silence_front * self.vocoder.vocoder_sample_rate / self.vocoder.vocoder_hop_size)
out_wav = self.mel2wav(out_mel, pitch, start_frame=start_frame)
out_wav = self.mel2wav(gt_spec, pitch, start_frame=start_frame)
out_wav *= mask
else:
out_wav = self.vocoder_onnx.infer(out_mel, pitch, silence_front, mask)
out_wav = self.vocoder_onnx.infer(gt_spec, pitch, silence_front, mask)
# print("[ ----Timer::3: ]", t.secs)
return out_wav.squeeze()

View File

@ -21,11 +21,16 @@ class Inferencer(Protocol):
def infer(
self,
audio_t: torch.Tensor,
feats: torch.Tensor,
pitch_length: torch.Tensor,
pitch: torch.Tensor | None,
pitchf: torch.Tensor | None,
pitch: torch.Tensor,
volume: torch.Tensor,
mask: torch.Tensor,
sid: torch.Tensor,
k_step: int,
infer_speedup: int,
silence_front: float,
skip_diffusion: bool = True,
) -> torch.Tensor:
...

View File

@ -81,23 +81,6 @@ class Pipeline(object):
@torch.no_grad()
def extract_volume_and_mask(self, audio: torch.Tensor, threshold: float):
'''
with Timer("[VolumeExt np]") as t:
for i in range(100):
volume = self.volumeExtractor.extract(audio)
time_np = t.secs
with Timer("[VolumeExt pt]") as t:
for i in range(100):
volume_t = self.volumeExtractor.extract_t(audio)
time_pt = t.secs
print("[Volume np]:", volume)
print("[Volume pt]:", volume_t)
print("[Perform]:", time_np, time_pt)
# -> [Perform]: 0.030178070068359375 0.005780220031738281 (RTX4090)
# -> [Perform]: 0.029046058654785156 0.0025115013122558594 (CPU i9 13900KF)
# ---> これくらいの処理ならCPU上のTorchでやった方が早い
'''
volume_t = self.volumeExtractor.extract_t(audio)
mask = self.volumeExtractor.get_mask_from_volume_t(volume_t, self.inferencer_block_size, threshold=threshold)
volume = volume_t.unsqueeze(-1).unsqueeze(0)
@ -116,10 +99,11 @@ class Pipeline(object):
silence_front,
embOutputLayer,
useFinalProj,
protect=0.5
protect=0.5,
skip_diffusion=True,
):
# print("---------- pipe line --------------------")
with Timer("pre-process") as t:
with Timer("pre-process", False) as t:
audio_t = torch.from_numpy(audio).float().unsqueeze(0).to(self.device)
audio16k = self.resamplerIn(audio_t)
volume, mask = self.extract_volume_and_mask(audio16k, threshold=-60.0)
@ -127,7 +111,7 @@ class Pipeline(object):
n_frames = int(audio16k.size(-1) // self.hop_size + 1)
# print("[Timer::1: ]", t.secs)
with Timer("pre-process") as t:
with Timer("pre-process", False) as t:
# ピッチ検出
try:
# pitch = self.pitchExtractor.extract(
@ -157,7 +141,7 @@ class Pipeline(object):
feats = feats.view(1, -1)
# print("[Timer::2: ]", t.secs)
with Timer("pre-process") as t:
with Timer("pre-process", False) as t:
# embedding
with autocast(enabled=self.isHalf):
@ -175,7 +159,7 @@ class Pipeline(object):
feats = F.interpolate(feats.permute(0, 2, 1), size=int(n_frames), mode='nearest').permute(0, 2, 1)
# print("[Timer::3: ]", t.secs)
with Timer("pre-process") as t:
with Timer("pre-process", False) as t:
# 推論実行
try:
with torch.no_grad():
@ -191,7 +175,8 @@ class Pipeline(object):
sid,
k_step,
infer_speedup,
silence_front=silence_front
silence_front=silence_front,
skip_diffusion=skip_diffusion
).to(dtype=torch.float32),
-1.0,
1.0,
@ -206,7 +191,7 @@ class Pipeline(object):
raise e
# print("[Timer::4: ]", t.secs)
with Timer("pre-process") as t: # NOQA
with Timer("pre-process", False) as t: # NOQA
feats_buffer = feats.squeeze(0).detach().cpu()
if pitch is not None:
pitch_buffer = pitch.squeeze(0).detach().cpu()

View File

@ -7,19 +7,23 @@ from voice_changer.DiffusionSVC.pitchExtractor.PitchExtractorManager import Pitc
from voice_changer.RVC.deviceManager.DeviceManager import DeviceManager
from voice_changer.RVC.embedder.EmbedderManager import EmbedderManager
import os
import torch
from torchaudio.transforms import Resample
from voice_changer.VoiceChangerParamsManager import VoiceChangerParamsManager
def createPipeline(modelSlot: DiffusionSVCModelSlot, gpu: int, f0Detector: str, inputSampleRate: int, outputSampleRate: int):
dev = DeviceManager.get_instance().getDevice(gpu)
vcparams = VoiceChangerParamsManager.get_instance().params
# half = DeviceManager.get_instance().halfPrecisionAvailable(gpu)
half = False
# Inferencer 生成
try:
inferencer = InferencerManager.getInferencer(modelSlot.modelType, modelSlot.modelFile, gpu)
modelPath = os.path.join(vcparams.model_dir, str(modelSlot.slotIndex), os.path.basename(modelSlot.modelFile))
inferencer = InferencerManager.getInferencer(modelSlot.modelType, modelPath, gpu)
except Exception as e:
print("[Voice Changer] exception! loading inferencer", e)
traceback.print_exc()

View File

@ -20,6 +20,13 @@ AudioDeviceKind: TypeAlias = Literal["input", "output"]
logger = VoiceChangaerLogger.get_instance().getLogger()
# See https://github.com/w-okada/voice-changer/issues/620
LocalServerDeviceMode: TypeAlias = Literal[
"NoMonitorSeparate",
"WithMonitorStandard",
"WithMonitorAllSeparate",
]
@dataclass
class ServerDeviceSettings:
@ -39,6 +46,7 @@ class ServerDeviceSettings:
serverReadChunkSize: int = 256
serverInputAudioGain: float = 1.0
serverOutputAudioGain: float = 1.0
serverMonitorAudioGain: float = 1.0
exclusiveMode: bool = False
@ -59,6 +67,7 @@ EditableServerDeviceSettings = {
"floatData": [
"serverInputAudioGain",
"serverOutputAudioGain",
"serverMonitorAudioGain",
],
"boolData": [
"exclusiveMode"
@ -95,6 +104,14 @@ class ServerDevice:
self.monQueue = Queue()
self.performance = []
# setting change確認用
self.currentServerInputDeviceId = -1
self.currentServerOutputDeviceId = -1
self.currentServerMonitorDeviceId = -1
self.currentModelSamplingRate = -1
self.currentInputChunkNum = -1
self.currentAudioSampleRate = -1
def getServerInputAudioDevice(self, index: int):
audioinput, _audiooutput = list_audio_device()
serverAudioDevice = [x for x in audioinput if x.index == index]
@ -111,36 +128,51 @@ class ServerDevice:
else:
return None
def audio_callback(self, indata: np.ndarray, outdata: np.ndarray, frames, times, status):
try:
###########################################
# Callback Section
###########################################
def _processData(self, indata: np.ndarray):
indata = indata * self.settings.serverInputAudioGain
with Timer("all_inference_time") as t:
unpackedData = librosa.to_mono(indata.T) * 32768.0
unpackedData = unpackedData.astype(np.int16)
out_wav, times = self.serverDeviceCallbacks.on_request(unpackedData)
outputChannels = outdata.shape[1]
outdata[:] = np.repeat(out_wav, outputChannels).reshape(-1, outputChannels) / 32768.0
outdata[:] = outdata * self.settings.serverOutputAudioGain
return out_wav, times
def _processDataWithTime(self, indata: np.ndarray):
with Timer("all_inference_time") as t:
out_wav, times = self._processData(indata)
all_inference_time = t.secs
self.performance = [all_inference_time] + times
self.serverDeviceCallbacks.emitTo(self.performance)
self.performance = [round(x * 1000) for x in self.performance]
return out_wav
def audio_callback_outQueue(self, indata: np.ndarray, outdata: np.ndarray, frames, times, status):
try:
out_wav = self._processDataWithTime(indata)
self.outQueue.put(out_wav)
outputChannels = outdata.shape[1] # Monitorへのアウトプット
outdata[:] = np.repeat(out_wav, outputChannels).reshape(-1, outputChannels) / 32768.0
outdata[:] = outdata * self.settings.serverMonitorAudioGain
except Exception as e:
print("[Voice Changer] ex:", e)
def audioInput_callback(self, indata: np.ndarray, frames, times, status):
def audioInput_callback_outQueue(self, indata: np.ndarray, frames, times, status):
try:
indata = indata * self.settings.serverInputAudioGain
with Timer("all_inference_time") as t:
unpackedData = librosa.to_mono(indata.T) * 32768.0
unpackedData = unpackedData.astype(np.int16)
out_wav, times = self.serverDeviceCallbacks.on_request(unpackedData)
out_wav = self._processDataWithTime(indata)
self.outQueue.put(out_wav)
except Exception as e:
print("[Voice Changer][ServerDevice][audioInput_callback] ex:", e)
# import traceback
# traceback.print_exc()
def audioInput_callback_outQueue_monQueue(self, indata: np.ndarray, frames, times, status):
try:
out_wav = self._processDataWithTime(indata)
self.outQueue.put(out_wav)
self.monQueue.put(out_wav)
all_inference_time = t.secs
self.performance = [all_inference_time] + times
self.serverDeviceCallbacks.emitTo(self.performance)
self.performance = [round(x * 1000) for x in self.performance]
except Exception as e:
print("[Voice Changer][ServerDevice][audioInput_callback] ex:", e)
# import traceback
@ -166,15 +198,144 @@ class ServerDevice:
self.monQueue.get()
outputChannels = outdata.shape[1]
outdata[:] = np.repeat(mon_wav, outputChannels).reshape(-1, outputChannels) / 32768.0
outdata[:] = outdata * self.settings.serverOutputAudioGain # GainはOutputのものをを流用
# Monitorモードが有効の場合はサンプリングレートはmonitorデバイスが優先されているためリサンプリング不要
outdata[:] = outdata * self.settings.serverMonitorAudioGain
except Exception as e:
print("[Voice Changer][ServerDevice][audioMonitor_callback] ex:", e)
# import traceback
# traceback.print_exc()
###########################################
# Main Loop Section
###########################################
def checkSettingChanged(self):
if self.settings.serverAudioStated != 1:
print(f"serverAudioStarted Changed: {self.settings.serverAudioStated}")
return True
elif self.currentServerInputDeviceId != self.settings.serverInputDeviceId:
print(f"serverInputDeviceId Changed: {self.currentServerInputDeviceId} -> {self.settings.serverInputDeviceId}")
return True
elif self.currentServerOutputDeviceId != self.settings.serverOutputDeviceId:
print(f"serverOutputDeviceId Changed: {self.currentServerOutputDeviceId} -> {self.settings.serverOutputDeviceId}")
return True
elif self.currentServerMonitorDeviceId != self.settings.serverMonitorDeviceId:
print(f"serverMonitorDeviceId Changed: {self.currentServerMonitorDeviceId} -> {self.settings.serverMonitorDeviceId}")
return True
elif self.currentModelSamplingRate != self.serverDeviceCallbacks.get_processing_sampling_rate():
print(f"currentModelSamplingRate Changed: {self.currentModelSamplingRate} -> {self.serverDeviceCallbacks.get_processing_sampling_rate()}")
return True
elif self.currentInputChunkNum != self.settings.serverReadChunkSize:
print(f"currentInputChunkNum Changed: {self.currentInputChunkNum} -> {self.settings.serverReadChunkSize}")
return True
elif self.currentAudioSampleRate != self.settings.serverAudioSampleRate:
print(f"currentAudioSampleRate Changed: {self.currentAudioSampleRate} -> {self.settings.serverAudioSampleRate}")
return True
else:
return False
def runNoMonitorSeparate(self, block_frame: int, inputMaxChannel: int, outputMaxChannel: int, inputExtraSetting, outputExtraSetting):
with sd.InputStream(
callback=self.audioInput_callback_outQueue,
dtype="float32",
device=self.settings.serverInputDeviceId,
blocksize=block_frame,
samplerate=self.settings.serverInputAudioSampleRate,
channels=inputMaxChannel,
extra_settings=inputExtraSetting
):
with sd.OutputStream(
callback=self.audioOutput_callback,
dtype="float32",
device=self.settings.serverOutputDeviceId,
blocksize=block_frame,
samplerate=self.settings.serverOutputAudioSampleRate,
channels=outputMaxChannel,
extra_settings=outputExtraSetting
):
while True:
changed = self.checkSettingChanged()
if changed:
break
time.sleep(2)
print(f"[Voice Changer] server audio performance {self.performance}")
print(f" status: started:{self.settings.serverAudioStated}, model_sr:{self.currentModelSamplingRate}, chunk:{self.currentInputChunkNum}")
print(f" input : id:{self.settings.serverInputDeviceId}, sr:{self.settings.serverInputAudioSampleRate}, ch:{inputMaxChannel}")
print(f" output : id:{self.settings.serverOutputDeviceId}, sr:{self.settings.serverOutputAudioSampleRate}, ch:{outputMaxChannel}")
# print(f" monitor: id:{self.settings.serverMonitorDeviceId}, sr:{self.settings.serverMonitorAudioSampleRate}, ch:{self.serverMonitorAudioDevice.maxOutputChannels}")
def runWithMonitorStandard(self, block_frame: int, inputMaxChannel: int, outputMaxChannel: int, monitorMaxChannel: int, inputExtraSetting, outputExtraSetting, monitorExtraSetting):
with sd.Stream(
callback=self.audio_callback_outQueue,
dtype="float32",
device=(self.settings.serverInputDeviceId, self.settings.serverMonitorDeviceId),
blocksize=block_frame,
samplerate=self.settings.serverInputAudioSampleRate,
channels=(inputMaxChannel, monitorMaxChannel),
extra_settings=[inputExtraSetting, monitorExtraSetting]
):
with sd.OutputStream(
callback=self.audioOutput_callback,
dtype="float32",
device=self.settings.serverOutputDeviceId,
blocksize=block_frame,
samplerate=self.settings.serverOutputAudioSampleRate,
channels=outputMaxChannel,
extra_settings=outputExtraSetting
):
while True:
changed = self.checkSettingChanged()
if changed:
break
time.sleep(2)
print(f"[Voice Changer] server audio performance {self.performance}")
print(f" status: started:{self.settings.serverAudioStated}, model_sr:{self.currentModelSamplingRate}, chunk:{self.currentInputChunkNum}")
print(f" input : id:{self.settings.serverInputDeviceId}, sr:{self.settings.serverInputAudioSampleRate}, ch:{inputMaxChannel}")
print(f" output : id:{self.settings.serverOutputDeviceId}, sr:{self.settings.serverOutputAudioSampleRate}, ch:{outputMaxChannel}")
print(f" monitor: id:{self.settings.serverMonitorDeviceId}, sr:{self.settings.serverMonitorAudioSampleRate}, ch:{monitorMaxChannel}")
def runWithMonitorAllSeparate(self, block_frame: int, inputMaxChannel: int, outputMaxChannel: int, monitorMaxChannel: int, inputExtraSetting, outputExtraSetting, monitorExtraSetting):
with sd.InputStream(
callback=self.audioInput_callback_outQueue_monQueue,
dtype="float32",
device=self.settings.serverInputDeviceId,
blocksize=block_frame,
samplerate=self.settings.serverInputAudioSampleRate,
channels=inputMaxChannel,
extra_settings=inputExtraSetting
):
with sd.OutputStream(
callback=self.audioOutput_callback,
dtype="float32",
device=self.settings.serverOutputDeviceId,
blocksize=block_frame,
samplerate=self.settings.serverOutputAudioSampleRate,
channels=outputMaxChannel,
extra_settings=outputExtraSetting
):
with sd.OutputStream(
callback=self.audioMonitor_callback,
dtype="float32",
device=self.settings.serverMonitorDeviceId,
blocksize=block_frame,
samplerate=self.settings.serverMonitorAudioSampleRate,
channels=monitorMaxChannel,
extra_settings=monitorExtraSetting
):
while True:
changed = self.checkSettingChanged()
if changed:
break
time.sleep(2)
print(f"[Voice Changer] server audio performance {self.performance}")
print(f" status: started:{self.settings.serverAudioStated}, model_sr:{self.currentModelSamplingRate}, chunk:{self.currentInputChunkNum}")
print(f" input : id:{self.settings.serverInputDeviceId}, sr:{self.settings.serverInputAudioSampleRate}, ch:{inputMaxChannel}")
print(f" output : id:{self.settings.serverOutputDeviceId}, sr:{self.settings.serverOutputAudioSampleRate}, ch:{outputMaxChannel}")
print(f" monitor: id:{self.settings.serverMonitorDeviceId}, sr:{self.settings.serverMonitorAudioSampleRate}, ch:{monitorMaxChannel}")
###########################################
# Start Section
###########################################
def start(self):
currentModelSamplingRate = -1
self.currentModelSamplingRate = -1
while True:
if self.settings.serverAudioStated == 0 or self.settings.serverInputDeviceId == -1:
time.sleep(2)
@ -183,9 +344,9 @@ class ServerDevice:
sd._initialize()
# Curret Device ID
currentServerInputDeviceId = self.settings.serverInputDeviceId
currentServerOutputDeviceId = self.settings.serverOutputDeviceId
currentServerMonitorDeviceId = self.settings.serverMonitorDeviceId
self.currentServerInputDeviceId = self.settings.serverInputDeviceId
self.currentServerOutputDeviceId = self.settings.serverOutputDeviceId
self.currentServerMonitorDeviceId = self.settings.serverMonitorDeviceId
# Device 特定
serverInputAudioDevice = self.getServerInputAudioDevice(self.settings.serverInputDeviceId)
@ -220,17 +381,17 @@ class ServerDevice:
# サンプリングレート
# 同一サンプリングレートに統一(変換時にサンプルが不足する場合があるため。パディング方法が明らかになれば、それぞれ設定できるかも)
currentAudioSampleRate = self.settings.serverAudioSampleRate
self.currentAudioSampleRate = self.settings.serverAudioSampleRate
try:
currentModelSamplingRate = self.serverDeviceCallbacks.get_processing_sampling_rate()
self.currentModelSamplingRate = self.serverDeviceCallbacks.get_processing_sampling_rate()
except Exception as e:
print("[Voice Changer] ex: get_processing_sampling_rate", e)
time.sleep(2)
continue
self.settings.serverInputAudioSampleRate = currentAudioSampleRate
self.settings.serverOutputAudioSampleRate = currentAudioSampleRate
self.settings.serverMonitorAudioSampleRate = currentAudioSampleRate
self.settings.serverInputAudioSampleRate = self.currentAudioSampleRate
self.settings.serverOutputAudioSampleRate = self.currentAudioSampleRate
self.settings.serverMonitorAudioSampleRate = self.currentAudioSampleRate
# Sample Rate Check
inputAudioSampleRateAvailable = checkSamplingRate(self.settings.serverInputDeviceId, self.settings.serverInputAudioSampleRate, "input")
@ -238,7 +399,7 @@ class ServerDevice:
monitorAudioSampleRateAvailable = checkSamplingRate(self.settings.serverMonitorDeviceId, self.settings.serverMonitorAudioSampleRate, "output") if serverMonitorAudioDevice else True
print("Sample Rate:")
print(f" [Model]: {currentModelSamplingRate}")
print(f" [Model]: {self.currentModelSamplingRate}")
print(f" [Input]: {self.settings.serverInputAudioSampleRate} -> {inputAudioSampleRateAvailable}")
print(f" [Output]: {self.settings.serverOutputAudioSampleRate} -> {outputAudioSampleRateAvailable}")
if serverMonitorAudioDevice is not None:
@ -274,153 +435,51 @@ class ServerDevice:
self.serverDeviceCallbacks.setOutputSamplingRate(self.settings.serverOutputAudioSampleRate)
# Blockサイズを計算
currentInputChunkNum = self.settings.serverReadChunkSize
self.currentInputChunkNum = self.settings.serverReadChunkSize
# block_frame = currentInputChunkNum * 128
block_frame = int(currentInputChunkNum * 128 * (self.settings.serverInputAudioSampleRate / 48000))
block_frame = int(self.currentInputChunkNum * 128 * (self.settings.serverInputAudioSampleRate / 48000))
sd.default.blocksize = block_frame
# main loop
try:
with sd.InputStream(
callback=self.audioInput_callback,
dtype="float32",
device=self.settings.serverInputDeviceId,
blocksize=block_frame,
samplerate=self.settings.serverInputAudioSampleRate,
channels=serverInputAudioDevice.maxInputChannels,
extra_settings=inputExtraSetting
):
with sd.OutputStream(
callback=self.audioOutput_callback,
dtype="float32",
device=self.settings.serverOutputDeviceId,
blocksize=block_frame,
samplerate=self.settings.serverOutputAudioSampleRate,
channels=serverOutputAudioDevice.maxOutputChannels,
extra_settings=outputExtraSetting
):
if self.settings.serverMonitorDeviceId != -1:
with sd.OutputStream(
callback=self.audioMonitor_callback,
dtype="float32",
device=self.settings.serverMonitorDeviceId,
blocksize=block_frame,
samplerate=self.settings.serverMonitorAudioSampleRate,
channels=serverMonitorAudioDevice.maxOutputChannels,
extra_settings=monitorExtraSetting
):
while (
self.settings.serverAudioStated == 1 and
currentServerInputDeviceId == self.settings.serverInputDeviceId and
currentServerOutputDeviceId == self.settings.serverOutputDeviceId and
currentServerMonitorDeviceId == self.settings.serverMonitorDeviceId and
currentModelSamplingRate == self.serverDeviceCallbacks.get_processing_sampling_rate() and
currentInputChunkNum == self.settings.serverReadChunkSize and
currentAudioSampleRate == self.settings.serverAudioSampleRate
):
time.sleep(2)
print(f"[Voice Changer] server audio performance {self.performance}")
print(f" status: started:{self.settings.serverAudioStated}, model_sr:{currentModelSamplingRate}, chunk:{currentInputChunkNum}")
print(f" input : id:{self.settings.serverInputDeviceId}, sr:{self.settings.serverInputAudioSampleRate}, ch:{serverInputAudioDevice.maxInputChannels}")
print(f" output : id:{self.settings.serverOutputDeviceId}, sr:{self.settings.serverOutputAudioSampleRate}, ch:{serverOutputAudioDevice.maxOutputChannels}")
print(f" monitor: id:{self.settings.serverMonitorDeviceId}, sr:{self.settings.serverMonitorAudioSampleRate}, ch:{serverMonitorAudioDevice.maxOutputChannels}")
# See https://github.com/w-okada/voice-changer/issues/620
def judgeServerDeviceMode() -> LocalServerDeviceMode:
if self.settings.serverMonitorDeviceId == -1:
return "NoMonitorSeparate"
else:
while (
self.settings.serverAudioStated == 1 and
currentServerInputDeviceId == self.settings.serverInputDeviceId and
currentServerOutputDeviceId == self.settings.serverOutputDeviceId and
currentServerMonitorDeviceId == self.settings.serverMonitorDeviceId and
currentModelSamplingRate == self.serverDeviceCallbacks.get_processing_sampling_rate() and
currentInputChunkNum == self.settings.serverReadChunkSize and
currentAudioSampleRate == self.settings.serverAudioSampleRate
):
time.sleep(2)
print(f"[Voice Changer] server audio performance {self.performance}")
print(f" status: started:{self.settings.serverAudioStated}, model_sr:{currentModelSamplingRate}, chunk:{currentInputChunkNum}]")
print(f" input : id:{self.settings.serverInputDeviceId}, sr:{self.settings.serverInputAudioSampleRate}, ch:{serverInputAudioDevice.maxInputChannels}")
print(f" output : id:{self.settings.serverOutputDeviceId}, sr:{self.settings.serverOutputAudioSampleRate}, ch:{serverOutputAudioDevice.maxOutputChannels}")
if serverInputAudioDevice.hostAPI == serverOutputAudioDevice.hostAPI and serverInputAudioDevice.hostAPI == serverMonitorAudioDevice.hostAPI: # すべて同じ
return "WithMonitorStandard"
elif serverInputAudioDevice.hostAPI != serverOutputAudioDevice.hostAPI and serverInputAudioDevice.hostAPI != serverMonitorAudioDevice.hostAPI and serverOutputAudioDevice.hostAPI != serverMonitorAudioDevice.hostAPI: # すべて違う
return "WithMonitorAllSeparate"
elif serverInputAudioDevice.hostAPI == serverOutputAudioDevice.hostAPI: # in/outだけが同じ
return "WithMonitorAllSeparate"
elif serverInputAudioDevice.hostAPI == serverMonitorAudioDevice.hostAPI: # in/monだけが同じ
return "WithMonitorStandard"
elif serverOutputAudioDevice.hostAPI == serverMonitorAudioDevice.hostAPI: # out/monだけが同じ
return "WithMonitorAllSeparate"
else:
raise RuntimeError(f"Cannot JudgeServerMode, in:{serverInputAudioDevice.hostAPI}, mon:{serverMonitorAudioDevice.hostAPI}, out:{serverOutputAudioDevice.hostAPI}")
serverDeviceMode = judgeServerDeviceMode()
if serverDeviceMode == "NoMonitorSeparate":
self.runNoMonitorSeparate(block_frame, serverInputAudioDevice.maxInputChannels, serverOutputAudioDevice.maxOutputChannels, inputExtraSetting, outputExtraSetting)
elif serverDeviceMode == "WithMonitorStandard":
self.runWithMonitorStandard(block_frame, serverInputAudioDevice.maxInputChannels, serverOutputAudioDevice.maxOutputChannels, serverMonitorAudioDevice.maxOutputChannels, inputExtraSetting, outputExtraSetting, monitorExtraSetting)
elif serverDeviceMode == "WithMonitorAllSeparate":
self.runWithMonitorAllSeparate(block_frame, serverInputAudioDevice.maxInputChannels, serverOutputAudioDevice.maxOutputChannels, serverMonitorAudioDevice.maxOutputChannels, inputExtraSetting, outputExtraSetting, monitorExtraSetting)
else:
raise RuntimeError(f"Unknown ServerDeviceMode: {serverDeviceMode}")
except Exception as e:
print("[Voice Changer] processing, ex:", e)
import traceback
traceback.print_exc()
time.sleep(2)
def start2(self):
# currentInputDeviceId = -1
# currentOutputDeviceId = -1
# currentInputChunkNum = -1
currentModelSamplingRate = -1
while True:
if self.settings.serverAudioStated == 0 or self.settings.serverInputDeviceId == -1:
time.sleep(2)
else:
sd._terminate()
sd._initialize()
sd.default.device[0] = self.settings.serverInputDeviceId
sd.default.device[1] = self.settings.serverOutputDeviceId
serverInputAudioDevice = self.getServerInputAudioDevice(sd.default.device[0])
serverOutputAudioDevice = self.getServerOutputAudioDevice(sd.default.device[1])
print("Devices:", serverInputAudioDevice, serverOutputAudioDevice)
if serverInputAudioDevice is None or serverOutputAudioDevice is None:
time.sleep(2)
print("serverInputAudioDevice or serverOutputAudioDevice is None")
continue
sd.default.channels[0] = serverInputAudioDevice.maxInputChannels
sd.default.channels[1] = serverOutputAudioDevice.maxOutputChannels
currentInputChunkNum = self.settings.serverReadChunkSize
block_frame = currentInputChunkNum * 128
# sample rate precheck(alsa cannot use 40000?)
try:
currentModelSamplingRate = self.serverDeviceCallbacks.get_processing_sampling_rate()
except Exception as e:
print("[Voice Changer] ex: get_processing_sampling_rate", e)
continue
try:
with sd.Stream(
callback=self.audio_callback,
blocksize=block_frame,
# samplerate=currentModelSamplingRate,
dtype="float32",
# dtype="int16",
# channels=[currentInputChannelNum, currentOutputChannelNum],
):
pass
self.settings.serverInputAudioSampleRate = currentModelSamplingRate
self.serverDeviceCallbacks.setInputSamplingRate(currentModelSamplingRate)
self.serverDeviceCallbacks.setOutputSamplingRate(currentModelSamplingRate)
print(f"[Voice Changer] sample rate {self.settings.serverInputAudioSampleRate}")
except Exception as e:
print("[Voice Changer] ex: fallback to device default samplerate", e)
print("[Voice Changer] device default samplerate", serverInputAudioDevice.default_samplerate)
self.settings.serverInputAudioSampleRate = round(serverInputAudioDevice.default_samplerate)
self.serverDeviceCallbacks.setInputSamplingRate(round(serverInputAudioDevice.default_samplerate))
self.serverDeviceCallbacks.setOutputSamplingRate(round(serverInputAudioDevice.default_samplerate))
sd.default.samplerate = self.settings.serverInputAudioSampleRate
sd.default.blocksize = block_frame
# main loop
try:
with sd.Stream(
callback=self.audio_callback,
# blocksize=block_frame,
# samplerate=vc.settings.serverInputAudioSampleRate,
dtype="float32",
# dtype="int16",
# channels=[currentInputChannelNum, currentOutputChannelNum],
):
while self.settings.serverAudioStated == 1 and sd.default.device[0] == self.settings.serverInputDeviceId and sd.default.device[1] == self.settings.serverOutputDeviceId and currentModelSamplingRate == self.serverDeviceCallbacks.get_processing_sampling_rate() and currentInputChunkNum == self.settings.serverReadChunkSize:
time.sleep(2)
print("[Voice Changer] server audio", self.performance)
print(f"[Voice Changer] started:{self.settings.serverAudioStated}, input:{sd.default.device[0]}, output:{sd.default.device[1]}, mic_sr:{self.settings.serverInputAudioSampleRate}, model_sr:{currentModelSamplingRate}, chunk:{currentInputChunkNum}, ch:[{sd.default.channels}]")
except Exception as e:
print("[Voice Changer] ex:", e)
time.sleep(2)
###########################################
# Info Section
###########################################
def get_info(self):
data = asdict(self.settings)
try:

View File

@ -1,6 +1,7 @@
import sys
import os
from data.ModelSlot import MMVCv13ModelSlot
from voice_changer.VoiceChangerParamsManager import VoiceChangerParamsManager
from voice_changer.utils.VoiceChangerModel import AudioInOut
@ -63,19 +64,22 @@ class MMVCv13:
def initialize(self):
print("[Voice Changer] [MMVCv13] Initializing... ")
vcparams = VoiceChangerParamsManager.get_instance().params
configPath = os.path.join(vcparams.model_dir, str(self.slotInfo.slotIndex), self.slotInfo.configFile)
modelPath = os.path.join(vcparams.model_dir, str(self.slotInfo.slotIndex), self.slotInfo.modelFile)
self.hps = get_hparams_from_file(self.slotInfo.configFile)
self.hps = get_hparams_from_file(configPath)
if self.slotInfo.isONNX:
providers, options = self.getOnnxExecutionProvider()
self.onnx_session = onnxruntime.InferenceSession(
self.slotInfo.modelFile,
modelPath,
providers=providers,
provider_options=options,
)
else:
self.net_g = SynthesizerTrn(len(symbols), self.hps.data.filter_length // 2 + 1, self.hps.train.segment_size // self.hps.data.hop_length, n_speakers=self.hps.data.n_speakers, **self.hps.model)
self.net_g.eval()
load_checkpoint(self.slotInfo.modelFile, self.net_g, None)
load_checkpoint(modelPath, self.net_g, None)
# その他の設定
self.settings.srcId = self.slotInfo.srcId
@ -105,8 +109,10 @@ class MMVCv13:
if key == "gpu" and self.slotInfo.isONNX:
providers, options = self.getOnnxExecutionProvider()
vcparams = VoiceChangerParamsManager.get_instance().params
modelPath = os.path.join(vcparams.model_dir, str(self.slotInfo.slotIndex), self.slotInfo.modelFile)
self.onnx_session = onnxruntime.InferenceSession(
self.slotInfo.modelFile,
modelPath,
providers=providers,
provider_options=options,
)
@ -249,3 +255,15 @@ class MMVCv13:
sys.modules.pop(key)
except: # NOQA
pass
def get_model_current(self):
return [
{
"key": "srcId",
"val": self.settings.srcId,
},
{
"key": "dstId",
"val": self.settings.dstId,
}
]

View File

@ -1,6 +1,7 @@
import sys
import os
from data.ModelSlot import MMVCv15ModelSlot
from voice_changer.VoiceChangerParamsManager import VoiceChangerParamsManager
from voice_changer.utils.VoiceChangerModel import AudioInOut
if sys.platform.startswith("darwin"):
@ -70,7 +71,11 @@ class MMVCv15:
def initialize(self):
print("[Voice Changer] [MMVCv15] Initializing... ")
self.hps = get_hparams_from_file(self.slotInfo.configFile)
vcparams = VoiceChangerParamsManager.get_instance().params
configPath = os.path.join(vcparams.model_dir, str(self.slotInfo.slotIndex), self.slotInfo.configFile)
modelPath = os.path.join(vcparams.model_dir, str(self.slotInfo.slotIndex), self.slotInfo.modelFile)
self.hps = get_hparams_from_file(configPath)
self.net_g = SynthesizerTrn(
spec_channels=self.hps.data.filter_length // 2 + 1,
@ -96,7 +101,7 @@ class MMVCv15:
self.onxx_input_length = 8192
providers, options = self.getOnnxExecutionProvider()
self.onnx_session = onnxruntime.InferenceSession(
self.slotInfo.modelFile,
modelPath,
providers=providers,
provider_options=options,
)
@ -108,7 +113,7 @@ class MMVCv15:
self.settings.maxInputLength = self.onxx_input_length - (0.012 * self.hps.data.sampling_rate) - 1024 # onnxの場合は入力長固(crossfadeの1024は仮) # NOQA
else:
self.net_g.eval()
load_checkpoint(self.slotInfo.modelFile, self.net_g, None)
load_checkpoint(modelPath, self.net_g, None)
# その他の設定
self.settings.srcId = self.slotInfo.srcId
@ -139,8 +144,10 @@ class MMVCv15:
setattr(self.settings, key, val)
if key == "gpu" and self.slotInfo.isONNX:
providers, options = self.getOnnxExecutionProvider()
vcparams = VoiceChangerParamsManager.get_instance().params
modelPath = os.path.join(vcparams.model_dir, str(self.slotInfo.slotIndex), self.slotInfo.modelFile)
self.onnx_session = onnxruntime.InferenceSession(
self.slotInfo.modelFile,
modelPath,
providers=providers,
provider_options=options,
)
@ -208,7 +215,8 @@ class MMVCv15:
solaSearchFrame: int = 0,
):
# maxInputLength を更新(ここでやると非効率だが、とりあえず。)
self.settings.maxInputLength = self.onxx_input_length - crossfadeSize - solaSearchFrame # onnxの場合は入力長固(crossfadeの1024は仮) # NOQA
if self.slotInfo.isONNX:
self.settings.maxInputLength = self.onxx_input_length - crossfadeSize - solaSearchFrame # onnxの場合は入力長固(crossfadeの1024は仮) # NOQA get_infoで返る値。この関数内の処理では使わない。
newData = newData.astype(np.float32) / self.hps.data.max_wav_value
@ -310,3 +318,19 @@ class MMVCv15:
sys.modules.pop(key)
except: # NOQA
pass
def get_model_current(self):
return [
{
"key": "srcId",
"val": self.settings.srcId,
},
{
"key": "dstId",
"val": self.settings.dstId,
},
{
"key": "f0Factor",
"val": self.settings.f0Factor,
}
]

View File

@ -1,6 +1,7 @@
import os
from data.ModelSlot import MMVCv15ModelSlot
from voice_changer.VoiceChangerParamsManager import VoiceChangerParamsManager
from voice_changer.utils.LoadModelParams import LoadModelParams
from voice_changer.utils.ModelSlotGenerator import ModelSlotGenerator
@ -15,7 +16,9 @@ class MMVCv15ModelSlotGenerator(ModelSlotGenerator):
elif file.kind == "mmvcv15Config":
slotInfo.configFile = file.name
elif file.kind == "mmvcv15Correspondence":
with open(file.name, "r") as f:
vcparams = VoiceChangerParamsManager.get_instance().params
filePath = os.path.join(vcparams.model_dir, str(props.slot), file.name)
with open(filePath, "r") as f:
slotInfo.speakers = {}
while True:
line = f.readline()

View File

@ -4,17 +4,17 @@ import torch
from const import UPLOAD_DIR
from voice_changer.RVC.modelMerger.MergeModel import merge_model
from voice_changer.utils.ModelMerger import ModelMerger, ModelMergerRequest
from voice_changer.utils.VoiceChangerParams import VoiceChangerParams
class RVCModelMerger(ModelMerger):
@classmethod
def merge_models(cls, request: ModelMergerRequest, storeSlot: int):
print("[Voice Changer] MergeRequest:", request)
merged = merge_model(request)
def merge_models(cls, params: VoiceChangerParams, request: ModelMergerRequest, storeSlot: int):
merged = merge_model(params, request)
# いったんは、アップロードフォルダに格納する。(歴史的経緯)
# 後続のloadmodelを呼び出すことで永続化モデルフォルダに移動させられる。
storeDir = os.path.join(UPLOAD_DIR, f"{storeSlot}")
storeDir = os.path.join(UPLOAD_DIR)
print("[Voice Changer] store merged model to:", storeDir)
os.makedirs(storeDir, exist_ok=True)
storeFile = os.path.join(storeDir, "merged.pth")

View File

@ -5,7 +5,8 @@ import torch
import onnxruntime
import json
from data.ModelSlot import ModelSlot, RVCModelSlot
from data.ModelSlot import RVCModelSlot
from voice_changer.VoiceChangerParamsManager import VoiceChangerParamsManager
from voice_changer.utils.LoadModelParams import LoadModelParams
from voice_changer.utils.ModelSlotGenerator import ModelSlotGenerator
@ -13,6 +14,7 @@ from voice_changer.utils.ModelSlotGenerator import ModelSlotGenerator
class RVCModelSlotGenerator(ModelSlotGenerator):
@classmethod
def loadModel(cls, props: LoadModelParams):
vcparams = VoiceChangerParamsManager.get_instance().params
slotInfo: RVCModelSlot = RVCModelSlot()
for file in props.files:
if file.kind == "rvcModel":
@ -24,17 +26,20 @@ class RVCModelSlotGenerator(ModelSlotGenerator):
slotInfo.defaultProtect = 0.5
slotInfo.isONNX = slotInfo.modelFile.endswith(".onnx")
slotInfo.name = os.path.splitext(os.path.basename(slotInfo.modelFile))[0]
print("RVC:: slotInfo.modelFile", slotInfo.modelFile)
# slotInfo.iconFile = "/assets/icons/noimage.png"
modelPath = os.path.join(vcparams.model_dir, str(props.slot), os.path.basename(slotInfo.modelFile))
if slotInfo.isONNX:
slotInfo = cls._setInfoByONNX(slotInfo)
slotInfo = cls._setInfoByONNX(modelPath, slotInfo)
else:
slotInfo = cls._setInfoByPytorch(slotInfo)
slotInfo = cls._setInfoByPytorch(modelPath, slotInfo)
return slotInfo
@classmethod
def _setInfoByPytorch(cls, slot: ModelSlot):
cpt = torch.load(slot.modelFile, map_location="cpu")
def _setInfoByPytorch(cls, modelPath: str, slot: RVCModelSlot):
cpt = torch.load(modelPath, map_location="cpu")
config_len = len(cpt["config"])
version = cpt.get("version", "v1")
@ -113,8 +118,8 @@ class RVCModelSlotGenerator(ModelSlotGenerator):
return slot
@classmethod
def _setInfoByONNX(cls, slot: ModelSlot):
tmp_onnx_session = onnxruntime.InferenceSession(slot.modelFile, providers=["CPUExecutionProvider"])
def _setInfoByONNX(cls, modelPath: str, slot: RVCModelSlot):
tmp_onnx_session = onnxruntime.InferenceSession(modelPath, providers=["CPUExecutionProvider"])
modelmeta = tmp_onnx_session.get_modelmeta()
try:
slot = RVCModelSlot(**asdict(slot))

View File

@ -0,0 +1,289 @@
'''
VoiceChangerV2向け
'''
from dataclasses import asdict
import numpy as np
import torch
from data.ModelSlot import RVCModelSlot
from mods.log_control import VoiceChangaerLogger
from voice_changer.RVC.RVCSettings import RVCSettings
from voice_changer.RVC.embedder.EmbedderManager import EmbedderManager
from voice_changer.utils.VoiceChangerModel import AudioInOut, PitchfInOut, FeatureInOut, VoiceChangerModel
from voice_changer.utils.VoiceChangerParams import VoiceChangerParams
from voice_changer.RVC.onnxExporter.export2onnx import export2onnx
from voice_changer.RVC.pitchExtractor.PitchExtractorManager import PitchExtractorManager
from voice_changer.RVC.pipeline.PipelineGenerator import createPipeline
from voice_changer.RVC.deviceManager.DeviceManager import DeviceManager
from voice_changer.RVC.pipeline.Pipeline import Pipeline
from Exceptions import DeviceCannotSupportHalfPrecisionException, PipelineCreateException, PipelineNotInitializedException
import resampy
from typing import cast
logger = VoiceChangaerLogger.get_instance().getLogger()
class RVCr2(VoiceChangerModel):
def __init__(self, params: VoiceChangerParams, slotInfo: RVCModelSlot):
logger.info("[Voice Changer] [RVCr2] Creating instance ")
self.deviceManager = DeviceManager.get_instance()
EmbedderManager.initialize(params)
PitchExtractorManager.initialize(params)
self.settings = RVCSettings()
self.params = params
# self.pitchExtractor = PitchExtractorManager.getPitchExtractor(self.settings.f0Detector, self.settings.gpu)
self.pipeline: Pipeline | None = None
self.audio_buffer: AudioInOut | None = None
self.pitchf_buffer: PitchfInOut | None = None
self.feature_buffer: FeatureInOut | None = None
self.prevVol = 0.0
self.slotInfo = slotInfo
# self.initialize()
def initialize(self):
logger.info("[Voice Changer][RVCr2] Initializing... ")
# pipelineの生成
try:
self.pipeline = createPipeline(self.params, self.slotInfo, self.settings.gpu, self.settings.f0Detector)
except PipelineCreateException as e: # NOQA
logger.error("[Voice Changer] pipeline create failed. check your model is valid.")
return
# その他の設定
self.settings.tran = self.slotInfo.defaultTune
self.settings.indexRatio = self.slotInfo.defaultIndexRatio
self.settings.protect = self.slotInfo.defaultProtect
logger.info("[Voice Changer] [RVC] Initializing... done")
def setSamplingRate(self, inputSampleRate, outputSampleRate):
self.inputSampleRate = inputSampleRate
self.outputSampleRate = outputSampleRate
self.initialize()
def update_settings(self, key: str, val: int | float | str):
logger.info(f"[Voice Changer][RVC]: update_settings {key}:{val}")
if key in self.settings.intData:
setattr(self.settings, key, int(val))
if key == "gpu":
self.deviceManager.setForceTensor(False)
self.initialize()
elif key in self.settings.floatData:
setattr(self.settings, key, float(val))
elif key in self.settings.strData:
setattr(self.settings, key, str(val))
if key == "f0Detector" and self.pipeline is not None:
pitchExtractor = PitchExtractorManager.getPitchExtractor(self.settings.f0Detector, self.settings.gpu)
self.pipeline.setPitchExtractor(pitchExtractor)
else:
return False
return True
def get_info(self):
data = asdict(self.settings)
if self.pipeline is not None:
pipelineInfo = self.pipeline.getPipelineInfo()
data["pipelineInfo"] = pipelineInfo
else:
data["pipelineInfo"] = "None"
return data
def get_processing_sampling_rate(self):
return self.slotInfo.samplingRate
def generate_input(
self,
newData: AudioInOut,
crossfadeSize: int,
solaSearchFrame: int,
extra_frame: int
):
# 16k で入ってくる。
inputSize = newData.shape[0]
newData = newData.astype(np.float32) / 32768.0
newFeatureLength = inputSize // 160 # hopsize:=160
if self.audio_buffer is not None:
# 過去のデータに連結
self.audio_buffer = np.concatenate([self.audio_buffer, newData], 0)
if self.slotInfo.f0:
self.pitchf_buffer = np.concatenate([self.pitchf_buffer, np.zeros(newFeatureLength)], 0)
self.feature_buffer = np.concatenate([self.feature_buffer, np.zeros([newFeatureLength, self.slotInfo.embChannels])], 0)
else:
self.audio_buffer = newData
if self.slotInfo.f0:
self.pitchf_buffer = np.zeros(newFeatureLength)
self.feature_buffer = np.zeros([newFeatureLength, self.slotInfo.embChannels])
convertSize = inputSize + crossfadeSize + solaSearchFrame + extra_frame
if convertSize % 160 != 0: # モデルの出力のホップサイズで切り捨てが発生するので補う。
convertSize = convertSize + (160 - (convertSize % 160))
outSize = int(((convertSize - extra_frame) / 16000) * self.slotInfo.samplingRate)
# バッファがたまっていない場合はzeroで補う
if self.audio_buffer.shape[0] < convertSize:
self.audio_buffer = np.concatenate([np.zeros([convertSize]), self.audio_buffer])
if self.slotInfo.f0:
self.pitchf_buffer = np.concatenate([np.zeros([convertSize // 160]), self.pitchf_buffer])
self.feature_buffer = np.concatenate([np.zeros([convertSize // 160, self.slotInfo.embChannels]), self.feature_buffer])
# 不要部分をトリミング
convertOffset = -1 * convertSize
featureOffset = convertOffset // 160
self.audio_buffer = self.audio_buffer[convertOffset:] # 変換対象の部分だけ抽出
if self.slotInfo.f0:
self.pitchf_buffer = self.pitchf_buffer[featureOffset:]
self.feature_buffer = self.feature_buffer[featureOffset:]
# 出力部分だけ切り出して音量を確認。(TODO:段階的消音にする)
cropOffset = -1 * (inputSize + crossfadeSize)
cropEnd = -1 * (crossfadeSize)
crop = self.audio_buffer[cropOffset:cropEnd]
vol = np.sqrt(np.square(crop).mean())
vol = max(vol, self.prevVol * 0.0)
self.prevVol = vol
return (self.audio_buffer, self.pitchf_buffer, self.feature_buffer, convertSize, vol, outSize)
def inference(self, receivedData: AudioInOut, crossfade_frame: int, sola_search_frame: int):
if self.pipeline is None:
logger.info("[Voice Changer] Pipeline is not initialized.")
raise PipelineNotInitializedException()
# 処理は16Kで実施(Pitch, embed, (infer))
receivedData = cast(
AudioInOut,
resampy.resample(
receivedData,
self.inputSampleRate,
16000,
),
)
crossfade_frame = int((crossfade_frame / self.inputSampleRate) * 16000)
sola_search_frame = int((sola_search_frame / self.inputSampleRate) * 16000)
extra_frame = int((self.settings.extraConvertSize / self.inputSampleRate) * 16000)
# 入力データ生成
data = self.generate_input(receivedData, crossfade_frame, sola_search_frame, extra_frame)
audio = data[0]
pitchf = data[1]
feature = data[2]
convertSize = data[3]
vol = data[4]
outSize = data[5]
if vol < self.settings.silentThreshold:
return np.zeros(convertSize).astype(np.int16) * np.sqrt(vol)
device = self.pipeline.device
audio = torch.from_numpy(audio).to(device=device, dtype=torch.float32)
repeat = 1 if self.settings.rvcQuality else 0
sid = self.settings.dstId
f0_up_key = self.settings.tran
index_rate = self.settings.indexRatio
protect = self.settings.protect
if_f0 = 1 if self.slotInfo.f0 else 0
embOutputLayer = self.slotInfo.embOutputLayer
useFinalProj = self.slotInfo.useFinalProj
try:
audio_out, self.pitchf_buffer, self.feature_buffer = self.pipeline.exec(
sid,
audio,
pitchf,
feature,
f0_up_key,
index_rate,
if_f0,
# 0,
self.settings.extraConvertSize / self.inputSampleRate if self.settings.silenceFront else 0., # extaraDataSizeの秒数。入力のサンプリングレートで算出
embOutputLayer,
useFinalProj,
repeat,
protect,
outSize
)
# result = audio_out.detach().cpu().numpy() * np.sqrt(vol)
result = audio_out[-outSize:].detach().cpu().numpy() * np.sqrt(vol)
result = cast(
AudioInOut,
resampy.resample(
result,
self.slotInfo.samplingRate,
self.outputSampleRate,
),
)
return result
except DeviceCannotSupportHalfPrecisionException as e: # NOQA
logger.warn("[Device Manager] Device cannot support half precision. Fallback to float....")
self.deviceManager.setForceTensor(True)
self.initialize()
# raise e
return
def __del__(self):
del self.pipeline
# print("---------- REMOVING ---------------")
# remove_path = os.path.join("RVC")
# sys.path = [x for x in sys.path if x.endswith(remove_path) is False]
# for key in list(sys.modules):
# val = sys.modules.get(key)
# try:
# file_path = val.__file__
# if file_path.find("RVC" + os.path.sep) >= 0:
# # print("remove", key, file_path)
# sys.modules.pop(key)
# except Exception: # type:ignore
# # print(e)
# pass
def export2onnx(self):
modelSlot = self.slotInfo
if modelSlot.isONNX:
logger.warn("[Voice Changer] export2onnx, No pyTorch filepath.")
return {"status": "ng", "path": ""}
if self.pipeline is not None:
del self.pipeline
self.pipeline = None
torch.cuda.empty_cache()
self.initialize()
output_file_simple = export2onnx(self.settings.gpu, modelSlot)
return {
"status": "ok",
"path": f"/tmp/{output_file_simple}",
"filename": output_file_simple,
}
def get_model_current(self):
return [
{
"key": "defaultTune",
"val": self.settings.tran,
},
{
"key": "defaultIndexRatio",
"val": self.settings.indexRatio,
},
{
"key": "defaultProtect",
"val": self.settings.protect,
},
]

View File

@ -46,7 +46,7 @@ class EmbedderManager:
file = cls.params.content_vec_500_onnx
return OnnxContentvec().loadModel(file, dev)
except Exception as e: # noqa
print("[Voice Changer] use torch contentvec")
print("[Voice Changer] use torch contentvec", e)
file = cls.params.hubert_base
return FairseqHubert().loadModel(file, dev, isHalf)
elif embederType == "hubert-base-japanese":

View File

@ -8,7 +8,7 @@ from voice_changer.RVC.inferencer.RVCInferencerv2 import RVCInferencerv2
from voice_changer.RVC.inferencer.RVCInferencerv2Nono import RVCInferencerv2Nono
from voice_changer.RVC.inferencer.WebUIInferencer import WebUIInferencer
from voice_changer.RVC.inferencer.WebUIInferencerNono import WebUIInferencerNono
from voice_changer.RVC.inferencer.VorasInferencebeta import VoRASInferencer
import sys
class InferencerManager:
@ -38,7 +38,11 @@ class InferencerManager:
elif inferencerType == EnumInferenceTypes.pyTorchRVCv2 or inferencerType == EnumInferenceTypes.pyTorchRVCv2.value:
return RVCInferencerv2().loadModel(file, gpu)
elif inferencerType == EnumInferenceTypes.pyTorchVoRASbeta or inferencerType == EnumInferenceTypes.pyTorchVoRASbeta.value:
if sys.platform.startswith("darwin") is False:
from voice_changer.RVC.inferencer.VorasInferencebeta import VoRASInferencer
return VoRASInferencer().loadModel(file, gpu)
else:
raise RuntimeError("[Voice Changer] VoRAS is not supported on macOS")
elif inferencerType == EnumInferenceTypes.pyTorchRVCv2Nono or inferencerType == EnumInferenceTypes.pyTorchRVCv2Nono.value:
return RVCInferencerv2Nono().loadModel(file, gpu)
elif inferencerType == EnumInferenceTypes.pyTorchWebUI or inferencerType == EnumInferenceTypes.pyTorchWebUI.value:

View File

@ -1,12 +1,14 @@
from typing import Dict, Any
import os
from collections import OrderedDict
import torch
from voice_changer.ModelSlotManager import ModelSlotManager
from voice_changer.utils.ModelMerger import ModelMergerRequest
from voice_changer.utils.VoiceChangerParams import VoiceChangerParams
def merge_model(request: ModelMergerRequest):
def merge_model(params: VoiceChangerParams, request: ModelMergerRequest):
def extract(ckpt: Dict[str, Any]):
a = ckpt["model"]
opt: Dict[str, Any] = OrderedDict()
@ -34,11 +36,16 @@ def merge_model(request: ModelMergerRequest):
weights = []
alphas = []
slotManager = ModelSlotManager.get_instance(params.model_dir)
for f in files:
strength = f.strength
if strength == 0:
continue
weight, state_dict = load_weight(f.filename)
slotInfo = slotManager.get_slot_info(f.slotIndex)
filename = os.path.join(params.model_dir, str(f.slotIndex), os.path.basename(slotInfo.modelFile)) # slotInfo.modelFileはv.1.5.3.11以前はmodel_dirから含まれている。
weight, state_dict = load_weight(filename)
weights.append(weight)
alphas.append(f.strength)

View File

@ -4,7 +4,7 @@ import torch
from onnxsim import simplify
import onnx
from const import TMP_DIR, EnumInferenceTypes
from data.ModelSlot import ModelSlot
from data.ModelSlot import RVCModelSlot
from voice_changer.RVC.deviceManager.DeviceManager import DeviceManager
from voice_changer.RVC.onnxExporter.SynthesizerTrnMs256NSFsid_ONNX import (
SynthesizerTrnMs256NSFsid_ONNX,
@ -24,10 +24,12 @@ from voice_changer.RVC.onnxExporter.SynthesizerTrnMsNSFsidNono_webui_ONNX import
from voice_changer.RVC.onnxExporter.SynthesizerTrnMsNSFsid_webui_ONNX import (
SynthesizerTrnMsNSFsid_webui_ONNX,
)
from voice_changer.VoiceChangerParamsManager import VoiceChangerParamsManager
def export2onnx(gpu: int, modelSlot: ModelSlot):
modelFile = modelSlot.modelFile
def export2onnx(gpu: int, modelSlot: RVCModelSlot):
vcparams = VoiceChangerParamsManager.get_instance().params
modelFile = os.path.join(vcparams.model_dir, str(modelSlot.slotIndex), os.path.basename(modelSlot.modelFile))
output_file = os.path.splitext(os.path.basename(modelFile))[0] + ".onnx"
output_file_simple = os.path.splitext(os.path.basename(modelFile))[0] + "_simple.onnx"

View File

@ -18,6 +18,7 @@ from voice_changer.RVC.inferencer.OnnxRVCInferencer import OnnxRVCInferencer
from voice_changer.RVC.inferencer.OnnxRVCInferencerNono import OnnxRVCInferencerNono
from voice_changer.RVC.pitchExtractor.PitchExtractor import PitchExtractor
from voice_changer.utils.Timer import Timer
logger = VoiceChangaerLogger.get_instance().getLogger()
@ -89,8 +90,11 @@ class Pipeline(object):
protect=0.5,
out_size=None,
):
# 16000のサンプリングレートで入ってきている。以降この世界は16000で処理。
# print(f"pipeline exec input, audio:{audio.shape}, pitchf:{pitchf.shape}, feature:{feature.shape}")
# print(f"pipeline exec input, silence_front:{silence_front}, out_size:{out_size}")
with Timer("main-process", False) as t: # NOQA
# 16000のサンプリングレートで入ってきている。以降この世界は16000で処理。
search_index = self.index is not None and self.big_npy is not None and index_rate != 0
# self.t_pad = self.sr * repeat # 1秒
# self.t_pad_tgt = self.targetSR * repeat # 1秒 出力時のトリミング(モデルのサンプリングで出力される)
@ -141,6 +145,7 @@ class Pipeline(object):
feats = feats.view(1, -1)
# embedding
with Timer("main-process", False) as te:
with autocast(enabled=self.isHalf):
try:
feats = self.embedder.extractFeatures(feats, embOutputLayer, useFinalProj)
@ -153,6 +158,7 @@ class Pipeline(object):
raise DeviceChangingException()
else:
raise e
# print(f"[Embedding] {te.secs}")
# Index - feature抽出
# if self.index is not None and self.feature is not None and index_rate != 0:
@ -240,6 +246,7 @@ class Pipeline(object):
raise e
feats_buffer = feats.squeeze(0).detach().cpu()
if pitchf is not None:
pitchf_buffer = pitchf.squeeze(0).detach().cpu()
else:
@ -257,6 +264,7 @@ class Pipeline(object):
del sid
# torch.cuda.empty_cache()
# print("EXEC AVERAGE:", t.avrSecs)
return audio1, pitchf_buffer, feats_buffer
def __del__(self):

View File

@ -9,15 +9,17 @@ from voice_changer.RVC.embedder.EmbedderManager import EmbedderManager
from voice_changer.RVC.inferencer.InferencerManager import InferencerManager
from voice_changer.RVC.pipeline.Pipeline import Pipeline
from voice_changer.RVC.pitchExtractor.PitchExtractorManager import PitchExtractorManager
from voice_changer.utils.VoiceChangerParams import VoiceChangerParams
def createPipeline(modelSlot: RVCModelSlot, gpu: int, f0Detector: str):
def createPipeline(params: VoiceChangerParams, modelSlot: RVCModelSlot, gpu: int, f0Detector: str):
dev = DeviceManager.get_instance().getDevice(gpu)
half = DeviceManager.get_instance().halfPrecisionAvailable(gpu)
# Inferencer 生成
try:
inferencer = InferencerManager.getInferencer(modelSlot.modelType, modelSlot.modelFile, gpu)
modelPath = os.path.join(params.model_dir, str(modelSlot.slotIndex), os.path.basename(modelSlot.modelFile))
inferencer = InferencerManager.getInferencer(modelSlot.modelType, modelPath, gpu)
except Exception as e:
print("[Voice Changer] exception! loading inferencer", e)
traceback.print_exc()
@ -40,7 +42,8 @@ def createPipeline(modelSlot: RVCModelSlot, gpu: int, f0Detector: str):
pitchExtractor = PitchExtractorManager.getPitchExtractor(f0Detector, gpu)
# index, feature
index = _loadIndex(modelSlot)
indexPath = os.path.join(params.model_dir, str(modelSlot.slotIndex), os.path.basename(modelSlot.indexFile))
index = _loadIndex(indexPath)
pipeline = Pipeline(
embedder,
@ -55,21 +58,17 @@ def createPipeline(modelSlot: RVCModelSlot, gpu: int, f0Detector: str):
return pipeline
def _loadIndex(modelSlot: RVCModelSlot):
def _loadIndex(indexPath: str):
# Indexのロード
print("[Voice Changer] Loading index...")
# ファイル指定がない場合はNone
if modelSlot.indexFile is None:
print("[Voice Changer] Index is None, not used")
return None
# ファイル指定があってもファイルがない場合はNone
if os.path.exists(modelSlot.indexFile) is not True:
if os.path.exists(indexPath) is not True or os.path.isfile(indexPath) is not True:
print("[Voice Changer] Index file is not found")
return None
try:
print("Try loading...", modelSlot.indexFile)
index = faiss.read_index(modelSlot.indexFile)
print("Try loading...", indexPath)
index = faiss.read_index(indexPath)
except: # NOQA
print("[Voice Changer] load index failed. Use no index.")
traceback.print_exc()

View File

@ -1,6 +1,7 @@
import sys
import os
from data.ModelSlot import SoVitsSvc40ModelSlot
from voice_changer.VoiceChangerParamsManager import VoiceChangerParamsManager
from voice_changer.utils.VoiceChangerModel import AudioInOut
from voice_changer.utils.VoiceChangerParams import VoiceChangerParams
@ -92,13 +93,17 @@ class SoVitsSvc40:
def initialize(self):
print("[Voice Changer] [so-vits-svc40] Initializing... ")
self.hps = get_hparams_from_file(self.slotInfo.configFile)
vcparams = VoiceChangerParamsManager.get_instance().params
configPath = os.path.join(vcparams.model_dir, str(self.slotInfo.slotIndex), self.slotInfo.configFile)
modelPath = os.path.join(vcparams.model_dir, str(self.slotInfo.slotIndex), self.slotInfo.modelFile)
self.hps = get_hparams_from_file(configPath)
self.settings.speakers = self.hps.spk
# cluster
try:
if self.slotInfo.clusterFile is not None:
self.cluster_model = get_cluster_model(self.slotInfo.clusterFile)
clusterPath = os.path.join(vcparams.model_dir, str(self.slotInfo.slotIndex), self.slotInfo.clusterFile)
self.cluster_model = get_cluster_model(clusterPath)
else:
self.cluster_model = None
except Exception as e:
@ -110,7 +115,7 @@ class SoVitsSvc40:
if self.slotInfo.isONNX:
providers, options = self.getOnnxExecutionProvider()
self.onnx_session = onnxruntime.InferenceSession(
self.slotInfo.modelFile,
modelPath,
providers=providers,
provider_options=options,
)
@ -122,7 +127,7 @@ class SoVitsSvc40:
)
net_g.eval()
self.net_g = net_g
load_checkpoint(self.slotInfo.modelFile, self.net_g, None)
load_checkpoint(modelPath, self.net_g, None)
def getOnnxExecutionProvider(self):
availableProviders = onnxruntime.get_available_providers()
@ -379,6 +384,10 @@ class SoVitsSvc40:
except Exception: # type:ignore
pass
def get_model_current(self):
return [
]
def resize_f0(x, target_len):
source = np.array(x)

View File

@ -37,8 +37,14 @@ class GPUInfo:
@dataclass()
class VoiceChangerManagerSettings:
modelSlotIndex: int = -1
passThrough: bool = False # 0: off, 1: on
# ↓mutableな物だけ列挙
intData: list[str] = field(default_factory=lambda: ["modelSlotIndex"])
boolData: list[str] = field(default_factory=lambda: [
"passThrough"
])
intData: list[str] = field(default_factory=lambda: [
"modelSlotIndex",
])
class VoiceChangerManager(ServerDeviceCallbacks):
@ -121,7 +127,6 @@ class VoiceChangerManager(ServerDeviceCallbacks):
def get_instance(cls, params: VoiceChangerParams):
if cls._instance is None:
cls._instance = cls(params)
# cls._instance.voiceChanger = VoiceChanger(params)
return cls._instance
def loadModel(self, params: LoadModelParams):
@ -147,7 +152,7 @@ class VoiceChangerManager(ServerDeviceCallbacks):
os.makedirs(dstDir, exist_ok=True)
logger.info(f"move to {srcPath} -> {dstPath}")
shutil.move(srcPath, dstPath)
file.name = dstPath
file.name = os.path.basename(dstPath)
# メタデータ作成(各VCで定義)
if params.voiceChangerType == "RVC":
@ -188,6 +193,7 @@ class VoiceChangerManager(ServerDeviceCallbacks):
data["modelSlots"] = self.modelSlotManager.getAllSlotInfo(reload=True)
data["sampleModels"] = getSampleInfos(self.params.sample_mode)
data["python"] = sys.version
data["voiceChangerParams"] = self.params
data["status"] = "OK"
@ -214,11 +220,18 @@ class VoiceChangerManager(ServerDeviceCallbacks):
return
elif slotInfo.voiceChangerType == "RVC":
logger.info("................RVC")
from voice_changer.RVC.RVC import RVC
# from voice_changer.RVC.RVC import RVC
self.voiceChangerModel = RVC(self.params, slotInfo)
self.voiceChanger = VoiceChanger(self.params)
# self.voiceChangerModel = RVC(self.params, slotInfo)
# self.voiceChanger = VoiceChanger(self.params)
# self.voiceChanger.setModel(self.voiceChangerModel)
from voice_changer.RVC.RVCr2 import RVCr2
self.voiceChangerModel = RVCr2(self.params, slotInfo)
self.voiceChanger = VoiceChangerV2(self.params)
self.voiceChanger.setModel(self.voiceChangerModel)
elif slotInfo.voiceChangerType == "MMVCv13":
logger.info("................MMVCv13")
from voice_changer.MMVCv13.MMVCv13 import MMVCv13
@ -260,10 +273,16 @@ class VoiceChangerManager(ServerDeviceCallbacks):
del self.voiceChangerModel
return
def update_settings(self, key: str, val: str | int | float):
def update_settings(self, key: str, val: str | int | float | bool):
self.store_setting(key, val)
if key in self.settings.intData:
if key in self.settings.boolData:
if val == "true":
newVal = True
elif val == "false":
newVal = False
setattr(self.settings, key, newVal)
elif key in self.settings.intData:
newVal = int(val)
if key == "modelSlotIndex":
newVal = newVal % 1000
@ -283,6 +302,9 @@ class VoiceChangerManager(ServerDeviceCallbacks):
return self.get_info()
def changeVoice(self, receivedData: AudioInOut):
if self.settings.passThrough is True: # パススルー
return receivedData, []
if hasattr(self, "voiceChanger") is True:
return self.voiceChanger.on_request(receivedData)
else:
@ -299,8 +321,8 @@ class VoiceChangerManager(ServerDeviceCallbacks):
req.files = [MergeElement(**f) for f in req.files]
slot = len(self.modelSlotManager.getAllSlotInfo()) - 1
if req.voiceChangerType == "RVC":
merged = RVCModelMerger.merge_models(req, slot)
loadParam = LoadModelParams(voiceChangerType="RVC", slot=slot, isSampleMode=False, sampleId="", files=[LoadModelParamFile(name=os.path.basename(merged), kind="rvcModel", dir=f"{slot}")], params={})
merged = RVCModelMerger.merge_models(self.params, req, slot)
loadParam = LoadModelParams(voiceChangerType="RVC", slot=slot, isSampleMode=False, sampleId="", files=[LoadModelParamFile(name=os.path.basename(merged), kind="rvcModel", dir="")], params={})
self.loadModel(loadParam)
return self.get_info()

View File

@ -0,0 +1,17 @@
from voice_changer.utils.VoiceChangerParams import VoiceChangerParams
class VoiceChangerParamsManager:
_instance = None
def __init__(self):
self.params = None
@classmethod
def get_instance(cls):
if cls._instance is None:
cls._instance = cls()
return cls._instance
def setParams(self, params: VoiceChangerParams):
self.params = params

View File

@ -6,7 +6,7 @@
- 適用VoiceChangerModel
DiffusionSVC
RVC
'''
from typing import Any, Union
@ -208,12 +208,13 @@ class VoiceChangerV2(VoiceChangerIF):
block_frame = receivedData.shape[0]
crossfade_frame = min(self.settings.crossFadeOverlapSize, block_frame)
self._generate_strength(crossfade_frame)
# data = self.voiceChanger.generate_input(newData, block_frame, crossfade_frame, sola_search_frame)
audio = self.voiceChanger.inference(
receivedData,
crossfade_frame=crossfade_frame,
sola_search_frame=sola_search_frame
)
if hasattr(self, "sola_buffer") is True:
np.set_printoptions(threshold=10000)
audio_offset = -1 * (sola_search_frame + crossfade_frame + block_frame)

View File

@ -5,7 +5,7 @@ from dataclasses import dataclass
@dataclass
class MergeElement:
filename: str
slotIndex: int
strength: int

View File

@ -1,15 +1,43 @@
import time
import inspect
class Timer(object):
def __init__(self, title: str):
storedSecs = {} # Class variable
def __init__(self, title: str, enalbe: bool = True):
self.title = title
self.enable = enalbe
self.secs = 0
self.msecs = 0
self.avrSecs = 0
if self.enable is False:
return
self.maxStores = 10
current_frame = inspect.currentframe()
caller_frame = inspect.getouterframes(current_frame, 2)
frame = caller_frame[1]
filename = frame.filename
line_number = frame.lineno
self.key = f"{title}_{filename}_{line_number}"
if self.key not in self.storedSecs:
self.storedSecs[self.key] = []
def __enter__(self):
if self.enable is False:
return
self.start = time.time()
return self
def __exit__(self, *_):
if self.enable is False:
return
self.end = time.time()
self.secs = self.end - self.start
self.msecs = self.secs * 1000 # millisecs
self.storedSecs[self.key].append(self.secs)
self.storedSecs[self.key] = self.storedSecs[self.key][-self.maxStores:]
self.avrSecs = sum(self.storedSecs[self.key]) / len(self.storedSecs[self.key])

View File

@ -0,0 +1,20 @@
{
"signedContributors": [
{
"name": "w-okada",
"id": 48346627,
"comment_id": 1667673774,
"created_at": "2023-08-07T11:21:42Z",
"repoId": 527419347,
"pullRequestNo": 661
},
{
"name": "w-okada",
"id": 48346627,
"comment_id": 1667674735,
"created_at": "2023-08-07T11:22:28Z",
"repoId": 527419347,
"pullRequestNo": 661
}
]
}

View File

@ -44,6 +44,8 @@ If you have the old version, be sure to unzip it into a separate folder.
When connecting remotely, please use `.bat` file (win) and `.command` file (mac) where http is replaced with https.
Access with Browser (currently only chrome is supported), then you can see gui.
### Console
When you run a .bat file (Windows) or .command file (Mac), a screen like the following will be displayed and various data will be downloaded from the Internet at the initial start-up. Depending on your environment, it may take 1-2 minutes in many cases.

View File

@ -44,6 +44,8 @@
リモートから接続する場合は、`.bat`ファイル(win)、`.command`ファイル(mac)の http が https に置き換わっているものを使用してください。
ブラウザ(Chrome のみサポート)でアクセスすると画面が表示されます。
### コンソール表示
`.bat`ファイル(win)や`.command`ファイル(mac)を実行すると、次のような画面が表示され、初回起動時には各種データをインターネットからダウンロードします。