Merge branch 'w-okada:master' into master
This commit is contained in:
commit
11219c11f6
12
.github/ISSUE_TEMPLATE/issue.yaml
vendored
12
.github/ISSUE_TEMPLATE/issue.yaml
vendored
@ -117,6 +117,16 @@ body:
|
||||
id: issue
|
||||
attributes:
|
||||
label: Situation
|
||||
description: Developers spend a lot of time developing new features and resolving issues. If you really want to get it solved, please provide as much reproducible information and logs as possible. Provide logs on the terminal and capture the window.
|
||||
description: Developers spend a lot of time developing new features and resolving issues. If you really want to get it solved, please provide as much reproducible information and logs as possible. Provide logs on the terminal and capture the appkication window.
|
||||
- type: textarea
|
||||
id: capture
|
||||
attributes:
|
||||
label: application window capture
|
||||
description: the appkication window.
|
||||
- type: textarea
|
||||
id: logs-on-terminal
|
||||
attributes:
|
||||
label: logs on terminal
|
||||
description: logs on terminal.
|
||||
validations:
|
||||
required: true
|
||||
|
36
.github/workflows/cla.yml
vendored
Normal file
36
.github/workflows/cla.yml
vendored
Normal file
@ -0,0 +1,36 @@
|
||||
name: "CLA Assistant"
|
||||
on:
|
||||
issue_comment:
|
||||
types: [created]
|
||||
pull_request_target:
|
||||
types: [opened, closed, synchronize]
|
||||
|
||||
jobs:
|
||||
CLAssistant:
|
||||
runs-on: ubuntu-latest
|
||||
steps:
|
||||
- name: "CLA Assistant"
|
||||
if: (github.event.comment.body == 'recheck' || github.event.comment.body == 'I have read the CLA Document and I hereby sign the CLA') || github.event_name == 'pull_request_target'
|
||||
# Beta Release
|
||||
uses: cla-assistant/github-action@v2.1.3-beta
|
||||
env:
|
||||
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
|
||||
# the below token should have repo scope and must be manually added by you in the repository's secret
|
||||
PERSONAL_ACCESS_TOKEN: ${{ secrets.PERSONAL_ACCESS_TOKEN }}
|
||||
with:
|
||||
path-to-signatures: "signatures/version1/cla.json"
|
||||
path-to-document: "https://raw.githubusercontent.com/w-okada/voice-changer/master/LICENSE-CLA" # e.g. a CLA or a DCO document
|
||||
# branch should not be protected
|
||||
branch: "master"
|
||||
#allowlist: user1,bot*
|
||||
|
||||
#below are the optional inputs - If the optional inputs are not given, then default values will be taken
|
||||
#remote-organization-name: enter the remote organization name where the signatures should be stored (Default is storing the signatures in the same repository)
|
||||
#remote-repository-name: enter the remote repository name where the signatures should be stored (Default is storing the signatures in the same repository)
|
||||
#create-file-commit-message: 'For example: Creating file for storing CLA Signatures'
|
||||
#signed-commit-message: 'For example: $contributorName has signed the CLA in #$pullRequestNo'
|
||||
#custom-notsigned-prcomment: 'pull request comment with Introductory message to ask new contributors to sign'
|
||||
#custom-pr-sign-comment: 'The signature to be committed in order to sign the CLA'
|
||||
#custom-allsigned-prcomment: 'pull request comment when all contributors has signed, defaults to **CLA Assistant Lite bot** All Contributors have signed the CLA.'
|
||||
#lock-pullrequest-aftermerge: false - if you don't want this bot to automatically lock the pull request after merging (default - true)
|
||||
#use-dco-flag: true - If you are using DCO instead of CLA
|
68
LICENSE
68
LICENSE
@ -20,7 +20,6 @@ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
||||
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
||||
SOFTWARE.
|
||||
|
||||
|
||||
MIT License
|
||||
|
||||
Copyright (c) 2022 Isle Tennos
|
||||
@ -64,3 +63,70 @@ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
||||
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
||||
SOFTWARE.
|
||||
|
||||
MIT License
|
||||
|
||||
Copyright (c) 2023 liujing04
|
||||
Copyright (c) 2023 源文雨
|
||||
|
||||
Permission is hereby granted, free of charge, to any person obtaining a copy
|
||||
of this software and associated documentation files (the "Software"), to deal
|
||||
in the Software without restriction, including without limitation the rights
|
||||
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
||||
copies of the Software, and to permit persons to whom the Software is
|
||||
furnished to do so, subject to the following conditions:
|
||||
|
||||
The above copyright notice and this permission notice shall be included in all
|
||||
copies or substantial portions of the Software.
|
||||
|
||||
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
||||
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
||||
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
||||
SOFTWARE.
|
||||
|
||||
MIT License
|
||||
|
||||
Copyright (c) 2023 yxlllc
|
||||
|
||||
Permission is hereby granted, free of charge, to any person obtaining a copy
|
||||
of this software and associated documentation files (the "Software"), to deal
|
||||
in the Software without restriction, including without limitation the rights
|
||||
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
||||
copies of the Software, and to permit persons to whom the Software is
|
||||
furnished to do so, subject to the following conditions:
|
||||
|
||||
The above copyright notice and this permission notice shall be included in all
|
||||
copies or substantial portions of the Software.
|
||||
|
||||
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
||||
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
||||
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
||||
SOFTWARE.
|
||||
|
||||
MIT License
|
||||
|
||||
Copyright (c) 2023 yxlllc
|
||||
|
||||
Permission is hereby granted, free of charge, to any person obtaining a copy
|
||||
of this software and associated documentation files (the "Software"), to deal
|
||||
in the Software without restriction, including without limitation the rights
|
||||
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
||||
copies of the Software, and to permit persons to whom the Software is
|
||||
furnished to do so, subject to the following conditions:
|
||||
|
||||
The above copyright notice and this permission notice shall be included in all
|
||||
copies or substantial portions of the Software.
|
||||
|
||||
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
||||
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
||||
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
||||
SOFTWARE.
|
||||
|
27
LICENSE-CLA
Normal file
27
LICENSE-CLA
Normal file
@ -0,0 +1,27 @@
|
||||
Contributor License Agreement
|
||||
|
||||
Copyright (c) 2022 Wataru Okada
|
||||
|
||||
本契約は、当社とあなた(以下、"貢献者"とします)の間で締結され、貢献者が当社に対してソフトウェアプロジェクト(以下、"プロジェクト"とします)に対する貢献(以下、"貢献"とします)を提供する際の条件を定めます。
|
||||
|
||||
1. 貢献者は、提供する貢献が、貢献者自身のオリジナルな作品であり、商標、著作権、特許、または他の知的財産権を侵害していないことを保証します。
|
||||
|
||||
2. 貢献者は、貢献を当社に対して無償で提供し、当社はそれを無制限に使用、複製、修正、公開、配布、サブライセンスを付与し、またその販売する権利を得ることに同意します。
|
||||
|
||||
3. 本契約が終了した場合でも、第 2 項で述べた権利は当社に留保されます。
|
||||
|
||||
4. 当社は貢献者の貢献を受け入れる義務を負わず、また貢献者に一切の補償をする義務を負わないことに貢献者は同意します。
|
||||
|
||||
5. 本契約は当社と貢献者双方の書面による合意により修正されることがあります。
|
||||
|
||||
"This Agreement is made between our Company and you (hereinafter referred to as "Contributor") and outlines the terms under which you provide your Contributions (hereinafter referred to as "Contributions") to our software project (hereinafter referred to as "Project").
|
||||
|
||||
1. You warrant that the Contributions you are providing are your original work and do not infringe any trademark, copyright, patent, or other intellectual property rights.
|
||||
|
||||
2. You agree to provide your Contributions to the Company for free, and the Company has the unlimited right to use, copy, modify, publish, distribute, and sublicense, and also sell the Contributions.
|
||||
|
||||
3. Even after the termination of this Agreement, the rights mentioned in the above clause will be retained by the Company.
|
||||
|
||||
4. The Company is under no obligation to accept your Contributions or to compensate you in any way for them, and you agree to this.
|
||||
|
||||
5. This Agreement may be modified by written agreement between the Company and the Contributor."
|
113
README.md
113
README.md
@ -4,74 +4,19 @@
|
||||
|
||||
## What's New!
|
||||
|
||||
- v.1.5.3.10b
|
||||
|
||||
- improve:
|
||||
- logger
|
||||
- bugfix:
|
||||
- RMVPE:different device bug (not finding root caused yet)
|
||||
- RVC: when loading sample model, useIndex issue
|
||||
|
||||
- v.1.5.3.10a
|
||||
|
||||
- Improvement:
|
||||
- launch sequence
|
||||
- onnx export process
|
||||
- error handling in client
|
||||
- bugfix:
|
||||
- RMVPE for mac
|
||||
|
||||
- v.1.5.3.10
|
||||
|
||||
- New Feature
|
||||
- Support Diffusion SVC(only combo model)
|
||||
- System audio capture(only for win)
|
||||
- Support RMVPE
|
||||
- improvement
|
||||
- directml: set device id
|
||||
- some bugfixes:
|
||||
- noise suppression2
|
||||
- etc.
|
||||
|
||||
- v.1.5.3.9a
|
||||
|
||||
- some improvements:
|
||||
- keep f0 detector setting
|
||||
- MMVC: max chunksize for onnx
|
||||
- etc
|
||||
- some bugfixs:
|
||||
- RVC: crepe fail to estimate f0
|
||||
- RVC: fallback from half-precision when half-precision failed.
|
||||
- etc
|
||||
|
||||
- v.1.5.3.9
|
||||
|
||||
- New feature:
|
||||
- Add Crepe Full/Tiny (onnx)
|
||||
- some improvements:
|
||||
- server info includes python version
|
||||
- contentvec onnx support
|
||||
- etc
|
||||
- some bugfixs:
|
||||
- server device mode chuttering
|
||||
- new model add sample rate
|
||||
- etc
|
||||
|
||||
- v.1.5.3.8a
|
||||
|
||||
- Bugfix(test): force client device samplerate
|
||||
- Bugfix: server device filter
|
||||
|
||||
- v.1.5.3.8
|
||||
|
||||
- RVC: performance improvement ([PR](https://github.com/w-okada/voice-changer/pull/371) from [nadare881](https://github.com/nadare881))
|
||||
|
||||
- v.1.5.3.7
|
||||
- v.1.5.3.12
|
||||
|
||||
- Feature:
|
||||
- server device monitor
|
||||
- Bugfix:
|
||||
- device output recorder button is showed in server device mode.
|
||||
- Pass through mode
|
||||
- bugfix:
|
||||
- Adapted the GUI to the number of slots.
|
||||
|
||||
- v.1.5.3.11
|
||||
|
||||
- improve:
|
||||
- increase slot size
|
||||
- bugfix:
|
||||
- m1 mac: eliminate torchaudio
|
||||
|
||||
# VC Client とは
|
||||
|
||||
@ -126,31 +71,13 @@
|
||||
- ダウンロードはこちらから。
|
||||
|
||||
| Version | OS | フレームワーク | link | サポート VC | サイズ |
|
||||
| ----------- | --- | ------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------- | ------ |
|
||||
| v.1.5.3.10b | mac | ONNX(cpu), PyTorch(cpu,mps) | [google](https://drive.google.com/uc?id=1akrb9RicU1-cldisToBedaM08y8pFQae&export=download), [hugging face](https://huggingface.co/wok000/vcclient000/tree/main) | MMVC v.1.5.x, MMVC v.1.3.x, so-vits-svc 4.0, RVC, Diffusion-SVC | 795MB |
|
||||
| | win | ONNX(cpu,cuda), PyTorch(cpu,cuda) | [google](https://drive.google.com/uc?id=1eZB0u2u0tEB1tR9mp06YiKx96x2oxgrN&export=download), [hugging face](https://huggingface.co/wok000/vcclient000/tree/main) | MMVC v.1.5.x, MMVC v.1.3.x, so-vits-svc 4.0, RVC, DDSP-SVC, Diffusion-SVC | 3237MB |
|
||||
| | win | ONNX(cpu,DirectML), PyTorch(cpu,cuda) | [google](https://drive.google.com/uc?id=1gzWuEN7oY_WdBwOEwtDdNaK2rT0nHvfN&export=download), [hugging face](https://huggingface.co/wok000/vcclient000/tree/main) | MMVC v.1.5.x, MMVC v.1.3.x, so-vits-svc 4.0, RVC, DDSP-SVC, Diffusion-SVC | 3122MB |
|
||||
| v.1.5.3.10a | mac | ONNX(cpu), PyTorch(cpu,mps) | [google](https://drive.google.com/uc?id=1_fLdFVswhOGwjRiQj4YWE-YZTO_GsnrA&export=download), [hugging face](https://huggingface.co/wok000/vcclient000/tree/main) | MMVC v.1.5.x, MMVC v.1.3.x, so-vits-svc 4.0, RVC, Diffusion-SVC | 795MB |
|
||||
| | win | ONNX(cpu,cuda), PyTorch(cpu,cuda) | [google](https://drive.google.com/uc?id=1imaTBgWBb9ICkNy9pN6NxBISI6SzhEfL&export=download), [hugging face](https://huggingface.co/wok000/vcclient000/tree/main) | MMVC v.1.5.x, MMVC v.1.3.x, so-vits-svc 4.0, RVC, DDSP-SVC, Diffusion-SVC | 3237MB |
|
||||
| | win | ONNX(cpu,DirectML), PyTorch(cpu,cuda) | [google](https://drive.google.com/uc?id=1GoijW29pjscdvxMhi8xgvPSvcHCGYwXO&export=download), [hugging face](https://huggingface.co/wok000/vcclient000/tree/main) | MMVC v.1.5.x, MMVC v.1.3.x, so-vits-svc 4.0, RVC, DDSP-SVC, Diffusion-SVC | 3122MB |
|
||||
| v.1.5.3.10 | mac | ONNX(cpu), PyTorch(cpu,mps) | [google](https://drive.google.com/uc?id=1useZ4gcI0la5OhPuvt2j94CbAhWikpV4&export=download), [hugging face](https://huggingface.co/wok000/vcclient000/tree/main) | MMVC v.1.5.x, MMVC v.1.3.x, so-vits-svc 4.0, RVC, Diffusion-SVC | 795MB |
|
||||
| | win | ONNX(cpu,cuda), PyTorch(cpu,cuda) | [google](https://drive.google.com/uc?id=13abR2xs4KmNIg9b5RJXFez9g6zwZqMj4&export=download), [hugging face](https://huggingface.co/wok000/vcclient000/tree/main) | MMVC v.1.5.x, MMVC v.1.3.x, so-vits-svc 4.0, RVC, DDSP-SVC, Diffusion-SVC | 3237MB |
|
||||
| | win | ONNX(cpu,DirectML), PyTorch(cpu,cuda) | [google](https://drive.google.com/uc?id=1ZxPp-HF7vSEJ8m00WnQaGbo4bTN4LqYD&export=download), [hugging face](https://huggingface.co/wok000/vcclient000/tree/main) | MMVC v.1.5.x, MMVC v.1.3.x, so-vits-svc 4.0, RVC, DDSP-SVC, Diffusion-SVC | 3122MB |
|
||||
| v.1.5.3.9a | mac | ONNX(cpu), PyTorch(cpu,mps) | [google](https://drive.google.com/uc?id=1GsPTUTUbMvwNwAA8SGvSplwsf-yui0iw&export=download), [hugging face](https://huggingface.co/wok000/vcclient000/tree/main) | MMVC v.1.5.x, MMVC v.1.3.x, so-vits-svc 4.0, RVC | 794MB |
|
||||
| | win | ONNX(cpu,cuda), PyTorch(cpu,cuda) | [google](https://drive.google.com/uc?id=1eKZCozh37QDfAr33ZG7lGFUOQv1tOooR&export=download), [hugging face](https://huggingface.co/wok000/vcclient000/tree/main) | MMVC v.1.5.x, MMVC v.1.3.x, so-vits-svc 4.0, RVC, DDSP-SVC | 3237MB |
|
||||
| | win | ONNX(cpu,DirectML), PyTorch(cpu,cuda) | [google](https://drive.google.com/uc?id=1sxUNBPkeSPPNOE1ZknVF-0kx2jHP3kN6&export=download), [hugging face](https://huggingface.co/wok000/vcclient000/tree/main) | MMVC v.1.5.x, MMVC v.1.3.x, so-vits-svc 4.0, RVC, DDSP-SVC | 3122MB |
|
||||
| v.1.5.3.9 | mac | ONNX(cpu), PyTorch(cpu,mps) | [google](https://drive.google.com/uc?id=1pTTcTseSdIfCyNUjB-K1mYPg9YocSYz6&export=download), [hugging face](https://huggingface.co/wok000/vcclient000/tree/main) | MMVC v.1.5.x, MMVC v.1.3.x, so-vits-svc 4.0, RVC | 795MB |
|
||||
| | win | ONNX(cpu,cuda), PyTorch(cpu,cuda) | [google](https://drive.google.com/uc?id=1KWg-QoF6XmLbkUav-fmxc7bdAcD3844V&export=download), [hugging face](https://huggingface.co/wok000/vcclient000/tree/main) | MMVC v.1.5.x, MMVC v.1.3.x, so-vits-svc 4.0, RVC, DDSP-SVC | 3238MB |
|
||||
| | win | ONNX(cpu,DirectML), PyTorch(cpu,cuda) | [google](https://drive.google.com/uc?id=1_TXUkDcofYz9mJd2L1ajAoyIBCQF29WL&export=download), [hugging face](https://huggingface.co/wok000/vcclient000/tree/main) | MMVC v.1.5.x, MMVC v.1.3.x, so-vits-svc 4.0, RVC, DDSP-SVC | 3123MB |
|
||||
| v.1.5.3.8a | mac | ONNX(cpu), PyTorch(cpu,mps) | [normal](https://drive.google.com/uc?id=1hg6lynE3wWJTNTParTa2qB2L06OL9KJ9&export=download), [hugging face](https://huggingface.co/wok000/vcclient000/tree/main) | MMVC v.1.5.x, MMVC v.1.3.x, so-vits-svc 4.0, RVC | 794MB |
|
||||
| | win | ONNX(cpu,DirectML), PyTorch(cpu,cuda) | [normal](https://drive.google.com/uc?id=1C9PCu8pdafO6jJ2yCaB7x54Ls7LcM0Xc&export=download), [hugging face](https://huggingface.co/wok000/vcclient000/tree/main) | MMVC v.1.5.x, MMVC v.1.3.x, so-vits-svc 4.0, RVC, DDSP-SVC | 3122MB |
|
||||
| | win | ONNX(cpu,cuda), PyTorch(cpu,cuda) | [normal](https://drive.google.com/uc?id=1bzrGhHPc9GdaRAMxkksTGtbuRLEeBx9i&export=download), [hugging face](https://huggingface.co/wok000/vcclient000/tree/main) | MMVC v.1.5.x, MMVC v.1.3.x, so-vits-svc 4.0, RVC, DDSP-SVC | 3237MB |
|
||||
| v.1.5.3.8 | mac | ONNX(cpu), PyTorch(cpu,mps) | [normal](https://drive.google.com/uc?id=1ptmjFCRDW7M0l80072JVRII5tJpF13__&export=download), [hugging face](https://huggingface.co/wok000/vcclient000/tree/main) | MMVC v.1.5.x, MMVC v.1.3.x, so-vits-svc 4.0, RVC | 794MB |
|
||||
| | win | ONNX(cpu,DirectML), PyTorch(cpu,cuda) | [normal](https://drive.google.com/uc?id=19DfeACmpnzqCVH5bIoFunS2pGPABRuso&export=download), [hugging face](https://huggingface.co/wok000/vcclient000/tree/main) | MMVC v.1.5.x, MMVC v.1.3.x, so-vits-svc 4.0, RVC, DDSP-SVC | 3122MB |
|
||||
| | win | ONNX(cpu,cuda), PyTorch(cpu,cuda) | [normal](https://drive.google.com/uc?id=1AYP_hMdoeacX0KiF31Vd3oEjxwdreSbM&export=download), [hugging face](https://huggingface.co/wok000/vcclient000/tree/main) | MMVC v.1.5.x, MMVC v.1.3.x, so-vits-svc 4.0, RVC, DDSP-SVC | 3237MB |
|
||||
| v.1.5.3.7 | mac | ONNX(cpu), PyTorch(cpu,mps) | [normal](https://drive.google.com/uc?id=1HdJwgo0__vR6pAkOkekejUZJ0lu2NfDs&export=download), [hugging face](https://huggingface.co/wok000/vcclient000/tree/main) | MMVC v.1.5.x, MMVC v.1.3.x, so-vits-svc 4.0, RVC | 794MB |
|
||||
| | win | ONNX(cpu,cuda), PyTorch(cpu,cuda) | [normal](https://drive.google.com/uc?id=1JIF4PvKg-8HNUv_fMaXSM3AeYa-F_c4z&export=download), [hugging face](https://huggingface.co/wok000/vcclient000/tree/main) | MMVC v.1.5.x, MMVC v.1.3.x, so-vits-svc 4.0, RVC, DDSP-SVC | 3237MB |
|
||||
| | win | ONNX(cpu,DirectML), PyTorch(cpu,cuda) | [normal](https://drive.google.com/uc?id=1cJzRHmD3vk6av0Dvwj3v9Ef5KUsQYhKv&export=download), [hugging face](https://huggingface.co/wok000/vcclient000/tree/main) | MMVC v.1.5.x, MMVC v.1.3.x, so-vits-svc 4.0, RVC, DDSP-SVC | 3122MB |
|
||||
| ---------- | --- | ------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------- | ------ |
|
||||
| v.1.5.3.12 | mac | ONNX(cpu), PyTorch(cpu,mps) | [google](https://drive.google.com/uc?id=1rC7IVpzfG68Ps6tBmdFIjSXvTNaUKBf6&export=download), [hugging face](https://huggingface.co/wok000/vcclient000/tree/main) | MMVC v.1.5.x, MMVC v.1.3.x, so-vits-svc 4.0, RVC | 797MB |
|
||||
| | win | ONNX(cpu,cuda), PyTorch(cpu,cuda) | [google](https://drive.google.com/uc?id=1OqxS_jve4qvj71DdSGOrhI8DGaEVRzgs&export=download), [hugging face](https://huggingface.co/wok000/vcclient000/tree/main) | MMVC v.1.5.x, MMVC v.1.3.x, so-vits-svc 4.0, RVC, DDSP-SVC, Diffusion-SVC | 3241MB |
|
||||
| | win | ONNX(cpu,DirectML), PyTorch(cpu,cuda) | [google](https://drive.google.com/uc?id=1HhfmMovujzbOmvCi7WPuqQAuuo7jaM1o&export=download), [hugging face](https://huggingface.co/wok000/vcclient000/tree/main) | MMVC v.1.5.x, MMVC v.1.3.x, so-vits-svc 4.0, RVC, DDSP-SVC, Diffusion-SVC | 3126MB |
|
||||
| v.1.5.3.11 | mac | ONNX(cpu), PyTorch(cpu,mps) | [google](https://drive.google.com/uc?id=1cutPICJa-PI_ww0E3ae9FCuSjY_5PnWE&export=download), [hugging face](https://huggingface.co/wok000/vcclient000/tree/main) | MMVC v.1.5.x, MMVC v.1.3.x, so-vits-svc 4.0, RVC, | 795MB |
|
||||
| | win | ONNX(cpu,cuda), PyTorch(cpu,cuda) | [google](https://drive.google.com/uc?id=1aOkc-QhtAj11gI8i335mHhNMUSESeJ5J&export=download), [hugging face](https://huggingface.co/wok000/vcclient000/tree/main) | MMVC v.1.5.x, MMVC v.1.3.x, so-vits-svc 4.0, RVC, DDSP-SVC, Diffusion-SVC | 3237MB |
|
||||
| | win | ONNX(cpu,DirectML), PyTorch(cpu,cuda) | [google](https://drive.google.com/uc?id=16g33cZ925HNty_0Hly7Aw_nXlQlgqxDC&export=download), [hugging face](https://huggingface.co/wok000/vcclient000/tree/main) | MMVC v.1.5.x, MMVC v.1.3.x, so-vits-svc 4.0, RVC, DDSP-SVC, Diffusion-SVC | 3122MB |
|
||||
|
||||
(\*1) Google Drive からダウンロードできない方は[hugging_face](https://huggingface.co/wok000/vcclient000/tree/main)からダウンロードしてみてください
|
||||
(\*2) 開発者が AMD のグラフィックボードを持っていないので動作確認していません。onnxruntime-directml を同梱しただけのものです。
|
||||
@ -255,3 +182,7 @@ Github Pages 上で実行できるため、ブラウザのみあれば様々な
|
||||
| | win | ONNX(cpu,cuda), PyTorch(cpu,cuda) | [normal](https://drive.google.com/uc?id=1tmTMJRRggS2Sb4goU-eHlRvUBR88RZDl&export=download) \*1 | MMVC v.1.5.x, MMVC v.1.3.x, so-vits-svc 4.0, so-vits-svc 4.0v2, RVC, DDSP-SVC | 2872MB |
|
||||
| v.1.5.3.1 | mac | ONNX(cpu), PyTorch(cpu,mps) | [normal](https://drive.google.com/uc?id=1oswF72q_cQQeXhIn6W275qLnoBAmcrR_&export=download) \*1 | MMVC v.1.5.x, MMVC v.1.3.x, so-vits-svc 4.0, RVC | 796MB |
|
||||
| | win | ONNX(cpu,cuda), PyTorch(cpu,cuda) | [normal](https://drive.google.com/uc?id=1AWjDhW4w2Uljp1-9P8YUJBZsIlnhkJX2&export=download) \*1 | MMVC v.1.5.x, MMVC v.1.3.x, so-vits-svc 4.0, so-vits-svc 4.0v2, RVC, DDSP-SVC | 2872MB |
|
||||
|
||||
# For Contributor
|
||||
|
||||
このリポジトリは[CLA](https://raw.githubusercontent.com/w-okada/voice-changer/master/LICENSE-CLA)を設定しています。
|
||||
|
@ -52,6 +52,8 @@ $ python3 MMVCServerSIO.py -p 18888 --https true \
|
||||
|
||||
```
|
||||
|
||||
Access with Browser (currently only chrome is supported), then you can see gui.
|
||||
|
||||
2-1. Trouble shoot
|
||||
|
||||
(1) OSError: PortAudio library not found
|
||||
|
@ -51,6 +51,8 @@ $ python3 MMVCServerSIO.py -p 18888 --https true \
|
||||
--samples samples.json
|
||||
```
|
||||
|
||||
ブラウザ(Chrome のみサポート)でアクセスすると画面が表示されます。
|
||||
|
||||
2-1. トラブルシュート
|
||||
|
||||
(1) OSError: PortAudio library not found
|
||||
|
109
README_en.md
109
README_en.md
@ -4,74 +4,19 @@
|
||||
|
||||
## What's New!
|
||||
|
||||
- v.1.5.3.10b
|
||||
|
||||
- improve:
|
||||
- logger
|
||||
- bugfix:
|
||||
- RMVPE:different device bug (not finding root caused yet)
|
||||
- RVC: when loading sample model, useIndex issue
|
||||
|
||||
- v.1.5.3.10a
|
||||
|
||||
- Improvement:
|
||||
- launch sequence
|
||||
- onnx export process
|
||||
- error handling in client
|
||||
- bugfix:
|
||||
- RMVPE for mac
|
||||
|
||||
- v.1.5.3.10
|
||||
|
||||
- New Feature
|
||||
- Support Diffusion SVC(only combo model)
|
||||
- System audio capture(only for win)
|
||||
- Support RMVPE
|
||||
- improvement
|
||||
- directml: set device id
|
||||
- some bugfixes:
|
||||
- noise suppression2
|
||||
- etc.
|
||||
|
||||
- v.1.5.3.9a
|
||||
|
||||
- some improvements:
|
||||
- keep f0 detector setting
|
||||
- MMVC: max chunksize for onnx
|
||||
- etc
|
||||
- some bugfixs:
|
||||
- RVC: crepe fail to estimate f0
|
||||
- RVC: fallback from half-precision when half-precision failed.
|
||||
- etc
|
||||
|
||||
- v.1.5.3.9
|
||||
|
||||
- New feature:
|
||||
- Add Crepe Full/Tiny (onnx)
|
||||
- some improvements:
|
||||
- server info includes python version
|
||||
- contentvec onnx support
|
||||
- etc
|
||||
- some bugfixs:
|
||||
- server device mode chuttering
|
||||
- new model add sample rate
|
||||
- etc
|
||||
|
||||
- v.1.5.3.8a
|
||||
|
||||
- Bugfix(test): force client device samplerate
|
||||
- Bugfix: server device filter
|
||||
|
||||
- v.1.5.3.8
|
||||
|
||||
- RVC: performance improvement ([PR](https://github.com/w-okada/voice-changer/pull/371) from [nadare881](https://github.com/nadare881))
|
||||
|
||||
- v.1.5.3.7
|
||||
- v.1.5.3.12
|
||||
|
||||
- Feature:
|
||||
- server device monitor
|
||||
- Bugfix:
|
||||
- device output recorder button is showed in server device mode.
|
||||
- Pass through mode
|
||||
- bugfix:
|
||||
- Adapted the GUI to the number of slots.
|
||||
|
||||
- v.1.5.3.11
|
||||
|
||||
- improve:
|
||||
- increase slot size
|
||||
- bugfix:
|
||||
- m1 mac: eliminate torchaudio
|
||||
|
||||
# What is VC Client
|
||||
|
||||
@ -123,31 +68,13 @@ It can be used in two main ways, in order of difficulty:
|
||||
- Download (When you cannot download from google drive, try [hugging_face](https://huggingface.co/wok000/vcclient000/tree/main))
|
||||
|
||||
| Version | OS | Framework | link | support VC | size |
|
||||
| ----------- | --- | ------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------- | ------ |
|
||||
| v.1.5.3.10b | mac | ONNX(cpu), PyTorch(cpu,mps) | [google](https://drive.google.com/uc?id=1akrb9RicU1-cldisToBedaM08y8pFQae&export=download), [hugging face](https://huggingface.co/wok000/vcclient000/tree/main) | MMVC v.1.5.x, MMVC v.1.3.x, so-vits-svc 4.0, RVC, Diffusion-SVC | 795MB |
|
||||
| | win | ONNX(cpu,cuda), PyTorch(cpu,cuda) | [google](https://drive.google.com/uc?id=1eZB0u2u0tEB1tR9mp06YiKx96x2oxgrN&export=download), [hugging face](https://huggingface.co/wok000/vcclient000/tree/main) | MMVC v.1.5.x, MMVC v.1.3.x, so-vits-svc 4.0, RVC, DDSP-SVC, Diffusion-SVC | 3237MB |
|
||||
| | win | ONNX(cpu,DirectML), PyTorch(cpu,cuda) | [google](https://drive.google.com/uc?id=1gzWuEN7oY_WdBwOEwtDdNaK2rT0nHvfN&export=download), [hugging face](https://huggingface.co/wok000/vcclient000/tree/main) | MMVC v.1.5.x, MMVC v.1.3.x, so-vits-svc 4.0, RVC, DDSP-SVC, Diffusion-SVC | 3122MB |
|
||||
| v.1.5.3.10a | mac | ONNX(cpu), PyTorch(cpu,mps) | [google](https://drive.google.com/uc?id=1_fLdFVswhOGwjRiQj4YWE-YZTO_GsnrA&export=download), [hugging face](https://huggingface.co/wok000/vcclient000/tree/main) | MMVC v.1.5.x, MMVC v.1.3.x, so-vits-svc 4.0, RVC, Diffusion-SVC | 795MB |
|
||||
| | win | ONNX(cpu,cuda), PyTorch(cpu,cuda) | [google](https://drive.google.com/uc?id=1imaTBgWBb9ICkNy9pN6NxBISI6SzhEfL&export=download), [hugging face](https://huggingface.co/wok000/vcclient000/tree/main) | MMVC v.1.5.x, MMVC v.1.3.x, so-vits-svc 4.0, RVC, DDSP-SVC, Diffusion-SVC | 3237MB |
|
||||
| | win | ONNX(cpu,DirectML), PyTorch(cpu,cuda) | [google](https://drive.google.com/uc?id=1GoijW29pjscdvxMhi8xgvPSvcHCGYwXO&export=download), [hugging face](https://huggingface.co/wok000/vcclient000/tree/main) | MMVC v.1.5.x, MMVC v.1.3.x, so-vits-svc 4.0, RVC, DDSP-SVC, Diffusion-SVC | 3122MB |
|
||||
| v.1.5.3.10 | mac | ONNX(cpu), PyTorch(cpu,mps) | [google](https://drive.google.com/uc?id=1useZ4gcI0la5OhPuvt2j94CbAhWikpV4&export=download), [hugging face](https://huggingface.co/wok000/vcclient000/tree/main) | MMVC v.1.5.x, MMVC v.1.3.x, so-vits-svc 4.0, RVC, Diffusion-SVC | 795MB |
|
||||
| | win | ONNX(cpu,cuda), PyTorch(cpu,cuda) | [google](https://drive.google.com/uc?id=13abR2xs4KmNIg9b5RJXFez9g6zwZqMj4&export=download), [hugging face](https://huggingface.co/wok000/vcclient000/tree/main) | MMVC v.1.5.x, MMVC v.1.3.x, so-vits-svc 4.0, RVC, DDSP-SVC, Diffusion-SVC | 3237MB |
|
||||
| | win | ONNX(cpu,DirectML), PyTorch(cpu,cuda) | [google](https://drive.google.com/uc?id=1ZxPp-HF7vSEJ8m00WnQaGbo4bTN4LqYD&export=download), [hugging face](https://huggingface.co/wok000/vcclient000/tree/main) | MMVC v.1.5.x, MMVC v.1.3.x, so-vits-svc 4.0, RVC, DDSP-SVC, Diffusion-SVC | 3122MB |
|
||||
| v.1.5.3.9a | mac | ONNX(cpu), PyTorch(cpu,mps) | [google](https://drive.google.com/uc?id=1GsPTUTUbMvwNwAA8SGvSplwsf-yui0iw&export=download), [hugging face](https://huggingface.co/wok000/vcclient000/tree/main) | MMVC v.1.5.x, MMVC v.1.3.x, so-vits-svc 4.0, RVC | 794MB |
|
||||
| | win | ONNX(cpu,cuda), PyTorch(cpu,cuda) | [google](https://drive.google.com/uc?id=1eKZCozh37QDfAr33ZG7lGFUOQv1tOooR&export=download), [hugging face](https://huggingface.co/wok000/vcclient000/tree/main) | MMVC v.1.5.x, MMVC v.1.3.x, so-vits-svc 4.0, RVC, DDSP-SVC | 3237MB |
|
||||
| | win | ONNX(cpu,DirectML), PyTorch(cpu,cuda) | [google](https://drive.google.com/uc?id=1sxUNBPkeSPPNOE1ZknVF-0kx2jHP3kN6&export=download), [hugging face](https://huggingface.co/wok000/vcclient000/tree/main) | MMVC v.1.5.x, MMVC v.1.3.x, so-vits-svc 4.0, RVC, DDSP-SVC | 3122MB |
|
||||
| v.1.5.3.9 | mac | ONNX(cpu), PyTorch(cpu,mps) | [google](https://drive.google.com/uc?id=1pTTcTseSdIfCyNUjB-K1mYPg9YocSYz6&export=download), [hugging face](https://huggingface.co/wok000/vcclient000/tree/main) | MMVC v.1.5.x, MMVC v.1.3.x, so-vits-svc 4.0, RVC | 795MB |
|
||||
| | win | ONNX(cpu,cuda), PyTorch(cpu,cuda) | [google](https://drive.google.com/uc?id=1KWg-QoF6XmLbkUav-fmxc7bdAcD3844V&export=download), [hugging face](https://huggingface.co/wok000/vcclient000/tree/main) | MMVC v.1.5.x, MMVC v.1.3.x, so-vits-svc 4.0, RVC, DDSP-SVC | 3238MB |
|
||||
| | win | ONNX(cpu,DirectML), PyTorch(cpu,cuda) | [google](https://drive.google.com/uc?id=1_TXUkDcofYz9mJd2L1ajAoyIBCQF29WL&export=download), [hugging face](https://huggingface.co/wok000/vcclient000/tree/main) | MMVC v.1.5.x, MMVC v.1.3.x, so-vits-svc 4.0, RVC, DDSP-SVC | 3123MB |
|
||||
| v.1.5.3.8a | mac | ONNX(cpu), PyTorch(cpu,mps) | [normal](https://drive.google.com/uc?id=1hg6lynE3wWJTNTParTa2qB2L06OL9KJ9&export=download), [hugging face](https://huggingface.co/wok000/vcclient000/tree/main) | MMVC v.1.5.x, MMVC v.1.3.x, so-vits-svc 4.0, RVC | 794MB |
|
||||
| | win | ONNX(cpu,DirectML), PyTorch(cpu,cuda) | [normal](https://drive.google.com/uc?id=1C9PCu8pdafO6jJ2yCaB7x54Ls7LcM0Xc&export=download), [hugging face](https://huggingface.co/wok000/vcclient000/tree/main) | MMVC v.1.5.x, MMVC v.1.3.x, so-vits-svc 4.0, RVC, DDSP-SVC | 3122MB |
|
||||
| | win | ONNX(cpu,cuda), PyTorch(cpu,cuda) | [normal](https://drive.google.com/uc?id=1bzrGhHPc9GdaRAMxkksTGtbuRLEeBx9i&export=download), [hugging face](https://huggingface.co/wok000/vcclient000/tree/main) | MMVC v.1.5.x, MMVC v.1.3.x, so-vits-svc 4.0, RVC, DDSP-SVC | 3237MB |
|
||||
| v.1.5.3.8 | mac | ONNX(cpu), PyTorch(cpu,mps) | [normal](https://drive.google.com/uc?id=1ptmjFCRDW7M0l80072JVRII5tJpF13__&export=download), [hugging face](https://huggingface.co/wok000/vcclient000/tree/main) | MMVC v.1.5.x, MMVC v.1.3.x, so-vits-svc 4.0, RVC | 794MB |
|
||||
| | win | ONNX(cpu,DirectML), PyTorch(cpu,cuda) | [normal](https://drive.google.com/uc?id=19DfeACmpnzqCVH5bIoFunS2pGPABRuso&export=download), [hugging face](https://huggingface.co/wok000/vcclient000/tree/main) | MMVC v.1.5.x, MMVC v.1.3.x, so-vits-svc 4.0, RVC, DDSP-SVC | 3122MB |
|
||||
| | win | ONNX(cpu,cuda), PyTorch(cpu,cuda) | [normal](https://drive.google.com/uc?id=1AYP_hMdoeacX0KiF31Vd3oEjxwdreSbM&export=download), [hugging face](https://huggingface.co/wok000/vcclient000/tree/main) | MMVC v.1.5.x, MMVC v.1.3.x, so-vits-svc 4.0, RVC, DDSP-SVC | 3237MB |
|
||||
| v.1.5.3.7 | mac | ONNX(cpu), PyTorch(cpu,mps) | [normal](https://drive.google.com/uc?id=1HdJwgo0__vR6pAkOkekejUZJ0lu2NfDs&export=download), [hugging face](https://huggingface.co/wok000/vcclient000/tree/main) | MMVC v.1.5.x, MMVC v.1.3.x, so-vits-svc 4.0, RVC | 794MB |
|
||||
| | win | ONNX(cpu,cuda), PyTorch(cpu,cuda) | [normal](https://drive.google.com/uc?id=1JIF4PvKg-8HNUv_fMaXSM3AeYa-F_c4z&export=download), [hugging face](https://huggingface.co/wok000/vcclient000/tree/main) | MMVC v.1.5.x, MMVC v.1.3.x, so-vits-svc 4.0, RVC, DDSP-SVC | 3237MB |
|
||||
| | win | ONNX(cpu,DirectML), PyTorch(cpu,cuda) | [normal](https://drive.google.com/uc?id=1cJzRHmD3vk6av0Dvwj3v9Ef5KUsQYhKv&export=download), [hugging face](https://huggingface.co/wok000/vcclient000/tree/main) | MMVC v.1.5.x, MMVC v.1.3.x, so-vits-svc 4.0, RVC, DDSP-SVC | 3122MB |
|
||||
| ---------- | --- | ------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------- | ------ |
|
||||
| v.1.5.3.12 | mac | ONNX(cpu), PyTorch(cpu,mps) | [google](https://drive.google.com/uc?id=1rC7IVpzfG68Ps6tBmdFIjSXvTNaUKBf6&export=download), [hugging face](https://huggingface.co/wok000/vcclient000/tree/main) | MMVC v.1.5.x, MMVC v.1.3.x, so-vits-svc 4.0, RVC | 797MB |
|
||||
| | win | ONNX(cpu,cuda), PyTorch(cpu,cuda) | [google](https://drive.google.com/uc?id=1OqxS_jve4qvj71DdSGOrhI8DGaEVRzgs&export=download), [hugging face](https://huggingface.co/wok000/vcclient000/tree/main) | MMVC v.1.5.x, MMVC v.1.3.x, so-vits-svc 4.0, RVC, DDSP-SVC, Diffusion-SVC | 3241MB |
|
||||
| | win | ONNX(cpu,DirectML), PyTorch(cpu,cuda) | [google](https://drive.google.com/uc?id=1HhfmMovujzbOmvCi7WPuqQAuuo7jaM1o&export=download), [hugging face](https://huggingface.co/wok000/vcclient000/tree/main) | MMVC v.1.5.x, MMVC v.1.3.x, so-vits-svc 4.0, RVC, DDSP-SVC, Diffusion-SVC | 3126MB |
|
||||
| v.1.5.3.11 | mac | ONNX(cpu), PyTorch(cpu,mps) | [google](https://drive.google.com/uc?id=1cutPICJa-PI_ww0E3ae9FCuSjY_5PnWE&export=download), [hugging face](https://huggingface.co/wok000/vcclient000/tree/main) | MMVC v.1.5.x, MMVC v.1.3.x, so-vits-svc 4.0, RVC, | 795MB |
|
||||
| | win | ONNX(cpu,cuda), PyTorch(cpu,cuda) | [google](https://drive.google.com/uc?id=1aOkc-QhtAj11gI8i335mHhNMUSESeJ5J&export=download), [hugging face](https://huggingface.co/wok000/vcclient000/tree/main) | MMVC v.1.5.x, MMVC v.1.3.x, so-vits-svc 4.0, RVC, DDSP-SVC, Diffusion-SVC | 3237MB |
|
||||
| | win | ONNX(cpu,DirectML), PyTorch(cpu,cuda) | [google](https://drive.google.com/uc?id=16g33cZ925HNty_0Hly7Aw_nXlQlgqxDC&export=download), [hugging face](https://huggingface.co/wok000/vcclient000/tree/main) | MMVC v.1.5.x, MMVC v.1.3.x, so-vits-svc 4.0, RVC, DDSP-SVC, Diffusion-SVC | 3122MB |
|
||||
|
||||
(\*1) You can also download from [hugging_face](https://huggingface.co/wok000/vcclient000/tree/main)
|
||||
(\*2) The developer does not have an AMD graphics card, so it has not been tested. This package only includes onnxruntime-directml.
|
||||
|
1
client/demo/dist/assets/gui_settings/edition_dml.txt
vendored
Normal file
1
client/demo/dist/assets/gui_settings/edition_dml.txt
vendored
Normal file
@ -0,0 +1 @@
|
||||
onnxdirectML-cuda
|
2
client/demo/dist/index.js
vendored
2
client/demo/dist/index.js
vendored
File diff suppressed because one or more lines are too long
3012
client/demo/dist/index.js.LICENSE.txt
vendored
3012
client/demo/dist/index.js.LICENSE.txt
vendored
File diff suppressed because it is too large
Load Diff
1683
client/demo/package-lock.json
generated
1683
client/demo/package-lock.json
generated
File diff suppressed because it is too large
Load Diff
@ -11,8 +11,8 @@
|
||||
"build:dev": "npm-run-all clean webpack:dev",
|
||||
"start": "webpack-dev-server --config webpack.dev.js",
|
||||
"build:mod": "cd ../lib && npm run build:dev && cd - && cp -r ../lib/dist/* node_modules/@dannadori/voice-changer-client-js/dist/",
|
||||
"build:mod_dos": "cd ../lib && npm run build:dev && cd ../demo && copy ../lib/dist/index.js node_modules/@dannadori/voice-changer-client-js/dist/",
|
||||
"build:mod_dos2": "copy ../lib/dist/index.js node_modules/@dannadori/voice-changer-client-js/dist/",
|
||||
"build:mod_dos": "cd ../lib && npm run build:dev && cd ../demo && npm-run-all build:mod_copy",
|
||||
"build:mod_copy": "XCOPY ..\\lib\\dist\\* .\\node_modules\\@dannadori\\voice-changer-client-js\\dist\\* /s /e /h /y",
|
||||
"test": "echo \"Error: no test specified\" && exit 1"
|
||||
},
|
||||
"keywords": [
|
||||
@ -26,17 +26,17 @@
|
||||
"@babel/preset-env": "^7.22.9",
|
||||
"@babel/preset-react": "^7.22.5",
|
||||
"@babel/preset-typescript": "^7.22.5",
|
||||
"@types/node": "^20.4.5",
|
||||
"@types/react": "^18.2.17",
|
||||
"@types/node": "^20.4.6",
|
||||
"@types/react": "^18.2.18",
|
||||
"@types/react-dom": "^18.2.7",
|
||||
"autoprefixer": "^10.4.14",
|
||||
"babel-loader": "^9.1.3",
|
||||
"copy-webpack-plugin": "^11.0.0",
|
||||
"css-loader": "^6.8.1",
|
||||
"eslint": "^8.45.0",
|
||||
"eslint-config-prettier": "^8.8.0",
|
||||
"eslint": "^8.46.0",
|
||||
"eslint-config-prettier": "^8.9.0",
|
||||
"eslint-plugin-prettier": "^5.0.0",
|
||||
"eslint-plugin-react": "^7.33.0",
|
||||
"eslint-plugin-react": "^7.33.1",
|
||||
"eslint-webpack-plugin": "^4.0.1",
|
||||
"html-loader": "^4.2.0",
|
||||
"html-webpack-plugin": "^5.5.3",
|
||||
@ -54,11 +54,11 @@
|
||||
"webpack-dev-server": "^4.15.1"
|
||||
},
|
||||
"dependencies": {
|
||||
"@dannadori/voice-changer-client-js": "^1.0.164",
|
||||
"@fortawesome/fontawesome-svg-core": "^6.4.0",
|
||||
"@fortawesome/free-brands-svg-icons": "^6.4.0",
|
||||
"@fortawesome/free-regular-svg-icons": "^6.4.0",
|
||||
"@fortawesome/free-solid-svg-icons": "^6.4.0",
|
||||
"@dannadori/voice-changer-client-js": "^1.0.166",
|
||||
"@fortawesome/fontawesome-svg-core": "^6.4.2",
|
||||
"@fortawesome/free-brands-svg-icons": "^6.4.2",
|
||||
"@fortawesome/free-regular-svg-icons": "^6.4.2",
|
||||
"@fortawesome/free-solid-svg-icons": "^6.4.2",
|
||||
"@fortawesome/react-fontawesome": "^0.2.0",
|
||||
"protobufjs": "^7.2.4",
|
||||
"react": "^18.2.0",
|
||||
|
1
client/demo/public/assets/gui_settings/edition_dml.txt
Normal file
1
client/demo/public/assets/gui_settings/edition_dml.txt
Normal file
@ -0,0 +1 @@
|
||||
onnxdirectML-cuda
|
@ -63,6 +63,7 @@ type GuiStateAndMethod = {
|
||||
outputAudioDeviceInfo: MediaDeviceInfo[];
|
||||
audioInputForGUI: string;
|
||||
audioOutputForGUI: string;
|
||||
audioMonitorForGUI: string;
|
||||
fileInputEchoback: boolean | undefined;
|
||||
shareScreenEnabled: boolean;
|
||||
audioOutputForAnalyzer: string;
|
||||
@ -70,6 +71,7 @@ type GuiStateAndMethod = {
|
||||
setOutputAudioDeviceInfo: (val: MediaDeviceInfo[]) => void;
|
||||
setAudioInputForGUI: (val: string) => void;
|
||||
setAudioOutputForGUI: (val: string) => void;
|
||||
setAudioMonitorForGUI: (val: string) => void;
|
||||
setFileInputEchoback: (val: boolean) => void;
|
||||
setShareScreenEnabled: (val: boolean) => void;
|
||||
setAudioOutputForAnalyzer: (val: string) => void;
|
||||
@ -106,6 +108,7 @@ export const GuiStateProvider = ({ children }: Props) => {
|
||||
const [outputAudioDeviceInfo, setOutputAudioDeviceInfo] = useState<MediaDeviceInfo[]>([]);
|
||||
const [audioInputForGUI, setAudioInputForGUI] = useState<string>("none");
|
||||
const [audioOutputForGUI, setAudioOutputForGUI] = useState<string>("none");
|
||||
const [audioMonitorForGUI, setAudioMonitorForGUI] = useState<string>("none");
|
||||
const [fileInputEchoback, setFileInputEchoback] = useState<boolean>(false); //最初のmuteが有効になるように。undefined <-- ??? falseしておけばよさそう。undefinedだとwarningがでる。
|
||||
const [shareScreenEnabled, setShareScreenEnabled] = useState<boolean>(false);
|
||||
const [audioOutputForAnalyzer, setAudioOutputForAnalyzer] = useState<string>("default");
|
||||
@ -270,6 +273,7 @@ export const GuiStateProvider = ({ children }: Props) => {
|
||||
outputAudioDeviceInfo,
|
||||
audioInputForGUI,
|
||||
audioOutputForGUI,
|
||||
audioMonitorForGUI,
|
||||
fileInputEchoback,
|
||||
shareScreenEnabled,
|
||||
audioOutputForAnalyzer,
|
||||
@ -277,6 +281,7 @@ export const GuiStateProvider = ({ children }: Props) => {
|
||||
setOutputAudioDeviceInfo,
|
||||
setAudioInputForGUI,
|
||||
setAudioOutputForGUI,
|
||||
setAudioMonitorForGUI,
|
||||
setFileInputEchoback,
|
||||
setShareScreenEnabled,
|
||||
setAudioOutputForAnalyzer,
|
||||
|
@ -19,7 +19,7 @@ export const MainScreen = (props: MainScreenProps) => {
|
||||
const guiState = useGuiState();
|
||||
const messageBuilderState = useMessageBuilder();
|
||||
useMemo(() => {
|
||||
messageBuilderState.setMessage(__filename, "change_icon", { ja: "アイコン変更", en: "chage icon" });
|
||||
messageBuilderState.setMessage(__filename, "change_icon", { ja: "アイコン変更", en: "change icon" });
|
||||
messageBuilderState.setMessage(__filename, "rename", { ja: "リネーム", en: "rename" });
|
||||
messageBuilderState.setMessage(__filename, "download", { ja: "ダウンロード", en: "download" });
|
||||
messageBuilderState.setMessage(__filename, "terms_of_use", { ja: "利用規約", en: "terms of use" });
|
||||
@ -99,7 +99,7 @@ export const MainScreen = (props: MainScreenProps) => {
|
||||
const slotRow = serverSetting.serverSetting.modelSlots.map((x, index) => {
|
||||
// モデルのアイコン
|
||||
const generateIconArea = (slotIndex: number, iconUrl: string, tooltip: boolean) => {
|
||||
const realIconUrl = iconUrl.length > 0 ? iconUrl : "/assets/icons/noimage.png";
|
||||
const realIconUrl = iconUrl.length > 0 ? serverSetting.serverSetting.voiceChangerParams.model_dir + "/" + slotIndex + "/" + iconUrl.split(/[\/\\]/).pop() : "/assets/icons/noimage.png";
|
||||
const iconDivClass = tooltip ? "tooltip" : "";
|
||||
const iconClass = tooltip ? "model-slot-icon-pointable" : "model-slot-icon";
|
||||
return (
|
||||
|
@ -3,158 +3,182 @@ import { useGuiState } from "./001_GuiStateProvider";
|
||||
import { useAppState } from "../../001_provider/001_AppStateProvider";
|
||||
import { MergeElement, RVCModelSlot, RVCModelType, VoiceChangerType } from "@dannadori/voice-changer-client-js";
|
||||
|
||||
|
||||
export const MergeLabDialog = () => {
|
||||
const guiState = useGuiState()
|
||||
const guiState = useGuiState();
|
||||
|
||||
const { serverSetting } = useAppState()
|
||||
const [currentFilter, setCurrentFilter] = useState<string>("")
|
||||
const [mergeElements, setMergeElements] = useState<MergeElement[]>([])
|
||||
const { serverSetting } = useAppState();
|
||||
const [currentFilter, setCurrentFilter] = useState<string>("");
|
||||
const [mergeElements, setMergeElements] = useState<MergeElement[]>([]);
|
||||
|
||||
// スロットが変更されたときの初期化処理
|
||||
const newSlotChangeKey = useMemo(() => {
|
||||
if (!serverSetting.serverSetting.modelSlots) {
|
||||
return ""
|
||||
return "";
|
||||
}
|
||||
return serverSetting.serverSetting.modelSlots.reduce((prev, cur) => {
|
||||
return prev + "_" + cur.modelFile
|
||||
}, "")
|
||||
}, [serverSetting.serverSetting.modelSlots])
|
||||
return prev + "_" + cur.modelFile;
|
||||
}, "");
|
||||
}, [serverSetting.serverSetting.modelSlots]);
|
||||
|
||||
const filterItems = useMemo(() => {
|
||||
return serverSetting.serverSetting.modelSlots.reduce((prev, cur) => {
|
||||
return serverSetting.serverSetting.modelSlots.reduce(
|
||||
(prev, cur) => {
|
||||
if (cur.voiceChangerType != "RVC") {
|
||||
return prev
|
||||
return prev;
|
||||
}
|
||||
const curRVC = cur as RVCModelSlot
|
||||
const key = `${curRVC.modelType},${cur.samplingRate},${curRVC.embChannels}`
|
||||
const val = { type: curRVC.modelType, samplingRate: cur.samplingRate, embChannels: curRVC.embChannels }
|
||||
const existKeys = Object.keys(prev)
|
||||
const curRVC = cur as RVCModelSlot;
|
||||
const key = `${curRVC.modelType},${cur.samplingRate},${curRVC.embChannels}`;
|
||||
const val = { type: curRVC.modelType, samplingRate: cur.samplingRate, embChannels: curRVC.embChannels };
|
||||
const existKeys = Object.keys(prev);
|
||||
if (!cur.modelFile || cur.modelFile.length == 0) {
|
||||
return prev
|
||||
return prev;
|
||||
}
|
||||
if (curRVC.modelType == "onnxRVC" || curRVC.modelType == "onnxRVCNono") {
|
||||
return prev
|
||||
return prev;
|
||||
}
|
||||
if (!existKeys.includes(key)) {
|
||||
prev[key] = val
|
||||
prev[key] = val;
|
||||
}
|
||||
return prev
|
||||
}, {} as { [key: string]: { type: RVCModelType, samplingRate: number, embChannels: number } })
|
||||
|
||||
}, [newSlotChangeKey])
|
||||
return prev;
|
||||
},
|
||||
{} as { [key: string]: { type: RVCModelType; samplingRate: number; embChannels: number } },
|
||||
);
|
||||
}, [newSlotChangeKey]);
|
||||
|
||||
const models = useMemo(() => {
|
||||
return serverSetting.serverSetting.modelSlots.filter(x => {
|
||||
return serverSetting.serverSetting.modelSlots.filter((x) => {
|
||||
if (x.voiceChangerType != "RVC") {
|
||||
return
|
||||
return;
|
||||
}
|
||||
const xRVC = x as RVCModelSlot
|
||||
const filterVals = filterItems[currentFilter]
|
||||
const xRVC = x as RVCModelSlot;
|
||||
const filterVals = filterItems[currentFilter];
|
||||
if (!filterVals) {
|
||||
return false
|
||||
return false;
|
||||
}
|
||||
if (xRVC.modelType == filterVals.type && xRVC.samplingRate == filterVals.samplingRate && xRVC.embChannels == filterVals.embChannels) {
|
||||
return true
|
||||
return true;
|
||||
} else {
|
||||
return false
|
||||
return false;
|
||||
}
|
||||
})
|
||||
}, [filterItems, currentFilter])
|
||||
});
|
||||
}, [filterItems, currentFilter]);
|
||||
|
||||
useEffect(() => {
|
||||
if (Object.keys(filterItems).length > 0) {
|
||||
setCurrentFilter(Object.keys(filterItems)[0])
|
||||
setCurrentFilter(Object.keys(filterItems)[0]);
|
||||
}
|
||||
}, [filterItems])
|
||||
}, [filterItems]);
|
||||
useEffect(() => {
|
||||
// models はフィルタ後の配列
|
||||
const newMergeElements = models.map((x) => {
|
||||
return { filename: x.modelFile, strength: 0 }
|
||||
})
|
||||
setMergeElements(newMergeElements)
|
||||
}, [models])
|
||||
return { slotIndex: x.slotIndex, filename: x.modelFile, strength: 0 };
|
||||
});
|
||||
setMergeElements(newMergeElements);
|
||||
}, [models]);
|
||||
|
||||
const dialog = useMemo(() => {
|
||||
const closeButtonRow = (
|
||||
<div className="body-row split-3-4-3 left-padding-1">
|
||||
<div className="body-item-text">
|
||||
</div>
|
||||
<div className="body-item-text"></div>
|
||||
<div className="body-button-container body-button-container-space-around">
|
||||
<div className="body-button" onClick={() => { guiState.stateControls.showMergeLabCheckbox.updateState(false) }} >close</div>
|
||||
<div
|
||||
className="body-button"
|
||||
onClick={() => {
|
||||
guiState.stateControls.showMergeLabCheckbox.updateState(false);
|
||||
}}
|
||||
>
|
||||
close
|
||||
</div>
|
||||
</div>
|
||||
<div className="body-item-text"></div>
|
||||
</div>
|
||||
)
|
||||
);
|
||||
|
||||
|
||||
const filterOptions = Object.keys(filterItems).map(x => {
|
||||
return <option key={x} value={x}>{x}</option>
|
||||
}).filter(x => x != null)
|
||||
|
||||
const onMergeElementsChanged = (filename: string, strength: number) => {
|
||||
const newMergeElements = mergeElements.map((x) => {
|
||||
if (x.filename == filename) {
|
||||
return { filename: x.filename, strength: strength }
|
||||
} else {
|
||||
return x
|
||||
}
|
||||
const filterOptions = Object.keys(filterItems)
|
||||
.map((x) => {
|
||||
return (
|
||||
<option key={x} value={x}>
|
||||
{x}
|
||||
</option>
|
||||
);
|
||||
})
|
||||
setMergeElements(newMergeElements)
|
||||
.filter((x) => x != null);
|
||||
|
||||
const onMergeElementsChanged = (slotIndex: number, strength: number) => {
|
||||
const newMergeElements = mergeElements.map((x) => {
|
||||
if (x.slotIndex == slotIndex) {
|
||||
return { slotIndex: x.slotIndex, strength: strength };
|
||||
} else {
|
||||
return x;
|
||||
}
|
||||
});
|
||||
setMergeElements(newMergeElements);
|
||||
};
|
||||
|
||||
const onMergeClicked = () => {
|
||||
const validMergeElements = mergeElements.filter((x) => {
|
||||
return x.strength > 0;
|
||||
});
|
||||
serverSetting.mergeModel({
|
||||
voiceChangerType: VoiceChangerType.RVC,
|
||||
command: "mix",
|
||||
files: mergeElements
|
||||
})
|
||||
}
|
||||
files: validMergeElements,
|
||||
});
|
||||
};
|
||||
|
||||
const modelList = mergeElements.map((x, index) => {
|
||||
const name = models.find(model => { return model.modelFile == x.filename })?.name || ""
|
||||
const name =
|
||||
models.find((model) => {
|
||||
return model.slotIndex == x.slotIndex;
|
||||
})?.name || "";
|
||||
|
||||
return (
|
||||
<div key={index} className="merge-lab-model-item">
|
||||
<div>{name}</div>
|
||||
<div>
|
||||
{name}
|
||||
</div>
|
||||
<div>
|
||||
<input type="range" className="body-item-input-slider" min="0" max="100" step="1" value={x.strength} onChange={(e) => {
|
||||
onMergeElementsChanged(x.filename, Number(e.target.value))
|
||||
}}></input>
|
||||
<input
|
||||
type="range"
|
||||
className="body-item-input-slider"
|
||||
min="0"
|
||||
max="100"
|
||||
step="1"
|
||||
value={x.strength}
|
||||
onChange={(e) => {
|
||||
onMergeElementsChanged(x.slotIndex, Number(e.target.value));
|
||||
}}
|
||||
></input>
|
||||
<span className="body-item-input-slider-val">{x.strength}</span>
|
||||
</div>
|
||||
</div>
|
||||
)
|
||||
})
|
||||
|
||||
);
|
||||
});
|
||||
|
||||
const content = (
|
||||
<div className="merge-lab-container">
|
||||
<div className="merge-lab-type-filter">
|
||||
<div>Type:</div>
|
||||
<div>
|
||||
Type:
|
||||
</div>
|
||||
<div>
|
||||
<select value={currentFilter} onChange={(e) => { setCurrentFilter(e.target.value) }}>
|
||||
<select
|
||||
value={currentFilter}
|
||||
onChange={(e) => {
|
||||
setCurrentFilter(e.target.value);
|
||||
}}
|
||||
>
|
||||
{filterOptions}
|
||||
</select>
|
||||
</div>
|
||||
</div>
|
||||
<div className="merge-lab-manipulator">
|
||||
<div className="merge-lab-model-list">
|
||||
{modelList}
|
||||
</div>
|
||||
<div className="merge-lab-model-list">{modelList}</div>
|
||||
<div className="merge-lab-merge-buttons">
|
||||
<div className="merge-lab-merge-buttons-notice">
|
||||
The merged model is stored in the final slot. If you assign this slot, it will be overwritten.
|
||||
</div>
|
||||
<div className="merge-lab-merge-buttons-notice">The merged model is stored in the final slot. If you assign this slot, it will be overwritten.</div>
|
||||
<div className="merge-lab-merge-button" onClick={onMergeClicked}>
|
||||
merge
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
)
|
||||
);
|
||||
return (
|
||||
<div className="dialog-frame">
|
||||
<div className="dialog-title">MergeLab</div>
|
||||
@ -166,5 +190,4 @@ export const MergeLabDialog = () => {
|
||||
);
|
||||
}, [newSlotChangeKey, currentFilter, mergeElements, models]);
|
||||
return dialog;
|
||||
|
||||
};
|
||||
|
@ -1,83 +1,115 @@
|
||||
import React, { useMemo } from "react"
|
||||
import { useAppState } from "../../../001_provider/001_AppStateProvider"
|
||||
import { useGuiState } from "../001_GuiStateProvider"
|
||||
import { useMessageBuilder } from "../../../hooks/useMessageBuilder"
|
||||
import React, { useMemo, useState } from "react";
|
||||
import { useAppState } from "../../../001_provider/001_AppStateProvider";
|
||||
import { useGuiState } from "../001_GuiStateProvider";
|
||||
import { useMessageBuilder } from "../../../hooks/useMessageBuilder";
|
||||
import { FontAwesomeIcon } from "@fortawesome/react-fontawesome";
|
||||
|
||||
export type ModelSlotAreaProps = {
|
||||
}
|
||||
export type ModelSlotAreaProps = {};
|
||||
|
||||
const SortTypes = {
|
||||
slot: "slot",
|
||||
name: "name",
|
||||
} as const;
|
||||
export type SortTypes = (typeof SortTypes)[keyof typeof SortTypes];
|
||||
|
||||
export const ModelSlotArea = (_props: ModelSlotAreaProps) => {
|
||||
const { serverSetting, getInfo } = useAppState()
|
||||
const guiState = useGuiState()
|
||||
const messageBuilderState = useMessageBuilder()
|
||||
const { serverSetting, getInfo } = useAppState();
|
||||
const guiState = useGuiState();
|
||||
const messageBuilderState = useMessageBuilder();
|
||||
const [sortType, setSortType] = useState<SortTypes>("slot");
|
||||
|
||||
useMemo(() => {
|
||||
messageBuilderState.setMessage(__filename, "edit", { "ja": "編集", "en": "edit" })
|
||||
}, [])
|
||||
|
||||
messageBuilderState.setMessage(__filename, "edit", { ja: "編集", en: "edit" });
|
||||
}, []);
|
||||
|
||||
const modelTiles = useMemo(() => {
|
||||
if (!serverSetting.serverSetting.modelSlots) {
|
||||
return []
|
||||
return [];
|
||||
}
|
||||
return serverSetting.serverSetting.modelSlots.map((x, index) => {
|
||||
const modelSlots =
|
||||
sortType == "slot"
|
||||
? serverSetting.serverSetting.modelSlots
|
||||
: serverSetting.serverSetting.modelSlots.slice().sort((a, b) => {
|
||||
return a.name.localeCompare(b.name);
|
||||
});
|
||||
|
||||
return modelSlots
|
||||
.map((x, index) => {
|
||||
if (!x.modelFile || x.modelFile.length == 0) {
|
||||
return null
|
||||
return null;
|
||||
}
|
||||
const tileContainerClass = index == serverSetting.serverSetting.modelSlotIndex ? "model-slot-tile-container-selected" : "model-slot-tile-container"
|
||||
const name = x.name.length > 8 ? x.name.substring(0, 7) + "..." : x.name
|
||||
const iconElem = x.iconFile.length > 0 ?
|
||||
const tileContainerClass = x.slotIndex == serverSetting.serverSetting.modelSlotIndex ? "model-slot-tile-container-selected" : "model-slot-tile-container";
|
||||
const name = x.name.length > 8 ? x.name.substring(0, 7) + "..." : x.name;
|
||||
|
||||
const iconElem =
|
||||
x.iconFile.length > 0 ? (
|
||||
<>
|
||||
<img className="model-slot-tile-icon" src={x.iconFile} alt={x.name} />
|
||||
<img className="model-slot-tile-icon" src={serverSetting.serverSetting.voiceChangerParams.model_dir + "/" + x.slotIndex + "/" + x.iconFile.split(/[\/\\]/).pop()} alt={x.name} />
|
||||
<div className="model-slot-tile-vctype">{x.voiceChangerType}</div>
|
||||
</>
|
||||
:
|
||||
) : (
|
||||
<>
|
||||
<div className="model-slot-tile-icon-no-entry">no image</div>
|
||||
<div className="model-slot-tile-vctype">{x.voiceChangerType}</div>
|
||||
</>
|
||||
);
|
||||
|
||||
const clickAction = async () => {
|
||||
const dummyModelSlotIndex = (Math.floor(Date.now() / 1000)) * 1000 + index
|
||||
await serverSetting.updateServerSettings({ ...serverSetting.serverSetting, modelSlotIndex: dummyModelSlotIndex })
|
||||
setTimeout(() => { // quick hack
|
||||
getInfo()
|
||||
}, 1000 * 2)
|
||||
}
|
||||
const dummyModelSlotIndex = Math.floor(Date.now() / 1000) * 1000 + x.slotIndex;
|
||||
await serverSetting.updateServerSettings({ ...serverSetting.serverSetting, modelSlotIndex: dummyModelSlotIndex });
|
||||
setTimeout(() => {
|
||||
// quick hack
|
||||
getInfo();
|
||||
}, 1000 * 2);
|
||||
};
|
||||
|
||||
return (
|
||||
<div key={index} className={tileContainerClass} onClick={clickAction}>
|
||||
<div className="model-slot-tile-icon-div">
|
||||
{iconElem}
|
||||
<div className="model-slot-tile-icon-div">{iconElem}</div>
|
||||
<div className="model-slot-tile-dscription">{name}</div>
|
||||
</div>
|
||||
<div className="model-slot-tile-dscription">
|
||||
{name}
|
||||
</div>
|
||||
</div >
|
||||
)
|
||||
}).filter(x => x != null)
|
||||
}, [serverSetting.serverSetting.modelSlots, serverSetting.serverSetting.modelSlotIndex])
|
||||
|
||||
);
|
||||
})
|
||||
.filter((x) => x != null);
|
||||
}, [serverSetting.serverSetting.modelSlots, serverSetting.serverSetting.modelSlotIndex, sortType]);
|
||||
|
||||
const modelSlotArea = useMemo(() => {
|
||||
const onModelSlotEditClicked = () => {
|
||||
guiState.stateControls.showModelSlotManagerCheckbox.updateState(true)
|
||||
}
|
||||
guiState.stateControls.showModelSlotManagerCheckbox.updateState(true);
|
||||
};
|
||||
const sortSlotByIdClass = sortType == "slot" ? "model-slot-sort-button-active" : "model-slot-sort-button";
|
||||
const sortSlotByNameClass = sortType == "name" ? "model-slot-sort-button-active" : "model-slot-sort-button";
|
||||
return (
|
||||
<div className="model-slot-area">
|
||||
<div className="model-slot-panel">
|
||||
<div className="model-slot-tiles-container">{modelTiles}</div>
|
||||
<div className="model-slot-buttons">
|
||||
<div className="model-slot-sort-buttons">
|
||||
<div
|
||||
className={sortSlotByIdClass}
|
||||
onClick={() => {
|
||||
setSortType("slot");
|
||||
}}
|
||||
>
|
||||
<FontAwesomeIcon icon={["fas", "arrow-down-1-9"]} style={{ fontSize: "1rem" }} />
|
||||
</div>
|
||||
<div
|
||||
className={sortSlotByNameClass}
|
||||
onClick={() => {
|
||||
setSortType("name");
|
||||
}}
|
||||
>
|
||||
<FontAwesomeIcon icon={["fas", "arrow-down-a-z"]} style={{ fontSize: "1rem" }} />
|
||||
</div>
|
||||
</div>
|
||||
<div className="model-slot-button" onClick={onModelSlotEditClicked}>
|
||||
{messageBuilderState.getMessage(__filename, "edit")}
|
||||
</div>
|
||||
</div>
|
||||
|
||||
</div>
|
||||
</div>
|
||||
)
|
||||
}, [modelTiles])
|
||||
);
|
||||
}, [modelTiles, sortType]);
|
||||
|
||||
return modelSlotArea
|
||||
}
|
||||
return modelSlotArea;
|
||||
};
|
||||
|
@ -23,6 +23,26 @@ export const DiffusionSVCSettingArea = (_props: DiffusionSVCSettingAreaProps) =>
|
||||
return <></>;
|
||||
}
|
||||
|
||||
const skipDiffusionClass = serverSetting.serverSetting.skipDiffusion == 0 ? "character-area-toggle-button" : "character-area-toggle-button-active";
|
||||
|
||||
const skipDiffRow = (
|
||||
<div className="character-area-control">
|
||||
<div className="character-area-control-title">Boost</div>
|
||||
<div className="character-area-control-field">
|
||||
<div className="character-area-buttons">
|
||||
<div
|
||||
className={skipDiffusionClass}
|
||||
onClick={() => {
|
||||
serverSetting.updateServerSettings({ ...serverSetting.serverSetting, skipDiffusion: serverSetting.serverSetting.skipDiffusion == 0 ? 1 : 0 });
|
||||
}}
|
||||
>
|
||||
skip diff
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
);
|
||||
|
||||
const skipValues = getDivisors(serverSetting.serverSetting.kStep);
|
||||
skipValues.pop();
|
||||
|
||||
@ -82,6 +102,7 @@ export const DiffusionSVCSettingArea = (_props: DiffusionSVCSettingAreaProps) =>
|
||||
);
|
||||
return (
|
||||
<>
|
||||
{skipDiffRow}
|
||||
{kStepRow}
|
||||
{speedUpRow}
|
||||
</>
|
||||
|
@ -49,7 +49,7 @@ export const CharacterArea = (_props: CharacterAreaProps) => {
|
||||
return <></>;
|
||||
}
|
||||
|
||||
const icon = selected.iconFile.length > 0 ? selected.iconFile : "./assets/icons/human.png";
|
||||
const icon = selected.iconFile.length > 0 ? serverSetting.serverSetting.voiceChangerParams.model_dir + "/" + selected.slotIndex + "/" + selected.iconFile.split(/[\/\\]/).pop() : "./assets/icons/human.png";
|
||||
const selectedTermOfUseUrlLink = selected.termsOfUseUrl ? (
|
||||
<a href={selected.termsOfUseUrl} target="_blank" rel="noopener noreferrer" className="portrait-area-terms-of-use-link">
|
||||
[{messageBuilderState.getMessage(__filename, "terms_of_use")}]
|
||||
@ -122,9 +122,13 @@ export const CharacterArea = (_props: CharacterAreaProps) => {
|
||||
serverSetting.updateServerSettings({ ...serverSetting.serverSetting, serverAudioStated: 0 });
|
||||
}
|
||||
};
|
||||
const onPassThroughClicked = async () => {
|
||||
serverSetting.updateServerSettings({ ...serverSetting.serverSetting, passThrough: !serverSetting.serverSetting.passThrough });
|
||||
};
|
||||
const startClassName = guiState.isConverting ? "character-area-control-button-active" : "character-area-control-button-stanby";
|
||||
const stopClassName = guiState.isConverting ? "character-area-control-button-stanby" : "character-area-control-button-active";
|
||||
|
||||
const passThruClassName = serverSetting.serverSetting.passThrough == false ? "character-area-control-passthru-button-stanby" : "character-area-control-passthru-button-active blinking";
|
||||
console.log("serverSetting.serverSetting.passThrough", passThruClassName, serverSetting.serverSetting.passThrough);
|
||||
return (
|
||||
<div className="character-area-control">
|
||||
<div className="character-area-control-buttons">
|
||||
@ -134,6 +138,9 @@ export const CharacterArea = (_props: CharacterAreaProps) => {
|
||||
<div onClick={onStopClicked} className={stopClassName}>
|
||||
stop
|
||||
</div>
|
||||
<div onClick={onPassThroughClicked} className={passThruClassName}>
|
||||
passthru
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
);
|
||||
|
@ -41,6 +41,7 @@ export const ConvertArea = (props: ConvertProps) => {
|
||||
|
||||
const gpuSelect =
|
||||
edition.indexOf("onnxdirectML-cuda") >= 0 ? (
|
||||
<>
|
||||
<div className="config-sub-area-control">
|
||||
<div className="config-sub-area-control-title">GPU(dml):</div>
|
||||
<div className="config-sub-area-control-field">
|
||||
@ -54,7 +55,7 @@ export const ConvertArea = (props: ConvertProps) => {
|
||||
}}
|
||||
className={cpuClassName}
|
||||
>
|
||||
cpu
|
||||
<span className="config-sub-area-button-text-small">cpu</span>
|
||||
</div>
|
||||
<div
|
||||
onClick={async () => {
|
||||
@ -65,7 +66,7 @@ export const ConvertArea = (props: ConvertProps) => {
|
||||
}}
|
||||
className={gpu0ClassName}
|
||||
>
|
||||
0
|
||||
<span className="config-sub-area-button-text-small">gpu0</span>
|
||||
</div>
|
||||
<div
|
||||
onClick={async () => {
|
||||
@ -76,7 +77,7 @@ export const ConvertArea = (props: ConvertProps) => {
|
||||
}}
|
||||
className={gpu1ClassName}
|
||||
>
|
||||
1
|
||||
<span className="config-sub-area-button-text-small">gpu1</span>
|
||||
</div>
|
||||
<div
|
||||
onClick={async () => {
|
||||
@ -87,7 +88,7 @@ export const ConvertArea = (props: ConvertProps) => {
|
||||
}}
|
||||
className={gpu2ClassName}
|
||||
>
|
||||
2
|
||||
<span className="config-sub-area-button-text-small">gpu2</span>
|
||||
</div>
|
||||
<div
|
||||
onClick={async () => {
|
||||
@ -98,11 +99,17 @@ export const ConvertArea = (props: ConvertProps) => {
|
||||
}}
|
||||
className={gpu3ClassName}
|
||||
>
|
||||
3
|
||||
<span className="config-sub-area-button-text-small">gpu3</span>
|
||||
</div>
|
||||
<div className="config-sub-area-control">
|
||||
<span className="config-sub-area-button-text-small">
|
||||
<a href="https://github.com/w-okada/voice-changer/issues/410">more info</a>
|
||||
</span>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
</>
|
||||
) : (
|
||||
<div className="config-sub-area-control">
|
||||
<div className="config-sub-area-control-title">GPU:</div>
|
||||
|
@ -2,14 +2,14 @@ import React, { useEffect, useMemo, useRef, useState } from "react";
|
||||
import { useAppState } from "../../../001_provider/001_AppStateProvider";
|
||||
import { fileSelectorAsDataURL, useIndexedDB } from "@dannadori/voice-changer-client-js";
|
||||
import { useGuiState } from "../001_GuiStateProvider";
|
||||
import { AUDIO_ELEMENT_FOR_PLAY_RESULT, AUDIO_ELEMENT_FOR_TEST_CONVERTED, AUDIO_ELEMENT_FOR_TEST_CONVERTED_ECHOBACK, AUDIO_ELEMENT_FOR_TEST_ORIGINAL, INDEXEDDB_KEY_AUDIO_OUTPUT } from "../../../const";
|
||||
import { AUDIO_ELEMENT_FOR_PLAY_MONITOR, AUDIO_ELEMENT_FOR_PLAY_RESULT, AUDIO_ELEMENT_FOR_TEST_CONVERTED, AUDIO_ELEMENT_FOR_TEST_CONVERTED_ECHOBACK, AUDIO_ELEMENT_FOR_TEST_ORIGINAL, INDEXEDDB_KEY_AUDIO_MONITR, INDEXEDDB_KEY_AUDIO_OUTPUT } from "../../../const";
|
||||
import { isDesktopApp } from "../../../const";
|
||||
|
||||
export type DeviceAreaProps = {};
|
||||
|
||||
export const DeviceArea = (_props: DeviceAreaProps) => {
|
||||
const { setting, serverSetting, audioContext, setAudioOutputElementId, initializedRef, setVoiceChangerClientSetting, startOutputRecording, stopOutputRecording } = useAppState();
|
||||
const { isConverting, audioInputForGUI, inputAudioDeviceInfo, setAudioInputForGUI, fileInputEchoback, setFileInputEchoback, setAudioOutputForGUI, audioOutputForGUI, outputAudioDeviceInfo, shareScreenEnabled, setShareScreenEnabled } = useGuiState();
|
||||
const { setting, serverSetting, audioContext, setAudioOutputElementId, setAudioMonitorElementId, initializedRef, setVoiceChangerClientSetting, startOutputRecording, stopOutputRecording } = useAppState();
|
||||
const { isConverting, audioInputForGUI, inputAudioDeviceInfo, setAudioInputForGUI, fileInputEchoback, setFileInputEchoback, setAudioOutputForGUI, setAudioMonitorForGUI, audioOutputForGUI, audioMonitorForGUI, outputAudioDeviceInfo, shareScreenEnabled, setShareScreenEnabled } = useGuiState();
|
||||
const [inputHostApi, setInputHostApi] = useState<string>("ALL");
|
||||
const [outputHostApi, setOutputHostApi] = useState<string>("ALL");
|
||||
const [monitorHostApi, setMonitorHostApi] = useState<string>("ALL");
|
||||
@ -244,10 +244,10 @@ export const DeviceArea = (_props: DeviceAreaProps) => {
|
||||
audio_echo.volume = 0;
|
||||
setFileInputEchoback(false);
|
||||
|
||||
// original stream to play.
|
||||
const audio_org = document.getElementById(AUDIO_ELEMENT_FOR_TEST_ORIGINAL) as HTMLAudioElement;
|
||||
audio_org.src = url;
|
||||
audio_org.pause();
|
||||
// // original stream to play.
|
||||
// const audio_org = document.getElementById(AUDIO_ELEMENT_FOR_TEST_ORIGINAL) as HTMLAudioElement;
|
||||
// audio_org.src = url;
|
||||
// audio_org.pause();
|
||||
};
|
||||
|
||||
const echobackClass = fileInputEchoback ? "config-sub-area-control-field-wav-file-echoback-button-active" : "config-sub-area-control-field-wav-file-echoback-button";
|
||||
@ -256,7 +256,7 @@ export const DeviceArea = (_props: DeviceAreaProps) => {
|
||||
<div className="config-sub-area-control-field">
|
||||
<div className="config-sub-area-control-field-wav-file left-padding-1">
|
||||
<div className="config-sub-area-control-field-wav-file-audio-container">
|
||||
<audio id={AUDIO_ELEMENT_FOR_TEST_ORIGINAL} controls hidden></audio>
|
||||
{/* <audio id={AUDIO_ELEMENT_FOR_TEST_ORIGINAL} controls hidden></audio> */}
|
||||
<audio className="config-sub-area-control-field-wav-file-audio" id={AUDIO_ELEMENT_FOR_TEST_CONVERTED} controls controlsList="nodownload noplaybackrate"></audio>
|
||||
<audio id={AUDIO_ELEMENT_FOR_TEST_CONVERTED_ECHOBACK} controls hidden></audio>
|
||||
</div>
|
||||
@ -381,7 +381,8 @@ export const DeviceArea = (_props: DeviceAreaProps) => {
|
||||
const setAudioOutput = async () => {
|
||||
const mediaDeviceInfos = await navigator.mediaDevices.enumerateDevices();
|
||||
|
||||
[AUDIO_ELEMENT_FOR_PLAY_RESULT, AUDIO_ELEMENT_FOR_TEST_ORIGINAL, AUDIO_ELEMENT_FOR_TEST_CONVERTED_ECHOBACK].forEach((x) => {
|
||||
// [AUDIO_ELEMENT_FOR_PLAY_RESULT, AUDIO_ELEMENT_FOR_TEST_ORIGINAL, AUDIO_ELEMENT_FOR_TEST_CONVERTED_ECHOBACK].forEach((x) => {
|
||||
[AUDIO_ELEMENT_FOR_PLAY_RESULT, AUDIO_ELEMENT_FOR_TEST_CONVERTED_ECHOBACK].forEach((x) => {
|
||||
const audio = document.getElementById(x) as HTMLAudioElement;
|
||||
if (audio) {
|
||||
if (serverSetting.serverSetting.enableServerAudio == 1) {
|
||||
@ -598,7 +599,88 @@ export const DeviceArea = (_props: DeviceAreaProps) => {
|
||||
);
|
||||
}, [serverSetting.serverSetting, serverSetting.updateServerSettings, serverSetting.serverSetting.enableServerAudio]);
|
||||
|
||||
// (6) Monitor
|
||||
// (6) モニター
|
||||
useEffect(() => {
|
||||
const loadCache = async () => {
|
||||
const key = await getItem(INDEXEDDB_KEY_AUDIO_MONITR);
|
||||
if (key) {
|
||||
setAudioMonitorForGUI(key as string);
|
||||
}
|
||||
};
|
||||
loadCache();
|
||||
}, []);
|
||||
useEffect(() => {
|
||||
const setAudioMonitor = async () => {
|
||||
const mediaDeviceInfos = await navigator.mediaDevices.enumerateDevices();
|
||||
|
||||
[AUDIO_ELEMENT_FOR_PLAY_MONITOR].forEach((x) => {
|
||||
const audio = document.getElementById(x) as HTMLAudioElement;
|
||||
if (audio) {
|
||||
if (serverSetting.serverSetting.enableServerAudio == 1) {
|
||||
// Server Audio を使う場合はElementから音は出さない。
|
||||
audio.volume = 0;
|
||||
} else if (audioMonitorForGUI == "none") {
|
||||
// @ts-ignore
|
||||
audio.setSinkId("");
|
||||
audio.volume = 0;
|
||||
} else {
|
||||
const audioOutputs = mediaDeviceInfos.filter((x) => {
|
||||
return x.kind == "audiooutput";
|
||||
});
|
||||
const found = audioOutputs.some((x) => {
|
||||
return x.deviceId == audioMonitorForGUI;
|
||||
});
|
||||
if (found) {
|
||||
// @ts-ignore // 例外キャッチできないので事前にIDチェックが必要らしい。!?
|
||||
audio.setSinkId(audioMonitorForGUI);
|
||||
audio.volume = 1;
|
||||
} else {
|
||||
console.warn("No audio output device. use default");
|
||||
}
|
||||
}
|
||||
}
|
||||
});
|
||||
};
|
||||
setAudioMonitor();
|
||||
}, [audioMonitorForGUI, serverSetting.serverSetting.enableServerAudio]);
|
||||
|
||||
// (6-1) クライアント
|
||||
const clientMonitorRow = useMemo(() => {
|
||||
if (serverSetting.serverSetting.enableServerAudio == 1) {
|
||||
return <></>;
|
||||
}
|
||||
|
||||
return (
|
||||
<div className="config-sub-area-control">
|
||||
<div className="config-sub-area-control-title left-padding-1">monitor</div>
|
||||
<div className="config-sub-area-control-field">
|
||||
<select
|
||||
className="body-select"
|
||||
value={audioMonitorForGUI}
|
||||
onChange={(e) => {
|
||||
setAudioMonitorForGUI(e.target.value);
|
||||
setItem(INDEXEDDB_KEY_AUDIO_MONITR, e.target.value);
|
||||
}}
|
||||
>
|
||||
{outputAudioDeviceInfo.map((x) => {
|
||||
return (
|
||||
<option key={x.deviceId} value={x.deviceId}>
|
||||
{x.label}
|
||||
</option>
|
||||
);
|
||||
})}
|
||||
</select>
|
||||
</div>
|
||||
</div>
|
||||
);
|
||||
}, [serverSetting.serverSetting.enableServerAudio, outputAudioDeviceInfo, audioMonitorForGUI]);
|
||||
|
||||
useEffect(() => {
|
||||
console.log("initializedRef.current", initializedRef.current);
|
||||
setAudioMonitorElementId(AUDIO_ELEMENT_FOR_PLAY_MONITOR);
|
||||
}, [initializedRef.current]);
|
||||
|
||||
// (6-2) サーバ
|
||||
const serverMonitorRow = useMemo(() => {
|
||||
if (serverSetting.serverSetting.enableServerAudio == 0) {
|
||||
return <></>;
|
||||
@ -675,6 +757,41 @@ export const DeviceArea = (_props: DeviceAreaProps) => {
|
||||
);
|
||||
}, [monitorHostApi, serverSetting.serverSetting, serverSetting.updateServerSettings, serverSetting.serverSetting.enableServerAudio]);
|
||||
|
||||
const monitorGainControl = useMemo(() => {
|
||||
const currentMonitorGain = serverSetting.serverSetting.enableServerAudio == 0 ? setting.voiceChangerClientSetting.monitorGain : serverSetting.serverSetting.serverMonitorAudioGain;
|
||||
const monitorValueUpdatedAction =
|
||||
serverSetting.serverSetting.enableServerAudio == 0
|
||||
? async (val: number) => {
|
||||
await setVoiceChangerClientSetting({ ...setting.voiceChangerClientSetting, monitorGain: val });
|
||||
}
|
||||
: async (val: number) => {
|
||||
await serverSetting.updateServerSettings({ ...serverSetting.serverSetting, serverMonitorAudioGain: val });
|
||||
};
|
||||
|
||||
return (
|
||||
<div className="config-sub-area-control">
|
||||
<div className="config-sub-area-control-title left-padding-2">gain</div>
|
||||
<div className="config-sub-area-control-field">
|
||||
<div className="config-sub-area-control-field-auido-io">
|
||||
<span className="character-area-slider-control-slider">
|
||||
<input
|
||||
type="range"
|
||||
min="0.1"
|
||||
max="10.0"
|
||||
step="0.1"
|
||||
value={currentMonitorGain}
|
||||
onChange={(e) => {
|
||||
monitorValueUpdatedAction(Number(e.target.value));
|
||||
}}
|
||||
></input>
|
||||
</span>
|
||||
<span className="character-area-slider-control-val">{currentMonitorGain}</span>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
);
|
||||
}, [serverSetting.serverSetting, setting, setVoiceChangerClientSetting, serverSetting.updateServerSettings]);
|
||||
|
||||
return (
|
||||
<div className="config-sub-area">
|
||||
{deviceModeRow}
|
||||
@ -685,10 +802,13 @@ export const DeviceArea = (_props: DeviceAreaProps) => {
|
||||
{audioInputScreenRow}
|
||||
{clientAudioOutputRow}
|
||||
{serverAudioOutputRow}
|
||||
{clientMonitorRow}
|
||||
{serverMonitorRow}
|
||||
{monitorGainControl}
|
||||
|
||||
{outputRecorderRow}
|
||||
<audio hidden id={AUDIO_ELEMENT_FOR_PLAY_RESULT}></audio>
|
||||
<audio hidden id={AUDIO_ELEMENT_FOR_PLAY_MONITOR}></audio>
|
||||
</div>
|
||||
);
|
||||
};
|
||||
|
@ -1,13 +1,15 @@
|
||||
|
||||
export const AUDIO_ELEMENT_FOR_PLAY_RESULT = "audio-result"
|
||||
export const AUDIO_ELEMENT_FOR_TEST_ORIGINAL = "audio-test-original"
|
||||
export const AUDIO_ELEMENT_FOR_TEST_CONVERTED = "audio-test-converted"
|
||||
export const AUDIO_ELEMENT_FOR_TEST_CONVERTED_ECHOBACK = "audio-test-converted-echoback"
|
||||
export const AUDIO_ELEMENT_FOR_PLAY_RESULT = "audio-result" // 変換後の出力用プレイヤー
|
||||
export const AUDIO_ELEMENT_FOR_PLAY_MONITOR = "audio-monitor" // 変換後のモニター用プレイヤー
|
||||
export const AUDIO_ELEMENT_FOR_TEST_ORIGINAL = "audio-test-original" // ??? 使ってないかも。
|
||||
export const AUDIO_ELEMENT_FOR_TEST_CONVERTED = "audio-test-converted" // ファイルインプットのコントロール
|
||||
export const AUDIO_ELEMENT_FOR_TEST_CONVERTED_ECHOBACK = "audio-test-converted-echoback" // ファイルインプットのエコーバック
|
||||
|
||||
export const AUDIO_ELEMENT_FOR_SAMPLING_INPUT = "body-wav-container-wav-input"
|
||||
export const AUDIO_ELEMENT_FOR_SAMPLING_OUTPUT = "body-wav-container-wav-output"
|
||||
|
||||
export const INDEXEDDB_KEY_AUDIO_OUTPUT = "INDEXEDDB_KEY_AUDIO_OUTPUT"
|
||||
export const INDEXEDDB_KEY_AUDIO_MONITR = "INDEXEDDB_KEY_AUDIO_MONITOR"
|
||||
export const INDEXEDDB_KEY_DEFAULT_MODEL_TYPE = "INDEXEDDB_KEY_DEFALT_MODEL_TYPE"
|
||||
|
||||
|
||||
|
@ -757,6 +757,18 @@ body {
|
||||
max-height: 60vh;
|
||||
width: 100%;
|
||||
overflow-y: scroll;
|
||||
&::-webkit-scrollbar {
|
||||
width: 10px;
|
||||
height: 10px;
|
||||
}
|
||||
&::-webkit-scrollbar-track {
|
||||
background-color: #eee;
|
||||
border-radius: 3px;
|
||||
}
|
||||
&::-webkit-scrollbar-thumb {
|
||||
background: #f7cfec80;
|
||||
border-radius: 3px;
|
||||
}
|
||||
|
||||
.model-slot {
|
||||
height: 5rem;
|
||||
@ -1150,12 +1162,30 @@ body {
|
||||
flex-direction: row;
|
||||
gap: 2px;
|
||||
flex-wrap: wrap;
|
||||
overflow-y: scroll;
|
||||
max-height: 12rem;
|
||||
&::-webkit-scrollbar {
|
||||
width: 10px;
|
||||
height: 10px;
|
||||
}
|
||||
&::-webkit-scrollbar-track {
|
||||
background-color: #eee;
|
||||
border-radius: 3px;
|
||||
}
|
||||
&::-webkit-scrollbar-thumb {
|
||||
background: #f7cfec80;
|
||||
border-radius: 3px;
|
||||
}
|
||||
|
||||
/* width: calc(30rem + 40px + 10px); */
|
||||
}
|
||||
|
||||
.model-slot-buttons {
|
||||
display: flex;
|
||||
flex-direction: column-reverse;
|
||||
gap: 5px;
|
||||
flex-direction: column;
|
||||
justify-content: space-between;
|
||||
width: 4rem;
|
||||
.model-slot-button {
|
||||
border: solid 2px #999;
|
||||
color: white;
|
||||
@ -1164,10 +1194,41 @@ body {
|
||||
background: #333;
|
||||
cursor: pointer;
|
||||
padding: 5px;
|
||||
text-align: center;
|
||||
width: 3rem;
|
||||
}
|
||||
.model-slot-button:hover {
|
||||
border: solid 2px #faa;
|
||||
}
|
||||
.model-slot-sort-buttons {
|
||||
height: 50%;
|
||||
.model-slot-sort-button {
|
||||
color: white;
|
||||
font-size: 0.8rem;
|
||||
border-radius: 4px;
|
||||
background: #333;
|
||||
border: solid 2px #444;
|
||||
cursor: pointer;
|
||||
padding: 1px;
|
||||
text-align: center;
|
||||
width: 3rem;
|
||||
}
|
||||
.model-slot-sort-button-active {
|
||||
color: white;
|
||||
font-size: 0.8rem;
|
||||
border-radius: 4px;
|
||||
background: #595;
|
||||
border: solid 2px #595;
|
||||
cursor: pointer;
|
||||
padding: 1px;
|
||||
text-align: center;
|
||||
width: 3rem;
|
||||
}
|
||||
.model-slot-sort-button:hover {
|
||||
border: solid 2px #faa;
|
||||
background: #343;
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
@ -1277,6 +1338,7 @@ body {
|
||||
.character-area-control {
|
||||
display: flex;
|
||||
gap: 3px;
|
||||
align-items: center;
|
||||
.character-area-control-buttons {
|
||||
display: flex;
|
||||
flex-direction: row;
|
||||
@ -1301,6 +1363,34 @@ body {
|
||||
border: solid 1px #000;
|
||||
}
|
||||
}
|
||||
.character-area-control-passthru-button-stanby {
|
||||
width: 5rem;
|
||||
border: solid 1px #999;
|
||||
border-radius: 15px;
|
||||
padding: 2px;
|
||||
background: #aba;
|
||||
cursor: pointer;
|
||||
font-weight: 700;
|
||||
font-size: 0.8rem;
|
||||
text-align: center;
|
||||
&:hover {
|
||||
border: solid 1px #000;
|
||||
}
|
||||
}
|
||||
.character-area-control-passthru-button-active {
|
||||
width: 5rem;
|
||||
border: solid 1px #955;
|
||||
border-radius: 15px;
|
||||
padding: 2px;
|
||||
background: #fdd;
|
||||
cursor: pointer;
|
||||
font-weight: 700;
|
||||
font-size: 0.8rem;
|
||||
text-align: center;
|
||||
&:hover {
|
||||
border: solid 1px #000;
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
.character-area-control-title {
|
||||
@ -1344,6 +1434,35 @@ body {
|
||||
.character-area-button:hover {
|
||||
border: solid 2px #faa;
|
||||
}
|
||||
.character-area-toggle-button {
|
||||
border: solid 2px #999;
|
||||
color: white;
|
||||
background: #666;
|
||||
|
||||
cursor: pointer;
|
||||
|
||||
font-size: 0.8rem;
|
||||
border-radius: 5px;
|
||||
height: 1.2rem;
|
||||
padding-left: 2px;
|
||||
padding-right: 2px;
|
||||
}
|
||||
.character-area-toggle-button:hover {
|
||||
border: solid 2px #faa;
|
||||
}
|
||||
.character-area-toggle-button-active {
|
||||
border: solid 2px #999;
|
||||
color: white;
|
||||
background: #844;
|
||||
|
||||
cursor: pointer;
|
||||
|
||||
font-size: 0.8rem;
|
||||
border-radius: 5px;
|
||||
height: 1.2rem;
|
||||
padding-left: 2px;
|
||||
padding-right: 2px;
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
@ -1443,6 +1562,10 @@ audio::-webkit-media-controls-overlay-enclosure{
|
||||
height: 1.2rem;
|
||||
padding-left: 2px;
|
||||
padding-right: 2px;
|
||||
white-space: nowrap;
|
||||
}
|
||||
.config-sub-area-button-text-small {
|
||||
font-size: 0.5rem;
|
||||
}
|
||||
}
|
||||
.config-sub-area-control-field-auido-io {
|
||||
@ -1635,6 +1758,21 @@ audio::-webkit-media-controls-overlay-enclosure{
|
||||
flex-direction: row;
|
||||
.merge-lab-model-list {
|
||||
width: 70%;
|
||||
overflow-y: scroll;
|
||||
max-height: 20rem;
|
||||
&::-webkit-scrollbar {
|
||||
width: 10px;
|
||||
height: 10px;
|
||||
}
|
||||
&::-webkit-scrollbar-track {
|
||||
background-color: #eee;
|
||||
border-radius: 3px;
|
||||
}
|
||||
&::-webkit-scrollbar-thumb {
|
||||
background: #f7cfec80;
|
||||
border-radius: 3px;
|
||||
}
|
||||
|
||||
.merge-lab-model-item {
|
||||
display: flex;
|
||||
flex-direction: row;
|
||||
@ -1673,3 +1811,18 @@ audio::-webkit-media-controls-overlay-enclosure{
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
.blinking {
|
||||
animation: flash 0.7s cubic-bezier(0.91, -0.14, 0, 1.4) infinite;
|
||||
}
|
||||
|
||||
@keyframes flash {
|
||||
0%,
|
||||
100% {
|
||||
opacity: 1;
|
||||
}
|
||||
|
||||
50% {
|
||||
opacity: 0.5;
|
||||
}
|
||||
}
|
||||
|
1581
client/lib/package-lock.json
generated
1581
client/lib/package-lock.json
generated
File diff suppressed because it is too large
Load Diff
@ -1,6 +1,6 @@
|
||||
{
|
||||
"name": "@dannadori/voice-changer-client-js",
|
||||
"version": "1.0.164",
|
||||
"version": "1.0.167",
|
||||
"description": "",
|
||||
"main": "dist/index.js",
|
||||
"directories": {
|
||||
@ -26,33 +26,33 @@
|
||||
"author": "wataru.okada@flect.co.jp",
|
||||
"license": "ISC",
|
||||
"devDependencies": {
|
||||
"@types/audioworklet": "^0.0.48",
|
||||
"@types/node": "^20.4.2",
|
||||
"@types/react": "18.2.15",
|
||||
"@types/audioworklet": "^0.0.50",
|
||||
"@types/node": "^20.4.8",
|
||||
"@types/react": "18.2.18",
|
||||
"@types/react-dom": "18.2.7",
|
||||
"eslint": "^8.45.0",
|
||||
"eslint-config-prettier": "^8.8.0",
|
||||
"eslint": "^8.46.0",
|
||||
"eslint-config-prettier": "^9.0.0",
|
||||
"eslint-plugin-prettier": "^5.0.0",
|
||||
"eslint-plugin-react": "^7.32.2",
|
||||
"eslint-plugin-react": "^7.33.1",
|
||||
"eslint-webpack-plugin": "^4.0.1",
|
||||
"npm-run-all": "^4.1.5",
|
||||
"prettier": "^3.0.0",
|
||||
"prettier": "^3.0.1",
|
||||
"raw-loader": "^4.0.2",
|
||||
"rimraf": "^5.0.1",
|
||||
"ts-loader": "^9.4.4",
|
||||
"typescript": "^5.1.6",
|
||||
"webpack": "^5.88.1",
|
||||
"webpack": "^5.88.2",
|
||||
"webpack-cli": "^5.1.4",
|
||||
"webpack-dev-server": "^4.15.1"
|
||||
},
|
||||
"dependencies": {
|
||||
"@types/readable-stream": "^2.3.15",
|
||||
"@types/readable-stream": "^4.0.0",
|
||||
"amazon-chime-sdk-js": "^3.15.0",
|
||||
"buffer": "^6.0.3",
|
||||
"localforage": "^1.10.0",
|
||||
"protobufjs": "^7.2.4",
|
||||
"react": "^18.2.0",
|
||||
"react-dom": "^18.2.0",
|
||||
"socket.io-client": "^4.7.1"
|
||||
"socket.io-client": "^4.7.2"
|
||||
}
|
||||
}
|
||||
|
@ -23,9 +23,11 @@ export class VoiceChangerClient {
|
||||
private currentMediaStreamAudioSourceNode: MediaStreamAudioSourceNode | null = null
|
||||
private inputGainNode: GainNode | null = null
|
||||
private outputGainNode: GainNode | null = null
|
||||
private monitorGainNode: GainNode | null = null
|
||||
private vcInNode!: VoiceChangerWorkletNode
|
||||
private vcOutNode!: VoiceChangerWorkletNode
|
||||
private currentMediaStreamAudioDestinationNode!: MediaStreamAudioDestinationNode
|
||||
private currentMediaStreamAudioDestinationMonitorNode!: MediaStreamAudioDestinationNode
|
||||
|
||||
|
||||
private promiseForInitialize: Promise<void>
|
||||
@ -72,6 +74,12 @@ export class VoiceChangerClient {
|
||||
this.vcOutNode.connect(this.outputGainNode) // vc node -> output node
|
||||
this.outputGainNode.connect(this.currentMediaStreamAudioDestinationNode)
|
||||
|
||||
this.currentMediaStreamAudioDestinationMonitorNode = ctx44k.createMediaStreamDestination() // output node
|
||||
this.monitorGainNode = ctx44k.createGain()
|
||||
this.monitorGainNode.gain.value = this.setting.monitorGain
|
||||
this.vcOutNode.connect(this.monitorGainNode) // vc node -> monitor node
|
||||
this.monitorGainNode.connect(this.currentMediaStreamAudioDestinationMonitorNode)
|
||||
|
||||
if (this.vfEnable) {
|
||||
this.vf = await VoiceFocusDeviceTransformer.create({ variant: 'c20' })
|
||||
const dummyMediaStream = createDummyMediaStream(this.ctx)
|
||||
@ -185,6 +193,9 @@ export class VoiceChangerClient {
|
||||
get stream(): MediaStream {
|
||||
return this.currentMediaStreamAudioDestinationNode.stream
|
||||
}
|
||||
get monitorStream(): MediaStream {
|
||||
return this.currentMediaStreamAudioDestinationMonitorNode.stream
|
||||
}
|
||||
|
||||
start = async () => {
|
||||
await this.vcInNode.start()
|
||||
@ -239,6 +250,9 @@ export class VoiceChangerClient {
|
||||
if (this.setting.outputGain != setting.outputGain) {
|
||||
this.setOutputGain(setting.outputGain)
|
||||
}
|
||||
if (this.setting.monitorGain != setting.monitorGain) {
|
||||
this.setMonitorGain(setting.monitorGain)
|
||||
}
|
||||
|
||||
this.setting = setting
|
||||
if (reconstructInputRequired) {
|
||||
@ -251,6 +265,9 @@ export class VoiceChangerClient {
|
||||
if (!this.inputGainNode) {
|
||||
return
|
||||
}
|
||||
if(!val){
|
||||
return
|
||||
}
|
||||
this.inputGainNode.gain.value = val
|
||||
}
|
||||
|
||||
@ -258,9 +275,22 @@ export class VoiceChangerClient {
|
||||
if (!this.outputGainNode) {
|
||||
return
|
||||
}
|
||||
if(!val){
|
||||
return
|
||||
}
|
||||
this.outputGainNode.gain.value = val
|
||||
}
|
||||
|
||||
setMonitorGain = (val: number) => {
|
||||
if (!this.monitorGainNode) {
|
||||
return
|
||||
}
|
||||
if(!val){
|
||||
return
|
||||
}
|
||||
this.monitorGainNode.gain.value = val
|
||||
}
|
||||
|
||||
/////////////////////////////////////////////////////
|
||||
// コンポーネント設定、操作
|
||||
/////////////////////////////////////////////////////
|
||||
|
@ -68,6 +68,7 @@ export const RVCModelType = {
|
||||
export type RVCModelType = typeof RVCModelType[keyof typeof RVCModelType]
|
||||
|
||||
export const ServerSettingKey = {
|
||||
"passThrough":"passThrough",
|
||||
"srcId": "srcId",
|
||||
"dstId": "dstId",
|
||||
"gpu": "gpu",
|
||||
@ -97,6 +98,7 @@ export const ServerSettingKey = {
|
||||
"serverReadChunkSize": "serverReadChunkSize",
|
||||
"serverInputAudioGain": "serverInputAudioGain",
|
||||
"serverOutputAudioGain": "serverOutputAudioGain",
|
||||
"serverMonitorAudioGain": "serverMonitorAudioGain",
|
||||
|
||||
"tran": "tran",
|
||||
"noiseScale": "noiseScale",
|
||||
@ -123,6 +125,7 @@ export const ServerSettingKey = {
|
||||
"threshold": "threshold",
|
||||
|
||||
"speedUp": "speedUp",
|
||||
"skipDiffusion": "skipDiffusion",
|
||||
|
||||
"inputSampleRate": "inputSampleRate",
|
||||
"enableDirectML": "enableDirectML",
|
||||
@ -131,6 +134,7 @@ export type ServerSettingKey = typeof ServerSettingKey[keyof typeof ServerSettin
|
||||
|
||||
|
||||
export type VoiceChangerServerSetting = {
|
||||
passThrough: boolean
|
||||
srcId: number,
|
||||
dstId: number,
|
||||
gpu: number,
|
||||
@ -157,6 +161,7 @@ export type VoiceChangerServerSetting = {
|
||||
serverReadChunkSize: number
|
||||
serverInputAudioGain: number
|
||||
serverOutputAudioGain: number
|
||||
serverMonitorAudioGain: number
|
||||
|
||||
|
||||
tran: number // so-vits-svc
|
||||
@ -184,13 +189,14 @@ export type VoiceChangerServerSetting = {
|
||||
threshold: number// DDSP-SVC
|
||||
|
||||
speedUp: number // Diffusion-SVC
|
||||
|
||||
skipDiffusion: number // Diffusion-SVC 0:off, 1:on
|
||||
|
||||
inputSampleRate: InputSampleRate
|
||||
enableDirectML: number
|
||||
}
|
||||
|
||||
type ModelSlot = {
|
||||
slotIndex: number
|
||||
voiceChangerType: VoiceChangerType
|
||||
name: string,
|
||||
description: string,
|
||||
@ -303,7 +309,9 @@ export type ServerInfo = VoiceChangerServerSetting & {
|
||||
memory: number,
|
||||
}[]
|
||||
maxInputLength: number // MMVCv15
|
||||
|
||||
voiceChangerParams: {
|
||||
model_dir: string
|
||||
}
|
||||
}
|
||||
|
||||
export type SampleModel = {
|
||||
@ -339,6 +347,7 @@ export type DiffusionSVCSampleModel =SampleModel & {
|
||||
|
||||
export const DefaultServerSetting: ServerInfo = {
|
||||
// VC Common
|
||||
passThrough: false,
|
||||
inputSampleRate: 48000,
|
||||
|
||||
crossFadeOffsetRate: 0.0,
|
||||
@ -361,6 +370,7 @@ export const DefaultServerSetting: ServerInfo = {
|
||||
serverReadChunkSize: 256,
|
||||
serverInputAudioGain: 1.0,
|
||||
serverOutputAudioGain: 1.0,
|
||||
serverMonitorAudioGain: 1.0,
|
||||
|
||||
// VC Specific
|
||||
srcId: 0,
|
||||
@ -397,6 +407,7 @@ export const DefaultServerSetting: ServerInfo = {
|
||||
threshold: -45,
|
||||
|
||||
speedUp: 10,
|
||||
skipDiffusion: 1,
|
||||
|
||||
enableDirectML: 0,
|
||||
//
|
||||
@ -405,7 +416,10 @@ export const DefaultServerSetting: ServerInfo = {
|
||||
serverAudioInputDevices: [],
|
||||
serverAudioOutputDevices: [],
|
||||
|
||||
maxInputLength: 128 * 2048
|
||||
maxInputLength: 128 * 2048,
|
||||
voiceChangerParams: {
|
||||
model_dir: ""
|
||||
}
|
||||
}
|
||||
|
||||
///////////////////////
|
||||
@ -466,6 +480,7 @@ export type VoiceChangerClientSetting = {
|
||||
|
||||
inputGain: number
|
||||
outputGain: number
|
||||
monitorGain: number
|
||||
}
|
||||
|
||||
///////////////////////
|
||||
@ -496,7 +511,8 @@ export const DefaultClientSettng: ClientSetting = {
|
||||
noiseSuppression: false,
|
||||
noiseSuppression2: false,
|
||||
inputGain: 1.0,
|
||||
outputGain: 1.0
|
||||
outputGain: 1.0,
|
||||
monitorGain: 1.0
|
||||
}
|
||||
}
|
||||
|
||||
@ -533,7 +549,7 @@ export type OnnxExporterInfo = {
|
||||
|
||||
// Merge
|
||||
export type MergeElement = {
|
||||
filename: string
|
||||
slotIndex: number
|
||||
strength: number
|
||||
}
|
||||
export type MergeModelRequest = {
|
||||
|
@ -47,6 +47,7 @@ export type ClientState = {
|
||||
clearSetting: () => Promise<void>
|
||||
// AudioOutputElement 設定
|
||||
setAudioOutputElementId: (elemId: string) => void
|
||||
setAudioMonitorElementId: (elemId: string) => void
|
||||
|
||||
ioErrorCount: number
|
||||
resetIoErrorCount: () => void
|
||||
@ -215,6 +216,18 @@ export const useClient = (props: UseClientProps): ClientState => {
|
||||
}
|
||||
}
|
||||
|
||||
const setAudioMonitorElementId = (elemId: string) => {
|
||||
if (!voiceChangerClientRef.current) {
|
||||
console.warn("[voiceChangerClient] is not ready for set audio output.")
|
||||
return
|
||||
}
|
||||
const audio = document.getElementById(elemId) as HTMLAudioElement
|
||||
if (audio.paused) {
|
||||
audio.srcObject = voiceChangerClientRef.current.monitorStream
|
||||
audio.play()
|
||||
}
|
||||
}
|
||||
|
||||
// (2-2) 情報リロード
|
||||
const getInfo = useMemo(() => {
|
||||
return async () => {
|
||||
@ -286,6 +299,7 @@ export const useClient = (props: UseClientProps): ClientState => {
|
||||
|
||||
// AudioOutputElement 設定
|
||||
setAudioOutputElementId,
|
||||
setAudioMonitorElementId,
|
||||
|
||||
ioErrorCount,
|
||||
resetIoErrorCount
|
||||
|
@ -18,6 +18,10 @@ npm run build:docker:vcclient
|
||||
bash start_docker.sh
|
||||
```
|
||||
|
||||
ブラウザ(Chrome のみサポート)でアクセスすると画面が表示されます。
|
||||
|
||||
## RUN with options
|
||||
|
||||
GPU を使用しない場合は
|
||||
|
||||
```
|
||||
|
@ -36,6 +36,10 @@ In root folder of repos.
|
||||
bash start_docker.sh
|
||||
```
|
||||
|
||||
Access with Browser (currently only chrome is supported), then you can see gui.
|
||||
|
||||
## RUN with options
|
||||
|
||||
Without GPU
|
||||
|
||||
```
|
||||
|
@ -9,6 +9,7 @@ import argparse
|
||||
from Exceptions import WeightDownladException
|
||||
from downloader.SampleDownloader import downloadInitialSamples
|
||||
from downloader.WeightDownloader import downloadWeight
|
||||
from voice_changer.VoiceChangerParamsManager import VoiceChangerParamsManager
|
||||
|
||||
from voice_changer.utils.VoiceChangerParams import VoiceChangerParams
|
||||
|
||||
@ -40,19 +41,19 @@ def setupArgParser():
|
||||
parser.add_argument("--httpsCert", type=str, default="ssl.cert", help="path for the cert of https")
|
||||
parser.add_argument("--httpsSelfSigned", type=strtobool, default=True, help="generate self-signed certificate")
|
||||
|
||||
parser.add_argument("--model_dir", type=str, help="path to model files")
|
||||
parser.add_argument("--model_dir", type=str, default="model_dir", help="path to model files")
|
||||
parser.add_argument("--sample_mode", type=str, default="production", help="rvc_sample_mode")
|
||||
|
||||
parser.add_argument("--content_vec_500", type=str, help="path to content_vec_500 model(pytorch)")
|
||||
parser.add_argument("--content_vec_500_onnx", type=str, help="path to content_vec_500 model(onnx)")
|
||||
parser.add_argument("--content_vec_500_onnx_on", type=strtobool, default=False, help="use or not onnx for content_vec_500")
|
||||
parser.add_argument("--hubert_base", type=str, help="path to hubert_base model(pytorch)")
|
||||
parser.add_argument("--hubert_base_jp", type=str, help="path to hubert_base_jp model(pytorch)")
|
||||
parser.add_argument("--hubert_soft", type=str, help="path to hubert_soft model(pytorch)")
|
||||
parser.add_argument("--nsf_hifigan", type=str, help="path to nsf_hifigan model(pytorch)")
|
||||
parser.add_argument("--crepe_onnx_full", type=str, help="path to crepe_onnx_full")
|
||||
parser.add_argument("--crepe_onnx_tiny", type=str, help="path to crepe_onnx_tiny")
|
||||
parser.add_argument("--rmvpe", type=str, help="path to rmvpe")
|
||||
parser.add_argument("--content_vec_500", type=str, default="pretrain/checkpoint_best_legacy_500.pt", help="path to content_vec_500 model(pytorch)")
|
||||
parser.add_argument("--content_vec_500_onnx", type=str, default="pretrain/content_vec_500.onnx", help="path to content_vec_500 model(onnx)")
|
||||
parser.add_argument("--content_vec_500_onnx_on", type=strtobool, default=True, help="use or not onnx for content_vec_500")
|
||||
parser.add_argument("--hubert_base", type=str, default="pretrain/hubert_base.pt", help="path to hubert_base model(pytorch)")
|
||||
parser.add_argument("--hubert_base_jp", type=str, default="pretrain/rinna_hubert_base_jp.pt", help="path to hubert_base_jp model(pytorch)")
|
||||
parser.add_argument("--hubert_soft", type=str, default="pretrain/hubert/hubert-soft-0d54a1f4.pt", help="path to hubert_soft model(pytorch)")
|
||||
parser.add_argument("--nsf_hifigan", type=str, default="pretrain/nsf_hifigan/model", help="path to nsf_hifigan model(pytorch)")
|
||||
parser.add_argument("--crepe_onnx_full", type=str, default="pretrain/crepe_onnx_full.onnx", help="path to crepe_onnx_full")
|
||||
parser.add_argument("--crepe_onnx_tiny", type=str, default="pretrain/crepe_onnx_tiny.onnx", help="path to crepe_onnx_tiny")
|
||||
parser.add_argument("--rmvpe", type=str, default="pretrain/rmvpe.pt", help="path to rmvpe")
|
||||
|
||||
return parser
|
||||
|
||||
@ -96,6 +97,8 @@ voiceChangerParams = VoiceChangerParams(
|
||||
rmvpe=args.rmvpe,
|
||||
sample_mode=args.sample_mode,
|
||||
)
|
||||
vcparams = VoiceChangerParamsManager.get_instance()
|
||||
vcparams.setParams(voiceChangerParams)
|
||||
|
||||
printMessage(f"Booting PHASE :{__name__}", level=2)
|
||||
|
||||
@ -124,7 +127,8 @@ if __name__ == "MMVCServerSIO":
|
||||
|
||||
|
||||
if __name__ == "__mp_main__":
|
||||
printMessage("サーバプロセスを起動しています。", level=2)
|
||||
# printMessage("サーバプロセスを起動しています。", level=2)
|
||||
printMessage("The server process is starting up.", level=2)
|
||||
|
||||
if __name__ == "__main__":
|
||||
mp.freeze_support()
|
||||
@ -132,12 +136,13 @@ if __name__ == "__main__":
|
||||
logger.debug(args)
|
||||
|
||||
printMessage(f"PYTHON:{sys.version}", level=2)
|
||||
printMessage("Voice Changerを起動しています。", level=2)
|
||||
# printMessage("Voice Changerを起動しています。", level=2)
|
||||
printMessage("Activating the Voice Changer.", level=2)
|
||||
# ダウンロード(Weight)
|
||||
try:
|
||||
downloadWeight(voiceChangerParams)
|
||||
except WeightDownladException:
|
||||
printMessage("RVC用のモデルファイルのダウンロードに失敗しました。", level=2)
|
||||
# printMessage("RVC用のモデルファイルのダウンロードに失敗しました。", level=2)
|
||||
printMessage("failed to download weight for rvc", level=2)
|
||||
|
||||
# ダウンロード(Sample)
|
||||
@ -192,29 +197,31 @@ if __name__ == "__main__":
|
||||
printMessage("-- ---- -- ", level=1)
|
||||
|
||||
# アドレス表示
|
||||
printMessage("ブラウザで次のURLを開いてください.", level=2)
|
||||
printMessage("Please open the following URL in your browser.", level=2)
|
||||
# printMessage("ブラウザで次のURLを開いてください.", level=2)
|
||||
if args.https == 1:
|
||||
printMessage("https://<IP>:<PORT>/", level=1)
|
||||
else:
|
||||
printMessage("http://<IP>:<PORT>/", level=1)
|
||||
|
||||
printMessage("多くの場合は次のいずれかのURLにアクセスすると起動します。", level=2)
|
||||
# printMessage("多くの場合は次のいずれかのURLにアクセスすると起動します。", level=2)
|
||||
printMessage("In many cases, it will launch when you access any of the following URLs.", level=2)
|
||||
if "EX_PORT" in locals() and "EX_IP" in locals(): # シェルスクリプト経由起動(docker)
|
||||
if args.https == 1:
|
||||
printMessage(f"https://localhost:{EX_PORT}/", level=1)
|
||||
printMessage(f"https://127.0.0.1:{EX_PORT}/", level=1)
|
||||
for ip in EX_IP.strip().split(" "):
|
||||
printMessage(f"https://{ip}:{EX_PORT}/", level=1)
|
||||
else:
|
||||
printMessage(f"http://localhost:{EX_PORT}/", level=1)
|
||||
printMessage(f"http://127.0.0.1:{EX_PORT}/", level=1)
|
||||
else: # 直接python起動
|
||||
if args.https == 1:
|
||||
s = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
|
||||
s.connect((args.test_connect, 80))
|
||||
hostname = s.getsockname()[0]
|
||||
printMessage(f"https://localhost:{PORT}/", level=1)
|
||||
printMessage(f"https://127.0.0.1:{PORT}/", level=1)
|
||||
printMessage(f"https://{hostname}:{PORT}/", level=1)
|
||||
else:
|
||||
printMessage(f"http://localhost:{PORT}/", level=1)
|
||||
printMessage(f"http://127.0.0.1:{PORT}/", level=1)
|
||||
|
||||
# サーバ起動
|
||||
if args.https:
|
||||
@ -237,15 +244,15 @@ if __name__ == "__main__":
|
||||
p.start()
|
||||
try:
|
||||
if sys.platform.startswith("win"):
|
||||
process = subprocess.Popen([NATIVE_CLIENT_FILE_WIN, "--disable-gpu", "-u", f"http://localhost:{PORT}/"])
|
||||
process = subprocess.Popen([NATIVE_CLIENT_FILE_WIN, "--disable-gpu", "-u", f"http://127.0.0.1:{PORT}/"])
|
||||
return_code = process.wait()
|
||||
logger.info("client closed.")
|
||||
p.terminate()
|
||||
elif sys.platform.startswith("darwin"):
|
||||
process = subprocess.Popen([NATIVE_CLIENT_FILE_MAC, "--disable-gpu", "-u", f"http://localhost:{PORT}/"])
|
||||
process = subprocess.Popen([NATIVE_CLIENT_FILE_MAC, "--disable-gpu", "-u", f"http://127.0.0.1:{PORT}/"])
|
||||
return_code = process.wait()
|
||||
logger.info("client closed.")
|
||||
p.terminate()
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"[Voice Changer] Launch Exception, {e}")
|
||||
logger.error(f"[Voice Changer] Client Launch Exception, {e}")
|
||||
|
@ -169,4 +169,4 @@ def getSampleJsonAndModelIds(mode: RVCSampleMode):
|
||||
|
||||
|
||||
RVC_MODEL_DIRNAME = "rvc"
|
||||
MAX_SLOT_NUM = 10
|
||||
MAX_SLOT_NUM = 200
|
||||
|
@ -9,6 +9,7 @@ import json
|
||||
|
||||
@dataclass
|
||||
class ModelSlot:
|
||||
slotIndex: int = -1
|
||||
voiceChangerType: VoiceChangerType | None = None
|
||||
name: str = ""
|
||||
description: str = ""
|
||||
@ -132,19 +133,26 @@ def loadSlotInfo(model_dir: str, slotIndex: int) -> ModelSlots:
|
||||
if not os.path.exists(jsonFile):
|
||||
return ModelSlot()
|
||||
jsonDict = json.load(open(os.path.join(slotDir, "params.json")))
|
||||
slotInfo = ModelSlot(**{k: v for k, v in jsonDict.items() if k in ModelSlot.__annotations__})
|
||||
slotInfoKey = list(ModelSlot.__annotations__.keys())
|
||||
slotInfo = ModelSlot(**{k: v for k, v in jsonDict.items() if k in slotInfoKey})
|
||||
if slotInfo.voiceChangerType == "RVC":
|
||||
return RVCModelSlot(**jsonDict)
|
||||
slotInfoKey.extend(list(RVCModelSlot.__annotations__.keys()))
|
||||
return RVCModelSlot(**{k: v for k, v in jsonDict.items() if k in slotInfoKey})
|
||||
elif slotInfo.voiceChangerType == "MMVCv13":
|
||||
return MMVCv13ModelSlot(**jsonDict)
|
||||
slotInfoKey.extend(list(MMVCv13ModelSlot.__annotations__.keys()))
|
||||
return MMVCv13ModelSlot(**{k: v for k, v in jsonDict.items() if k in slotInfoKey})
|
||||
elif slotInfo.voiceChangerType == "MMVCv15":
|
||||
return MMVCv15ModelSlot(**jsonDict)
|
||||
slotInfoKey.extend(list(MMVCv15ModelSlot.__annotations__.keys()))
|
||||
return MMVCv15ModelSlot(**{k: v for k, v in jsonDict.items() if k in slotInfoKey})
|
||||
elif slotInfo.voiceChangerType == "so-vits-svc-40":
|
||||
return SoVitsSvc40ModelSlot(**jsonDict)
|
||||
slotInfoKey.extend(list(SoVitsSvc40ModelSlot.__annotations__.keys()))
|
||||
return SoVitsSvc40ModelSlot(**{k: v for k, v in jsonDict.items() if k in slotInfoKey})
|
||||
elif slotInfo.voiceChangerType == "DDSP-SVC":
|
||||
return DDSPSVCModelSlot(**jsonDict)
|
||||
slotInfoKey.extend(list(DDSPSVCModelSlot.__annotations__.keys()))
|
||||
return DDSPSVCModelSlot(**{k: v for k, v in jsonDict.items() if k in slotInfoKey})
|
||||
elif slotInfo.voiceChangerType == "Diffusion-SVC":
|
||||
return DiffusionSVCModelSlot(**jsonDict)
|
||||
slotInfoKey.extend(list(DiffusionSVCModelSlot.__annotations__.keys()))
|
||||
return DiffusionSVCModelSlot(**{k: v for k, v in jsonDict.items() if k in slotInfoKey})
|
||||
else:
|
||||
return ModelSlot()
|
||||
|
||||
@ -153,10 +161,13 @@ def loadAllSlotInfo(model_dir: str):
|
||||
slotInfos: list[ModelSlots] = []
|
||||
for slotIndex in range(MAX_SLOT_NUM):
|
||||
slotInfo = loadSlotInfo(model_dir, slotIndex)
|
||||
slotInfo.slotIndex = slotIndex # スロットインデックスは動的に注入
|
||||
slotInfos.append(slotInfo)
|
||||
return slotInfos
|
||||
|
||||
|
||||
def saveSlotInfo(model_dir: str, slotIndex: int, slotInfo: ModelSlots):
|
||||
slotDir = os.path.join(model_dir, str(slotIndex))
|
||||
json.dump(asdict(slotInfo), open(os.path.join(slotDir, "params.json"), "w"))
|
||||
slotInfoDict = asdict(slotInfo)
|
||||
slotInfo.slotIndex = -1 # スロットインデックスは動的に注入
|
||||
json.dump(slotInfoDict, open(os.path.join(slotDir, "params.json"), "w"), indent=4)
|
||||
|
@ -1,5 +1,6 @@
|
||||
import json
|
||||
import os
|
||||
import sys
|
||||
from concurrent.futures import ThreadPoolExecutor
|
||||
from typing import Any, Tuple
|
||||
|
||||
@ -7,7 +8,6 @@ from const import RVCSampleMode, getSampleJsonAndModelIds
|
||||
from data.ModelSample import ModelSamples, generateModelSample
|
||||
from data.ModelSlot import DiffusionSVCModelSlot, ModelSlot, RVCModelSlot
|
||||
from mods.log_control import VoiceChangaerLogger
|
||||
from voice_changer.DiffusionSVC.DiffusionSVCModelSlotGenerator import DiffusionSVCModelSlotGenerator
|
||||
from voice_changer.ModelSlotManager import ModelSlotManager
|
||||
from voice_changer.RVC.RVCModelSlotGenerator import RVCModelSlotGenerator
|
||||
from downloader.Downloader import download, download_no_tqdm
|
||||
@ -109,7 +109,7 @@ def _downloadSamples(samples: list[ModelSamples], sampleModelIds: list[Tuple[str
|
||||
"position": line_num,
|
||||
}
|
||||
)
|
||||
slotInfo.modelFile = modelFilePath
|
||||
slotInfo.modelFile = os.path.basename(sample.modelUrl)
|
||||
line_num += 1
|
||||
|
||||
if targetSampleParams["useIndex"] is True and hasattr(sample, "indexUrl") and sample.indexUrl != "":
|
||||
@ -124,7 +124,7 @@ def _downloadSamples(samples: list[ModelSamples], sampleModelIds: list[Tuple[str
|
||||
"position": line_num,
|
||||
}
|
||||
)
|
||||
slotInfo.indexFile = indexPath
|
||||
slotInfo.indexFile = os.path.basename(sample.indexUrl)
|
||||
line_num += 1
|
||||
|
||||
if hasattr(sample, "icon") and sample.icon != "":
|
||||
@ -139,7 +139,7 @@ def _downloadSamples(samples: list[ModelSamples], sampleModelIds: list[Tuple[str
|
||||
"position": line_num,
|
||||
}
|
||||
)
|
||||
slotInfo.iconFile = iconPath
|
||||
slotInfo.iconFile = os.path.basename(sample.icon)
|
||||
line_num += 1
|
||||
|
||||
slotInfo.sampleId = sample.id
|
||||
@ -153,6 +153,8 @@ def _downloadSamples(samples: list[ModelSamples], sampleModelIds: list[Tuple[str
|
||||
slotInfo.isONNX = slotInfo.modelFile.endswith(".onnx")
|
||||
modelSlotManager.save_model_slot(targetSlotIndex, slotInfo)
|
||||
elif sample.voiceChangerType == "Diffusion-SVC":
|
||||
if sys.platform.startswith("darwin") is True:
|
||||
continue
|
||||
slotInfo: DiffusionSVCModelSlot = DiffusionSVCModelSlot()
|
||||
|
||||
os.makedirs(slotDir, exist_ok=True)
|
||||
@ -167,7 +169,7 @@ def _downloadSamples(samples: list[ModelSamples], sampleModelIds: list[Tuple[str
|
||||
"position": line_num,
|
||||
}
|
||||
)
|
||||
slotInfo.modelFile = modelFilePath
|
||||
slotInfo.modelFile = os.path.basename(sample.modelUrl)
|
||||
line_num += 1
|
||||
|
||||
if hasattr(sample, "icon") and sample.icon != "":
|
||||
@ -182,7 +184,7 @@ def _downloadSamples(samples: list[ModelSamples], sampleModelIds: list[Tuple[str
|
||||
"position": line_num,
|
||||
}
|
||||
)
|
||||
slotInfo.iconFile = iconPath
|
||||
slotInfo.iconFile = os.path.basename(sample.icon)
|
||||
line_num += 1
|
||||
|
||||
slotInfo.sampleId = sample.id
|
||||
@ -212,14 +214,17 @@ def _downloadSamples(samples: list[ModelSamples], sampleModelIds: list[Tuple[str
|
||||
logger.info("[Voice Changer] Generating metadata...")
|
||||
for targetSlotIndex in slotIndex:
|
||||
slotInfo = modelSlotManager.get_slot_info(targetSlotIndex)
|
||||
modelPath = os.path.join(model_dir, str(slotInfo.slotIndex), os.path.basename(slotInfo.modelFile))
|
||||
if slotInfo.voiceChangerType == "RVC":
|
||||
if slotInfo.isONNX:
|
||||
slotInfo = RVCModelSlotGenerator._setInfoByONNX(slotInfo)
|
||||
slotInfo = RVCModelSlotGenerator._setInfoByONNX(modelPath, slotInfo)
|
||||
else:
|
||||
slotInfo = RVCModelSlotGenerator._setInfoByPytorch(slotInfo)
|
||||
slotInfo = RVCModelSlotGenerator._setInfoByPytorch(modelPath, slotInfo)
|
||||
|
||||
modelSlotManager.save_model_slot(targetSlotIndex, slotInfo)
|
||||
elif slotInfo.voiceChangerType == "Diffusion-SVC":
|
||||
if sys.platform.startswith("darwin") is False:
|
||||
from voice_changer.DiffusionSVC.DiffusionSVCModelSlotGenerator import DiffusionSVCModelSlotGenerator
|
||||
if slotInfo.isONNX:
|
||||
pass
|
||||
else:
|
||||
|
3
server/fillSlot.sh
Normal file
3
server/fillSlot.sh
Normal file
@ -0,0 +1,3 @@
|
||||
for i in {1..199}; do
|
||||
cp -r model_dir/0 model_dir/$i
|
||||
done
|
@ -113,6 +113,8 @@ class MMVC_Rest_Fileuploader:
|
||||
return JSONResponse(content=json_compatible_item_data)
|
||||
except Exception as e:
|
||||
print("[Voice Changer] post_merge_models ex:", e)
|
||||
import traceback
|
||||
traceback.print_exc()
|
||||
|
||||
def post_update_model_default(self):
|
||||
try:
|
||||
|
@ -6,6 +6,7 @@ import torch
|
||||
from data.ModelSlot import DDSPSVCModelSlot
|
||||
|
||||
from voice_changer.DDSP_SVC.deviceManager.DeviceManager import DeviceManager
|
||||
from voice_changer.VoiceChangerParamsManager import VoiceChangerParamsManager
|
||||
|
||||
if sys.platform.startswith("darwin"):
|
||||
baseDir = [x for x in sys.path if x.endswith("Contents/MacOS")]
|
||||
@ -69,12 +70,15 @@ class DDSP_SVC:
|
||||
|
||||
def initialize(self):
|
||||
self.device = self.deviceManager.getDevice(self.settings.gpu)
|
||||
vcparams = VoiceChangerParamsManager.get_instance().params
|
||||
modelPath = os.path.join(vcparams.model_dir, str(self.slotInfo.slotIndex), "model", self.slotInfo.modelFile)
|
||||
diffPath = os.path.join(vcparams.model_dir, str(self.slotInfo.slotIndex), "diff", self.slotInfo.diffModelFile)
|
||||
|
||||
self.svc_model = SvcDDSP()
|
||||
self.svc_model.setVCParams(self.params)
|
||||
self.svc_model.update_model(self.slotInfo.modelFile, self.device)
|
||||
self.svc_model.update_model(modelPath, self.device)
|
||||
self.diff_model = DiffGtMel(device=self.device)
|
||||
self.diff_model.flush_model(self.slotInfo.diffModelFile, ddsp_config=self.svc_model.args)
|
||||
self.diff_model.flush_model(diffPath, ddsp_config=self.svc_model.args)
|
||||
|
||||
def update_settings(self, key: str, val: int | float | str):
|
||||
if key in self.settings.intData:
|
||||
@ -174,5 +178,9 @@ class DDSP_SVC:
|
||||
if file_path.find("DDSP-SVC" + os.path.sep) >= 0:
|
||||
# print("remove", key, file_path)
|
||||
sys.modules.pop(key)
|
||||
except: # type:ignore
|
||||
except: # type:ignore # noqa
|
||||
pass
|
||||
|
||||
def get_model_current(self):
|
||||
return [
|
||||
]
|
||||
|
@ -14,7 +14,7 @@ from voice_changer.RVC.embedder.EmbedderManager import EmbedderManager
|
||||
# from voice_changer.RVC.onnxExporter.export2onnx import export2onnx
|
||||
from voice_changer.RVC.deviceManager.DeviceManager import DeviceManager
|
||||
|
||||
from Exceptions import DeviceCannotSupportHalfPrecisionException, PipelineCreateException
|
||||
from Exceptions import DeviceCannotSupportHalfPrecisionException, PipelineCreateException, PipelineNotInitializedException
|
||||
|
||||
logger = VoiceChangaerLogger.get_instance().getLogger()
|
||||
|
||||
@ -28,7 +28,6 @@ class DiffusionSVC(VoiceChangerModel):
|
||||
InferencerManager.initialize(params)
|
||||
self.settings = DiffusionSVCSettings()
|
||||
self.params = params
|
||||
self.pitchExtractor = PitchExtractorManager.getPitchExtractor(self.settings.f0Detector, self.settings.gpu)
|
||||
|
||||
self.pipeline: Pipeline | None = None
|
||||
|
||||
@ -84,6 +83,8 @@ class DiffusionSVC(VoiceChangerModel):
|
||||
if self.pipeline is not None:
|
||||
pipelineInfo = self.pipeline.getPipelineInfo()
|
||||
data["pipelineInfo"] = pipelineInfo
|
||||
else:
|
||||
data["pipelineInfo"] = "None"
|
||||
return data
|
||||
|
||||
def get_processing_sampling_rate(self):
|
||||
@ -137,6 +138,9 @@ class DiffusionSVC(VoiceChangerModel):
|
||||
return (self.audio_buffer, self.pitchf_buffer, self.feature_buffer, convertSize, vol)
|
||||
|
||||
def inference(self, receivedData: AudioInOut, crossfade_frame: int, sola_search_frame: int):
|
||||
if self.pipeline is None:
|
||||
logger.info("[Voice Changer] Pipeline is not initialized.")
|
||||
raise PipelineNotInitializedException()
|
||||
data = self.generate_input(receivedData, crossfade_frame, sola_search_frame)
|
||||
audio: AudioInOut = data[0]
|
||||
pitchf: PitchfInOut = data[1]
|
||||
@ -176,7 +180,8 @@ class DiffusionSVC(VoiceChangerModel):
|
||||
silenceFrontSec,
|
||||
embOutputLayer,
|
||||
useFinalProj,
|
||||
protect
|
||||
protect,
|
||||
skip_diffusion=self.settings.skipDiffusion,
|
||||
)
|
||||
result = audio_out.detach().cpu().numpy()
|
||||
return result
|
||||
@ -211,6 +216,10 @@ class DiffusionSVC(VoiceChangerModel):
|
||||
"key": "defaultTune",
|
||||
"val": self.settings.tran,
|
||||
},
|
||||
{
|
||||
"key": "dstId",
|
||||
"val": self.settings.dstId,
|
||||
},
|
||||
{
|
||||
"key": "defaultKstep",
|
||||
"val": self.settings.kStep,
|
||||
|
@ -1,14 +1,14 @@
|
||||
import os
|
||||
from const import EnumInferenceTypes
|
||||
from dataclasses import asdict
|
||||
import onnxruntime
|
||||
import json
|
||||
|
||||
|
||||
from data.ModelSlot import DiffusionSVCModelSlot, ModelSlot, RVCModelSlot
|
||||
from voice_changer.DiffusionSVC.inferencer.diffusion_svc_model.diffusion.unit2mel import load_model_vocoder_from_combo
|
||||
from voice_changer.VoiceChangerParamsManager import VoiceChangerParamsManager
|
||||
from voice_changer.utils.LoadModelParams import LoadModelParams
|
||||
from voice_changer.utils.ModelSlotGenerator import ModelSlotGenerator
|
||||
|
||||
|
||||
def get_divisors(n):
|
||||
divisors = []
|
||||
for i in range(1, int(n**0.5)+1):
|
||||
@ -31,6 +31,7 @@ class DiffusionSVCModelSlotGenerator(ModelSlotGenerator):
|
||||
slotInfo.name = os.path.splitext(os.path.basename(slotInfo.modelFile))[0]
|
||||
# slotInfo.iconFile = "/assets/icons/noimage.png"
|
||||
slotInfo.embChannels = 768
|
||||
slotInfo.slotIndex = props.slot
|
||||
|
||||
if slotInfo.isONNX:
|
||||
slotInfo = cls._setInfoByONNX(slotInfo)
|
||||
@ -40,7 +41,10 @@ class DiffusionSVCModelSlotGenerator(ModelSlotGenerator):
|
||||
|
||||
@classmethod
|
||||
def _setInfoByPytorch(cls, slot: DiffusionSVCModelSlot):
|
||||
diff_model, diff_args, naive_model, naive_args = load_model_vocoder_from_combo(slot.modelFile, device="cpu")
|
||||
vcparams = VoiceChangerParamsManager.get_instance().params
|
||||
modelPath = os.path.join(vcparams.model_dir, str(slot.slotIndex), os.path.basename(slot.modelFile))
|
||||
|
||||
diff_model, diff_args, naive_model, naive_args = load_model_vocoder_from_combo(modelPath, device="cpu")
|
||||
slot.kStepMax = diff_args.model.k_step_max
|
||||
slot.nLayers = diff_args.model.n_layers
|
||||
slot.nnLayers = naive_args.model.n_layers
|
||||
@ -52,53 +56,4 @@ class DiffusionSVCModelSlotGenerator(ModelSlotGenerator):
|
||||
|
||||
@classmethod
|
||||
def _setInfoByONNX(cls, slot: ModelSlot):
|
||||
tmp_onnx_session = onnxruntime.InferenceSession(slot.modelFile, providers=["CPUExecutionProvider"])
|
||||
modelmeta = tmp_onnx_session.get_modelmeta()
|
||||
try:
|
||||
slot = RVCModelSlot(**asdict(slot))
|
||||
metadata = json.loads(modelmeta.custom_metadata_map["metadata"])
|
||||
|
||||
# slot.modelType = metadata["modelType"]
|
||||
slot.embChannels = metadata["embChannels"]
|
||||
|
||||
slot.embOutputLayer = metadata["embOutputLayer"] if "embOutputLayer" in metadata else 9
|
||||
slot.useFinalProj = metadata["useFinalProj"] if "useFinalProj" in metadata else True if slot.embChannels == 256 else False
|
||||
|
||||
if slot.embChannels == 256:
|
||||
slot.useFinalProj = True
|
||||
else:
|
||||
slot.useFinalProj = False
|
||||
|
||||
# ONNXモデルの情報を表示
|
||||
if slot.embChannels == 256 and slot.embOutputLayer == 9 and slot.useFinalProj is True:
|
||||
print("[Voice Changer] ONNX Model: Official v1 like")
|
||||
elif slot.embChannels == 768 and slot.embOutputLayer == 12 and slot.useFinalProj is False:
|
||||
print("[Voice Changer] ONNX Model: Official v2 like")
|
||||
else:
|
||||
print(f"[Voice Changer] ONNX Model: ch:{slot.embChannels}, L:{slot.embOutputLayer}, FP:{slot.useFinalProj}")
|
||||
|
||||
if "embedder" not in metadata:
|
||||
slot.embedder = "hubert_base"
|
||||
else:
|
||||
slot.embedder = metadata["embedder"]
|
||||
|
||||
slot.f0 = metadata["f0"]
|
||||
slot.modelType = EnumInferenceTypes.onnxRVC.value if slot.f0 else EnumInferenceTypes.onnxRVCNono.value
|
||||
slot.samplingRate = metadata["samplingRate"]
|
||||
slot.deprecated = False
|
||||
|
||||
except Exception as e:
|
||||
slot.modelType = EnumInferenceTypes.onnxRVC.value
|
||||
slot.embChannels = 256
|
||||
slot.embedder = "hubert_base"
|
||||
slot.f0 = True
|
||||
slot.samplingRate = 48000
|
||||
slot.deprecated = True
|
||||
|
||||
print("[Voice Changer] setInfoByONNX", e)
|
||||
print("[Voice Changer] ############## !!!! CAUTION !!!! ####################")
|
||||
print("[Voice Changer] This onnxfie is depricated. Please regenerate onnxfile.")
|
||||
print("[Voice Changer] ############## !!!! CAUTION !!!! ####################")
|
||||
|
||||
del tmp_onnx_session
|
||||
return slot
|
||||
|
@ -13,6 +13,7 @@ class DiffusionSVCSettings:
|
||||
|
||||
kStep: int = 20
|
||||
speedUp: int = 10
|
||||
skipDiffusion: int = 1 # 0:off, 1:on
|
||||
|
||||
silenceFront: int = 1 # 0:off, 1:on
|
||||
modelSamplingRate: int = 44100
|
||||
@ -29,6 +30,7 @@ class DiffusionSVCSettings:
|
||||
"kStep",
|
||||
"speedUp",
|
||||
"silenceFront",
|
||||
"skipDiffusion",
|
||||
]
|
||||
floatData = ["silentThreshold"]
|
||||
strData = ["f0Detector"]
|
||||
|
@ -112,25 +112,27 @@ class DiffusionSVCInferencer(Inferencer):
|
||||
k_step: int,
|
||||
infer_speedup: int,
|
||||
silence_front: float,
|
||||
skip_diffusion: bool = True,
|
||||
) -> torch.Tensor:
|
||||
with Timer("pre-process") as t:
|
||||
with Timer("pre-process", False) as t:
|
||||
gt_spec = self.naive_model_call(feats, pitch, volume, spk_id=sid, spk_mix_dict=None, aug_shift=0, spk_emb=None)
|
||||
# gt_spec = self.vocoder.extract(audio_t, 16000)
|
||||
# gt_spec = torch.cat((gt_spec, gt_spec[:, -1:, :]), 1)
|
||||
|
||||
# print("[ ----Timer::1: ]", t.secs)
|
||||
|
||||
with Timer("pre-process") as t:
|
||||
with Timer("pre-process", False) as t:
|
||||
if skip_diffusion == 0:
|
||||
out_mel = self.__call__(feats, pitch, volume, spk_id=sid, spk_mix_dict=None, aug_shift=0, gt_spec=gt_spec, infer_speedup=infer_speedup, method='dpm-solver', k_step=k_step, use_tqdm=False, spk_emb=None)
|
||||
|
||||
gt_spec = out_mel
|
||||
# print("[ ----Timer::2: ]", t.secs)
|
||||
with Timer("pre-process") as t: # NOQA
|
||||
|
||||
|
||||
with Timer("pre-process", False) as t: # NOQA
|
||||
if self.vocoder_onnx is None:
|
||||
start_frame = int(silence_front * self.vocoder.vocoder_sample_rate / self.vocoder.vocoder_hop_size)
|
||||
out_wav = self.mel2wav(out_mel, pitch, start_frame=start_frame)
|
||||
out_wav = self.mel2wav(gt_spec, pitch, start_frame=start_frame)
|
||||
out_wav *= mask
|
||||
else:
|
||||
out_wav = self.vocoder_onnx.infer(out_mel, pitch, silence_front, mask)
|
||||
out_wav = self.vocoder_onnx.infer(gt_spec, pitch, silence_front, mask)
|
||||
# print("[ ----Timer::3: ]", t.secs)
|
||||
|
||||
return out_wav.squeeze()
|
||||
|
@ -21,11 +21,16 @@ class Inferencer(Protocol):
|
||||
|
||||
def infer(
|
||||
self,
|
||||
audio_t: torch.Tensor,
|
||||
feats: torch.Tensor,
|
||||
pitch_length: torch.Tensor,
|
||||
pitch: torch.Tensor | None,
|
||||
pitchf: torch.Tensor | None,
|
||||
pitch: torch.Tensor,
|
||||
volume: torch.Tensor,
|
||||
mask: torch.Tensor,
|
||||
sid: torch.Tensor,
|
||||
k_step: int,
|
||||
infer_speedup: int,
|
||||
silence_front: float,
|
||||
skip_diffusion: bool = True,
|
||||
) -> torch.Tensor:
|
||||
...
|
||||
|
||||
|
@ -81,23 +81,6 @@ class Pipeline(object):
|
||||
|
||||
@torch.no_grad()
|
||||
def extract_volume_and_mask(self, audio: torch.Tensor, threshold: float):
|
||||
'''
|
||||
with Timer("[VolumeExt np]") as t:
|
||||
for i in range(100):
|
||||
volume = self.volumeExtractor.extract(audio)
|
||||
time_np = t.secs
|
||||
with Timer("[VolumeExt pt]") as t:
|
||||
for i in range(100):
|
||||
volume_t = self.volumeExtractor.extract_t(audio)
|
||||
time_pt = t.secs
|
||||
|
||||
print("[Volume np]:", volume)
|
||||
print("[Volume pt]:", volume_t)
|
||||
print("[Perform]:", time_np, time_pt)
|
||||
# -> [Perform]: 0.030178070068359375 0.005780220031738281 (RTX4090)
|
||||
# -> [Perform]: 0.029046058654785156 0.0025115013122558594 (CPU i9 13900KF)
|
||||
# ---> これくらいの処理ならCPU上のTorchでやった方が早い?
|
||||
'''
|
||||
volume_t = self.volumeExtractor.extract_t(audio)
|
||||
mask = self.volumeExtractor.get_mask_from_volume_t(volume_t, self.inferencer_block_size, threshold=threshold)
|
||||
volume = volume_t.unsqueeze(-1).unsqueeze(0)
|
||||
@ -116,10 +99,11 @@ class Pipeline(object):
|
||||
silence_front,
|
||||
embOutputLayer,
|
||||
useFinalProj,
|
||||
protect=0.5
|
||||
protect=0.5,
|
||||
skip_diffusion=True,
|
||||
):
|
||||
# print("---------- pipe line --------------------")
|
||||
with Timer("pre-process") as t:
|
||||
with Timer("pre-process", False) as t:
|
||||
audio_t = torch.from_numpy(audio).float().unsqueeze(0).to(self.device)
|
||||
audio16k = self.resamplerIn(audio_t)
|
||||
volume, mask = self.extract_volume_and_mask(audio16k, threshold=-60.0)
|
||||
@ -127,7 +111,7 @@ class Pipeline(object):
|
||||
n_frames = int(audio16k.size(-1) // self.hop_size + 1)
|
||||
# print("[Timer::1: ]", t.secs)
|
||||
|
||||
with Timer("pre-process") as t:
|
||||
with Timer("pre-process", False) as t:
|
||||
# ピッチ検出
|
||||
try:
|
||||
# pitch = self.pitchExtractor.extract(
|
||||
@ -157,7 +141,7 @@ class Pipeline(object):
|
||||
feats = feats.view(1, -1)
|
||||
# print("[Timer::2: ]", t.secs)
|
||||
|
||||
with Timer("pre-process") as t:
|
||||
with Timer("pre-process", False) as t:
|
||||
|
||||
# embedding
|
||||
with autocast(enabled=self.isHalf):
|
||||
@ -175,7 +159,7 @@ class Pipeline(object):
|
||||
feats = F.interpolate(feats.permute(0, 2, 1), size=int(n_frames), mode='nearest').permute(0, 2, 1)
|
||||
# print("[Timer::3: ]", t.secs)
|
||||
|
||||
with Timer("pre-process") as t:
|
||||
with Timer("pre-process", False) as t:
|
||||
# 推論実行
|
||||
try:
|
||||
with torch.no_grad():
|
||||
@ -191,7 +175,8 @@ class Pipeline(object):
|
||||
sid,
|
||||
k_step,
|
||||
infer_speedup,
|
||||
silence_front=silence_front
|
||||
silence_front=silence_front,
|
||||
skip_diffusion=skip_diffusion
|
||||
).to(dtype=torch.float32),
|
||||
-1.0,
|
||||
1.0,
|
||||
@ -206,7 +191,7 @@ class Pipeline(object):
|
||||
raise e
|
||||
# print("[Timer::4: ]", t.secs)
|
||||
|
||||
with Timer("pre-process") as t: # NOQA
|
||||
with Timer("pre-process", False) as t: # NOQA
|
||||
feats_buffer = feats.squeeze(0).detach().cpu()
|
||||
if pitch is not None:
|
||||
pitch_buffer = pitch.squeeze(0).detach().cpu()
|
||||
|
@ -7,19 +7,23 @@ from voice_changer.DiffusionSVC.pitchExtractor.PitchExtractorManager import Pitc
|
||||
|
||||
from voice_changer.RVC.deviceManager.DeviceManager import DeviceManager
|
||||
from voice_changer.RVC.embedder.EmbedderManager import EmbedderManager
|
||||
|
||||
import os
|
||||
import torch
|
||||
from torchaudio.transforms import Resample
|
||||
|
||||
from voice_changer.VoiceChangerParamsManager import VoiceChangerParamsManager
|
||||
|
||||
|
||||
def createPipeline(modelSlot: DiffusionSVCModelSlot, gpu: int, f0Detector: str, inputSampleRate: int, outputSampleRate: int):
|
||||
dev = DeviceManager.get_instance().getDevice(gpu)
|
||||
vcparams = VoiceChangerParamsManager.get_instance().params
|
||||
# half = DeviceManager.get_instance().halfPrecisionAvailable(gpu)
|
||||
half = False
|
||||
|
||||
# Inferencer 生成
|
||||
try:
|
||||
inferencer = InferencerManager.getInferencer(modelSlot.modelType, modelSlot.modelFile, gpu)
|
||||
modelPath = os.path.join(vcparams.model_dir, str(modelSlot.slotIndex), os.path.basename(modelSlot.modelFile))
|
||||
inferencer = InferencerManager.getInferencer(modelSlot.modelType, modelPath, gpu)
|
||||
except Exception as e:
|
||||
print("[Voice Changer] exception! loading inferencer", e)
|
||||
traceback.print_exc()
|
||||
|
@ -20,6 +20,13 @@ AudioDeviceKind: TypeAlias = Literal["input", "output"]
|
||||
|
||||
logger = VoiceChangaerLogger.get_instance().getLogger()
|
||||
|
||||
# See https://github.com/w-okada/voice-changer/issues/620
|
||||
LocalServerDeviceMode: TypeAlias = Literal[
|
||||
"NoMonitorSeparate",
|
||||
"WithMonitorStandard",
|
||||
"WithMonitorAllSeparate",
|
||||
]
|
||||
|
||||
|
||||
@dataclass
|
||||
class ServerDeviceSettings:
|
||||
@ -39,6 +46,7 @@ class ServerDeviceSettings:
|
||||
serverReadChunkSize: int = 256
|
||||
serverInputAudioGain: float = 1.0
|
||||
serverOutputAudioGain: float = 1.0
|
||||
serverMonitorAudioGain: float = 1.0
|
||||
|
||||
exclusiveMode: bool = False
|
||||
|
||||
@ -59,6 +67,7 @@ EditableServerDeviceSettings = {
|
||||
"floatData": [
|
||||
"serverInputAudioGain",
|
||||
"serverOutputAudioGain",
|
||||
"serverMonitorAudioGain",
|
||||
],
|
||||
"boolData": [
|
||||
"exclusiveMode"
|
||||
@ -95,6 +104,14 @@ class ServerDevice:
|
||||
self.monQueue = Queue()
|
||||
self.performance = []
|
||||
|
||||
# setting change確認用
|
||||
self.currentServerInputDeviceId = -1
|
||||
self.currentServerOutputDeviceId = -1
|
||||
self.currentServerMonitorDeviceId = -1
|
||||
self.currentModelSamplingRate = -1
|
||||
self.currentInputChunkNum = -1
|
||||
self.currentAudioSampleRate = -1
|
||||
|
||||
def getServerInputAudioDevice(self, index: int):
|
||||
audioinput, _audiooutput = list_audio_device()
|
||||
serverAudioDevice = [x for x in audioinput if x.index == index]
|
||||
@ -111,36 +128,51 @@ class ServerDevice:
|
||||
else:
|
||||
return None
|
||||
|
||||
def audio_callback(self, indata: np.ndarray, outdata: np.ndarray, frames, times, status):
|
||||
try:
|
||||
###########################################
|
||||
# Callback Section
|
||||
###########################################
|
||||
|
||||
def _processData(self, indata: np.ndarray):
|
||||
indata = indata * self.settings.serverInputAudioGain
|
||||
with Timer("all_inference_time") as t:
|
||||
unpackedData = librosa.to_mono(indata.T) * 32768.0
|
||||
unpackedData = unpackedData.astype(np.int16)
|
||||
out_wav, times = self.serverDeviceCallbacks.on_request(unpackedData)
|
||||
outputChannels = outdata.shape[1]
|
||||
outdata[:] = np.repeat(out_wav, outputChannels).reshape(-1, outputChannels) / 32768.0
|
||||
outdata[:] = outdata * self.settings.serverOutputAudioGain
|
||||
return out_wav, times
|
||||
|
||||
def _processDataWithTime(self, indata: np.ndarray):
|
||||
with Timer("all_inference_time") as t:
|
||||
out_wav, times = self._processData(indata)
|
||||
all_inference_time = t.secs
|
||||
self.performance = [all_inference_time] + times
|
||||
self.serverDeviceCallbacks.emitTo(self.performance)
|
||||
self.performance = [round(x * 1000) for x in self.performance]
|
||||
return out_wav
|
||||
|
||||
def audio_callback_outQueue(self, indata: np.ndarray, outdata: np.ndarray, frames, times, status):
|
||||
try:
|
||||
out_wav = self._processDataWithTime(indata)
|
||||
|
||||
self.outQueue.put(out_wav)
|
||||
outputChannels = outdata.shape[1] # Monitorへのアウトプット
|
||||
outdata[:] = np.repeat(out_wav, outputChannels).reshape(-1, outputChannels) / 32768.0
|
||||
outdata[:] = outdata * self.settings.serverMonitorAudioGain
|
||||
except Exception as e:
|
||||
print("[Voice Changer] ex:", e)
|
||||
|
||||
def audioInput_callback(self, indata: np.ndarray, frames, times, status):
|
||||
def audioInput_callback_outQueue(self, indata: np.ndarray, frames, times, status):
|
||||
try:
|
||||
indata = indata * self.settings.serverInputAudioGain
|
||||
with Timer("all_inference_time") as t:
|
||||
unpackedData = librosa.to_mono(indata.T) * 32768.0
|
||||
unpackedData = unpackedData.astype(np.int16)
|
||||
out_wav, times = self.serverDeviceCallbacks.on_request(unpackedData)
|
||||
out_wav = self._processDataWithTime(indata)
|
||||
self.outQueue.put(out_wav)
|
||||
except Exception as e:
|
||||
print("[Voice Changer][ServerDevice][audioInput_callback] ex:", e)
|
||||
# import traceback
|
||||
# traceback.print_exc()
|
||||
|
||||
def audioInput_callback_outQueue_monQueue(self, indata: np.ndarray, frames, times, status):
|
||||
try:
|
||||
out_wav = self._processDataWithTime(indata)
|
||||
self.outQueue.put(out_wav)
|
||||
self.monQueue.put(out_wav)
|
||||
all_inference_time = t.secs
|
||||
self.performance = [all_inference_time] + times
|
||||
self.serverDeviceCallbacks.emitTo(self.performance)
|
||||
self.performance = [round(x * 1000) for x in self.performance]
|
||||
except Exception as e:
|
||||
print("[Voice Changer][ServerDevice][audioInput_callback] ex:", e)
|
||||
# import traceback
|
||||
@ -166,15 +198,144 @@ class ServerDevice:
|
||||
self.monQueue.get()
|
||||
outputChannels = outdata.shape[1]
|
||||
outdata[:] = np.repeat(mon_wav, outputChannels).reshape(-1, outputChannels) / 32768.0
|
||||
outdata[:] = outdata * self.settings.serverOutputAudioGain # GainはOutputのものをを流用
|
||||
# Monitorモードが有効の場合はサンプリングレートはmonitorデバイスが優先されているためリサンプリング不要
|
||||
outdata[:] = outdata * self.settings.serverMonitorAudioGain
|
||||
except Exception as e:
|
||||
print("[Voice Changer][ServerDevice][audioMonitor_callback] ex:", e)
|
||||
# import traceback
|
||||
# traceback.print_exc()
|
||||
|
||||
###########################################
|
||||
# Main Loop Section
|
||||
###########################################
|
||||
def checkSettingChanged(self):
|
||||
if self.settings.serverAudioStated != 1:
|
||||
print(f"serverAudioStarted Changed: {self.settings.serverAudioStated}")
|
||||
return True
|
||||
elif self.currentServerInputDeviceId != self.settings.serverInputDeviceId:
|
||||
print(f"serverInputDeviceId Changed: {self.currentServerInputDeviceId} -> {self.settings.serverInputDeviceId}")
|
||||
return True
|
||||
elif self.currentServerOutputDeviceId != self.settings.serverOutputDeviceId:
|
||||
print(f"serverOutputDeviceId Changed: {self.currentServerOutputDeviceId} -> {self.settings.serverOutputDeviceId}")
|
||||
return True
|
||||
elif self.currentServerMonitorDeviceId != self.settings.serverMonitorDeviceId:
|
||||
print(f"serverMonitorDeviceId Changed: {self.currentServerMonitorDeviceId} -> {self.settings.serverMonitorDeviceId}")
|
||||
return True
|
||||
elif self.currentModelSamplingRate != self.serverDeviceCallbacks.get_processing_sampling_rate():
|
||||
print(f"currentModelSamplingRate Changed: {self.currentModelSamplingRate} -> {self.serverDeviceCallbacks.get_processing_sampling_rate()}")
|
||||
return True
|
||||
elif self.currentInputChunkNum != self.settings.serverReadChunkSize:
|
||||
print(f"currentInputChunkNum Changed: {self.currentInputChunkNum} -> {self.settings.serverReadChunkSize}")
|
||||
return True
|
||||
elif self.currentAudioSampleRate != self.settings.serverAudioSampleRate:
|
||||
print(f"currentAudioSampleRate Changed: {self.currentAudioSampleRate} -> {self.settings.serverAudioSampleRate}")
|
||||
return True
|
||||
else:
|
||||
return False
|
||||
|
||||
def runNoMonitorSeparate(self, block_frame: int, inputMaxChannel: int, outputMaxChannel: int, inputExtraSetting, outputExtraSetting):
|
||||
with sd.InputStream(
|
||||
callback=self.audioInput_callback_outQueue,
|
||||
dtype="float32",
|
||||
device=self.settings.serverInputDeviceId,
|
||||
blocksize=block_frame,
|
||||
samplerate=self.settings.serverInputAudioSampleRate,
|
||||
channels=inputMaxChannel,
|
||||
extra_settings=inputExtraSetting
|
||||
):
|
||||
with sd.OutputStream(
|
||||
callback=self.audioOutput_callback,
|
||||
dtype="float32",
|
||||
device=self.settings.serverOutputDeviceId,
|
||||
blocksize=block_frame,
|
||||
samplerate=self.settings.serverOutputAudioSampleRate,
|
||||
channels=outputMaxChannel,
|
||||
extra_settings=outputExtraSetting
|
||||
):
|
||||
while True:
|
||||
changed = self.checkSettingChanged()
|
||||
if changed:
|
||||
break
|
||||
time.sleep(2)
|
||||
print(f"[Voice Changer] server audio performance {self.performance}")
|
||||
print(f" status: started:{self.settings.serverAudioStated}, model_sr:{self.currentModelSamplingRate}, chunk:{self.currentInputChunkNum}")
|
||||
print(f" input : id:{self.settings.serverInputDeviceId}, sr:{self.settings.serverInputAudioSampleRate}, ch:{inputMaxChannel}")
|
||||
print(f" output : id:{self.settings.serverOutputDeviceId}, sr:{self.settings.serverOutputAudioSampleRate}, ch:{outputMaxChannel}")
|
||||
# print(f" monitor: id:{self.settings.serverMonitorDeviceId}, sr:{self.settings.serverMonitorAudioSampleRate}, ch:{self.serverMonitorAudioDevice.maxOutputChannels}")
|
||||
|
||||
def runWithMonitorStandard(self, block_frame: int, inputMaxChannel: int, outputMaxChannel: int, monitorMaxChannel: int, inputExtraSetting, outputExtraSetting, monitorExtraSetting):
|
||||
with sd.Stream(
|
||||
callback=self.audio_callback_outQueue,
|
||||
dtype="float32",
|
||||
device=(self.settings.serverInputDeviceId, self.settings.serverMonitorDeviceId),
|
||||
blocksize=block_frame,
|
||||
samplerate=self.settings.serverInputAudioSampleRate,
|
||||
channels=(inputMaxChannel, monitorMaxChannel),
|
||||
extra_settings=[inputExtraSetting, monitorExtraSetting]
|
||||
):
|
||||
with sd.OutputStream(
|
||||
callback=self.audioOutput_callback,
|
||||
dtype="float32",
|
||||
device=self.settings.serverOutputDeviceId,
|
||||
blocksize=block_frame,
|
||||
samplerate=self.settings.serverOutputAudioSampleRate,
|
||||
channels=outputMaxChannel,
|
||||
extra_settings=outputExtraSetting
|
||||
):
|
||||
while True:
|
||||
changed = self.checkSettingChanged()
|
||||
if changed:
|
||||
break
|
||||
time.sleep(2)
|
||||
print(f"[Voice Changer] server audio performance {self.performance}")
|
||||
print(f" status: started:{self.settings.serverAudioStated}, model_sr:{self.currentModelSamplingRate}, chunk:{self.currentInputChunkNum}")
|
||||
print(f" input : id:{self.settings.serverInputDeviceId}, sr:{self.settings.serverInputAudioSampleRate}, ch:{inputMaxChannel}")
|
||||
print(f" output : id:{self.settings.serverOutputDeviceId}, sr:{self.settings.serverOutputAudioSampleRate}, ch:{outputMaxChannel}")
|
||||
print(f" monitor: id:{self.settings.serverMonitorDeviceId}, sr:{self.settings.serverMonitorAudioSampleRate}, ch:{monitorMaxChannel}")
|
||||
|
||||
def runWithMonitorAllSeparate(self, block_frame: int, inputMaxChannel: int, outputMaxChannel: int, monitorMaxChannel: int, inputExtraSetting, outputExtraSetting, monitorExtraSetting):
|
||||
with sd.InputStream(
|
||||
callback=self.audioInput_callback_outQueue_monQueue,
|
||||
dtype="float32",
|
||||
device=self.settings.serverInputDeviceId,
|
||||
blocksize=block_frame,
|
||||
samplerate=self.settings.serverInputAudioSampleRate,
|
||||
channels=inputMaxChannel,
|
||||
extra_settings=inputExtraSetting
|
||||
):
|
||||
with sd.OutputStream(
|
||||
callback=self.audioOutput_callback,
|
||||
dtype="float32",
|
||||
device=self.settings.serverOutputDeviceId,
|
||||
blocksize=block_frame,
|
||||
samplerate=self.settings.serverOutputAudioSampleRate,
|
||||
channels=outputMaxChannel,
|
||||
extra_settings=outputExtraSetting
|
||||
):
|
||||
with sd.OutputStream(
|
||||
callback=self.audioMonitor_callback,
|
||||
dtype="float32",
|
||||
device=self.settings.serverMonitorDeviceId,
|
||||
blocksize=block_frame,
|
||||
samplerate=self.settings.serverMonitorAudioSampleRate,
|
||||
channels=monitorMaxChannel,
|
||||
extra_settings=monitorExtraSetting
|
||||
):
|
||||
while True:
|
||||
changed = self.checkSettingChanged()
|
||||
if changed:
|
||||
break
|
||||
time.sleep(2)
|
||||
print(f"[Voice Changer] server audio performance {self.performance}")
|
||||
print(f" status: started:{self.settings.serverAudioStated}, model_sr:{self.currentModelSamplingRate}, chunk:{self.currentInputChunkNum}")
|
||||
print(f" input : id:{self.settings.serverInputDeviceId}, sr:{self.settings.serverInputAudioSampleRate}, ch:{inputMaxChannel}")
|
||||
print(f" output : id:{self.settings.serverOutputDeviceId}, sr:{self.settings.serverOutputAudioSampleRate}, ch:{outputMaxChannel}")
|
||||
print(f" monitor: id:{self.settings.serverMonitorDeviceId}, sr:{self.settings.serverMonitorAudioSampleRate}, ch:{monitorMaxChannel}")
|
||||
|
||||
###########################################
|
||||
# Start Section
|
||||
###########################################
|
||||
def start(self):
|
||||
currentModelSamplingRate = -1
|
||||
self.currentModelSamplingRate = -1
|
||||
while True:
|
||||
if self.settings.serverAudioStated == 0 or self.settings.serverInputDeviceId == -1:
|
||||
time.sleep(2)
|
||||
@ -183,9 +344,9 @@ class ServerDevice:
|
||||
sd._initialize()
|
||||
|
||||
# Curret Device ID
|
||||
currentServerInputDeviceId = self.settings.serverInputDeviceId
|
||||
currentServerOutputDeviceId = self.settings.serverOutputDeviceId
|
||||
currentServerMonitorDeviceId = self.settings.serverMonitorDeviceId
|
||||
self.currentServerInputDeviceId = self.settings.serverInputDeviceId
|
||||
self.currentServerOutputDeviceId = self.settings.serverOutputDeviceId
|
||||
self.currentServerMonitorDeviceId = self.settings.serverMonitorDeviceId
|
||||
|
||||
# Device 特定
|
||||
serverInputAudioDevice = self.getServerInputAudioDevice(self.settings.serverInputDeviceId)
|
||||
@ -220,17 +381,17 @@ class ServerDevice:
|
||||
|
||||
# サンプリングレート
|
||||
# 同一サンプリングレートに統一(変換時にサンプルが不足する場合があるため。パディング方法が明らかになれば、それぞれ設定できるかも)
|
||||
currentAudioSampleRate = self.settings.serverAudioSampleRate
|
||||
self.currentAudioSampleRate = self.settings.serverAudioSampleRate
|
||||
try:
|
||||
currentModelSamplingRate = self.serverDeviceCallbacks.get_processing_sampling_rate()
|
||||
self.currentModelSamplingRate = self.serverDeviceCallbacks.get_processing_sampling_rate()
|
||||
except Exception as e:
|
||||
print("[Voice Changer] ex: get_processing_sampling_rate", e)
|
||||
time.sleep(2)
|
||||
continue
|
||||
|
||||
self.settings.serverInputAudioSampleRate = currentAudioSampleRate
|
||||
self.settings.serverOutputAudioSampleRate = currentAudioSampleRate
|
||||
self.settings.serverMonitorAudioSampleRate = currentAudioSampleRate
|
||||
self.settings.serverInputAudioSampleRate = self.currentAudioSampleRate
|
||||
self.settings.serverOutputAudioSampleRate = self.currentAudioSampleRate
|
||||
self.settings.serverMonitorAudioSampleRate = self.currentAudioSampleRate
|
||||
|
||||
# Sample Rate Check
|
||||
inputAudioSampleRateAvailable = checkSamplingRate(self.settings.serverInputDeviceId, self.settings.serverInputAudioSampleRate, "input")
|
||||
@ -238,7 +399,7 @@ class ServerDevice:
|
||||
monitorAudioSampleRateAvailable = checkSamplingRate(self.settings.serverMonitorDeviceId, self.settings.serverMonitorAudioSampleRate, "output") if serverMonitorAudioDevice else True
|
||||
|
||||
print("Sample Rate:")
|
||||
print(f" [Model]: {currentModelSamplingRate}")
|
||||
print(f" [Model]: {self.currentModelSamplingRate}")
|
||||
print(f" [Input]: {self.settings.serverInputAudioSampleRate} -> {inputAudioSampleRateAvailable}")
|
||||
print(f" [Output]: {self.settings.serverOutputAudioSampleRate} -> {outputAudioSampleRateAvailable}")
|
||||
if serverMonitorAudioDevice is not None:
|
||||
@ -274,153 +435,51 @@ class ServerDevice:
|
||||
self.serverDeviceCallbacks.setOutputSamplingRate(self.settings.serverOutputAudioSampleRate)
|
||||
|
||||
# Blockサイズを計算
|
||||
currentInputChunkNum = self.settings.serverReadChunkSize
|
||||
self.currentInputChunkNum = self.settings.serverReadChunkSize
|
||||
# block_frame = currentInputChunkNum * 128
|
||||
block_frame = int(currentInputChunkNum * 128 * (self.settings.serverInputAudioSampleRate / 48000))
|
||||
block_frame = int(self.currentInputChunkNum * 128 * (self.settings.serverInputAudioSampleRate / 48000))
|
||||
|
||||
sd.default.blocksize = block_frame
|
||||
|
||||
# main loop
|
||||
try:
|
||||
with sd.InputStream(
|
||||
callback=self.audioInput_callback,
|
||||
dtype="float32",
|
||||
device=self.settings.serverInputDeviceId,
|
||||
blocksize=block_frame,
|
||||
samplerate=self.settings.serverInputAudioSampleRate,
|
||||
channels=serverInputAudioDevice.maxInputChannels,
|
||||
extra_settings=inputExtraSetting
|
||||
):
|
||||
with sd.OutputStream(
|
||||
callback=self.audioOutput_callback,
|
||||
dtype="float32",
|
||||
device=self.settings.serverOutputDeviceId,
|
||||
blocksize=block_frame,
|
||||
samplerate=self.settings.serverOutputAudioSampleRate,
|
||||
channels=serverOutputAudioDevice.maxOutputChannels,
|
||||
extra_settings=outputExtraSetting
|
||||
):
|
||||
if self.settings.serverMonitorDeviceId != -1:
|
||||
with sd.OutputStream(
|
||||
callback=self.audioMonitor_callback,
|
||||
dtype="float32",
|
||||
device=self.settings.serverMonitorDeviceId,
|
||||
blocksize=block_frame,
|
||||
samplerate=self.settings.serverMonitorAudioSampleRate,
|
||||
channels=serverMonitorAudioDevice.maxOutputChannels,
|
||||
extra_settings=monitorExtraSetting
|
||||
):
|
||||
while (
|
||||
self.settings.serverAudioStated == 1 and
|
||||
currentServerInputDeviceId == self.settings.serverInputDeviceId and
|
||||
currentServerOutputDeviceId == self.settings.serverOutputDeviceId and
|
||||
currentServerMonitorDeviceId == self.settings.serverMonitorDeviceId and
|
||||
currentModelSamplingRate == self.serverDeviceCallbacks.get_processing_sampling_rate() and
|
||||
currentInputChunkNum == self.settings.serverReadChunkSize and
|
||||
currentAudioSampleRate == self.settings.serverAudioSampleRate
|
||||
):
|
||||
time.sleep(2)
|
||||
print(f"[Voice Changer] server audio performance {self.performance}")
|
||||
print(f" status: started:{self.settings.serverAudioStated}, model_sr:{currentModelSamplingRate}, chunk:{currentInputChunkNum}")
|
||||
print(f" input : id:{self.settings.serverInputDeviceId}, sr:{self.settings.serverInputAudioSampleRate}, ch:{serverInputAudioDevice.maxInputChannels}")
|
||||
print(f" output : id:{self.settings.serverOutputDeviceId}, sr:{self.settings.serverOutputAudioSampleRate}, ch:{serverOutputAudioDevice.maxOutputChannels}")
|
||||
print(f" monitor: id:{self.settings.serverMonitorDeviceId}, sr:{self.settings.serverMonitorAudioSampleRate}, ch:{serverMonitorAudioDevice.maxOutputChannels}")
|
||||
# See https://github.com/w-okada/voice-changer/issues/620
|
||||
def judgeServerDeviceMode() -> LocalServerDeviceMode:
|
||||
if self.settings.serverMonitorDeviceId == -1:
|
||||
return "NoMonitorSeparate"
|
||||
else:
|
||||
while (
|
||||
self.settings.serverAudioStated == 1 and
|
||||
currentServerInputDeviceId == self.settings.serverInputDeviceId and
|
||||
currentServerOutputDeviceId == self.settings.serverOutputDeviceId and
|
||||
currentServerMonitorDeviceId == self.settings.serverMonitorDeviceId and
|
||||
currentModelSamplingRate == self.serverDeviceCallbacks.get_processing_sampling_rate() and
|
||||
currentInputChunkNum == self.settings.serverReadChunkSize and
|
||||
currentAudioSampleRate == self.settings.serverAudioSampleRate
|
||||
):
|
||||
time.sleep(2)
|
||||
print(f"[Voice Changer] server audio performance {self.performance}")
|
||||
print(f" status: started:{self.settings.serverAudioStated}, model_sr:{currentModelSamplingRate}, chunk:{currentInputChunkNum}]")
|
||||
print(f" input : id:{self.settings.serverInputDeviceId}, sr:{self.settings.serverInputAudioSampleRate}, ch:{serverInputAudioDevice.maxInputChannels}")
|
||||
print(f" output : id:{self.settings.serverOutputDeviceId}, sr:{self.settings.serverOutputAudioSampleRate}, ch:{serverOutputAudioDevice.maxOutputChannels}")
|
||||
if serverInputAudioDevice.hostAPI == serverOutputAudioDevice.hostAPI and serverInputAudioDevice.hostAPI == serverMonitorAudioDevice.hostAPI: # すべて同じ
|
||||
return "WithMonitorStandard"
|
||||
elif serverInputAudioDevice.hostAPI != serverOutputAudioDevice.hostAPI and serverInputAudioDevice.hostAPI != serverMonitorAudioDevice.hostAPI and serverOutputAudioDevice.hostAPI != serverMonitorAudioDevice.hostAPI: # すべて違う
|
||||
return "WithMonitorAllSeparate"
|
||||
elif serverInputAudioDevice.hostAPI == serverOutputAudioDevice.hostAPI: # in/outだけが同じ
|
||||
return "WithMonitorAllSeparate"
|
||||
elif serverInputAudioDevice.hostAPI == serverMonitorAudioDevice.hostAPI: # in/monだけが同じ
|
||||
return "WithMonitorStandard"
|
||||
elif serverOutputAudioDevice.hostAPI == serverMonitorAudioDevice.hostAPI: # out/monだけが同じ
|
||||
return "WithMonitorAllSeparate"
|
||||
else:
|
||||
raise RuntimeError(f"Cannot JudgeServerMode, in:{serverInputAudioDevice.hostAPI}, mon:{serverMonitorAudioDevice.hostAPI}, out:{serverOutputAudioDevice.hostAPI}")
|
||||
|
||||
serverDeviceMode = judgeServerDeviceMode()
|
||||
if serverDeviceMode == "NoMonitorSeparate":
|
||||
self.runNoMonitorSeparate(block_frame, serverInputAudioDevice.maxInputChannels, serverOutputAudioDevice.maxOutputChannels, inputExtraSetting, outputExtraSetting)
|
||||
elif serverDeviceMode == "WithMonitorStandard":
|
||||
self.runWithMonitorStandard(block_frame, serverInputAudioDevice.maxInputChannels, serverOutputAudioDevice.maxOutputChannels, serverMonitorAudioDevice.maxOutputChannels, inputExtraSetting, outputExtraSetting, monitorExtraSetting)
|
||||
elif serverDeviceMode == "WithMonitorAllSeparate":
|
||||
self.runWithMonitorAllSeparate(block_frame, serverInputAudioDevice.maxInputChannels, serverOutputAudioDevice.maxOutputChannels, serverMonitorAudioDevice.maxOutputChannels, inputExtraSetting, outputExtraSetting, monitorExtraSetting)
|
||||
else:
|
||||
raise RuntimeError(f"Unknown ServerDeviceMode: {serverDeviceMode}")
|
||||
|
||||
except Exception as e:
|
||||
print("[Voice Changer] processing, ex:", e)
|
||||
import traceback
|
||||
traceback.print_exc()
|
||||
time.sleep(2)
|
||||
|
||||
def start2(self):
|
||||
# currentInputDeviceId = -1
|
||||
# currentOutputDeviceId = -1
|
||||
# currentInputChunkNum = -1
|
||||
currentModelSamplingRate = -1
|
||||
while True:
|
||||
if self.settings.serverAudioStated == 0 or self.settings.serverInputDeviceId == -1:
|
||||
time.sleep(2)
|
||||
else:
|
||||
sd._terminate()
|
||||
sd._initialize()
|
||||
|
||||
sd.default.device[0] = self.settings.serverInputDeviceId
|
||||
sd.default.device[1] = self.settings.serverOutputDeviceId
|
||||
|
||||
serverInputAudioDevice = self.getServerInputAudioDevice(sd.default.device[0])
|
||||
serverOutputAudioDevice = self.getServerOutputAudioDevice(sd.default.device[1])
|
||||
print("Devices:", serverInputAudioDevice, serverOutputAudioDevice)
|
||||
if serverInputAudioDevice is None or serverOutputAudioDevice is None:
|
||||
time.sleep(2)
|
||||
print("serverInputAudioDevice or serverOutputAudioDevice is None")
|
||||
continue
|
||||
|
||||
sd.default.channels[0] = serverInputAudioDevice.maxInputChannels
|
||||
sd.default.channels[1] = serverOutputAudioDevice.maxOutputChannels
|
||||
|
||||
currentInputChunkNum = self.settings.serverReadChunkSize
|
||||
block_frame = currentInputChunkNum * 128
|
||||
|
||||
# sample rate precheck(alsa cannot use 40000?)
|
||||
try:
|
||||
currentModelSamplingRate = self.serverDeviceCallbacks.get_processing_sampling_rate()
|
||||
except Exception as e:
|
||||
print("[Voice Changer] ex: get_processing_sampling_rate", e)
|
||||
continue
|
||||
try:
|
||||
with sd.Stream(
|
||||
callback=self.audio_callback,
|
||||
blocksize=block_frame,
|
||||
# samplerate=currentModelSamplingRate,
|
||||
dtype="float32",
|
||||
# dtype="int16",
|
||||
# channels=[currentInputChannelNum, currentOutputChannelNum],
|
||||
):
|
||||
pass
|
||||
self.settings.serverInputAudioSampleRate = currentModelSamplingRate
|
||||
self.serverDeviceCallbacks.setInputSamplingRate(currentModelSamplingRate)
|
||||
self.serverDeviceCallbacks.setOutputSamplingRate(currentModelSamplingRate)
|
||||
print(f"[Voice Changer] sample rate {self.settings.serverInputAudioSampleRate}")
|
||||
except Exception as e:
|
||||
print("[Voice Changer] ex: fallback to device default samplerate", e)
|
||||
print("[Voice Changer] device default samplerate", serverInputAudioDevice.default_samplerate)
|
||||
self.settings.serverInputAudioSampleRate = round(serverInputAudioDevice.default_samplerate)
|
||||
self.serverDeviceCallbacks.setInputSamplingRate(round(serverInputAudioDevice.default_samplerate))
|
||||
self.serverDeviceCallbacks.setOutputSamplingRate(round(serverInputAudioDevice.default_samplerate))
|
||||
|
||||
sd.default.samplerate = self.settings.serverInputAudioSampleRate
|
||||
sd.default.blocksize = block_frame
|
||||
# main loop
|
||||
try:
|
||||
with sd.Stream(
|
||||
callback=self.audio_callback,
|
||||
# blocksize=block_frame,
|
||||
# samplerate=vc.settings.serverInputAudioSampleRate,
|
||||
dtype="float32",
|
||||
# dtype="int16",
|
||||
# channels=[currentInputChannelNum, currentOutputChannelNum],
|
||||
):
|
||||
while self.settings.serverAudioStated == 1 and sd.default.device[0] == self.settings.serverInputDeviceId and sd.default.device[1] == self.settings.serverOutputDeviceId and currentModelSamplingRate == self.serverDeviceCallbacks.get_processing_sampling_rate() and currentInputChunkNum == self.settings.serverReadChunkSize:
|
||||
time.sleep(2)
|
||||
print("[Voice Changer] server audio", self.performance)
|
||||
print(f"[Voice Changer] started:{self.settings.serverAudioStated}, input:{sd.default.device[0]}, output:{sd.default.device[1]}, mic_sr:{self.settings.serverInputAudioSampleRate}, model_sr:{currentModelSamplingRate}, chunk:{currentInputChunkNum}, ch:[{sd.default.channels}]")
|
||||
|
||||
except Exception as e:
|
||||
print("[Voice Changer] ex:", e)
|
||||
time.sleep(2)
|
||||
|
||||
###########################################
|
||||
# Info Section
|
||||
###########################################
|
||||
def get_info(self):
|
||||
data = asdict(self.settings)
|
||||
try:
|
||||
|
@ -1,6 +1,7 @@
|
||||
import sys
|
||||
import os
|
||||
from data.ModelSlot import MMVCv13ModelSlot
|
||||
from voice_changer.VoiceChangerParamsManager import VoiceChangerParamsManager
|
||||
|
||||
from voice_changer.utils.VoiceChangerModel import AudioInOut
|
||||
|
||||
@ -63,19 +64,22 @@ class MMVCv13:
|
||||
|
||||
def initialize(self):
|
||||
print("[Voice Changer] [MMVCv13] Initializing... ")
|
||||
vcparams = VoiceChangerParamsManager.get_instance().params
|
||||
configPath = os.path.join(vcparams.model_dir, str(self.slotInfo.slotIndex), self.slotInfo.configFile)
|
||||
modelPath = os.path.join(vcparams.model_dir, str(self.slotInfo.slotIndex), self.slotInfo.modelFile)
|
||||
|
||||
self.hps = get_hparams_from_file(self.slotInfo.configFile)
|
||||
self.hps = get_hparams_from_file(configPath)
|
||||
if self.slotInfo.isONNX:
|
||||
providers, options = self.getOnnxExecutionProvider()
|
||||
self.onnx_session = onnxruntime.InferenceSession(
|
||||
self.slotInfo.modelFile,
|
||||
modelPath,
|
||||
providers=providers,
|
||||
provider_options=options,
|
||||
)
|
||||
else:
|
||||
self.net_g = SynthesizerTrn(len(symbols), self.hps.data.filter_length // 2 + 1, self.hps.train.segment_size // self.hps.data.hop_length, n_speakers=self.hps.data.n_speakers, **self.hps.model)
|
||||
self.net_g.eval()
|
||||
load_checkpoint(self.slotInfo.modelFile, self.net_g, None)
|
||||
load_checkpoint(modelPath, self.net_g, None)
|
||||
|
||||
# その他の設定
|
||||
self.settings.srcId = self.slotInfo.srcId
|
||||
@ -105,8 +109,10 @@ class MMVCv13:
|
||||
|
||||
if key == "gpu" and self.slotInfo.isONNX:
|
||||
providers, options = self.getOnnxExecutionProvider()
|
||||
vcparams = VoiceChangerParamsManager.get_instance().params
|
||||
modelPath = os.path.join(vcparams.model_dir, str(self.slotInfo.slotIndex), self.slotInfo.modelFile)
|
||||
self.onnx_session = onnxruntime.InferenceSession(
|
||||
self.slotInfo.modelFile,
|
||||
modelPath,
|
||||
providers=providers,
|
||||
provider_options=options,
|
||||
)
|
||||
@ -249,3 +255,15 @@ class MMVCv13:
|
||||
sys.modules.pop(key)
|
||||
except: # NOQA
|
||||
pass
|
||||
|
||||
def get_model_current(self):
|
||||
return [
|
||||
{
|
||||
"key": "srcId",
|
||||
"val": self.settings.srcId,
|
||||
},
|
||||
{
|
||||
"key": "dstId",
|
||||
"val": self.settings.dstId,
|
||||
}
|
||||
]
|
||||
|
@ -1,6 +1,7 @@
|
||||
import sys
|
||||
import os
|
||||
from data.ModelSlot import MMVCv15ModelSlot
|
||||
from voice_changer.VoiceChangerParamsManager import VoiceChangerParamsManager
|
||||
from voice_changer.utils.VoiceChangerModel import AudioInOut
|
||||
|
||||
if sys.platform.startswith("darwin"):
|
||||
@ -70,7 +71,11 @@ class MMVCv15:
|
||||
|
||||
def initialize(self):
|
||||
print("[Voice Changer] [MMVCv15] Initializing... ")
|
||||
self.hps = get_hparams_from_file(self.slotInfo.configFile)
|
||||
vcparams = VoiceChangerParamsManager.get_instance().params
|
||||
configPath = os.path.join(vcparams.model_dir, str(self.slotInfo.slotIndex), self.slotInfo.configFile)
|
||||
modelPath = os.path.join(vcparams.model_dir, str(self.slotInfo.slotIndex), self.slotInfo.modelFile)
|
||||
|
||||
self.hps = get_hparams_from_file(configPath)
|
||||
|
||||
self.net_g = SynthesizerTrn(
|
||||
spec_channels=self.hps.data.filter_length // 2 + 1,
|
||||
@ -96,7 +101,7 @@ class MMVCv15:
|
||||
self.onxx_input_length = 8192
|
||||
providers, options = self.getOnnxExecutionProvider()
|
||||
self.onnx_session = onnxruntime.InferenceSession(
|
||||
self.slotInfo.modelFile,
|
||||
modelPath,
|
||||
providers=providers,
|
||||
provider_options=options,
|
||||
)
|
||||
@ -108,7 +113,7 @@ class MMVCv15:
|
||||
self.settings.maxInputLength = self.onxx_input_length - (0.012 * self.hps.data.sampling_rate) - 1024 # onnxの場合は入力長固(crossfadeの1024は仮) # NOQA
|
||||
else:
|
||||
self.net_g.eval()
|
||||
load_checkpoint(self.slotInfo.modelFile, self.net_g, None)
|
||||
load_checkpoint(modelPath, self.net_g, None)
|
||||
|
||||
# その他の設定
|
||||
self.settings.srcId = self.slotInfo.srcId
|
||||
@ -139,8 +144,10 @@ class MMVCv15:
|
||||
setattr(self.settings, key, val)
|
||||
if key == "gpu" and self.slotInfo.isONNX:
|
||||
providers, options = self.getOnnxExecutionProvider()
|
||||
vcparams = VoiceChangerParamsManager.get_instance().params
|
||||
modelPath = os.path.join(vcparams.model_dir, str(self.slotInfo.slotIndex), self.slotInfo.modelFile)
|
||||
self.onnx_session = onnxruntime.InferenceSession(
|
||||
self.slotInfo.modelFile,
|
||||
modelPath,
|
||||
providers=providers,
|
||||
provider_options=options,
|
||||
)
|
||||
@ -208,7 +215,8 @@ class MMVCv15:
|
||||
solaSearchFrame: int = 0,
|
||||
):
|
||||
# maxInputLength を更新(ここでやると非効率だが、とりあえず。)
|
||||
self.settings.maxInputLength = self.onxx_input_length - crossfadeSize - solaSearchFrame # onnxの場合は入力長固(crossfadeの1024は仮) # NOQA
|
||||
if self.slotInfo.isONNX:
|
||||
self.settings.maxInputLength = self.onxx_input_length - crossfadeSize - solaSearchFrame # onnxの場合は入力長固(crossfadeの1024は仮) # NOQA get_infoで返る値。この関数内の処理では使わない。
|
||||
|
||||
newData = newData.astype(np.float32) / self.hps.data.max_wav_value
|
||||
|
||||
@ -310,3 +318,19 @@ class MMVCv15:
|
||||
sys.modules.pop(key)
|
||||
except: # NOQA
|
||||
pass
|
||||
|
||||
def get_model_current(self):
|
||||
return [
|
||||
{
|
||||
"key": "srcId",
|
||||
"val": self.settings.srcId,
|
||||
},
|
||||
{
|
||||
"key": "dstId",
|
||||
"val": self.settings.dstId,
|
||||
},
|
||||
{
|
||||
"key": "f0Factor",
|
||||
"val": self.settings.f0Factor,
|
||||
}
|
||||
]
|
||||
|
@ -1,6 +1,7 @@
|
||||
import os
|
||||
|
||||
from data.ModelSlot import MMVCv15ModelSlot
|
||||
from voice_changer.VoiceChangerParamsManager import VoiceChangerParamsManager
|
||||
from voice_changer.utils.LoadModelParams import LoadModelParams
|
||||
from voice_changer.utils.ModelSlotGenerator import ModelSlotGenerator
|
||||
|
||||
@ -15,7 +16,9 @@ class MMVCv15ModelSlotGenerator(ModelSlotGenerator):
|
||||
elif file.kind == "mmvcv15Config":
|
||||
slotInfo.configFile = file.name
|
||||
elif file.kind == "mmvcv15Correspondence":
|
||||
with open(file.name, "r") as f:
|
||||
vcparams = VoiceChangerParamsManager.get_instance().params
|
||||
filePath = os.path.join(vcparams.model_dir, str(props.slot), file.name)
|
||||
with open(filePath, "r") as f:
|
||||
slotInfo.speakers = {}
|
||||
while True:
|
||||
line = f.readline()
|
||||
|
@ -4,17 +4,17 @@ import torch
|
||||
from const import UPLOAD_DIR
|
||||
from voice_changer.RVC.modelMerger.MergeModel import merge_model
|
||||
from voice_changer.utils.ModelMerger import ModelMerger, ModelMergerRequest
|
||||
from voice_changer.utils.VoiceChangerParams import VoiceChangerParams
|
||||
|
||||
|
||||
class RVCModelMerger(ModelMerger):
|
||||
@classmethod
|
||||
def merge_models(cls, request: ModelMergerRequest, storeSlot: int):
|
||||
print("[Voice Changer] MergeRequest:", request)
|
||||
merged = merge_model(request)
|
||||
def merge_models(cls, params: VoiceChangerParams, request: ModelMergerRequest, storeSlot: int):
|
||||
merged = merge_model(params, request)
|
||||
|
||||
# いったんは、アップロードフォルダに格納する。(歴史的経緯)
|
||||
# 後続のloadmodelを呼び出すことで永続化モデルフォルダに移動させられる。
|
||||
storeDir = os.path.join(UPLOAD_DIR, f"{storeSlot}")
|
||||
storeDir = os.path.join(UPLOAD_DIR)
|
||||
print("[Voice Changer] store merged model to:", storeDir)
|
||||
os.makedirs(storeDir, exist_ok=True)
|
||||
storeFile = os.path.join(storeDir, "merged.pth")
|
||||
|
@ -5,7 +5,8 @@ import torch
|
||||
import onnxruntime
|
||||
import json
|
||||
|
||||
from data.ModelSlot import ModelSlot, RVCModelSlot
|
||||
from data.ModelSlot import RVCModelSlot
|
||||
from voice_changer.VoiceChangerParamsManager import VoiceChangerParamsManager
|
||||
from voice_changer.utils.LoadModelParams import LoadModelParams
|
||||
from voice_changer.utils.ModelSlotGenerator import ModelSlotGenerator
|
||||
|
||||
@ -13,6 +14,7 @@ from voice_changer.utils.ModelSlotGenerator import ModelSlotGenerator
|
||||
class RVCModelSlotGenerator(ModelSlotGenerator):
|
||||
@classmethod
|
||||
def loadModel(cls, props: LoadModelParams):
|
||||
vcparams = VoiceChangerParamsManager.get_instance().params
|
||||
slotInfo: RVCModelSlot = RVCModelSlot()
|
||||
for file in props.files:
|
||||
if file.kind == "rvcModel":
|
||||
@ -24,17 +26,20 @@ class RVCModelSlotGenerator(ModelSlotGenerator):
|
||||
slotInfo.defaultProtect = 0.5
|
||||
slotInfo.isONNX = slotInfo.modelFile.endswith(".onnx")
|
||||
slotInfo.name = os.path.splitext(os.path.basename(slotInfo.modelFile))[0]
|
||||
print("RVC:: slotInfo.modelFile", slotInfo.modelFile)
|
||||
|
||||
# slotInfo.iconFile = "/assets/icons/noimage.png"
|
||||
|
||||
modelPath = os.path.join(vcparams.model_dir, str(props.slot), os.path.basename(slotInfo.modelFile))
|
||||
if slotInfo.isONNX:
|
||||
slotInfo = cls._setInfoByONNX(slotInfo)
|
||||
slotInfo = cls._setInfoByONNX(modelPath, slotInfo)
|
||||
else:
|
||||
slotInfo = cls._setInfoByPytorch(slotInfo)
|
||||
slotInfo = cls._setInfoByPytorch(modelPath, slotInfo)
|
||||
return slotInfo
|
||||
|
||||
@classmethod
|
||||
def _setInfoByPytorch(cls, slot: ModelSlot):
|
||||
cpt = torch.load(slot.modelFile, map_location="cpu")
|
||||
def _setInfoByPytorch(cls, modelPath: str, slot: RVCModelSlot):
|
||||
cpt = torch.load(modelPath, map_location="cpu")
|
||||
config_len = len(cpt["config"])
|
||||
version = cpt.get("version", "v1")
|
||||
|
||||
@ -113,8 +118,8 @@ class RVCModelSlotGenerator(ModelSlotGenerator):
|
||||
return slot
|
||||
|
||||
@classmethod
|
||||
def _setInfoByONNX(cls, slot: ModelSlot):
|
||||
tmp_onnx_session = onnxruntime.InferenceSession(slot.modelFile, providers=["CPUExecutionProvider"])
|
||||
def _setInfoByONNX(cls, modelPath: str, slot: RVCModelSlot):
|
||||
tmp_onnx_session = onnxruntime.InferenceSession(modelPath, providers=["CPUExecutionProvider"])
|
||||
modelmeta = tmp_onnx_session.get_modelmeta()
|
||||
try:
|
||||
slot = RVCModelSlot(**asdict(slot))
|
||||
|
289
server/voice_changer/RVC/RVCr2.py
Normal file
289
server/voice_changer/RVC/RVCr2.py
Normal file
@ -0,0 +1,289 @@
|
||||
'''
|
||||
VoiceChangerV2向け
|
||||
'''
|
||||
from dataclasses import asdict
|
||||
import numpy as np
|
||||
import torch
|
||||
from data.ModelSlot import RVCModelSlot
|
||||
from mods.log_control import VoiceChangaerLogger
|
||||
|
||||
from voice_changer.RVC.RVCSettings import RVCSettings
|
||||
from voice_changer.RVC.embedder.EmbedderManager import EmbedderManager
|
||||
from voice_changer.utils.VoiceChangerModel import AudioInOut, PitchfInOut, FeatureInOut, VoiceChangerModel
|
||||
from voice_changer.utils.VoiceChangerParams import VoiceChangerParams
|
||||
from voice_changer.RVC.onnxExporter.export2onnx import export2onnx
|
||||
from voice_changer.RVC.pitchExtractor.PitchExtractorManager import PitchExtractorManager
|
||||
from voice_changer.RVC.pipeline.PipelineGenerator import createPipeline
|
||||
from voice_changer.RVC.deviceManager.DeviceManager import DeviceManager
|
||||
from voice_changer.RVC.pipeline.Pipeline import Pipeline
|
||||
|
||||
from Exceptions import DeviceCannotSupportHalfPrecisionException, PipelineCreateException, PipelineNotInitializedException
|
||||
import resampy
|
||||
from typing import cast
|
||||
|
||||
logger = VoiceChangaerLogger.get_instance().getLogger()
|
||||
|
||||
|
||||
class RVCr2(VoiceChangerModel):
|
||||
def __init__(self, params: VoiceChangerParams, slotInfo: RVCModelSlot):
|
||||
logger.info("[Voice Changer] [RVCr2] Creating instance ")
|
||||
self.deviceManager = DeviceManager.get_instance()
|
||||
EmbedderManager.initialize(params)
|
||||
PitchExtractorManager.initialize(params)
|
||||
self.settings = RVCSettings()
|
||||
self.params = params
|
||||
# self.pitchExtractor = PitchExtractorManager.getPitchExtractor(self.settings.f0Detector, self.settings.gpu)
|
||||
|
||||
self.pipeline: Pipeline | None = None
|
||||
|
||||
self.audio_buffer: AudioInOut | None = None
|
||||
self.pitchf_buffer: PitchfInOut | None = None
|
||||
self.feature_buffer: FeatureInOut | None = None
|
||||
self.prevVol = 0.0
|
||||
self.slotInfo = slotInfo
|
||||
# self.initialize()
|
||||
|
||||
def initialize(self):
|
||||
logger.info("[Voice Changer][RVCr2] Initializing... ")
|
||||
|
||||
# pipelineの生成
|
||||
try:
|
||||
self.pipeline = createPipeline(self.params, self.slotInfo, self.settings.gpu, self.settings.f0Detector)
|
||||
except PipelineCreateException as e: # NOQA
|
||||
logger.error("[Voice Changer] pipeline create failed. check your model is valid.")
|
||||
return
|
||||
|
||||
# その他の設定
|
||||
self.settings.tran = self.slotInfo.defaultTune
|
||||
self.settings.indexRatio = self.slotInfo.defaultIndexRatio
|
||||
self.settings.protect = self.slotInfo.defaultProtect
|
||||
logger.info("[Voice Changer] [RVC] Initializing... done")
|
||||
|
||||
def setSamplingRate(self, inputSampleRate, outputSampleRate):
|
||||
self.inputSampleRate = inputSampleRate
|
||||
self.outputSampleRate = outputSampleRate
|
||||
self.initialize()
|
||||
|
||||
def update_settings(self, key: str, val: int | float | str):
|
||||
logger.info(f"[Voice Changer][RVC]: update_settings {key}:{val}")
|
||||
if key in self.settings.intData:
|
||||
setattr(self.settings, key, int(val))
|
||||
if key == "gpu":
|
||||
self.deviceManager.setForceTensor(False)
|
||||
self.initialize()
|
||||
elif key in self.settings.floatData:
|
||||
setattr(self.settings, key, float(val))
|
||||
elif key in self.settings.strData:
|
||||
setattr(self.settings, key, str(val))
|
||||
if key == "f0Detector" and self.pipeline is not None:
|
||||
pitchExtractor = PitchExtractorManager.getPitchExtractor(self.settings.f0Detector, self.settings.gpu)
|
||||
self.pipeline.setPitchExtractor(pitchExtractor)
|
||||
else:
|
||||
return False
|
||||
return True
|
||||
|
||||
def get_info(self):
|
||||
data = asdict(self.settings)
|
||||
if self.pipeline is not None:
|
||||
pipelineInfo = self.pipeline.getPipelineInfo()
|
||||
data["pipelineInfo"] = pipelineInfo
|
||||
else:
|
||||
data["pipelineInfo"] = "None"
|
||||
return data
|
||||
|
||||
def get_processing_sampling_rate(self):
|
||||
return self.slotInfo.samplingRate
|
||||
|
||||
def generate_input(
|
||||
self,
|
||||
newData: AudioInOut,
|
||||
crossfadeSize: int,
|
||||
solaSearchFrame: int,
|
||||
extra_frame: int
|
||||
):
|
||||
# 16k で入ってくる。
|
||||
inputSize = newData.shape[0]
|
||||
newData = newData.astype(np.float32) / 32768.0
|
||||
newFeatureLength = inputSize // 160 # hopsize:=160
|
||||
|
||||
if self.audio_buffer is not None:
|
||||
# 過去のデータに連結
|
||||
self.audio_buffer = np.concatenate([self.audio_buffer, newData], 0)
|
||||
if self.slotInfo.f0:
|
||||
self.pitchf_buffer = np.concatenate([self.pitchf_buffer, np.zeros(newFeatureLength)], 0)
|
||||
self.feature_buffer = np.concatenate([self.feature_buffer, np.zeros([newFeatureLength, self.slotInfo.embChannels])], 0)
|
||||
else:
|
||||
self.audio_buffer = newData
|
||||
if self.slotInfo.f0:
|
||||
self.pitchf_buffer = np.zeros(newFeatureLength)
|
||||
self.feature_buffer = np.zeros([newFeatureLength, self.slotInfo.embChannels])
|
||||
|
||||
convertSize = inputSize + crossfadeSize + solaSearchFrame + extra_frame
|
||||
|
||||
if convertSize % 160 != 0: # モデルの出力のホップサイズで切り捨てが発生するので補う。
|
||||
convertSize = convertSize + (160 - (convertSize % 160))
|
||||
outSize = int(((convertSize - extra_frame) / 16000) * self.slotInfo.samplingRate)
|
||||
|
||||
# バッファがたまっていない場合はzeroで補う
|
||||
if self.audio_buffer.shape[0] < convertSize:
|
||||
self.audio_buffer = np.concatenate([np.zeros([convertSize]), self.audio_buffer])
|
||||
if self.slotInfo.f0:
|
||||
self.pitchf_buffer = np.concatenate([np.zeros([convertSize // 160]), self.pitchf_buffer])
|
||||
self.feature_buffer = np.concatenate([np.zeros([convertSize // 160, self.slotInfo.embChannels]), self.feature_buffer])
|
||||
|
||||
# 不要部分をトリミング
|
||||
convertOffset = -1 * convertSize
|
||||
featureOffset = convertOffset // 160
|
||||
self.audio_buffer = self.audio_buffer[convertOffset:] # 変換対象の部分だけ抽出
|
||||
if self.slotInfo.f0:
|
||||
self.pitchf_buffer = self.pitchf_buffer[featureOffset:]
|
||||
self.feature_buffer = self.feature_buffer[featureOffset:]
|
||||
|
||||
# 出力部分だけ切り出して音量を確認。(TODO:段階的消音にする)
|
||||
cropOffset = -1 * (inputSize + crossfadeSize)
|
||||
cropEnd = -1 * (crossfadeSize)
|
||||
crop = self.audio_buffer[cropOffset:cropEnd]
|
||||
vol = np.sqrt(np.square(crop).mean())
|
||||
vol = max(vol, self.prevVol * 0.0)
|
||||
self.prevVol = vol
|
||||
|
||||
return (self.audio_buffer, self.pitchf_buffer, self.feature_buffer, convertSize, vol, outSize)
|
||||
|
||||
def inference(self, receivedData: AudioInOut, crossfade_frame: int, sola_search_frame: int):
|
||||
if self.pipeline is None:
|
||||
logger.info("[Voice Changer] Pipeline is not initialized.")
|
||||
raise PipelineNotInitializedException()
|
||||
|
||||
# 処理は16Kで実施(Pitch, embed, (infer))
|
||||
receivedData = cast(
|
||||
AudioInOut,
|
||||
resampy.resample(
|
||||
receivedData,
|
||||
self.inputSampleRate,
|
||||
16000,
|
||||
),
|
||||
)
|
||||
crossfade_frame = int((crossfade_frame / self.inputSampleRate) * 16000)
|
||||
sola_search_frame = int((sola_search_frame / self.inputSampleRate) * 16000)
|
||||
extra_frame = int((self.settings.extraConvertSize / self.inputSampleRate) * 16000)
|
||||
|
||||
# 入力データ生成
|
||||
data = self.generate_input(receivedData, crossfade_frame, sola_search_frame, extra_frame)
|
||||
|
||||
audio = data[0]
|
||||
pitchf = data[1]
|
||||
feature = data[2]
|
||||
convertSize = data[3]
|
||||
vol = data[4]
|
||||
outSize = data[5]
|
||||
|
||||
if vol < self.settings.silentThreshold:
|
||||
return np.zeros(convertSize).astype(np.int16) * np.sqrt(vol)
|
||||
|
||||
device = self.pipeline.device
|
||||
|
||||
audio = torch.from_numpy(audio).to(device=device, dtype=torch.float32)
|
||||
repeat = 1 if self.settings.rvcQuality else 0
|
||||
sid = self.settings.dstId
|
||||
f0_up_key = self.settings.tran
|
||||
index_rate = self.settings.indexRatio
|
||||
protect = self.settings.protect
|
||||
|
||||
if_f0 = 1 if self.slotInfo.f0 else 0
|
||||
embOutputLayer = self.slotInfo.embOutputLayer
|
||||
useFinalProj = self.slotInfo.useFinalProj
|
||||
|
||||
try:
|
||||
audio_out, self.pitchf_buffer, self.feature_buffer = self.pipeline.exec(
|
||||
sid,
|
||||
audio,
|
||||
pitchf,
|
||||
feature,
|
||||
f0_up_key,
|
||||
index_rate,
|
||||
if_f0,
|
||||
# 0,
|
||||
self.settings.extraConvertSize / self.inputSampleRate if self.settings.silenceFront else 0., # extaraDataSizeの秒数。入力のサンプリングレートで算出
|
||||
embOutputLayer,
|
||||
useFinalProj,
|
||||
repeat,
|
||||
protect,
|
||||
outSize
|
||||
)
|
||||
# result = audio_out.detach().cpu().numpy() * np.sqrt(vol)
|
||||
result = audio_out[-outSize:].detach().cpu().numpy() * np.sqrt(vol)
|
||||
|
||||
result = cast(
|
||||
AudioInOut,
|
||||
resampy.resample(
|
||||
result,
|
||||
self.slotInfo.samplingRate,
|
||||
self.outputSampleRate,
|
||||
),
|
||||
)
|
||||
|
||||
return result
|
||||
except DeviceCannotSupportHalfPrecisionException as e: # NOQA
|
||||
logger.warn("[Device Manager] Device cannot support half precision. Fallback to float....")
|
||||
self.deviceManager.setForceTensor(True)
|
||||
self.initialize()
|
||||
# raise e
|
||||
|
||||
return
|
||||
|
||||
def __del__(self):
|
||||
del self.pipeline
|
||||
|
||||
# print("---------- REMOVING ---------------")
|
||||
|
||||
# remove_path = os.path.join("RVC")
|
||||
# sys.path = [x for x in sys.path if x.endswith(remove_path) is False]
|
||||
|
||||
# for key in list(sys.modules):
|
||||
# val = sys.modules.get(key)
|
||||
# try:
|
||||
# file_path = val.__file__
|
||||
# if file_path.find("RVC" + os.path.sep) >= 0:
|
||||
# # print("remove", key, file_path)
|
||||
# sys.modules.pop(key)
|
||||
# except Exception: # type:ignore
|
||||
# # print(e)
|
||||
# pass
|
||||
|
||||
def export2onnx(self):
|
||||
modelSlot = self.slotInfo
|
||||
|
||||
if modelSlot.isONNX:
|
||||
logger.warn("[Voice Changer] export2onnx, No pyTorch filepath.")
|
||||
return {"status": "ng", "path": ""}
|
||||
|
||||
if self.pipeline is not None:
|
||||
del self.pipeline
|
||||
self.pipeline = None
|
||||
|
||||
torch.cuda.empty_cache()
|
||||
self.initialize()
|
||||
|
||||
output_file_simple = export2onnx(self.settings.gpu, modelSlot)
|
||||
|
||||
return {
|
||||
"status": "ok",
|
||||
"path": f"/tmp/{output_file_simple}",
|
||||
"filename": output_file_simple,
|
||||
}
|
||||
|
||||
def get_model_current(self):
|
||||
return [
|
||||
{
|
||||
"key": "defaultTune",
|
||||
"val": self.settings.tran,
|
||||
},
|
||||
{
|
||||
"key": "defaultIndexRatio",
|
||||
"val": self.settings.indexRatio,
|
||||
},
|
||||
{
|
||||
"key": "defaultProtect",
|
||||
"val": self.settings.protect,
|
||||
},
|
||||
]
|
@ -46,7 +46,7 @@ class EmbedderManager:
|
||||
file = cls.params.content_vec_500_onnx
|
||||
return OnnxContentvec().loadModel(file, dev)
|
||||
except Exception as e: # noqa
|
||||
print("[Voice Changer] use torch contentvec")
|
||||
print("[Voice Changer] use torch contentvec", e)
|
||||
file = cls.params.hubert_base
|
||||
return FairseqHubert().loadModel(file, dev, isHalf)
|
||||
elif embederType == "hubert-base-japanese":
|
||||
|
@ -8,7 +8,7 @@ from voice_changer.RVC.inferencer.RVCInferencerv2 import RVCInferencerv2
|
||||
from voice_changer.RVC.inferencer.RVCInferencerv2Nono import RVCInferencerv2Nono
|
||||
from voice_changer.RVC.inferencer.WebUIInferencer import WebUIInferencer
|
||||
from voice_changer.RVC.inferencer.WebUIInferencerNono import WebUIInferencerNono
|
||||
from voice_changer.RVC.inferencer.VorasInferencebeta import VoRASInferencer
|
||||
import sys
|
||||
|
||||
|
||||
class InferencerManager:
|
||||
@ -38,7 +38,11 @@ class InferencerManager:
|
||||
elif inferencerType == EnumInferenceTypes.pyTorchRVCv2 or inferencerType == EnumInferenceTypes.pyTorchRVCv2.value:
|
||||
return RVCInferencerv2().loadModel(file, gpu)
|
||||
elif inferencerType == EnumInferenceTypes.pyTorchVoRASbeta or inferencerType == EnumInferenceTypes.pyTorchVoRASbeta.value:
|
||||
if sys.platform.startswith("darwin") is False:
|
||||
from voice_changer.RVC.inferencer.VorasInferencebeta import VoRASInferencer
|
||||
return VoRASInferencer().loadModel(file, gpu)
|
||||
else:
|
||||
raise RuntimeError("[Voice Changer] VoRAS is not supported on macOS")
|
||||
elif inferencerType == EnumInferenceTypes.pyTorchRVCv2Nono or inferencerType == EnumInferenceTypes.pyTorchRVCv2Nono.value:
|
||||
return RVCInferencerv2Nono().loadModel(file, gpu)
|
||||
elif inferencerType == EnumInferenceTypes.pyTorchWebUI or inferencerType == EnumInferenceTypes.pyTorchWebUI.value:
|
||||
|
@ -1,12 +1,14 @@
|
||||
from typing import Dict, Any
|
||||
|
||||
import os
|
||||
from collections import OrderedDict
|
||||
import torch
|
||||
from voice_changer.ModelSlotManager import ModelSlotManager
|
||||
|
||||
from voice_changer.utils.ModelMerger import ModelMergerRequest
|
||||
from voice_changer.utils.VoiceChangerParams import VoiceChangerParams
|
||||
|
||||
|
||||
def merge_model(request: ModelMergerRequest):
|
||||
def merge_model(params: VoiceChangerParams, request: ModelMergerRequest):
|
||||
def extract(ckpt: Dict[str, Any]):
|
||||
a = ckpt["model"]
|
||||
opt: Dict[str, Any] = OrderedDict()
|
||||
@ -34,11 +36,16 @@ def merge_model(request: ModelMergerRequest):
|
||||
|
||||
weights = []
|
||||
alphas = []
|
||||
slotManager = ModelSlotManager.get_instance(params.model_dir)
|
||||
for f in files:
|
||||
strength = f.strength
|
||||
if strength == 0:
|
||||
continue
|
||||
weight, state_dict = load_weight(f.filename)
|
||||
slotInfo = slotManager.get_slot_info(f.slotIndex)
|
||||
|
||||
filename = os.path.join(params.model_dir, str(f.slotIndex), os.path.basename(slotInfo.modelFile)) # slotInfo.modelFileはv.1.5.3.11以前はmodel_dirから含まれている。
|
||||
|
||||
weight, state_dict = load_weight(filename)
|
||||
weights.append(weight)
|
||||
alphas.append(f.strength)
|
||||
|
||||
|
@ -4,7 +4,7 @@ import torch
|
||||
from onnxsim import simplify
|
||||
import onnx
|
||||
from const import TMP_DIR, EnumInferenceTypes
|
||||
from data.ModelSlot import ModelSlot
|
||||
from data.ModelSlot import RVCModelSlot
|
||||
from voice_changer.RVC.deviceManager.DeviceManager import DeviceManager
|
||||
from voice_changer.RVC.onnxExporter.SynthesizerTrnMs256NSFsid_ONNX import (
|
||||
SynthesizerTrnMs256NSFsid_ONNX,
|
||||
@ -24,10 +24,12 @@ from voice_changer.RVC.onnxExporter.SynthesizerTrnMsNSFsidNono_webui_ONNX import
|
||||
from voice_changer.RVC.onnxExporter.SynthesizerTrnMsNSFsid_webui_ONNX import (
|
||||
SynthesizerTrnMsNSFsid_webui_ONNX,
|
||||
)
|
||||
from voice_changer.VoiceChangerParamsManager import VoiceChangerParamsManager
|
||||
|
||||
|
||||
def export2onnx(gpu: int, modelSlot: ModelSlot):
|
||||
modelFile = modelSlot.modelFile
|
||||
def export2onnx(gpu: int, modelSlot: RVCModelSlot):
|
||||
vcparams = VoiceChangerParamsManager.get_instance().params
|
||||
modelFile = os.path.join(vcparams.model_dir, str(modelSlot.slotIndex), os.path.basename(modelSlot.modelFile))
|
||||
|
||||
output_file = os.path.splitext(os.path.basename(modelFile))[0] + ".onnx"
|
||||
output_file_simple = os.path.splitext(os.path.basename(modelFile))[0] + "_simple.onnx"
|
||||
|
@ -18,6 +18,7 @@ from voice_changer.RVC.inferencer.OnnxRVCInferencer import OnnxRVCInferencer
|
||||
from voice_changer.RVC.inferencer.OnnxRVCInferencerNono import OnnxRVCInferencerNono
|
||||
|
||||
from voice_changer.RVC.pitchExtractor.PitchExtractor import PitchExtractor
|
||||
from voice_changer.utils.Timer import Timer
|
||||
|
||||
logger = VoiceChangaerLogger.get_instance().getLogger()
|
||||
|
||||
@ -89,8 +90,11 @@ class Pipeline(object):
|
||||
protect=0.5,
|
||||
out_size=None,
|
||||
):
|
||||
# 16000のサンプリングレートで入ってきている。以降この世界は16000で処理。
|
||||
# print(f"pipeline exec input, audio:{audio.shape}, pitchf:{pitchf.shape}, feature:{feature.shape}")
|
||||
# print(f"pipeline exec input, silence_front:{silence_front}, out_size:{out_size}")
|
||||
|
||||
with Timer("main-process", False) as t: # NOQA
|
||||
# 16000のサンプリングレートで入ってきている。以降この世界は16000で処理。
|
||||
search_index = self.index is not None and self.big_npy is not None and index_rate != 0
|
||||
# self.t_pad = self.sr * repeat # 1秒
|
||||
# self.t_pad_tgt = self.targetSR * repeat # 1秒 出力時のトリミング(モデルのサンプリングで出力される)
|
||||
@ -141,6 +145,7 @@ class Pipeline(object):
|
||||
feats = feats.view(1, -1)
|
||||
|
||||
# embedding
|
||||
with Timer("main-process", False) as te:
|
||||
with autocast(enabled=self.isHalf):
|
||||
try:
|
||||
feats = self.embedder.extractFeatures(feats, embOutputLayer, useFinalProj)
|
||||
@ -153,6 +158,7 @@ class Pipeline(object):
|
||||
raise DeviceChangingException()
|
||||
else:
|
||||
raise e
|
||||
# print(f"[Embedding] {te.secs}")
|
||||
|
||||
# Index - feature抽出
|
||||
# if self.index is not None and self.feature is not None and index_rate != 0:
|
||||
@ -240,6 +246,7 @@ class Pipeline(object):
|
||||
raise e
|
||||
|
||||
feats_buffer = feats.squeeze(0).detach().cpu()
|
||||
|
||||
if pitchf is not None:
|
||||
pitchf_buffer = pitchf.squeeze(0).detach().cpu()
|
||||
else:
|
||||
@ -257,6 +264,7 @@ class Pipeline(object):
|
||||
|
||||
del sid
|
||||
# torch.cuda.empty_cache()
|
||||
# print("EXEC AVERAGE:", t.avrSecs)
|
||||
return audio1, pitchf_buffer, feats_buffer
|
||||
|
||||
def __del__(self):
|
||||
|
@ -9,15 +9,17 @@ from voice_changer.RVC.embedder.EmbedderManager import EmbedderManager
|
||||
from voice_changer.RVC.inferencer.InferencerManager import InferencerManager
|
||||
from voice_changer.RVC.pipeline.Pipeline import Pipeline
|
||||
from voice_changer.RVC.pitchExtractor.PitchExtractorManager import PitchExtractorManager
|
||||
from voice_changer.utils.VoiceChangerParams import VoiceChangerParams
|
||||
|
||||
|
||||
def createPipeline(modelSlot: RVCModelSlot, gpu: int, f0Detector: str):
|
||||
def createPipeline(params: VoiceChangerParams, modelSlot: RVCModelSlot, gpu: int, f0Detector: str):
|
||||
dev = DeviceManager.get_instance().getDevice(gpu)
|
||||
half = DeviceManager.get_instance().halfPrecisionAvailable(gpu)
|
||||
|
||||
# Inferencer 生成
|
||||
try:
|
||||
inferencer = InferencerManager.getInferencer(modelSlot.modelType, modelSlot.modelFile, gpu)
|
||||
modelPath = os.path.join(params.model_dir, str(modelSlot.slotIndex), os.path.basename(modelSlot.modelFile))
|
||||
inferencer = InferencerManager.getInferencer(modelSlot.modelType, modelPath, gpu)
|
||||
except Exception as e:
|
||||
print("[Voice Changer] exception! loading inferencer", e)
|
||||
traceback.print_exc()
|
||||
@ -40,7 +42,8 @@ def createPipeline(modelSlot: RVCModelSlot, gpu: int, f0Detector: str):
|
||||
pitchExtractor = PitchExtractorManager.getPitchExtractor(f0Detector, gpu)
|
||||
|
||||
# index, feature
|
||||
index = _loadIndex(modelSlot)
|
||||
indexPath = os.path.join(params.model_dir, str(modelSlot.slotIndex), os.path.basename(modelSlot.indexFile))
|
||||
index = _loadIndex(indexPath)
|
||||
|
||||
pipeline = Pipeline(
|
||||
embedder,
|
||||
@ -55,21 +58,17 @@ def createPipeline(modelSlot: RVCModelSlot, gpu: int, f0Detector: str):
|
||||
return pipeline
|
||||
|
||||
|
||||
def _loadIndex(modelSlot: RVCModelSlot):
|
||||
def _loadIndex(indexPath: str):
|
||||
# Indexのロード
|
||||
print("[Voice Changer] Loading index...")
|
||||
# ファイル指定がない場合はNone
|
||||
if modelSlot.indexFile is None:
|
||||
print("[Voice Changer] Index is None, not used")
|
||||
return None
|
||||
|
||||
# ファイル指定があってもファイルがない場合はNone
|
||||
if os.path.exists(modelSlot.indexFile) is not True:
|
||||
if os.path.exists(indexPath) is not True or os.path.isfile(indexPath) is not True:
|
||||
print("[Voice Changer] Index file is not found")
|
||||
return None
|
||||
|
||||
try:
|
||||
print("Try loading...", modelSlot.indexFile)
|
||||
index = faiss.read_index(modelSlot.indexFile)
|
||||
print("Try loading...", indexPath)
|
||||
index = faiss.read_index(indexPath)
|
||||
except: # NOQA
|
||||
print("[Voice Changer] load index failed. Use no index.")
|
||||
traceback.print_exc()
|
||||
|
@ -1,6 +1,7 @@
|
||||
import sys
|
||||
import os
|
||||
from data.ModelSlot import SoVitsSvc40ModelSlot
|
||||
from voice_changer.VoiceChangerParamsManager import VoiceChangerParamsManager
|
||||
|
||||
from voice_changer.utils.VoiceChangerModel import AudioInOut
|
||||
from voice_changer.utils.VoiceChangerParams import VoiceChangerParams
|
||||
@ -92,13 +93,17 @@ class SoVitsSvc40:
|
||||
|
||||
def initialize(self):
|
||||
print("[Voice Changer] [so-vits-svc40] Initializing... ")
|
||||
self.hps = get_hparams_from_file(self.slotInfo.configFile)
|
||||
vcparams = VoiceChangerParamsManager.get_instance().params
|
||||
configPath = os.path.join(vcparams.model_dir, str(self.slotInfo.slotIndex), self.slotInfo.configFile)
|
||||
modelPath = os.path.join(vcparams.model_dir, str(self.slotInfo.slotIndex), self.slotInfo.modelFile)
|
||||
self.hps = get_hparams_from_file(configPath)
|
||||
self.settings.speakers = self.hps.spk
|
||||
|
||||
# cluster
|
||||
try:
|
||||
if self.slotInfo.clusterFile is not None:
|
||||
self.cluster_model = get_cluster_model(self.slotInfo.clusterFile)
|
||||
clusterPath = os.path.join(vcparams.model_dir, str(self.slotInfo.slotIndex), self.slotInfo.clusterFile)
|
||||
self.cluster_model = get_cluster_model(clusterPath)
|
||||
else:
|
||||
self.cluster_model = None
|
||||
except Exception as e:
|
||||
@ -110,7 +115,7 @@ class SoVitsSvc40:
|
||||
if self.slotInfo.isONNX:
|
||||
providers, options = self.getOnnxExecutionProvider()
|
||||
self.onnx_session = onnxruntime.InferenceSession(
|
||||
self.slotInfo.modelFile,
|
||||
modelPath,
|
||||
providers=providers,
|
||||
provider_options=options,
|
||||
)
|
||||
@ -122,7 +127,7 @@ class SoVitsSvc40:
|
||||
)
|
||||
net_g.eval()
|
||||
self.net_g = net_g
|
||||
load_checkpoint(self.slotInfo.modelFile, self.net_g, None)
|
||||
load_checkpoint(modelPath, self.net_g, None)
|
||||
|
||||
def getOnnxExecutionProvider(self):
|
||||
availableProviders = onnxruntime.get_available_providers()
|
||||
@ -379,6 +384,10 @@ class SoVitsSvc40:
|
||||
except Exception: # type:ignore
|
||||
pass
|
||||
|
||||
def get_model_current(self):
|
||||
return [
|
||||
]
|
||||
|
||||
|
||||
def resize_f0(x, target_len):
|
||||
source = np.array(x)
|
||||
|
@ -37,8 +37,14 @@ class GPUInfo:
|
||||
@dataclass()
|
||||
class VoiceChangerManagerSettings:
|
||||
modelSlotIndex: int = -1
|
||||
passThrough: bool = False # 0: off, 1: on
|
||||
# ↓mutableな物だけ列挙
|
||||
intData: list[str] = field(default_factory=lambda: ["modelSlotIndex"])
|
||||
boolData: list[str] = field(default_factory=lambda: [
|
||||
"passThrough"
|
||||
])
|
||||
intData: list[str] = field(default_factory=lambda: [
|
||||
"modelSlotIndex",
|
||||
])
|
||||
|
||||
|
||||
class VoiceChangerManager(ServerDeviceCallbacks):
|
||||
@ -121,7 +127,6 @@ class VoiceChangerManager(ServerDeviceCallbacks):
|
||||
def get_instance(cls, params: VoiceChangerParams):
|
||||
if cls._instance is None:
|
||||
cls._instance = cls(params)
|
||||
# cls._instance.voiceChanger = VoiceChanger(params)
|
||||
return cls._instance
|
||||
|
||||
def loadModel(self, params: LoadModelParams):
|
||||
@ -147,7 +152,7 @@ class VoiceChangerManager(ServerDeviceCallbacks):
|
||||
os.makedirs(dstDir, exist_ok=True)
|
||||
logger.info(f"move to {srcPath} -> {dstPath}")
|
||||
shutil.move(srcPath, dstPath)
|
||||
file.name = dstPath
|
||||
file.name = os.path.basename(dstPath)
|
||||
|
||||
# メタデータ作成(各VCで定義)
|
||||
if params.voiceChangerType == "RVC":
|
||||
@ -188,6 +193,7 @@ class VoiceChangerManager(ServerDeviceCallbacks):
|
||||
data["modelSlots"] = self.modelSlotManager.getAllSlotInfo(reload=True)
|
||||
data["sampleModels"] = getSampleInfos(self.params.sample_mode)
|
||||
data["python"] = sys.version
|
||||
data["voiceChangerParams"] = self.params
|
||||
|
||||
data["status"] = "OK"
|
||||
|
||||
@ -214,11 +220,18 @@ class VoiceChangerManager(ServerDeviceCallbacks):
|
||||
return
|
||||
elif slotInfo.voiceChangerType == "RVC":
|
||||
logger.info("................RVC")
|
||||
from voice_changer.RVC.RVC import RVC
|
||||
# from voice_changer.RVC.RVC import RVC
|
||||
|
||||
self.voiceChangerModel = RVC(self.params, slotInfo)
|
||||
self.voiceChanger = VoiceChanger(self.params)
|
||||
# self.voiceChangerModel = RVC(self.params, slotInfo)
|
||||
# self.voiceChanger = VoiceChanger(self.params)
|
||||
# self.voiceChanger.setModel(self.voiceChangerModel)
|
||||
|
||||
from voice_changer.RVC.RVCr2 import RVCr2
|
||||
|
||||
self.voiceChangerModel = RVCr2(self.params, slotInfo)
|
||||
self.voiceChanger = VoiceChangerV2(self.params)
|
||||
self.voiceChanger.setModel(self.voiceChangerModel)
|
||||
|
||||
elif slotInfo.voiceChangerType == "MMVCv13":
|
||||
logger.info("................MMVCv13")
|
||||
from voice_changer.MMVCv13.MMVCv13 import MMVCv13
|
||||
@ -260,10 +273,16 @@ class VoiceChangerManager(ServerDeviceCallbacks):
|
||||
del self.voiceChangerModel
|
||||
return
|
||||
|
||||
def update_settings(self, key: str, val: str | int | float):
|
||||
def update_settings(self, key: str, val: str | int | float | bool):
|
||||
self.store_setting(key, val)
|
||||
|
||||
if key in self.settings.intData:
|
||||
if key in self.settings.boolData:
|
||||
if val == "true":
|
||||
newVal = True
|
||||
elif val == "false":
|
||||
newVal = False
|
||||
setattr(self.settings, key, newVal)
|
||||
elif key in self.settings.intData:
|
||||
newVal = int(val)
|
||||
if key == "modelSlotIndex":
|
||||
newVal = newVal % 1000
|
||||
@ -283,6 +302,9 @@ class VoiceChangerManager(ServerDeviceCallbacks):
|
||||
return self.get_info()
|
||||
|
||||
def changeVoice(self, receivedData: AudioInOut):
|
||||
if self.settings.passThrough is True: # パススルー
|
||||
return receivedData, []
|
||||
|
||||
if hasattr(self, "voiceChanger") is True:
|
||||
return self.voiceChanger.on_request(receivedData)
|
||||
else:
|
||||
@ -299,8 +321,8 @@ class VoiceChangerManager(ServerDeviceCallbacks):
|
||||
req.files = [MergeElement(**f) for f in req.files]
|
||||
slot = len(self.modelSlotManager.getAllSlotInfo()) - 1
|
||||
if req.voiceChangerType == "RVC":
|
||||
merged = RVCModelMerger.merge_models(req, slot)
|
||||
loadParam = LoadModelParams(voiceChangerType="RVC", slot=slot, isSampleMode=False, sampleId="", files=[LoadModelParamFile(name=os.path.basename(merged), kind="rvcModel", dir=f"{slot}")], params={})
|
||||
merged = RVCModelMerger.merge_models(self.params, req, slot)
|
||||
loadParam = LoadModelParams(voiceChangerType="RVC", slot=slot, isSampleMode=False, sampleId="", files=[LoadModelParamFile(name=os.path.basename(merged), kind="rvcModel", dir="")], params={})
|
||||
self.loadModel(loadParam)
|
||||
return self.get_info()
|
||||
|
||||
|
17
server/voice_changer/VoiceChangerParamsManager.py
Normal file
17
server/voice_changer/VoiceChangerParamsManager.py
Normal file
@ -0,0 +1,17 @@
|
||||
from voice_changer.utils.VoiceChangerParams import VoiceChangerParams
|
||||
|
||||
|
||||
class VoiceChangerParamsManager:
|
||||
_instance = None
|
||||
|
||||
def __init__(self):
|
||||
self.params = None
|
||||
|
||||
@classmethod
|
||||
def get_instance(cls):
|
||||
if cls._instance is None:
|
||||
cls._instance = cls()
|
||||
return cls._instance
|
||||
|
||||
def setParams(self, params: VoiceChangerParams):
|
||||
self.params = params
|
@ -6,7 +6,7 @@
|
||||
|
||||
- 適用VoiceChangerModel
|
||||
・DiffusionSVC
|
||||
|
||||
・RVC
|
||||
'''
|
||||
|
||||
from typing import Any, Union
|
||||
@ -208,12 +208,13 @@ class VoiceChangerV2(VoiceChangerIF):
|
||||
block_frame = receivedData.shape[0]
|
||||
crossfade_frame = min(self.settings.crossFadeOverlapSize, block_frame)
|
||||
self._generate_strength(crossfade_frame)
|
||||
# data = self.voiceChanger.generate_input(newData, block_frame, crossfade_frame, sola_search_frame)
|
||||
|
||||
audio = self.voiceChanger.inference(
|
||||
receivedData,
|
||||
crossfade_frame=crossfade_frame,
|
||||
sola_search_frame=sola_search_frame
|
||||
)
|
||||
|
||||
if hasattr(self, "sola_buffer") is True:
|
||||
np.set_printoptions(threshold=10000)
|
||||
audio_offset = -1 * (sola_search_frame + crossfade_frame + block_frame)
|
||||
|
@ -5,7 +5,7 @@ from dataclasses import dataclass
|
||||
|
||||
@dataclass
|
||||
class MergeElement:
|
||||
filename: str
|
||||
slotIndex: int
|
||||
strength: int
|
||||
|
||||
|
||||
|
@ -1,15 +1,43 @@
|
||||
import time
|
||||
import inspect
|
||||
|
||||
|
||||
class Timer(object):
|
||||
def __init__(self, title: str):
|
||||
storedSecs = {} # Class variable
|
||||
|
||||
def __init__(self, title: str, enalbe: bool = True):
|
||||
self.title = title
|
||||
self.enable = enalbe
|
||||
self.secs = 0
|
||||
self.msecs = 0
|
||||
self.avrSecs = 0
|
||||
|
||||
if self.enable is False:
|
||||
return
|
||||
|
||||
self.maxStores = 10
|
||||
|
||||
current_frame = inspect.currentframe()
|
||||
caller_frame = inspect.getouterframes(current_frame, 2)
|
||||
frame = caller_frame[1]
|
||||
filename = frame.filename
|
||||
line_number = frame.lineno
|
||||
self.key = f"{title}_{filename}_{line_number}"
|
||||
if self.key not in self.storedSecs:
|
||||
self.storedSecs[self.key] = []
|
||||
|
||||
def __enter__(self):
|
||||
if self.enable is False:
|
||||
return
|
||||
self.start = time.time()
|
||||
return self
|
||||
|
||||
def __exit__(self, *_):
|
||||
if self.enable is False:
|
||||
return
|
||||
self.end = time.time()
|
||||
self.secs = self.end - self.start
|
||||
self.msecs = self.secs * 1000 # millisecs
|
||||
self.storedSecs[self.key].append(self.secs)
|
||||
self.storedSecs[self.key] = self.storedSecs[self.key][-self.maxStores:]
|
||||
self.avrSecs = sum(self.storedSecs[self.key]) / len(self.storedSecs[self.key])
|
||||
|
20
signatures/version1/cla.json
Normal file
20
signatures/version1/cla.json
Normal file
@ -0,0 +1,20 @@
|
||||
{
|
||||
"signedContributors": [
|
||||
{
|
||||
"name": "w-okada",
|
||||
"id": 48346627,
|
||||
"comment_id": 1667673774,
|
||||
"created_at": "2023-08-07T11:21:42Z",
|
||||
"repoId": 527419347,
|
||||
"pullRequestNo": 661
|
||||
},
|
||||
{
|
||||
"name": "w-okada",
|
||||
"id": 48346627,
|
||||
"comment_id": 1667674735,
|
||||
"created_at": "2023-08-07T11:22:28Z",
|
||||
"repoId": 527419347,
|
||||
"pullRequestNo": 661
|
||||
}
|
||||
]
|
||||
}
|
@ -44,6 +44,8 @@ If you have the old version, be sure to unzip it into a separate folder.
|
||||
|
||||
When connecting remotely, please use `.bat` file (win) and `.command` file (mac) where http is replaced with https.
|
||||
|
||||
Access with Browser (currently only chrome is supported), then you can see gui.
|
||||
|
||||
### Console
|
||||
|
||||
When you run a .bat file (Windows) or .command file (Mac), a screen like the following will be displayed and various data will be downloaded from the Internet at the initial start-up. Depending on your environment, it may take 1-2 minutes in many cases.
|
||||
|
@ -44,6 +44,8 @@
|
||||
|
||||
リモートから接続する場合は、`.bat`ファイル(win)、`.command`ファイル(mac)の http が https に置き換わっているものを使用してください。
|
||||
|
||||
ブラウザ(Chrome のみサポート)でアクセスすると画面が表示されます。
|
||||
|
||||
### コンソール表示
|
||||
|
||||
`.bat`ファイル(win)や`.command`ファイル(mac)を実行すると、次のような画面が表示され、初回起動時には各種データをインターネットからダウンロードします。
|
||||
|
Loading…
x
Reference in New Issue
Block a user