Correspondingly, if you do not plan to use acceleration, you can comment out the `--compile` parameter.
!!! info
For GPUs that do not support bf16, you may need to use the `--half` parameter.
### 3. Generate vocals from semantic tokens:
!!! warning "Future Warning"
We have kept the interface accessible from the original path (tools/vqgan/inference.py), but this interface may be removed in subsequent releases, so please change your code as soon as possible.
```bash
python fish_speech/models/dac/inference.py \
-i "codes_0.npy" \
```
## HTTP API Inference
We provide a HTTP API for inference. You can use the following command to start the server:
> If you want to speed up inference, you can add the `--compile` parameter.
!!! note
You can save the label file and reference audio file in advance to the `references` folder in the main directory (which you need to create yourself), so that you can directly call them in the WebUI.
!!! note
You can use Gradio environment variables, such as `GRADIO_SHARE`, `GRADIO_SERVER_PORT`, `GRADIO_SERVER_NAME` to configure WebUI.