mirror of
https://github.com/remsky/Kokoro-FastAPI.git
synced 2025-04-13 09:39:17 +00:00
Update README.md
This commit is contained in:
parent
c95b34d904
commit
eb2191e23d
1 changed files with 33 additions and 55 deletions
70
README.md
70
README.md
|
@ -11,12 +11,11 @@
|
||||||
|
|
||||||
Dockerized FastAPI wrapper for [Kokoro-82M](https://huggingface.co/hexgrad/Kokoro-82M) text-to-speech model
|
Dockerized FastAPI wrapper for [Kokoro-82M](https://huggingface.co/hexgrad/Kokoro-82M) text-to-speech model
|
||||||
- OpenAI-compatible Speech endpoint, with inline voice combination, and mapped naming/models for strict systems
|
- OpenAI-compatible Speech endpoint, with inline voice combination, and mapped naming/models for strict systems
|
||||||
- NVIDIA GPU accelerated or CPU inference (ONNX, Pytorch)
|
- NVIDIA GPU accelerated or CPU inference (ONNX or Pytorch for either)
|
||||||
- very fast generation time
|
- very fast generation time
|
||||||
- ~35x-100x+ real time speed via 4060Ti+
|
- ~35x-100x+ real time speed via 4060Ti+
|
||||||
- ~5x+ real time speed via M3 Pro CPU
|
- ~5x+ real time speed via M3 Pro CPU
|
||||||
- streaming support & tempfile generation
|
- streaming support & tempfile generation, phoneme based dev endpoints
|
||||||
- phoneme based dev endpoints
|
|
||||||
- (new) Integrated web UI on localhost:8880/web
|
- (new) Integrated web UI on localhost:8880/web
|
||||||
- (new) Debug endpoints for monitoring threads, storage, and session pools
|
- (new) Debug endpoints for monitoring threads, storage, and session pools
|
||||||
|
|
||||||
|
@ -36,14 +35,6 @@ docker run -p 8880:8880 ghcr.io/remsky/kokoro-fastapi-cpu:v0.1.4 # CPU, or:
|
||||||
docker run --gpus all -p 8880:8880 ghcr.io/remsky/kokoro-fastapi-gpu:v0.1.4 #NVIDIA GPU
|
docker run --gpus all -p 8880:8880 ghcr.io/remsky/kokoro-fastapi-gpu:v0.1.4 #NVIDIA GPU
|
||||||
```
|
```
|
||||||
|
|
||||||
Once running, access:
|
|
||||||
- API Documentation: http://localhost:8880/docs
|
|
||||||
- Web Interface: http://localhost:8880/web
|
|
||||||
|
|
||||||
<div align="center" style="display: flex; justify-content: center; gap: 20px;">
|
|
||||||
<img src="assets/docs-screenshot.png" width="48%" alt="API Documentation" style="border: 2px solid #333; padding: 10px;">
|
|
||||||
<img src="assets/webui-screenshot.png" width="48%" alt="Web UI Screenshot" style="border: 2px solid #333; padding: 10px;">
|
|
||||||
</div>
|
|
||||||
|
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
|
@ -53,7 +44,6 @@ Once running, access:
|
||||||
|
|
||||||
1. Install prerequisites, and start the service using Docker Compose (Full setup including UI):
|
1. Install prerequisites, and start the service using Docker Compose (Full setup including UI):
|
||||||
- Install [Docker](https://www.docker.com/products/docker-desktop/)
|
- Install [Docker](https://www.docker.com/products/docker-desktop/)
|
||||||
-
|
|
||||||
- Clone the repository:
|
- Clone the repository:
|
||||||
```bash
|
```bash
|
||||||
git clone https://github.com/remsky/Kokoro-FastAPI.git
|
git clone https://github.com/remsky/Kokoro-FastAPI.git
|
||||||
|
@ -72,31 +62,7 @@ Once running, access:
|
||||||
./start-cpu.sh
|
./start-cpu.sh
|
||||||
./start-gpu.sh
|
./start-gpu.sh
|
||||||
```
|
```
|
||||||
|
|
||||||
Once started:
|
|
||||||
- The API will be available at http://localhost:8880
|
|
||||||
- The *Web UI* can be tested at http://localhost:8880/web
|
|
||||||
- The Gradio UI (deprecating) can be accessed at http://localhost:7860
|
|
||||||
|
|
||||||
2. Run locally as an OpenAI-Compatible Speech Endpoint
|
|
||||||
```python
|
|
||||||
from openai import OpenAI
|
|
||||||
client = OpenAI(
|
|
||||||
base_url="http://localhost:8880/v1",
|
|
||||||
api_key="not-needed"
|
|
||||||
)
|
|
||||||
|
|
||||||
with client.audio.speech.with_streaming_response.create(
|
|
||||||
model="kokoro",
|
|
||||||
voice="af_sky+af_bella", #single or multiple voicepack combo
|
|
||||||
input="Hello world!",
|
|
||||||
response_format="mp3"
|
|
||||||
) as response:
|
|
||||||
response.stream_to_file("output.mp3")
|
|
||||||
|
|
||||||
```
|
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
<details>
|
<details>
|
||||||
<summary>Direct Run (via uv) </summary>
|
<summary>Direct Run (via uv) </summary>
|
||||||
|
|
||||||
|
@ -118,28 +84,40 @@ Once running, access:
|
||||||
./start-gpu.sh
|
./start-gpu.sh
|
||||||
```
|
```
|
||||||
|
|
||||||
Once started:
|
</details>
|
||||||
- The API will be available at http://localhost:8880
|
|
||||||
- The *Web UI* can be tested at http://localhost:8880/web
|
<details open>
|
||||||
- The Gradio UI (deprecating) can be accessed at http://localhost:7860
|
<summary> Up and Running? </summary>
|
||||||
|
|
||||||
|
|
||||||
|
Run locally as an OpenAI-Compatible Speech Endpoint
|
||||||
|
|
||||||
2. Run locally as an OpenAI-Compatible Speech Endpoint
|
|
||||||
```python
|
```python
|
||||||
from openai import OpenAI
|
from openai import OpenAI
|
||||||
|
|
||||||
client = OpenAI(
|
client = OpenAI(
|
||||||
base_url="http://localhost:8880/v1",
|
base_url="http://localhost:8880/v1", api_key="not-needed"
|
||||||
api_key="not-needed"
|
|
||||||
)
|
)
|
||||||
|
|
||||||
with client.audio.speech.with_streaming_response.create(
|
with client.audio.speech.with_streaming_response.create(
|
||||||
model="kokoro",
|
model="kokoro",
|
||||||
voice="af_sky+af_bella", #single or multiple voicepack combo
|
voice="af_sky+af_bella", #single or multiple voicepack combo
|
||||||
input="Hello world!",
|
input="Hello world!"
|
||||||
response_format="mp3"
|
|
||||||
) as response:
|
) as response:
|
||||||
response.stream_to_file("output.mp3")
|
response.stream_to_file("output.mp3")
|
||||||
|
|
||||||
```
|
```
|
||||||
|
|
||||||
|
- The API will be available at http://localhost:8880
|
||||||
|
- API Documentation: http://localhost:8880/docs
|
||||||
|
|
||||||
|
- Web Interface: http://localhost:8880/web
|
||||||
|
- Gradio UI (deprecating) can be accessed at http://localhost:7860 if enabled in docker compose file (it is a separate image!)
|
||||||
|
|
||||||
|
<div align="center" style="display: flex; justify-content: center; gap: 10px;">
|
||||||
|
<img src="assets/docs-screenshot.png" width="40%" alt="API Documentation" style="border: 2px solid #333; padding: 10px;">
|
||||||
|
<img src="assets/webui-screenshot.png" width="49%" alt="Web UI Screenshot" style="border: 2px solid #333; padding: 10px;">
|
||||||
|
</div>
|
||||||
|
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
## Features
|
## Features
|
||||||
|
|
Loading…
Add table
Reference in a new issue