mirror of
https://github.com/remsky/Kokoro-FastAPI.git
synced 2025-04-13 09:39:17 +00:00

Major changes: - Baked model directly into Dockerfile for improved deployment - Switched to uv for dependency management - Restructured Docker files into docker/cpu and docker/gpu directories - Updated configuration for better ONNX performance
2.5 KiB
2.5 KiB
Changelog
Notable changes to this project will be documented in this file.
[v0.1.0] - 2025-01-13
Changed
- Major Docker improvements:
- Baked model directly into Dockerfile for improved deployment reliability
- Switched to uv for dependency management
- Streamlined container builds and reduced image sizes
- Dependency Management:
- Migrated from pip/poetry to uv for faster, more reliable package management
- Added uv.lock for deterministic builds
- Updated dependency resolution strategy
[v0.0.5post1] - 2025-01-11
Fixed
- Docker image tagging and versioning improvements (-gpu, -cpu, -ui)
- Minor vram management improvements
- Gradio bugfix causing crashes and errant warnings
- Updated GPU and UI container configurations
[v0.0.5] - 2025-01-10
Fixed
- Stabilized issues with images tagging and structures from v0.0.4
- Added automatic master to develop branch synchronization
- Improved release tagging and structures
- Initial CI/CD setup
2025-01-04
Added
- ONNX Support:
- Added single batch ONNX support for CPU inference
- Roughly 0.4 RTF (2.4x real-time speed)
Modified
- Code Refactoring:
- Work on modularizing phonemizer and tokenizer into separate services
- Incorporated these services into a dev endpoint
- Testing and Benchmarking:
- Cleaned up benchmarking scripts
- Cleaned up test scripts
- Added auto-WAV validation scripts
2025-01-02
- Audio Format Support:
- Added comprehensive audio format conversion support (mp3, wav, opus, flac)
2025-01-01
Added
- Gradio Web Interface:
- Added simple web UI utility for audio generation from input or txt file
Modified
Configuration Changes
- Updated Docker configurations:
- Changes to
Dockerfile
:- Improved layer caching by separating dependency and code layers
- Updates to
docker-compose.yml
anddocker-compose.cpu.yml
:- Removed commit lock from model fetching to allow automatic model updates from HF
- Added git index lock cleanup
- Changes to
API Changes
- Modified
api/src/main.py
- Updated TTS service implementation in
api/src/services/tts.py
:- Added device management for better resource control:
- Voices are now copied from model repository to api/src/voices directory for persistence
- Refactored voice pack handling:
- Removed static voice pack dictionary
- On-demand voice loading from disk
- Added model warm-up functionality:
- Model now initializes with a dummy text generation
- Uses default voice (af.pt) for warm-up
- Model is ready for inference on first request
- Added device management for better resource control: