Kokoro-FastAPI

mirror of https://github.com/remsky/Kokoro-FastAPI.git synced 2025-04-13 09:39:17 +00:00

Author	SHA1	Message	Date
remsky	720c1fb97d	-update soundfile version -alignment with streaming standards -audio processing config settings -more comprehensive model warmup -minor model improvements -enhancing testing, benchmarking -cool ascii logo	2025-01-06 03:32:41 -07:00
remsky	4c6cd83f85	Swapped generator to preprocessing	2025-01-04 22:23:59 -07:00
remsky	0e9f77fc79	WIP: open ai compatible streaming	2025-01-04 17:55:36 -07:00
remsky	f1eb1d9590	First streaming attempt	2025-01-04 17:54:54 -07:00
remsky	76e8b07a92	Allow ONNX support optimizations for CPU inference and update benchmarking scripts; modify README for clarity on performance metrics	2025-01-04 02:46:27 -07:00
remsky	93aa205da9	Enhance ONNX optimization settings and add validation script for TTS audio files	2025-01-04 02:14:46 -07:00
remsky	7df2a68fb4	- CPU ONNX + PyTorch CUDA, functional - Incorporated text processing module as service, towards modularization and optimizations - Added text processing router for phonemization - Enhanced benchmark statistics with real-time speed metrics	2025-01-03 17:54:17 -07:00
remsky	9496a3a63f	WIP: CPU/GPU Functional, few straggling tests to fix and check.	2025-01-03 03:16:42 -07:00
remsky	e4d8e74738	WIP, Functional for CPU: Updated for ONNX runtime support, Dockerfile and TTS Service	2025-01-03 00:53:41 -07:00
remsky	40894449da	added output audio tests, validation	2025-01-02 15:36:53 -07:00
remsky	f051984805	Ruff Check + Format	2025-01-01 21:50:41 -07:00
remsky	05e1e30c47	- modified voice loading to copy on init - adjustments to the combine voices functionality - error handling and analysis	2024-12-31 18:55:26 -07:00
Emmanuel Schmidbauer	510b01cc90	add ability to combine voices	2024-12-31 10:30:12 -05:00
remsky	f800c4ecf9	Added mp3 samples	2024-12-31 03:48:26 -07:00
remsky	607df6e03b	Update README and tests to clarify audio format support and enhance documentation	2024-12-31 03:46:31 -07:00
remsky	36606f7234	Refactor Docker setup to use a dedicated model-fetcher service and update schemas for additional voice support	2024-12-31 03:41:45 -07:00
remsky	4123ab0891	Refactor TTS API and enhance testing setup with coverage and logging improvements	2024-12-31 02:55:51 -07:00
remsky	c11a6ea6ea	Enhance TTS API with logging, voice pack loading, and schema updates	2024-12-31 01:57:00 -07:00
remsky	8ce8334345	- Complete TTS endpoint replacement with OpenAI compatible -Removed output directory, and update configuration settings - Added benchmarking for entire novel	2024-12-31 01:52:16 -07:00
Emmanuel Schmidbauer	f95e526a3f	add speed	2024-12-30 13:39:35 -05:00
remsky	0fb36bb1b2	fix: update benchmark results for processing time and output length	2024-12-30 06:16:55 -07:00
remsky	79d5332c8a	feat: enabled support for stitching long outputs in TTS requests	2024-12-30 06:16:18 -07:00
remsky	aa2df45858	Update README with performance benchmarks and usage examples; add benchmark plotting script	2024-12-30 04:53:29 -07:00
remsky	ce0ef3534a	Add initial implementation of Kokoro TTS API with Docker GPU support - Set up FastAPI application with TTS service - Define API endpoints for TTS generation and voice listing - Implement Pydantic models for request and response schemas - Add Dockerfile and docker-compose.yml for containerization - Include example usage and benchmark results in README	2024-12-30 04:17:50 -07:00

24 commits