Kokoro-FastAPI

mirror of https://github.com/remsky/Kokoro-FastAPI.git synced 2025-08-05 16:48:53 +00:00

Author	SHA1	Message	Date
remsky	8a60a2b90c	Add StreamingAudioWriter class for audio format conversions and remove deprecated migration notes	2025-01-27 20:23:35 -07:00
remsky	409a9e9af3	Merge remote-tracking branch 'origin/master'	2025-01-27 15:19:28 -07:00
Josh Rosen	b8d592081e	Fix truncated playback issue in streaming WAV responses.	2025-01-26 12:40:45 -08:00
remsky	00497f8872	Refactor: Consolidate PyTorch CPU and GPU backends into a single PyTorchBackend class; remove obsolete files	2025-01-25 13:33:42 -07:00
remsky	3547d95ee6	-unified streaming implementation	2025-01-25 05:25:13 -07:00
remsky	9efb9db4d9	Fix: VoiceManager singleton instantiation	2025-01-24 05:30:56 -07:00
remsky	20658f9759	Performance: Adjust session timeout and GPU memory limit; minim voice pre-caching and improve singleton instance management	2025-01-24 05:01:38 -07:00
remsky	ee1f7cde18	Add async audio processing and semantic chunking support; flattened static audio trimming	2025-01-24 04:06:47 -07:00
remsky	8eb3525382	Refactor configuration and enhance web interface: update GPU settings, add speed control, and improve input handling for audio generation	2025-01-23 04:54:55 -07:00
remsky	ba577d348e	Enhance web player information, adjust text chunk size, update audio wave settings, and implement OpenAI model mappings	2025-01-23 04:11:31 -07:00
remsky	8e8f120a3e	Update configuration to disable local voice saving, enhance voice validation logic, and remove deprecated test file	2025-01-23 02:00:46 -07:00
remsky	df4cc5b4b2	-Adjust testing framework for new model -Add web player support: include static file serving and HTML interface for TTS	2025-01-22 21:11:47 -07:00
remsky	66f46e82f9	Refactor ONNX GPU backend and phoneme generation: improve token handling, add chunk processing for audio generation, and initial introduce stitch options for audio chunks.	2025-01-22 17:43:38 -07:00
remsky	d50214d3be	Enable ONNX GPU support in Docker configurations and refactor model file handling	2025-01-22 05:00:38 -07:00
remsky	4a24be1605	Refactor model loading and configuration: update, adjust model loading device,. add async streaming examples and remove unused warmup service.	2025-01-22 02:33:29 -07:00
remsky	21bf810f97	Enhance model inference: update documentation, add model download scripts for PyTorch and ONNX, and refactor configuration handling	2025-01-21 21:44:21 -07:00
Fireblade	53c8c9ca5d	Fixed thread leak because of creating excessive E-speak backends	2025-01-21 14:45:43 -05:00
remsky	ab28a62e86	Refactor inference architecture: remove legacy TTS model, add ONNX and PyTorch backends, and introduce model configuration schemas	2025-01-20 22:42:29 -07:00
Richard Roberson	d51d861861	add AAC audio format and test	2025-01-17 21:43:10 -07:00
Fireblade2534	eb556ec7d3	Fixed python tests so they run properly and cleaned up some unneeded files	2025-01-17 14:55:25 +00:00
remsky	d20da2f92e	Default hexxgrad voicepacks added as temporary fix	2025-01-15 09:42:27 +00:00
remsky	8bc8661930	fix: update model directory paths and improve logging in TTS services	2025-01-14 06:37:03 -07:00
remsky	cf72e4ed2b	Add interruptible streams	2025-01-13 23:25:06 -07:00
remsky	064313450e	fix: test of cicd	2025-01-13 20:18:02 -07:00
remsky	22752900e5	Ruff checks, ci fix	2025-01-13 20:15:46 -07:00
remsky	007b1a35e8	feat: merge master into core/uv-management for v0.1.0 Major changes: - Baked model directly into Dockerfile for improved deployment - Switched to uv for dependency management - Restructured Docker files into docker/cpu and docker/gpu directories - Updated configuration for better ONNX performance	2025-01-13 19:31:44 -07:00
remsky	387653050b	refactor: streamline audio normalization process and update tests	2025-01-13 18:56:49 -07:00
remsky	f4dc292440	fix: ui stability, memory safeties	2025-01-12 21:33:23 -07:00
remsky	3d0ca2a8c2	Update Dockerfiles for baked in models, adjustments to cpu/gpu environment splits	2025-01-12 05:23:02 -07:00
remsky	926ea8cecf	Refactor Docker configurations and update test mocks for development routers	2025-01-10 22:03:16 -07:00
remsky	e8c1284032	Ruff format + fix	2025-01-09 18:41:44 -07:00
remsky	4b521f9bf0	- Added GenerateFromPhonemesRequest model to text_schemas.py - Refactored TTS model initialization methods in tts_gpu.py and tts_cpu.py - Added custom logger configuration in main.py - Deprecated text_processing router -> development route	2025-01-09 07:20:14 -07:00
remsky	bd4df84410	Merge pull request #12 from fireblade2534/master Gave it a test, didn't see any issues 👍	2025-01-09 01:30:07 -07:00
Fireblade	1f22cda9be	Fix remaining slashes not being converted into text and made % be converted	2025-01-08 08:50:22 -05:00
remsky	a0a85f5ef0	-add email handling, minor additional URL processing, tests	2025-01-08 03:13:17 -07:00
remsky	e7ffcf49f5	fixed: async scandir finding voices	2025-01-07 21:36:07 -07:00
Fireblade	1625082724	Fix url parsing for urls without https, http, or www. It also allows raw ips, ports, and dashs	2025-01-07 19:34:38 -05:00
remsky	d7e8a5c953	Adjusting aiofiles implementation, testing	2025-01-07 04:30:02 -07:00
remsky	130b084cce	- Added support for combining voices via any endpoint - Updated the `process_voices` function to handle both string and list formats for voice input.	2025-01-07 03:50:08 -07:00
remsky	fddf26c905	Added tested, slight changes to regex	2025-01-07 00:18:44 -07:00
Fireblade	db2f3dd323	Made urls readable	2025-01-06 19:40:21 -05:00
remsky	720c1fb97d	-update soundfile version -alignment with streaming standards -audio processing config settings -more comprehensive model warmup -minor model improvements -enhancing testing, benchmarking -cool ascii logo	2025-01-06 03:32:41 -07:00
remsky	4c6cd83f85	Swapped generator to preprocessing	2025-01-04 22:23:59 -07:00
remsky	e799f0c7c1	WIP: basic tests on OpenAI streaming compatibility	2025-01-04 18:09:23 -07:00
remsky	0e9f77fc79	WIP: open ai compatible streaming	2025-01-04 17:55:36 -07:00
remsky	f1eb1d9590	First streaming attempt	2025-01-04 17:54:54 -07:00
remsky	93aa205da9	Enhance ONNX optimization settings and add validation script for TTS audio files	2025-01-04 02:14:46 -07:00
remsky	7df2a68fb4	- CPU ONNX + PyTorch CUDA, functional - Incorporated text processing module as service, towards modularization and optimizations - Added text processing router for phonemization - Enhanced benchmark statistics with real-time speed metrics	2025-01-03 17:54:17 -07:00
remsky	9496a3a63f	WIP: CPU/GPU Functional, few straggling tests to fix and check.	2025-01-03 03:16:42 -07:00
remsky	e4d8e74738	WIP, Functional for CPU: Updated for ONNX runtime support, Dockerfile and TTS Service	2025-01-03 00:53:41 -07:00

1 2

67 commits