Kokoro-FastAPI

mirror of https://github.com/remsky/Kokoro-FastAPI.git synced 2025-08-05 16:48:53 +00:00

Author	SHA1	Message	Date
Lukin	ab8ab7d749	Refactor audio processing and text normalization: Update audio normalization to use absolute amplitude threshold, enhance streaming audio writer with MP3 container options, and improve text normalization by stripping spaces and handling special characters to prevent audio artifacts.	2025-05-30 22:52:58 +08:00
remsky	afa879546c	CONTRIBUTING + Ruff format	2025-04-04 16:58:07 -06:00
remsky	447f9d360c	Ruff check	2025-04-04 16:50:46 -06:00
Fireblade2534	c24aeefbb2	Aculy fixed tests this time	2025-03-20 19:15:07 +00:00
Fireblade2534	8f23bf53a4	Inital test commit of segfault fixes	2025-03-20 16:20:28 +00:00
Fireblade	b3d5f4de08	fixes and corrections to code that didn't cause errors but didn't really make sense	2025-03-02 21:36:34 -05:00
Fireblade	c5a3e13670	Converted the stream writer to use pyav	2025-02-19 23:10:51 -05:00
Fireblade	4ee4d36822	Fixes a couple of issues with audio triming and prevents errors with single voice weights	2025-02-18 18:12:49 -05:00
Fireblade	e3dc959775	Simplify code so erverything uses AudioChunks	2025-02-16 15:37:01 -05:00
Fireblade	0b5ec320c7	streaming word level time stamps	2025-02-14 13:37:42 -05:00
Fireblade	7772dbc2e4	fixed no stream file writing	2025-02-13 16:12:51 -05:00
Fireblade	dbf2b99026	Simplifed generate_audio in tts_service mostly working (audio conversion does not work)	2025-02-12 22:42:41 -05:00
Fireblade2534	51b6b01589	Fixed not returning enough values	2025-02-12 15:06:11 +00:00
Fireblade	5cc9d140fe	WIP	2025-02-11 22:36:19 -05:00
Fireblade	45cdb607e6	WIP	2025-02-11 22:32:10 -05:00
Fireblade	ab1c21130e	Made the api use the normalizer, fixed the wrong version of espeak, added better normilzation, improved the sentence splitting, fixed some formatting	2025-02-10 21:45:52 -05:00
remsky	a91e0fe9df	Ruff check + formatting	2025-02-09 18:32:17 -07:00
remsky	f61f79981d	-Add debug endpoint for system stats -Adjust headers, generate from phonemes, etc	2025-01-30 04:44:04 -07:00
remsky	9867fc398f	WIP: v1_0_0 migration	2025-01-28 13:52:57 -07:00
remsky	75889e157d	Refactor audio processing and cleanup: remove unused chunker, enhance StreamingAudioWriter for better MP3 handling, and improve text processing compatibility.	2025-01-27 20:23:42 -07:00
remsky	8a60a2b90c	Add StreamingAudioWriter class for audio format conversions and remove deprecated migration notes	2025-01-27 20:23:35 -07:00
remsky	409a9e9af3	Merge remote-tracking branch 'origin/master'	2025-01-27 15:19:28 -07:00
Josh Rosen	b8d592081e	Fix truncated playback issue in streaming WAV responses.	2025-01-26 12:40:45 -08:00
remsky	ee1f7cde18	Add async audio processing and semantic chunking support; flattened static audio trimming	2025-01-24 04:06:47 -07:00
Richard Roberson	d51d861861	add AAC audio format and test	2025-01-17 21:43:10 -07:00
remsky	22752900e5	Ruff checks, ci fix	2025-01-13 20:15:46 -07:00
remsky	387653050b	refactor: streamline audio normalization process and update tests	2025-01-13 18:56:49 -07:00
remsky	926ea8cecf	Refactor Docker configurations and update test mocks for development routers	2025-01-10 22:03:16 -07:00
remsky	e8c1284032	Ruff format + fix	2025-01-09 18:41:44 -07:00
remsky	4b521f9bf0	- Added GenerateFromPhonemesRequest model to text_schemas.py - Refactored TTS model initialization methods in tts_gpu.py and tts_cpu.py - Added custom logger configuration in main.py - Deprecated text_processing router -> development route	2025-01-09 07:20:14 -07:00
remsky	720c1fb97d	-update soundfile version -alignment with streaming standards -audio processing config settings -more comprehensive model warmup -minor model improvements -enhancing testing, benchmarking -cool ascii logo	2025-01-06 03:32:41 -07:00
remsky	e799f0c7c1	WIP: basic tests on OpenAI streaming compatibility	2025-01-04 18:09:23 -07:00
remsky	0e9f77fc79	WIP: open ai compatible streaming	2025-01-04 17:55:36 -07:00
remsky	f1eb1d9590	First streaming attempt	2025-01-04 17:54:54 -07:00
remsky	40894449da	added output audio tests, validation	2025-01-02 15:36:53 -07:00
DINMAY KUMAR BRAHMA	8ccca1fcad	Update audio.py	2025-01-03 00:28:59 +05:30
DINMAY KUMAR BRAHMA	94b6fc22ea	Update audio.py	2025-01-01 21:11:23 +05:30
remsky	4123ab0891	Refactor TTS API and enhance testing setup with coverage and logging improvements	2024-12-31 02:55:51 -07:00
remsky	c11a6ea6ea	Enhance TTS API with logging, voice pack loading, and schema updates	2024-12-31 01:57:00 -07:00
remsky	8ce8334345	- Complete TTS endpoint replacement with OpenAI compatible -Removed output directory, and update configuration settings - Added benchmarking for entire novel	2024-12-31 01:52:16 -07:00

40 commits