Commit graph

40 commits

Author SHA1 Message Date
Lukin
ab8ab7d749 Refactor audio processing and text normalization: Update audio normalization to use absolute amplitude threshold, enhance streaming audio writer with MP3 container options, and improve text normalization by stripping spaces and handling special characters to prevent audio artifacts. 2025-05-30 22:52:58 +08:00
remsky
afa879546c CONTRIBUTING + Ruff format 2025-04-04 16:58:07 -06:00
remsky
447f9d360c Ruff check 2025-04-04 16:50:46 -06:00
Fireblade2534
c24aeefbb2 Aculy fixed tests this time 2025-03-20 19:15:07 +00:00
Fireblade2534
8f23bf53a4 Inital test commit of segfault fixes 2025-03-20 16:20:28 +00:00
Fireblade
b3d5f4de08 fixes and corrections to code that didn't cause errors but didn't really make sense 2025-03-02 21:36:34 -05:00
Fireblade
c5a3e13670 Converted the stream writer to use pyav 2025-02-19 23:10:51 -05:00
Fireblade
4ee4d36822 Fixes a couple of issues with audio triming and prevents errors with single voice weights 2025-02-18 18:12:49 -05:00
Fireblade
e3dc959775 Simplify code so erverything uses AudioChunks 2025-02-16 15:37:01 -05:00
Fireblade
0b5ec320c7 streaming word level time stamps 2025-02-14 13:37:42 -05:00
Fireblade
7772dbc2e4 fixed no stream file writing 2025-02-13 16:12:51 -05:00
Fireblade
dbf2b99026 Simplifed generate_audio in tts_service mostly working (audio conversion does not work) 2025-02-12 22:42:41 -05:00
Fireblade2534
51b6b01589 Fixed not returning enough values 2025-02-12 15:06:11 +00:00
Fireblade
5cc9d140fe WIP 2025-02-11 22:36:19 -05:00
Fireblade
45cdb607e6 WIP 2025-02-11 22:32:10 -05:00
Fireblade
ab1c21130e Made the api use the normalizer, fixed the wrong version of espeak, added better normilzation, improved the sentence splitting, fixed some formatting 2025-02-10 21:45:52 -05:00
remsky
a91e0fe9df Ruff check + formatting 2025-02-09 18:32:17 -07:00
remsky
f61f79981d -Add debug endpoint for system stats
-Adjust headers, generate from phonemes, etc
2025-01-30 04:44:04 -07:00
remsky
9867fc398f WIP: v1_0_0 migration 2025-01-28 13:52:57 -07:00
remsky
75889e157d Refactor audio processing and cleanup: remove unused chunker, enhance StreamingAudioWriter for better MP3 handling, and improve text processing compatibility. 2025-01-27 20:23:42 -07:00
remsky
8a60a2b90c Add StreamingAudioWriter class for audio format conversions and remove deprecated migration notes 2025-01-27 20:23:35 -07:00
remsky
409a9e9af3 Merge remote-tracking branch 'origin/master' 2025-01-27 15:19:28 -07:00
Josh Rosen
b8d592081e Fix truncated playback issue in streaming WAV responses. 2025-01-26 12:40:45 -08:00
remsky
ee1f7cde18 Add async audio processing and semantic chunking support; flattened static audio trimming 2025-01-24 04:06:47 -07:00
Richard Roberson
d51d861861 add AAC audio format and test 2025-01-17 21:43:10 -07:00
remsky
22752900e5 Ruff checks, ci fix 2025-01-13 20:15:46 -07:00
remsky
387653050b refactor: streamline audio normalization process and update tests 2025-01-13 18:56:49 -07:00
remsky
926ea8cecf Refactor Docker configurations and update test mocks for development routers 2025-01-10 22:03:16 -07:00
remsky
e8c1284032 Ruff format + fix 2025-01-09 18:41:44 -07:00
remsky
4b521f9bf0 - Added GenerateFromPhonemesRequest model to text_schemas.py
- Refactored TTS model initialization methods in tts_gpu.py and tts_cpu.py
- Added custom logger configuration in main.py
- Deprecated text_processing router -> development route
2025-01-09 07:20:14 -07:00
remsky
720c1fb97d -update soundfile version
-alignment with streaming standards
-audio processing config settings
-more comprehensive model warmup
-minor model improvements
-enhancing testing, benchmarking
-cool ascii logo
2025-01-06 03:32:41 -07:00
remsky
e799f0c7c1 WIP: basic tests on OpenAI streaming compatibility 2025-01-04 18:09:23 -07:00
remsky
0e9f77fc79 WIP: open ai compatible streaming 2025-01-04 17:55:36 -07:00
remsky
f1eb1d9590 First streaming attempt 2025-01-04 17:54:54 -07:00
remsky
40894449da added output audio tests, validation 2025-01-02 15:36:53 -07:00
DINMAY KUMAR BRAHMA
8ccca1fcad
Update audio.py 2025-01-03 00:28:59 +05:30
DINMAY KUMAR BRAHMA
94b6fc22ea
Update audio.py 2025-01-01 21:11:23 +05:30
remsky
4123ab0891 Refactor TTS API and enhance testing setup with coverage and logging improvements 2024-12-31 02:55:51 -07:00
remsky
c11a6ea6ea Enhance TTS API with logging, voice pack loading, and schema updates 2024-12-31 01:57:00 -07:00
remsky
8ce8334345 - Complete TTS endpoint replacement with OpenAI compatible
-Removed output directory, and update configuration settings
- Added benchmarking for entire novel
2024-12-31 01:52:16 -07:00