Lukin
ab8ab7d749
Refactor audio processing and text normalization: Update audio normalization to use absolute amplitude threshold, enhance streaming audio writer with MP3 container options, and improve text normalization by stripping spaces and handling special characters to prevent audio artifacts.
2025-05-30 22:52:58 +08:00
remsky
afa879546c
CONTRIBUTING + Ruff format
2025-04-04 16:58:07 -06:00
remsky
447f9d360c
Ruff check
2025-04-04 16:50:46 -06:00
Fireblade2534
c24aeefbb2
Aculy fixed tests this time
2025-03-20 19:15:07 +00:00
Fireblade2534
8f23bf53a4
Inital test commit of segfault fixes
2025-03-20 16:20:28 +00:00
Fireblade
b3d5f4de08
fixes and corrections to code that didn't cause errors but didn't really make sense
2025-03-02 21:36:34 -05:00
Fireblade
c5a3e13670
Converted the stream writer to use pyav
2025-02-19 23:10:51 -05:00
Fireblade
4ee4d36822
Fixes a couple of issues with audio triming and prevents errors with single voice weights
2025-02-18 18:12:49 -05:00
Fireblade
e3dc959775
Simplify code so erverything uses AudioChunks
2025-02-16 15:37:01 -05:00
Fireblade
0b5ec320c7
streaming word level time stamps
2025-02-14 13:37:42 -05:00
Fireblade
7772dbc2e4
fixed no stream file writing
2025-02-13 16:12:51 -05:00
Fireblade
dbf2b99026
Simplifed generate_audio in tts_service mostly working (audio conversion does not work)
2025-02-12 22:42:41 -05:00
Fireblade2534
51b6b01589
Fixed not returning enough values
2025-02-12 15:06:11 +00:00
Fireblade
5cc9d140fe
WIP
2025-02-11 22:36:19 -05:00
Fireblade
45cdb607e6
WIP
2025-02-11 22:32:10 -05:00
Fireblade
ab1c21130e
Made the api use the normalizer, fixed the wrong version of espeak, added better normilzation, improved the sentence splitting, fixed some formatting
2025-02-10 21:45:52 -05:00
remsky
a91e0fe9df
Ruff check + formatting
2025-02-09 18:32:17 -07:00
remsky
f61f79981d
-Add debug endpoint for system stats
...
-Adjust headers, generate from phonemes, etc
2025-01-30 04:44:04 -07:00
remsky
9867fc398f
WIP: v1_0_0 migration
2025-01-28 13:52:57 -07:00
remsky
75889e157d
Refactor audio processing and cleanup: remove unused chunker, enhance StreamingAudioWriter for better MP3 handling, and improve text processing compatibility.
2025-01-27 20:23:42 -07:00
remsky
8a60a2b90c
Add StreamingAudioWriter class for audio format conversions and remove deprecated migration notes
2025-01-27 20:23:35 -07:00
remsky
409a9e9af3
Merge remote-tracking branch 'origin/master'
2025-01-27 15:19:28 -07:00
Josh Rosen
b8d592081e
Fix truncated playback issue in streaming WAV responses.
2025-01-26 12:40:45 -08:00
remsky
ee1f7cde18
Add async audio processing and semantic chunking support; flattened static audio trimming
2025-01-24 04:06:47 -07:00
Richard Roberson
d51d861861
add AAC audio format and test
2025-01-17 21:43:10 -07:00
remsky
22752900e5
Ruff checks, ci fix
2025-01-13 20:15:46 -07:00
remsky
387653050b
refactor: streamline audio normalization process and update tests
2025-01-13 18:56:49 -07:00
remsky
926ea8cecf
Refactor Docker configurations and update test mocks for development routers
2025-01-10 22:03:16 -07:00
remsky
e8c1284032
Ruff format + fix
2025-01-09 18:41:44 -07:00
remsky
4b521f9bf0
- Added GenerateFromPhonemesRequest model to text_schemas.py
...
- Refactored TTS model initialization methods in tts_gpu.py and tts_cpu.py
- Added custom logger configuration in main.py
- Deprecated text_processing router -> development route
2025-01-09 07:20:14 -07:00
remsky
720c1fb97d
-update soundfile version
...
-alignment with streaming standards
-audio processing config settings
-more comprehensive model warmup
-minor model improvements
-enhancing testing, benchmarking
-cool ascii logo
2025-01-06 03:32:41 -07:00
remsky
e799f0c7c1
WIP: basic tests on OpenAI streaming compatibility
2025-01-04 18:09:23 -07:00
remsky
0e9f77fc79
WIP: open ai compatible streaming
2025-01-04 17:55:36 -07:00
remsky
f1eb1d9590
First streaming attempt
2025-01-04 17:54:54 -07:00
remsky
40894449da
added output audio tests, validation
2025-01-02 15:36:53 -07:00
DINMAY KUMAR BRAHMA
8ccca1fcad
Update audio.py
2025-01-03 00:28:59 +05:30
DINMAY KUMAR BRAHMA
94b6fc22ea
Update audio.py
2025-01-01 21:11:23 +05:30
remsky
4123ab0891
Refactor TTS API and enhance testing setup with coverage and logging improvements
2024-12-31 02:55:51 -07:00
remsky
c11a6ea6ea
Enhance TTS API with logging, voice pack loading, and schema updates
2024-12-31 01:57:00 -07:00
remsky
8ce8334345
- Complete TTS endpoint replacement with OpenAI compatible
...
-Removed output directory, and update configuration settings
- Added benchmarking for entire novel
2024-12-31 01:52:16 -07:00