Commit graph

172 commits

Author SHA1 Message Date
Lukin
0d1dd666f2 Refactor TTS service to simplify language code determination and improve logging. Removed unnecessary comments and streamlined voice path handling for clarity. 2025-04-08 11:40:51 +08:00
Lukin
66201494d0 Refactor TTS service to improve voice combination logic and error handling. Updated voice parsing to support combined voices with weights, enhanced normalization handling, and streamlined audio generation process. Improved logging for better debugging and removed unnecessary comments for clarity. 2025-04-08 11:38:07 +08:00
Lukin
7a838ab3e8 Refactor TTS service to improve audio chunk handling and filename safety. Removed unnecessary comments, adjusted text processing for legacy backends, and enhanced error handling during audio stream generation. Updated filename regex to restrict allowed characters for safer filenames. 2025-04-08 11:17:46 +08:00
Lukin
88b9349198 Enhance StreamingAudioWriter to support MP3 encoding without Xing VBR header and conditionally set bit rate for applicable formats. Improved error handling by using self.format in exceptions. 2025-04-08 10:21:43 +08:00
Lukin
207d709de1 Refactor TTS service to improve filename safety and audio chunk handling. Updated filename regex to allow additional characters, enhanced silence chunk creation for AudioService, and ensured final audio output is consistently in int16 format. Removed premature writer closure in the finalization process, delegating responsibility to the caller. 2025-04-08 09:38:59 +08:00
Lukin
4b334beff4 Enhance test coverage for text processing and TTS service. Updated assertions in test_get_sentence_info_phenomoes to verify placeholder presence and token counts. Modified smart_split tests to unpack additional values and ensure proper handling of text and tokens. Improved clarity in test assertions for punctuation preservation. 2025-04-08 00:47:12 +08:00
Lukin
c0da571857 Refactor TTS service and text processing to enhance handling of pauses, newlines, and custom phonemes. Updated smart_split to manage pause tags and improved error logging. Adjusted audio generation logic for better performance and clarity. 2025-04-07 14:12:18 +08:00
Lukin
b31f79d8d7 Enhance TTS service to handle pauses and trailing newlines in text processing. Updated smart_split to preserve newlines and added logic for generating silence chunks during pauses. Improved error handling and logging for audio processing. 2025-04-07 13:21:49 +08:00
Fireblade2534
d004b6d304
Apply suggestions from copilot
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-04-04 19:06:59 -04:00
Fireblade2534
7f0e06ea6b
Update normalizer.py 2025-04-04 19:06:13 -04:00
remsky
afa879546c CONTRIBUTING + Ruff format 2025-04-04 16:58:07 -06:00
remsky
447f9d360c Ruff check 2025-04-04 16:50:46 -06:00
remsky
65f6b979c3 Enhance temp file handling with error tracking and update Docker Compose to run as non-root user
Some checks failed
CI / test (3.10) (push) Has been cancelled
2025-03-29 17:01:15 -06:00
Fireblade2534
d712308f98 Fixes relating to parsing money and tests. Also readme stuff 2025-03-21 18:03:09 +00:00
Fireblade2534
c24aeefbb2 Aculy fixed tests this time 2025-03-20 19:15:07 +00:00
Fireblade2534
c902b2ca0d probably fix tests 2025-03-20 16:27:18 +00:00
Fireblade2534
8f23bf53a4 Inital test commit of segfault fixes 2025-03-20 16:20:28 +00:00
remsky
0d7570ab50
Merge pull request #240 from fireblade2534/fixes
Some checks are pending
CI / test (3.10) (push) Waiting to run
2025-03-18 04:27:17 -06:00
Fireblade
9f9e9b601e Fixes not returning a download link if streaming is off and return_download_link is true 2025-03-13 16:23:49 -04:00
Fireblade2534
acb7d05515
Merge branch 'master' into master 2025-03-12 11:17:44 -04:00
remsky
e4744f5545
Merge pull request #235 from fireblade2534/fixes 2025-03-12 02:22:04 -06:00
Fireblade
aa403f2070 Adds the ability to subtract voices 2025-03-11 14:28:48 -04:00
Fireblade2534
dafc87ddef
Merge pull request #199 from blakkd/master
Some checks failed
CI / test (3.10) (push) Has been cancelled
converted CRLF ending lines to LF ones in api/src/structures/custom_responses.py
2025-03-10 18:14:29 -04:00
Cong Nguyen
9a9bc4aca9 added support for mps on mac with apple silicon 2025-03-10 11:58:45 +11:00
Fireblade2534
f2c5bc1b71
Merge branch 'remsky:master' into fixes 2025-03-02 21:39:17 -05:00
Fireblade
b3d5f4de08 fixes and corrections to code that didn't cause errors but didn't really make sense 2025-03-02 21:36:34 -05:00
Fireblade2534
d67570ab21
Merge pull request #210 from fireblade2534/preserve-custom-phenomes
This fix allows for inputing custom pronuncations through text. For example: "This is a test of a [bla bla](/ðɪs ɪz ˈoʊnli ɐ tˈɛst/) system." It ensures that normalization does not affect custom prnouncations
2025-03-02 14:37:07 -05:00
Fireblade2534
43576c4a76
Remove random 1 2025-03-01 12:45:41 -05:00
Fireblade
226a75e782 fixes the low quality fix not working properly 2025-02-28 21:57:33 -05:00
Fireblade
f415ce7109 don't replace brackets as that is handled in misaki 2025-02-28 21:39:12 -05:00
Fireblade
906cf77a65 preserve custom phenomes 2025-02-28 21:37:46 -05:00
Fireblade
9247bc3a12 notremoved the rate argument which apperently means bitrate 2025-02-26 21:51:00 -05:00
Fireblade
980bc5b4a8 Fix low quality because audio was being encoded at a lower bitrate 2025-02-26 20:52:38 -05:00
blakkd
3c5029f801 converted CRLF ending lines to LF ones in api/src/structures/custom_responses.py
let ruff organise the imports
2025-02-24 02:11:48 +01:00
Fireblade
5de3cace3b Fix some tests and allow running the docker container offline 2025-02-22 15:17:28 -05:00
Fireblade
c1207f085b Merge remote-tracking branch 'upstream/master' into streaming-word-timestamps 2025-02-22 14:58:28 -05:00
remsky
39cc056fe2
Merge pull request #179 from fireblade2534/normalization-changes
Some checks failed
CI / test (3.10) (push) Has been cancelled
2025-02-21 20:00:15 -07:00
Fireblade
c5a3e13670 Converted the stream writer to use pyav 2025-02-19 23:10:51 -05:00
Fireblade
4ee4d36822 Fixes a couple of issues with audio triming and prevents errors with single voice weights 2025-02-18 18:12:49 -05:00
Fireblade
7f15ba8fed Add a .gitattributes 2025-02-18 17:44:03 -05:00
Fireblade
f2b2f41412 fixed wrong varible name bug 2025-02-16 17:07:41 -05:00
Fireblade
cb22aab239 Fix streaming a wav file with captions not reaturning any captions (This is only a problem because wav streaming does not acually work) 2025-02-16 16:49:33 -05:00
Fireblade
e3dc959775 Simplify code so erverything uses AudioChunks 2025-02-16 15:37:01 -05:00
Fireblade
9c0e328318 made it skip text normalization when using other languages as it only supports english 2025-02-16 14:16:18 -05:00
Fireblade
41598eb3c5 better parsing for times and phone numbers 2025-02-15 19:02:57 -05:00
Fireblade
3290bada2e changes to how money and numbers are handled 2025-02-15 17:48:12 -05:00
Fireblade
4802128943 Replaced default voice with af_heart as af doesn't exist 2025-02-15 12:36:36 -05:00
Fireblade
8c457c3292 fixed final test 2025-02-15 09:49:15 -05:00
Fireblade
1a6e7abac3 fixed a bunch of tests 2025-02-15 09:40:01 -05:00
Fireblade
1a03ac7464 Fixed some tests 2025-02-14 15:00:47 -05:00