Fireblade2534
9f67366278
Merge branch 'master' into master
2025-03-11 11:06:33 -04:00
CodePothunter
d2b93e8da1
Fix speed parameter support for TTS generation
...
- Update InstancePool to accept and process speed parameter
- Modify TTSService to pass speed to instance pool
- Update Test.py with new port and authentication
- Adjust start-gpu.sh to use port 50888
2025-03-11 20:49:41 +08:00
Fireblade2534
dafc87ddef
Merge pull request #199 from blakkd/master
...
CI / test (3.10) (push) Has been cancelled
converted CRLF ending lines to LF ones in api/src/structures/custom_responses.py
2025-03-10 18:14:29 -04:00
Fireblade2534
6edc44edf3
Update docker-compose.yml
2025-03-10 18:12:52 -04:00
Fireblade2534
4d0f72b84e
Merge pull request #232 from FotieMConstant/patch-1
...
docs: added a note for Apple Silicon users regarding GPU build
2025-03-10 18:05:35 -04:00
CodePothunter
e67264f789
Fix BUGs of streaming non-wav format audio; improve robustness of releasing audio container
...
Refactor StreamingAudioWriter to improve audio encoding reliability
- Restructure audio encoding logic for better error handling
- Create a new method `_create_container()` to manage container creation
- Improve handling of different audio formats and encoding scenarios
- Add error logging for audio chunk encoding failures
- Simplify container and stream management in write_chunk method
2025-03-10 13:26:55 +08:00
fotiecodes
c3d1f0f45a
docs: added note for Apple Silicon users regarding GPU build
...
Clarified in the Docker documentation that the GPU build is CUDA-only and not supported on Apple Silicon (M1/M2/M3) to avoid confusion. Advised using the CPU build (docker/cpu) and noted that MPS support is still to come.
2025-03-10 00:20:42 +03:00
Fireblade
fbdedfb131
Combine the language code checks
2025-03-09 15:16:45 -04:00
CodePothunter
f998cf8d01
Fix bugs of generating empty file when using streaming mode.
...
The reason is that when stream=True, the audio conversion functions are not really called.
2025-03-09 14:12:18 +08:00
Fireblade
3e6ee65482
Simple fixes and translations
2025-03-08 22:48:52 -05:00
CodePothunter
70c0d506de
Add start-gpu.sh script for GPU-enabled FastAPI deployment
...
- Create GPU-specific startup script
- Set environment variables for GPU and project configuration
- Use uv to install GPU extras and run FastAPI server
2025-03-07 20:24:07 +08:00
CodePothunter
6e79b252d0
Merge branch 'master' of https://github.com/CodePothunter/Kokoro-FastAPI
2025-03-07 20:20:18 +08:00
CodePothunter
2dc9b81ad5
Fix audio chunk concatenation and dtype conversion confliction
...
- Modify audio chunk concatenation to handle float32 audio data
- Add explicit conversion from float32 to int16 using amplitude scaling
- Remove unnecessary dtype specification in np.concatenate
2025-03-07 20:17:35 +08:00
CodePothunter
8fe85c3386
Delete start-gpu.sh
2025-03-07 15:20:04 +08:00
CodePothunter
5c8f941f06
Add API authentication and configuration improvements
...
- Implement OpenAI-compatible API key authentication
- Add configuration options for GPU instances, concurrency, and request handling
- Update README with authentication instructions
- Modify configuration and routing to support optional API key verification
- Enhance system information and debug endpoints to expose authentication status
2025-03-07 11:36:13 +08:00
Fireblade2534
a578d22084
Merge pull request #221 from Chuui9739/fix-MediaSource-error
...
CI / test (3.10) (push) Has been cancelled
Repair the error 'Error: Error generating speech: Failed to execute '…
2025-03-06 17:52:21 -05:00
Chuui9739
d69a4c3b6e
Update AudioService.js
...
Change tab to space
2025-03-05 17:30:11 +08:00
Anthony
f4970a92f4
Repair the error 'Error: Error generating speech: Failed to execute 'endOfStream' on 'MediaSource': The 'updating' attribute is true on one or more of this MediaSource's SourceBuffers.'
2025-03-05 17:04:53 +08:00
Fireblade2534
d67570ab21
Merge pull request #210 from fireblade2534/preserve-custom-phenomes
...
This fix allows for inputing custom pronuncations through text. For example: "This is a test of a [bla bla](/ðɪs ɪz ˈoʊnli ɐ tˈɛst/) system." It ensures that normalization does not affect custom prnouncations
2025-03-02 14:37:07 -05:00
Fireblade2534
43576c4a76
Remove random 1
2025-03-01 12:45:41 -05:00
Fireblade2534
2a54140c46
Merge pull request #211 from fireblade2534/master
...
CI / test (3.10) (push) Has been cancelled
fixes the low quality fix not working properly
2025-02-28 22:01:00 -05:00
Fireblade
226a75e782
fixes the low quality fix not working properly
2025-02-28 21:57:33 -05:00
Fireblade
f415ce7109
don't replace brackets as that is handled in misaki
2025-02-28 21:39:12 -05:00
Fireblade
906cf77a65
preserve custom phenomes
2025-02-28 21:37:46 -05:00
remsky
9c6e72943c
Merge pull request #207 from fireblade2534/master
...
CI / test (3.10) (push) Has been cancelled
Fix low quality because audio was being encoded at a lower bitrate
2025-02-27 03:43:38 -07:00
Fireblade
9247bc3a12
notremoved the rate argument which apperently means bitrate
2025-02-26 21:51:00 -05:00
Fireblade
980bc5b4a8
Fix low quality because audio was being encoded at a lower bitrate
2025-02-26 20:52:38 -05:00
blakkd
664451e11c
added docker to video group
2025-02-24 02:16:07 +01:00
blakkd
3c5029f801
converted CRLF ending lines to LF ones in api/src/structures/custom_responses.py
...
let ruff organise the imports
2025-02-24 02:11:48 +01:00
remsky
7d73c3c7ee
Merge pull request #173 from fireblade2534/streaming-word-timestamps
...
CI / test (3.10) (push) Has been cancelled
Streaming word timestamps
2025-02-22 23:12:22 -07:00
Fireblade
e6feea78a3
Testing error
2025-02-22 15:29:26 -05:00
Fireblade
5de3cace3b
Fix some tests and allow running the docker container offline
2025-02-22 15:17:28 -05:00
Fireblade
c1207f085b
Merge remote-tracking branch 'upstream/master' into streaming-word-timestamps
2025-02-22 14:58:28 -05:00
remsky
39cc056fe2
Merge pull request #179 from fireblade2534/normalization-changes
CI / test (3.10) (push) Has been cancelled
2025-02-21 20:00:15 -07:00
remsky
3fd37b837b
Merge pull request #186 from fireblade2534/Add-.gitattribues-file
2025-02-21 19:59:53 -07:00
remsky
a6defbff18
Merge pull request #171 from randombk/pr-no-reload
2025-02-21 19:59:21 -07:00
Fireblade
c5a3e13670
Converted the stream writer to use pyav
2025-02-19 23:10:51 -05:00
Fireblade
4ee4d36822
Fixes a couple of issues with audio triming and prevents errors with single voice weights
2025-02-18 18:12:49 -05:00
Fireblade
7f15ba8fed
Add a .gitattributes
2025-02-18 17:44:03 -05:00
Fireblade
f2b2f41412
fixed wrong varible name bug
2025-02-16 17:07:41 -05:00
Fireblade
cb22aab239
Fix streaming a wav file with captions not reaturning any captions (This is only a problem because wav streaming does not acually work)
2025-02-16 16:49:33 -05:00
Fireblade
e3dc959775
Simplify code so erverything uses AudioChunks
2025-02-16 15:37:01 -05:00
Fireblade
9c0e328318
made it skip text normalization when using other languages as it only supports english
2025-02-16 14:16:18 -05:00
Fireblade
41598eb3c5
better parsing for times and phone numbers
2025-02-15 19:02:57 -05:00
Fireblade
3290bada2e
changes to how money and numbers are handled
2025-02-15 17:48:12 -05:00
Fireblade
4802128943
Replaced default voice with af_heart as af doesn't exist
2025-02-15 12:36:36 -05:00
Fireblade
8c457c3292
fixed final test
2025-02-15 09:49:15 -05:00
Fireblade
1a6e7abac3
fixed a bunch of tests
2025-02-15 09:40:01 -05:00
Fireblade
1a03ac7464
Fixed some tests
2025-02-14 15:00:47 -05:00
Fireblade
353fe79690
fix small error
2025-02-14 14:39:24 -05:00