Commit graph

66 commits

Author SHA1 Message Date
remsky
f11a6b3e2b
Revert "Adds support for creating weighted voice combinations" 2025-02-09 22:41:42 -07:00
remsky
d5709097e2
Merge pull request #92 from rvuyyuru2/v0.1.2-pre
Adds support for creating weighted voice combinations (reimplemented in v0.2.0)
2025-02-09 22:37:16 -07:00
remsky
00497f8872 Refactor: Consolidate PyTorch CPU and GPU backends into a single PyTorchBackend class; remove obsolete files 2025-01-25 13:33:42 -07:00
rvuyyuru2
44c62467ae Adds support for creating weighted voice combinations
Implements a new method to parse weighted voice formulas and generate combined audio outputs based on specified weights.

This enhancement allows for more diverse audio generation by letting users specify multiple voices with respective weights, improving flexibility in voice management.

Updates voice processing logic in relevant API routes to handle weighted formulas seamlessly.

Fixes #123 (if applicable, replace with the actual issue reference)
2025-01-25 20:54:21 +05:30
remsky
3547d95ee6 -unified streaming implementation 2025-01-25 05:25:13 -07:00
remsky
9efb9db4d9 Fix: VoiceManager singleton instantiation 2025-01-24 05:30:56 -07:00
remsky
20658f9759 Performance: Adjust session timeout and GPU memory limit; minim voice pre-caching and improve singleton instance management 2025-01-24 05:01:38 -07:00
remsky
ee1f7cde18 Add async audio processing and semantic chunking support; flattened static audio trimming 2025-01-24 04:06:47 -07:00
remsky
8eb3525382 Refactor configuration and enhance web interface: update GPU settings, add speed control, and improve input handling for audio generation 2025-01-23 04:54:55 -07:00
remsky
ba577d348e Enhance web player information, adjust text chunk size, update audio wave settings, and implement OpenAI model mappings 2025-01-23 04:11:31 -07:00
remsky
8e8f120a3e Update configuration to disable local voice saving, enhance voice validation logic, and remove deprecated test file 2025-01-23 02:00:46 -07:00
remsky
df4cc5b4b2 -Adjust testing framework for new model
-Add web player support: include static file serving and HTML interface for TTS
2025-01-22 21:11:47 -07:00
remsky
66f46e82f9 Refactor ONNX GPU backend and phoneme generation: improve token handling, add chunk processing for audio generation, and initial introduce stitch options for audio chunks. 2025-01-22 17:43:38 -07:00
remsky
d50214d3be Enable ONNX GPU support in Docker configurations and refactor model file handling 2025-01-22 05:00:38 -07:00
remsky
4a24be1605 Refactor model loading and configuration: update, adjust model loading device,. add async streaming examples and remove unused warmup service. 2025-01-22 02:33:29 -07:00
remsky
21bf810f97 Enhance model inference: update documentation, add model download scripts for PyTorch and ONNX, and refactor configuration handling 2025-01-21 21:44:21 -07:00
remsky
ab28a62e86 Refactor inference architecture: remove legacy TTS model, add ONNX and PyTorch backends, and introduce model configuration schemas 2025-01-20 22:42:29 -07:00
Richard Roberson
d51d861861 add AAC audio format and test 2025-01-17 21:43:10 -07:00
Fireblade2534
eb556ec7d3 Fixed python tests so they run properly and cleaned up some unneeded files 2025-01-17 14:55:25 +00:00
remsky
d20da2f92e Default hexxgrad voicepacks added as temporary fix 2025-01-15 09:42:27 +00:00
remsky
8bc8661930 fix: update model directory paths and improve logging in TTS services 2025-01-14 06:37:03 -07:00
remsky
cf72e4ed2b Add interruptible streams 2025-01-13 23:25:06 -07:00
remsky
064313450e fix: test of cicd 2025-01-13 20:18:02 -07:00
remsky
22752900e5 Ruff checks, ci fix 2025-01-13 20:15:46 -07:00
remsky
007b1a35e8 feat: merge master into core/uv-management for v0.1.0
Major changes:
- Baked model directly into Dockerfile for improved deployment
- Switched to uv for dependency management
- Restructured Docker files into docker/cpu and docker/gpu directories
- Updated configuration for better ONNX performance
2025-01-13 19:31:44 -07:00
remsky
387653050b refactor: streamline audio normalization process and update tests 2025-01-13 18:56:49 -07:00
remsky
f4dc292440 fix: ui stability, memory safeties 2025-01-12 21:33:23 -07:00
remsky
3d0ca2a8c2 Update Dockerfiles for baked in models, adjustments to cpu/gpu environment splits 2025-01-12 05:23:02 -07:00
remsky
926ea8cecf Refactor Docker configurations and update test mocks for development routers 2025-01-10 22:03:16 -07:00
remsky
e8c1284032 Ruff format + fix 2025-01-09 18:41:44 -07:00
remsky
4b521f9bf0 - Added GenerateFromPhonemesRequest model to text_schemas.py
- Refactored TTS model initialization methods in tts_gpu.py and tts_cpu.py
- Added custom logger configuration in main.py
- Deprecated text_processing router -> development route
2025-01-09 07:20:14 -07:00
remsky
bd4df84410
Merge pull request #12 from fireblade2534/master
Gave it a test, didn't see any issues 👍
2025-01-09 01:30:07 -07:00
Fireblade
1f22cda9be Fix remaining slashes not being converted into text and made % be converted 2025-01-08 08:50:22 -05:00
remsky
a0a85f5ef0 -add email handling, minor additional URL processing, tests 2025-01-08 03:13:17 -07:00
remsky
e7ffcf49f5 fixed: async scandir finding voices 2025-01-07 21:36:07 -07:00
Fireblade
1625082724 Fix url parsing for urls without https, http, or www. It also allows raw ips, ports, and dashs 2025-01-07 19:34:38 -05:00
remsky
d7e8a5c953 Adjusting aiofiles implementation, testing 2025-01-07 04:30:02 -07:00
remsky
130b084cce - Added support for combining voices via any endpoint
- Updated the `process_voices` function to handle both string and list formats for voice input.
2025-01-07 03:50:08 -07:00
remsky
fddf26c905 Added tested, slight changes to regex 2025-01-07 00:18:44 -07:00
Fireblade
db2f3dd323 Made urls readable 2025-01-06 19:40:21 -05:00
remsky
720c1fb97d -update soundfile version
-alignment with streaming standards
-audio processing config settings
-more comprehensive model warmup
-minor model improvements
-enhancing testing, benchmarking
-cool ascii logo
2025-01-06 03:32:41 -07:00
remsky
4c6cd83f85 Swapped generator to preprocessing 2025-01-04 22:23:59 -07:00
remsky
e799f0c7c1 WIP: basic tests on OpenAI streaming compatibility 2025-01-04 18:09:23 -07:00
remsky
0e9f77fc79 WIP: open ai compatible streaming 2025-01-04 17:55:36 -07:00
remsky
f1eb1d9590 First streaming attempt 2025-01-04 17:54:54 -07:00
remsky
93aa205da9 Enhance ONNX optimization settings and add validation script for TTS audio files 2025-01-04 02:14:46 -07:00
remsky
7df2a68fb4 - CPU ONNX + PyTorch CUDA, functional
- Incorporated text processing module as service, towards modularization and optimizations
- Added text processing router for phonemization
- Enhanced benchmark statistics with real-time speed metrics
2025-01-03 17:54:17 -07:00
remsky
9496a3a63f WIP: CPU/GPU Functional, few straggling tests to fix and check. 2025-01-03 03:16:42 -07:00
remsky
e4d8e74738 WIP, Functional for CPU: Updated for ONNX runtime support, Dockerfile and TTS Service 2025-01-03 00:53:41 -07:00
remsky
40894449da added output audio tests, validation 2025-01-02 15:36:53 -07:00