remsky
f11a6b3e2b
Revert "Adds support for creating weighted voice combinations"
2025-02-09 22:41:42 -07:00
remsky
d5709097e2
Merge pull request #92 from rvuyyuru2/v0.1.2-pre
...
Adds support for creating weighted voice combinations (reimplemented in v0.2.0)
2025-02-09 22:37:16 -07:00
remsky
00497f8872
Refactor: Consolidate PyTorch CPU and GPU backends into a single PyTorchBackend class; remove obsolete files
2025-01-25 13:33:42 -07:00
rvuyyuru2
44c62467ae
Adds support for creating weighted voice combinations
...
Implements a new method to parse weighted voice formulas and generate combined audio outputs based on specified weights.
This enhancement allows for more diverse audio generation by letting users specify multiple voices with respective weights, improving flexibility in voice management.
Updates voice processing logic in relevant API routes to handle weighted formulas seamlessly.
Fixes #123 (if applicable, replace with the actual issue reference)
2025-01-25 20:54:21 +05:30
remsky
3547d95ee6
-unified streaming implementation
2025-01-25 05:25:13 -07:00
remsky
9efb9db4d9
Fix: VoiceManager singleton instantiation
2025-01-24 05:30:56 -07:00
remsky
20658f9759
Performance: Adjust session timeout and GPU memory limit; minim voice pre-caching and improve singleton instance management
2025-01-24 05:01:38 -07:00
remsky
ee1f7cde18
Add async audio processing and semantic chunking support; flattened static audio trimming
2025-01-24 04:06:47 -07:00
remsky
8eb3525382
Refactor configuration and enhance web interface: update GPU settings, add speed control, and improve input handling for audio generation
2025-01-23 04:54:55 -07:00
remsky
ba577d348e
Enhance web player information, adjust text chunk size, update audio wave settings, and implement OpenAI model mappings
2025-01-23 04:11:31 -07:00
remsky
8e8f120a3e
Update configuration to disable local voice saving, enhance voice validation logic, and remove deprecated test file
2025-01-23 02:00:46 -07:00
remsky
df4cc5b4b2
-Adjust testing framework for new model
...
-Add web player support: include static file serving and HTML interface for TTS
2025-01-22 21:11:47 -07:00
remsky
66f46e82f9
Refactor ONNX GPU backend and phoneme generation: improve token handling, add chunk processing for audio generation, and initial introduce stitch options for audio chunks.
2025-01-22 17:43:38 -07:00
remsky
d50214d3be
Enable ONNX GPU support in Docker configurations and refactor model file handling
2025-01-22 05:00:38 -07:00
remsky
4a24be1605
Refactor model loading and configuration: update, adjust model loading device,. add async streaming examples and remove unused warmup service.
2025-01-22 02:33:29 -07:00
remsky
21bf810f97
Enhance model inference: update documentation, add model download scripts for PyTorch and ONNX, and refactor configuration handling
2025-01-21 21:44:21 -07:00
remsky
ab28a62e86
Refactor inference architecture: remove legacy TTS model, add ONNX and PyTorch backends, and introduce model configuration schemas
2025-01-20 22:42:29 -07:00
Richard Roberson
d51d861861
add AAC audio format and test
2025-01-17 21:43:10 -07:00
Fireblade2534
eb556ec7d3
Fixed python tests so they run properly and cleaned up some unneeded files
2025-01-17 14:55:25 +00:00
remsky
d20da2f92e
Default hexxgrad voicepacks added as temporary fix
2025-01-15 09:42:27 +00:00
remsky
8bc8661930
fix: update model directory paths and improve logging in TTS services
2025-01-14 06:37:03 -07:00
remsky
cf72e4ed2b
Add interruptible streams
2025-01-13 23:25:06 -07:00
remsky
064313450e
fix: test of cicd
2025-01-13 20:18:02 -07:00
remsky
22752900e5
Ruff checks, ci fix
2025-01-13 20:15:46 -07:00
remsky
007b1a35e8
feat: merge master into core/uv-management for v0.1.0
...
Major changes:
- Baked model directly into Dockerfile for improved deployment
- Switched to uv for dependency management
- Restructured Docker files into docker/cpu and docker/gpu directories
- Updated configuration for better ONNX performance
2025-01-13 19:31:44 -07:00
remsky
387653050b
refactor: streamline audio normalization process and update tests
2025-01-13 18:56:49 -07:00
remsky
f4dc292440
fix: ui stability, memory safeties
2025-01-12 21:33:23 -07:00
remsky
3d0ca2a8c2
Update Dockerfiles for baked in models, adjustments to cpu/gpu environment splits
2025-01-12 05:23:02 -07:00
remsky
926ea8cecf
Refactor Docker configurations and update test mocks for development routers
2025-01-10 22:03:16 -07:00
remsky
e8c1284032
Ruff format + fix
2025-01-09 18:41:44 -07:00
remsky
4b521f9bf0
- Added GenerateFromPhonemesRequest model to text_schemas.py
...
- Refactored TTS model initialization methods in tts_gpu.py and tts_cpu.py
- Added custom logger configuration in main.py
- Deprecated text_processing router -> development route
2025-01-09 07:20:14 -07:00
remsky
bd4df84410
Merge pull request #12 from fireblade2534/master
...
Gave it a test, didn't see any issues 👍
2025-01-09 01:30:07 -07:00
Fireblade
1f22cda9be
Fix remaining slashes not being converted into text and made % be converted
2025-01-08 08:50:22 -05:00
remsky
a0a85f5ef0
-add email handling, minor additional URL processing, tests
2025-01-08 03:13:17 -07:00
remsky
e7ffcf49f5
fixed: async scandir finding voices
2025-01-07 21:36:07 -07:00
Fireblade
1625082724
Fix url parsing for urls without https, http, or www. It also allows raw ips, ports, and dashs
2025-01-07 19:34:38 -05:00
remsky
d7e8a5c953
Adjusting aiofiles implementation, testing
2025-01-07 04:30:02 -07:00
remsky
130b084cce
- Added support for combining voices via any endpoint
...
- Updated the `process_voices` function to handle both string and list formats for voice input.
2025-01-07 03:50:08 -07:00
remsky
fddf26c905
Added tested, slight changes to regex
2025-01-07 00:18:44 -07:00
Fireblade
db2f3dd323
Made urls readable
2025-01-06 19:40:21 -05:00
remsky
720c1fb97d
-update soundfile version
...
-alignment with streaming standards
-audio processing config settings
-more comprehensive model warmup
-minor model improvements
-enhancing testing, benchmarking
-cool ascii logo
2025-01-06 03:32:41 -07:00
remsky
4c6cd83f85
Swapped generator to preprocessing
2025-01-04 22:23:59 -07:00
remsky
e799f0c7c1
WIP: basic tests on OpenAI streaming compatibility
2025-01-04 18:09:23 -07:00
remsky
0e9f77fc79
WIP: open ai compatible streaming
2025-01-04 17:55:36 -07:00
remsky
f1eb1d9590
First streaming attempt
2025-01-04 17:54:54 -07:00
remsky
93aa205da9
Enhance ONNX optimization settings and add validation script for TTS audio files
2025-01-04 02:14:46 -07:00
remsky
7df2a68fb4
- CPU ONNX + PyTorch CUDA, functional
...
- Incorporated text processing module as service, towards modularization and optimizations
- Added text processing router for phonemization
- Enhanced benchmark statistics with real-time speed metrics
2025-01-03 17:54:17 -07:00
remsky
9496a3a63f
WIP: CPU/GPU Functional, few straggling tests to fix and check.
2025-01-03 03:16:42 -07:00
remsky
e4d8e74738
WIP, Functional for CPU: Updated for ONNX runtime support, Dockerfile and TTS Service
2025-01-03 00:53:41 -07:00
remsky
40894449da
added output audio tests, validation
2025-01-02 15:36:53 -07:00