mirror of https://github.com/santinic/audiblez.git synced 2025-08-05 16:48:55 +00:00

Generate audiobooks from e-books

Find a file

Claudio Santini 361a7dd728 fixes		2025-01-31 14:39:32 +01:00
.github/workflows	fix	2025-01-29 16:11:09 +01:00
test	fixes	2025-01-31 14:39:32 +01:00
.gitignore	ign	2025-01-23 22:36:55 +01:00
audiblez.py	fixes	2025-01-31 14:39:32 +01:00
LICENSE	Initial commit	2025-01-14 15:36:21 +01:00
poetry.lock	deps	2025-01-29 11:17:23 +01:00
pyproject.toml	deps	2025-01-29 11:11:54 +01:00
README.md	fix espeak-ng	2025-01-30 16:04:58 +01:00
voices.py	voices	2025-01-31 12:23:29 +01:00

README.md

Audiblez: Generate audiobooks from e-books

v3.0 Now with CUDA support!

Audiblez generates .m4b audiobooks from regular .epub e-books, using Kokoro's high-quality speech synthesis.

Kokoro-82M is a recently published text-to-speech model with just 82M params and very natural sounding output. It's released under Apache licence and it was trained on < 100 hours of audio. It currently supports American and British English in a bunch of very good voices.

Future support for French, Korean, Japanese and Mandarin is planned.

On a Google Colab's T4 GPU via Cuda, it takes about 5 minutes to convert "Animal's Farm" by Orwell (which is a bout 160,000 characters) to audiobook, at a rate of about 600 characters per second.

On my M2 MacBook Pro, on CPU, it takes about 1 hour, at a rate of about 60 characters per second.

How to install and run

If you have Python 3 on your computer, you can install it with pip. You also need espeak-ng installed on your machine:

pip install audiblez
sudo apt install espeak-ng  # on Ubuntu/Debian
brew install espeak-ng      # on Mac

Then, to convert an epub file into an audiobook, just run:

audiblez book.epub -v af_sky

It will first create a bunch of book_chapter_1.wav, book_chapter_2.wav, etc. files in the same directory, and at the end it will produce a book.m4b file with the whole book you can listen with VLC or any audiobook player. It will only produce the .m4b file if you have ffmpeg installed on your machine.

Speed

By default the audio is generated using a normal speed, but you can make it up to twice slower or faster by specifying a speed argument between 0.5 to 2.0:

audiblez book.epub -v af_sky -s 1.5

Supported Voices

Use -v option to specify the voice to use. Available voices are listed here, the ones that start with "a" are American, the ones that start with "b" are British:

af_alloy, af_aoede, af_bella, af_jessica, af_kore, af_nicole, af_nova, af_river, af_sarah, af_sky, am_adam, am_echo, am_eric, am_fenrir, am_liam, am_michael, am_onyx, am_puck, bf_alice, bf_emma, bf_isabella, bf_lily, bm_daniel, bm_fable, bm_george, bm_lewis

You can try them here: https://huggingface.co/spaces/hexgrad/Kokoro-TTS

How to run on GPU

Experimental support for Cuda is available b

By default audiblez runs on CPU. If you pass the option --cuda it will try to use the Cuda device via Torch.

Check out this example: Audiblez running on a Google Colab Notebook with Cuda .

We don't currently support Apple Silicon, as there is not yet a Kokoro implementation in MLX. As soon as it will be available, we will support it.

Author

by Claudio Santini in 2025, distributed under MIT licence.