audiblez/README.md

53 lines
2.3 KiB
Markdown
Raw Normal View History

2025-01-14 22:57:31 +01:00
# Audiblez: Generate audiobooks from e-books
2025-01-15 00:09:27 +01:00
[![Installing via pip and running](https://github.com/santinic/audiblez/actions/workflows/pip-install.yaml/badge.svg)](https://github.com/santinic/audiblez/actions/workflows/pip-install.yaml)
2025-01-14 16:30:25 +01:00
2025-01-14 22:57:31 +01:00
Audiblez generates `.m4b` audiobooks from regular `.epub` e-books,
using Kokoro's high-quality speech synthesis.
2025-01-14 16:30:25 +01:00
2025-01-14 22:57:31 +01:00
[Kokoro v0.19](https://huggingface.co/hexgrad/Kokoro-82M) is a recently published text-to-speech model with just 82M params and very natural sounding output.
It's released under Apache licence and it was trained on < 100 hours of audio.
It currently supports American, British English, French, Korean, Japanese and Mandarin, and a bunch of very good voices.
2025-01-14 16:30:25 +01:00
2025-01-14 22:57:31 +01:00
An example of the quality:
2025-01-14 16:30:25 +01:00
2025-01-14 22:57:31 +01:00
<audio controls=""><source type="audio/wav" src="https://huggingface.co/hexgrad/Kokoro-82M/resolve/main/demo/HEARME.wav"></audio>
2025-01-14 16:30:25 +01:00
2025-01-14 22:57:31 +01:00
On my M2 MacBook Pro, **it takes about 2 hours to convert to mp3 the Selfish Gene by Richard Dawkins**, which is about 100,000 words (or 600,000 characters),
at a rate of about 80 characters per second.
2025-01-14 16:30:25 +01:00
2025-01-14 22:57:31 +01:00
## How to install and run
2025-01-14 16:30:25 +01:00
2025-01-14 23:34:37 +01:00
If you have Python 3.11 or Python 3.12 on your computer, you can install it with pip.
2025-01-15 00:09:27 +01:00
Be aware that it won't work with Python 3.13.
2025-01-14 22:57:31 +01:00
Then you also need to download the onnx and voices files in the same folder, which are about ~360MB:
2025-01-14 16:30:25 +01:00
2025-01-14 22:57:31 +01:00
```bash
pip install audiblez
wget https://github.com/thewh1teagle/kokoro-onnx/releases/download/model-files/kokoro-v0_19.onnx
wget https://github.com/thewh1teagle/kokoro-onnx/releases/download/model-files/voices.json
2025-01-14 16:30:25 +01:00
```
2025-01-14 22:57:31 +01:00
Then to convert an epub file into an audiobook, just run:
2025-01-14 16:30:25 +01:00
2025-01-14 22:57:31 +01:00
```bash
audiblez book.epub -l en-gb -v af_sky
2025-01-14 16:30:25 +01:00
```
2025-01-14 22:57:31 +01:00
It will first create a bunch of `book_chapter_1.wav`, `book_chapter_2.wav`, etc. files in the same directory,
and at the end it will produce a `book.m4b` file with the whole book you can listen with VLC or any
audiobook player.
It will only produce the `.m4b` file if you have `ffmpeg` installed on your machine.
## Supported Languages
- 🇺🇸 en-US
- 🇬🇧 en-GB
- 🇫🇷 fr-FR
- 🇯🇵 ja-JP
- 🇰🇷 ko-KR
- 🇨🇳 zh-CN
## Supported Voices
2025-01-14 23:54:02 +01:00
Available voices are `af`, `af_bella`, `af_nicole`, `af_sarah`, `af_sky`, `am_adam`, `am_michael`, `bf_emma`, `bf_isabella`, `bm_george`, `bm_lewis`.
2025-01-14 22:57:31 +01:00
You can try them here: [https://huggingface.co/spaces/hexgrad/Kokoro-TTS](https://huggingface.co/spaces/hexgrad/Kokoro-TTS)