Merge pull request #6 from dino65-dev/master

Enhance Audio Converter
2025-08-05 16:48:53 +00:00 · 2025-01-02 15:38:37 -07:00 · 2025-01-02 15:38:37 -07:00 · ee5be65596
commit ee5be65596
parent d1c3feb8c4 40894449da
8 changed files with 486 additions and 67 deletions
--- a/.gitignore
+++ b/.gitignore
@ -14,3 +14,5 @@ env/
 .Python
 .coverage
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@ -2,8 +2,12 @@
 Notable changes to this project will be documented in this file.
 ## 2024-01-09
 ## 2025-01-02
 - Audio Format Support:
  - Added comprehensive audio format conversion support (mp3, wav, opus, flac)
 ## 2025-01-01
 ### Added
 - Gradio Web Interface:
  - Added simple web UI utility for audio generation from input or txt file
--- a/README.md
+++ b/README.md
@ -3,8 +3,8 @@
 </p>
 # Kokoro TTS API
-[![Tests](https://img.shields.io/badge/tests-81%20passed-darkgreen)]()
+[![Tests](https://img.shields.io/badge/tests-89%20passed-darkgreen)]()
-[![Coverage](https://img.shields.io/badge/coverage-76%25-darkgreen)]()
+[![Coverage](https://img.shields.io/badge/coverage-80%25-darkgreen)]()
 [![Tested at Model Commit](https://img.shields.io/badge/last--tested--model--commit-a67f113-blue)](https://huggingface.co/hexgrad/Kokoro-82M/tree/c3b0d86e2a980e027ef71c28819ea02e351c2667)
 Dockerized FastAPI wrapper for [Kokoro-82M](https://huggingface.co/hexgrad/Kokoro-82M) text-to-speech model
@ -14,8 +14,7 @@ Dockerized FastAPI wrapper for [Kokoro-82M](https://huggingface.co/hexgrad/Kokor
 - automatic chunking/stitching for long texts
 - simple audio generation web ui utility
-<details open>
+## Quick Start
 <summary><b>Quick Start</b></summary>
 The service can be accessed through either the API endpoints or the Gradio web interface.
@ -48,9 +47,10 @@ The service can be accessed through either the API endpoints or the Gradio web i
    <p align="center">
    <img src="ui\GradioScreenShot.png" width="80%" alt="Voice Analysis Comparison" style="border: 2px solid #333; padding: 10px;">
    </p>
-</details>
+
 ## Features 
 <details>
-<summary><b>OpenAI-Compatible Speech Endpoint</b></summary>
+<summary>OpenAI-Compatible Speech Endpoint</summary>
 ```python
 # Using OpenAI's Python library
@ -98,7 +98,10 @@ python examples/test_all_voices.py # Test all available voices
 </details>
 <details>
-<summary><b>Voice Combination</b></summary>
+<summary>Voice Combination</summary>
 - Averages model weights of any existing voicepacks
 - Saves generated voicepacks for future use
 Combine voices and generate audio:
 ```python
@ -129,7 +132,23 @@ response = requests.post(
 </details>
 <details>
-<summary><b>Gradio Web Utility</b></summary>
+<summary>Multiple Output Audio Formats</summary>
 - mp3
 - wav
 - opus 
 - flac
 - aac
 - pcm
 <p align="center">
 <img src="examples/benchmarks/format_comparison.png" width="80%" alt="Audio Format Comparison" style="border: 2px solid #333; padding: 10px;">
 </p>
 </details>
 <details>
 <summary>Gradio Web Utility</summary>
 Access the interactive web UI at http://localhost:7860 after starting the service. Features include:
 - Voice/format/speed selection
@ -141,9 +160,9 @@ If you only want the API, just comment out everything in the docker-compose.yml
 Currently, voices created via the API are accessible here, but voice combination/creation has not yet been added
 </details>
-
+## Processing Details
 <details>
-<summary><b>Performance Benchmarks</b></summary>
+<summary>Performance Benchmarks</summary>
 Benchmarking was performed on generation via the local API using text lengths up to feature-length books (~1.5 hours output), measuring processing time and realtime factor. Tests were run on: 
 - Windows 11 Home w/ WSL2 
@ -163,7 +182,7 @@ Key Performance Metrics:
 - Average Processing Rate: 137.67 tokens/second (cl100k_base)
 </details>
 <details>
-<summary><b>GPU Vs. CPU<b></summary>
+<summary>GPU Vs. CPU</summary>
 ```bash
 # GPU: Requires NVIDIA GPU with CUDA 12.1 support
@ -172,35 +191,29 @@ docker compose up --build
 # CPU: ~10x slower than GPU inference
 docker compose -f docker-compose.cpu.yml up --build
 ```
 </details>
 <details>
 <summary><b>Features</b></summary>
 - OpenAI-compatible API endpoints (with optional Gradio Web UI)
 - GPU-accelerated inference (if desired)
 - Multiple audio formats: mp3, wav, opus, flac, (aac & pcm not implemented)
 - Natural Boundary Detection:
    - Automatically splits and stitches at sentence boundaries to reduce artifacts and maintain performacne
 - Voice Combination:
    - Averages model weights of any existing voicepacks
    - Saves generated voicepacks for future use
 *Note: CPU Inference is currently a very basic implementation, and not heavily tested*
 </details>
 <details>
 <summary>Natural Boundary Detection</summary>
 - Automatically splits and stitches at sentence boundaries 
 - Helps to reduce artifacts and allow long form processing as the base model is only currently configured for approximately 30s output 
 </details>
 ## Model and License
 <details open>
-<summary><b>Model</b></summary>
+<summary>Model</summary>
 This API uses the [Kokoro-82M](https://huggingface.co/hexgrad/Kokoro-82M) model from HuggingFace. 
 Visit the model page for more details about training, architecture, and capabilities. I have no affiliation with any of their work, and produced this wrapper for ease of use and personal projects.
 </details>
 <details>
-<summary><b>License</b></summary>
+<summary>License</summary>
 This project is licensed under the Apache License 2.0 - see below for details:
 - The Kokoro model weights are licensed under Apache 2.0 (see [model page](https://huggingface.co/hexgrad/Kokoro-82M))
@ -209,3 +222,6 @@ This project is licensed under the Apache License 2.0 - see below for details:
 The full Apache 2.0 license text can be found at: https://www.apache.org/licenses/LICENSE-2.0
 </details>
--- a/api/src/services/audio.py
+++ b/api/src/services/audio.py
@ -4,7 +4,6 @@ from io import BytesIO
 import numpy as np
 import soundfile as sf
 import scipy.io.wavfile as wavfile
 from loguru import logger
@ -20,7 +19,7 @@ class AudioService:
        Args:
            audio_data: Numpy array of audio samples
            sample_rate: Sample rate of the audio
-            output_format: Target format (wav, mp3, etc.)
+            output_format: Target format (wav, mp3, opus, flac, pcm)
        Returns:
            Bytes of the converted audio
@ -30,46 +29,36 @@ class AudioService:
        try:
            if output_format == "wav":
                logger.info("Writing to WAV format...")
-                wavfile.write(buffer, sample_rate, audio_data)
+                # Ensure audio_data is in int16 format for WAV
-                return buffer.getvalue()
+                audio_data_wav = (
-
+                    audio_data / np.abs(audio_data).max() * np.iinfo(np.int16).max
                ).astype(np.int16)  # Normalize
                sf.write(buffer, audio_data_wav, sample_rate, format="WAV")
            elif output_format == "mp3":
                # For MP3, we need to convert to WAV first
                logger.info("Converting to MP3 format...")
-                wav_buffer = BytesIO()
+                # soundfile can write MP3 if ffmpeg or libsox is installed
-                wavfile.write(wav_buffer, sample_rate, audio_data)
+                sf.write(buffer, audio_data, sample_rate, format="MP3")
                wav_buffer.seek(0)
                # Convert WAV to MP3 using soundfile
                buffer = BytesIO()
                sf.write(buffer, audio_data, sample_rate, format="mp3")
                return buffer.getvalue()
            elif output_format == "opus":
                logger.info("Converting to Opus format...")
-                sf.write(buffer, audio_data, sample_rate, format="ogg", subtype="opus")
+                sf.write(buffer, audio_data, sample_rate, format="OGG", subtype="OPUS")
                return buffer.getvalue()
            elif output_format == "flac":
                logger.info("Converting to FLAC format...")
-                sf.write(buffer, audio_data, sample_rate, format="flac")
+                sf.write(buffer, audio_data, sample_rate, format="FLAC")
                return buffer.getvalue()
            elif output_format == "aac":
                raise ValueError(
                    "AAC format is not currently supported. Please use wav, mp3, opus, or flac."
                )
            elif output_format == "pcm":
-                raise ValueError(
+                logger.info("Extracting PCM data...")
-                    "PCM format is not currently supported. Please use wav, mp3, opus, or flac."
+                # Ensure audio_data is in int16 format for PCM
-                )
+                audio_data_pcm = (
-
+                    audio_data / np.abs(audio_data).max() * np.iinfo(np.int16).max
                ).astype(np.int16)  # Normalize
                buffer.write(audio_data_pcm.tobytes())
            else:
                raise ValueError(
-                    f"Format {output_format} not supported. Supported formats are: wav, mp3, opus, flac."
+                    f"Format {output_format} not supported. Supported formats are: wav, mp3, opus, flac, pcm."
                )
            buffer.seek(0)
            return buffer.getvalue()
        except Exception as e:
            logger.error(f"Error converting audio to {output_format}: {str(e)}")
            raise ValueError(f"Failed to convert audio to {output_format}: {str(e)}")
--- a/api/tests/test_audio_service.py
+++ b/api/tests/test_audio_service.py
@ -51,15 +51,19 @@ def test_convert_to_flac(sample_audio):
 def test_convert_to_aac_raises_error(sample_audio):
    """Test that converting to AAC raises an error"""
    audio_data, sample_rate = sample_audio
-    with pytest.raises(ValueError, match="AAC format is not currently supported"):
+    with pytest.raises(
        ValueError,
        match="Format aac not supported. Supported formats are: wav, mp3, opus, flac, pcm.",
    ):
        AudioService.convert_audio(audio_data, sample_rate, "aac")
-def test_convert_to_pcm_raises_error(sample_audio):
+def test_convert_to_pcm(sample_audio):
-    """Test that converting to PCM raises an error"""
+    """Test converting to PCM format"""
    audio_data, sample_rate = sample_audio
-    with pytest.raises(ValueError, match="PCM format is not currently supported"):
+    result = AudioService.convert_audio(audio_data, sample_rate, "pcm")
-        AudioService.convert_audio(audio_data, sample_rate, "pcm")
+    assert isinstance(result, bytes)
    assert len(result) > 0
 def test_convert_to_invalid_format_raises_error(sample_audio):
@ -67,3 +71,51 @@ def test_convert_to_invalid_format_raises_error(sample_audio):
    audio_data, sample_rate = sample_audio
    with pytest.raises(ValueError, match="Format invalid not supported"):
        AudioService.convert_audio(audio_data, sample_rate, "invalid")
 def test_normalization_wav(sample_audio):
    """Test that WAV output is properly normalized to int16 range"""
    audio_data, sample_rate = sample_audio
    # Create audio data outside int16 range
    large_audio = audio_data * 1e5
    result = AudioService.convert_audio(large_audio, sample_rate, "wav")
    assert isinstance(result, bytes)
    assert len(result) > 0
 def test_normalization_pcm(sample_audio):
    """Test that PCM output is properly normalized to int16 range"""
    audio_data, sample_rate = sample_audio
    # Create audio data outside int16 range
    large_audio = audio_data * 1e5
    result = AudioService.convert_audio(large_audio, sample_rate, "pcm")
    assert isinstance(result, bytes)
    assert len(result) > 0
 def test_invalid_audio_data():
    """Test handling of invalid audio data"""
    invalid_audio = np.array([])  # Empty array
    sample_rate = 24000
    with pytest.raises(ValueError):
        AudioService.convert_audio(invalid_audio, sample_rate, "wav")
 def test_different_sample_rates(sample_audio):
    """Test converting audio with different sample rates"""
    audio_data, _ = sample_audio
    sample_rates = [8000, 16000, 44100, 48000]
    for rate in sample_rates:
        result = AudioService.convert_audio(audio_data, rate, "wav")
        assert isinstance(result, bytes)
        assert len(result) > 0
 def test_buffer_position_after_conversion(sample_audio):
    """Test that buffer position is reset after writing"""
    audio_data, sample_rate = sample_audio
    result = AudioService.convert_audio(audio_data, sample_rate, "wav")
    # Convert again to ensure buffer was properly reset
    result2 = AudioService.convert_audio(audio_data, sample_rate, "wav")
    assert len(result) == len(result2)
--- a/api/tests/test_tts_service.py
+++ b/api/tests/test_tts_service.py
@ -4,6 +4,7 @@ import os
 from unittest.mock import MagicMock, call, patch
 import numpy as np
 import torch
 import pytest
 from api.src.services.tts import TTSModel, TTSService
@ -119,6 +120,78 @@ def test_generate_audio_no_chunks(
        tts_service._generate_audio("Test text", "af", 1.0)
@patch("torch.load")
@patch("torch.save")
@patch("torch.stack")
@patch("torch.mean")
@patch("os.path.exists")
 def test_combine_voices(
    mock_exists, mock_mean, mock_stack, mock_save, mock_load, tts_service
 ):
    """Test combining multiple voices"""
    # Setup mocks
    mock_exists.return_value = True
    mock_load.return_value = torch.tensor([1.0, 2.0])
    mock_stack.return_value = torch.tensor([[1.0, 2.0], [3.0, 4.0]])
    mock_mean.return_value = torch.tensor([2.0, 3.0])
    # Test combining two voices
    result = tts_service.combine_voices(["voice1", "voice2"])
    assert result == "voice1_voice2"
    mock_stack.assert_called_once()
    mock_mean.assert_called_once()
    mock_save.assert_called_once()
 def test_combine_voices_invalid_input(tts_service):
    """Test combining voices with invalid input"""
    # Test with empty list
    with pytest.raises(ValueError, match="At least 2 voices are required"):
        tts_service.combine_voices([])
    # Test with single voice
    with pytest.raises(ValueError, match="At least 2 voices are required"):
        tts_service.combine_voices(["voice1"])
@patch("os.makedirs")
@patch("os.path.exists")
@patch("os.listdir")
@patch("torch.load")
@patch("torch.save")
@patch("os.path.join")
 def test_ensure_voices(
    mock_join,
    mock_save,
    mock_load,
    mock_listdir,
    mock_exists,
    mock_makedirs,
    tts_service,
 ):
    """Test voice directory initialization"""
    # Setup mocks
    mock_exists.side_effect = [
        True,
        False,
        False,
    ]  # base_dir exists, voice files don't exist
    mock_listdir.return_value = ["voice1.pt", "voice2.pt"]
    mock_load.return_value = MagicMock()
    mock_join.return_value = "/fake/path"
    # Test voice directory initialization
    tts_service._ensure_voices()
    # Verify directory was created
    mock_makedirs.assert_called_once()
    # Verify voices were loaded and saved
    assert mock_load.call_count == len(mock_listdir.return_value)
    assert mock_save.call_count == len(mock_listdir.return_value)
@patch("api.src.services.tts.TTSModel.get_instance")
@patch("os.path.exists")
@patch("api.src.services.tts.normalize_text")
@ -236,7 +309,6 @@ def test_generate_audio_without_stitching(
        "Test text", "af", 1.0, stitch_long_output=False
    )
    assert isinstance(audio, np.ndarray)
    assert isinstance(processing_time, float)
    assert len(audio) > 0
    mock_generate.assert_called_once()
--- a/examples/benchmarks/format_comparison.png
+++ b/examples/benchmarks/format_comparison.png
--- a/examples/test_audio_formats.py
+++ b/examples/test_audio_formats.py
@ -0,0 +1,284 @@
 """Test script to generate and analyze different audio formats"""
 import os
 import time
 from pathlib import Path
 import numpy as np
 import openai
 import requests
 import soundfile as sf
 import matplotlib.pyplot as plt
 from scipy.io import wavfile
 SAMPLE_TEXT = """
 That is the germ of my great discovery. But you are wrong to say that we cannot move about in Time.
 """
 # Configure OpenAI client
 client = openai.OpenAI(
    timeout=60,
    api_key="notneeded",  # API key not required for our endpoint
    base_url="http://localhost:8880/v1",  # Point to our local server with v1 prefix
 )
 def setup_plot(fig, ax, title):
    """Configure plot styling"""
    # Improve grid
    ax.grid(True, linestyle="--", alpha=0.3, color="#ffffff")
    # Set title and labels with better fonts and more padding
    ax.set_title(title, pad=40, fontsize=16, fontweight="bold", color="#ffffff")
    ax.set_xlabel(ax.get_xlabel(), fontsize=14, fontweight="medium", color="#ffffff")
    ax.set_ylabel(ax.get_ylabel(), fontsize=14, fontweight="medium", color="#ffffff")
    # Improve tick labels
    ax.tick_params(labelsize=12, colors="#ffffff")
    # Style spines
    for spine in ax.spines.values():
        spine.set_color("#ffffff")
        spine.set_alpha(0.3)
        spine.set_linewidth(0.5)
    # Set background colors
    ax.set_facecolor("#1a1a2e")
    fig.patch.set_facecolor("#1a1a2e")
    return fig, ax
 def plot_format_comparison(stats: list, output_dir: str):
    """Plot audio format comparison"""
    plt.style.use("dark_background")
    # Create figure with subplots
    fig = plt.figure(figsize=(18, 16))  # Taller figure to accommodate bottom legend
    fig.patch.set_facecolor("#1a1a2e")
    # Create subplot grid with balanced spacing for waveforms
    gs_waves = plt.GridSpec(
        len(stats), 1, left=0.15, right=0.85, top=0.9, bottom=0.35, hspace=0.4
    )
    # Plot waveforms for each format
    for i, stat in enumerate(stats):
        format_name = stat["format"].upper()
        try:
            # Handle PCM format differently
            if stat["format"] == "pcm":
                # Read raw PCM data (16-bit mono)
                with open(
                    os.path.join(output_dir, f"test_audio.{stat['format']}"), "rb"
                ) as f:
                    raw_data = f.read()
                data = np.frombuffer(raw_data, dtype=np.int16)
                data = data.astype(np.float32) / 32768.0  # Convert to float [-1, 1]
                sr = 24000
            else:
                # Read other formats with soundfile
                data, sr = sf.read(
                    os.path.join(output_dir, f"test_audio.{stat['format']}")
                )
            # Plot waveform
            ax = plt.subplot(gs_waves[i])
            time = np.arange(len(data)) / sr
            plt.plot(time, data / np.max(np.abs(data)), linewidth=0.5, color="#ff2a6d")
            ax.set_xlabel("Time (seconds)")
            ax.set_ylabel("")
            ax.set_ylim(-1.1, 1.1)
            setup_plot(fig, ax, f"Waveform: {format_name}")
        except Exception as e:
            print(f"Error plotting waveform for {format_name}: {e}")
    # Colors for formats
    colors = ["#ff2a6d", "#05d9e8", "#d1f7ff", "#ff9e00", "#8c1eff"]
    # Create three subplots for metrics with more space at bottom for legend
    gs_bottom = plt.GridSpec(
        1,
        3,
        left=0.15,
        right=0.85,
        bottom=0.15,
        top=0.25,  # More bottom space for legend
        wspace=0.3,
    )
    # File Size subplot
    ax1 = plt.subplot(gs_bottom[0])
    metrics1 = [("File Size", [s["file_size_kb"] for s in stats], "KB")]
    # Duration and Gen Time subplot
    ax2 = plt.subplot(gs_bottom[1])
    metrics2 = [
        ("Duration", [s["duration_seconds"] for s in stats], "s"),
        ("Gen Time", [s["generation_time"] for s in stats], "s"),
    ]
    # Sample Rate subplot
    ax3 = plt.subplot(gs_bottom[2])
    metrics3 = [("Sample Rate", [s["sample_rate"] / 1000 for s in stats], "kHz")]
    def plot_grouped_bars(ax, metrics, show_legend=True):
        n_groups = len(metrics)
        n_formats = len(stats)
        # Use wider bars for time metrics
        bar_width = 0.175 if metrics == metrics2 else 0.1
        indices = np.arange(n_groups)
        # Get max value for y-axis scaling
        max_val = max(max(m[1]) for m in metrics)
        for i, (stat, color) in enumerate(zip(stats, colors)):
            values = [m[1][i] for m in metrics]
            # Reduce spacing between bars for time metrics
            spacing = 1.1 if metrics == metrics2 else 1.0
            offset = (i - n_formats / 2 + 0.5) * bar_width * spacing
            bars = ax.bar(
                indices + offset,
                values,
                bar_width,
                label=stat["format"].upper(),
                color=color,
                alpha=0.8,
            )
            # Add value labels on top of bars
            for bar in bars:
                height = bar.get_height()
                ax.text(
                    bar.get_x() + bar.get_width() / 2.0,
                    height,
                    f"{height:.1f}",
                    ha="center",
                    va="bottom",
                    color="white",
                    fontsize=10,
                )
        ax.set_xticks(indices)
        ax.set_xticklabels([f"{m[0]}\n({m[2]})" for m in metrics])
        # Set y-axis limits with some padding
        ax.set_ylim(0, max_val * 1.2)
        if show_legend:
            # Place legend at the bottom
            ax.legend(
                bbox_to_anchor=(1.8, -0.8),
                loc="center",
                facecolor="#1a1a2e",
                edgecolor="#ffffff",
                ncol=len(stats),
            )  # Show all formats in one row
    # Plot all three subplots with shared legend
    plot_grouped_bars(ax1, metrics1, show_legend=True)
    plot_grouped_bars(ax2, metrics2, show_legend=False)
    plot_grouped_bars(ax3, metrics3, show_legend=False)
    # Style all subplots
    setup_plot(fig, ax1, "File Size")
    setup_plot(fig, ax2, "Time Metrics")
    setup_plot(fig, ax3, "Sample Rate")
    # Add y-axis labels
    ax1.set_ylabel("Value")
    ax2.set_ylabel("Value")
    ax3.set_ylabel("Value")
    # Save the plot
    plt.savefig(os.path.join(output_dir, "format_comparison.png"), dpi=300)
    print(f"\nSaved format comparison plot to {output_dir}/format_comparison.png")
 def get_audio_stats(file_path: str) -> dict:
    """Get audio file statistics"""
    file_size = os.path.getsize(file_path)
    file_size_kb = file_size / 1024  # Convert to KB
    try:
        # Try reading with soundfile first
        data, sample_rate = sf.read(file_path)
        duration = len(data) / sample_rate
        channels = 1 if len(data.shape) == 1 else data.shape[1]
        # Calculate audio statistics
        stats = {
            "format": Path(file_path).suffix[1:],
            "file_size_kb": round(file_size_kb, 2),
            "duration_seconds": round(duration, 2),
            "sample_rate": sample_rate,
            "channels": channels,
            "min_amplitude": float(np.min(data)),
            "max_amplitude": float(np.max(data)),
            "mean_amplitude": float(np.mean(np.abs(data))),
            "rms_amplitude": float(np.sqrt(np.mean(np.square(data)))),
        }
        return stats
    except:
        # For PCM, read raw bytes and estimate duration
        with open(file_path, "rb") as f:
            data = f.read()
            # Assuming 16-bit PCM mono at 24kHz
            samples = len(data) // 2  # 2 bytes per sample
            duration = samples / 24000
            return {
                "format": "pcm",
                "file_size_kb": round(file_size_kb, 2),
                "duration_seconds": round(duration, 2),
                "sample_rate": 24000,
                "channels": 1,
                "note": "PCM stats are estimated from raw bytes",
            }
 def main():
    """Generate and analyze audio in different formats"""
    # Create output directory
    output_dir = Path(__file__).parent / "output" / "test_formats"
    output_dir.mkdir(exist_ok=True, parents=True)
    # First generate audio in each format using the API
    voice = "af"  # Using default voice
    formats = ["wav", "mp3", "opus", "flac", "pcm"]
    stats = []
    for fmt in formats:
        output_path = output_dir / f"test_audio.{fmt}"
        print(f"\nGenerating {fmt.upper()} audio...")
        # Generate and save
        start_time = time.time()
        response = client.audio.speech.create(
            model="kokoro", voice=voice, input=SAMPLE_TEXT, response_format=fmt
        )
        generation_time = time.time() - start_time
        with open(output_path, "wb") as f:
            f.write(response.content)
        # Get stats
        file_stats = get_audio_stats(str(output_path))
        file_stats["generation_time"] = round(generation_time, 3)
        stats.append(file_stats)
    # Generate comparison plot
    plot_format_comparison(stats, str(output_dir))
    # Print detailed statistics
    print("\nDetailed Audio Statistics:")
    print("=" * 100)
    for stat in stats:
        print(f"\n{stat['format'].upper()} Format:")
        for key, value in sorted(stat.items()):
            if key not in ["format"]:  # Skip format as it's in the header
                print(f"  {key}: {value}")
 if __name__ == "__main__":
    main()