Cloning King Charles's voice to read audiobooks

Jun 22, 2026

I pray to the British only that I be beheaded rather than hanged for this. Project Gutenberg has more than 75,000 books in the public domain that you can read on its website. However, the public-domain audiobooks are sometimes lacking. In March, Fish Audio released S2, which provides adequate-sounding text-to-speech. Here is a sample of King Charles’s voice reading the title of South, generated using S2.

Cloning a voice is fairly simple, you just need ~30 seconds of reference speech:

python fish_speech/models/dac/inference.py \
  -i reference.wav \
  -o work/ref.wav \
  --checkpoint-path checkpoints/s2-pro/codec.pth \
  -d cuda

This will give you a file, work/ref.npy, representing the voice in tokens. This file can be used to generate semantic tokens, and then an output .wav.

python fish_speech/models/text2semantic/inference.py \
  --text "This is the new sentence I want spoken." \
  --prompt-text "Exact transcript of the reference audio." \
  --prompt-tokens work/ref.npy \
  --checkpoint-path checkpoints/s2-pro \
  --device cuda \
  --no-compile \
  --output-dir work
python fish_speech/models/dac/inference.py \
  -i work/codes_0.npy \
  -o output.wav \
  --checkpoint-path checkpoints/s2-pro/codec.pth \
  -d cuda

I clipped a speech King Charles gave and used it to generate the entire audiobook of South by Ernest Shackleton. It sounds good enough that I can listen to the whole book. You can listen to it on my audiobook YouTube channel. What public domain book should I do next? Email me.