• implosive_sprig@beehaw.org
    link
    fedilink
    arrow-up
    1
    ·
    1 day ago

    If you have a human-narrated audiobook, you can use Storyteller to synchronize those.

    AI-TTS still doesn’t do it for me. It’s either the mispronunciation of proper nouns or the cadence putting me to sleep. Maybe in a few years, I’ll try again.

    • eldavi@lemmy.ml
      link
      fedilink
      English
      arrow-up
      5
      ·
      2 days ago

      i’m curious to see how much it mispronounces words like earlier iterations from different projects did.

      • Apathy Tree@lemmy.dbzer0.com
        link
        fedilink
        English
        arrow-up
        4
        ·
        2 days ago

        I’d honestly probably be less annoyed by a machine mispronouncing words than I am when a human reader does it…

        I know I shouldn’t be annoyed because language is difficult and not everyone has heard every word… but you’d think they would, like, check instead of saying something wrong 1,000 times (especially since the books I listen to are mostly science communication and science history)

    • 🇨🇦 tunetardis@piefed.ca
      link
      fedilink
      English
      arrow-up
      2
      ·
      edit-2
      2 days ago

      I installed it yesterday and started having it chug through the Murderbot series I got in epub format. It seemed to be taking forever, but then I checked a system monitor and discovered it was using the GPU to do most of the work. So whenever my GPU-heavy screen saver kicked in, it slowed to a crawl.

      At any rate, it was done this morning but then I forgot to bring the files to work, so I can’t say at this point how good a job it did? It was a bit of a pain to install because it needed Python 12 and wouldn’t accept Python 14 for some reason, and pyenv on my Mac is a bit of pain because it hates tkinter. Go figure. But I got it working in the end.

      • Sims@lemmy.ml
        link
        fedilink
        arrow-up
        1
        ·
        14 hours ago

        whenever my GPU-heavy screen saver kicked in

        So, a combined “Screen-saver” / “GPU-murderer” ? Neat - we can’t save everyone ! ;-)

      • 🇨🇦 tunetardis@piefed.ca
        link
        fedilink
        English
        arrow-up
        2
        ·
        2 days ago

        A little follow-up on this. Tonight I had a look at what it generated. It produced 2 files: a .wav and a .ass. The latter apparently contains subtitles that sync to the audio. But how do you play them together?

        After searching around online, the general consensus seemed that you need to make a video file that throws it all together. For the background image I used a still of the book cover art. Then I ran an ffmpeg command that looked something like this:

        ffmpeg -loop 1 -i cover.jpg -i abogen_file.wav -vf subtitles=abogen_file.ass -shortest audio_book.mov
        

        It sounds pretty awesome and looks like this while it’s playing!

        bUtdFKluimxbNPg.jpg

        • CagedDingo@aussie.zone
          link
          fedilink
          arrow-up
          1
          ·
          19 hours ago

          If you use VLC or some other capable player it’ll automatically pick up the subtitles if they have the same name (sans extension).