使用 ffmpeg 将 Audible 有声读物分成章节?

使用 ffmpeg 将 Audible 有声读物分成章节?

我一直在关注这个答案用于ffmpeg在 LinuxMint 中转换和播放我的一些 Audible 有声读物。每本书都是一个源文件,但我注意到它ffmpeg在转换开始时列出了所有章节。

有没有办法ffmpeg将书拆分为章节 - 将每个章节转换为单独的文件(按章节拆分)?最好ffmpeg单独使用,但使用其他程序/脚本(与 一起ffmpeg)也是一种选择......

(我已经看到了一些关于将 DVD 分割成均匀长度的块或章节(使用ffmpegpython 脚本)的其他答案,但这并不是我想要的,所以我希望这是一种更简单的方法正在做...)

答案1

我自己最近一直在这样做:正如 Nemo 上面评论的那样 - ffprobe 为您提供了一个 json 文件,其中使用命令轻松地开始和结束章节......

ffprobe -i fileName -print_format json -show_chapters

如果您添加-sexagesimal到该命令,它会创建一个更易于阅读的输出(IMO),并且输出可以重定向到文件以供以后处理。

FFmpeg 需要一点帮助,因此我还使用了 jg 和 AtomicParsley - 前者解析 JSON 文件,后者将图像和元数据添加到生成的 m4b 文件中。

该脚本还支持使用 m4a 文件输出,或根据需要转换为 mp3 - 使用参数 $1 - 输入文件和(可选)$2 输出类型简单调用它 - 默认为 m4b。

以此为基础我创建了以下脚本......

#!/bin/bash

# script to convert m4b (audiobook) files with embedded chapted (for eg. converted from Audbile) into individual chapter files

# required: ffmpeg; jg (json interpreter) & AtomicParsley (to embed pictures and add additional metadata to m4a/m4b AAC files)

# discover the file type (extension) of the input file
ext=${1##*.}
echo "extension: $ext"
# all files / folders are named based on the "shortname" of the input file
shortname=$(basename "$1" ".$ext")
picture=$shortname.jpg
chapterdata=$shortname.dat
metadata=$shortname.tmp
echo "shortname: $shortname"

# if an output type has been given on the command line, set parameters (used in ffmpeg command later)
if [[ $2 = "mp3" ]]; then
  outputtype="mp3"
  codec="libmp3lame"
elif [[ $2 = "m4a" ]]; then
  outputtype="m4a"
  codec="copy"
else
  outputtype="m4b"
  codec="copy"
fi
echo "outputtype: |$outputtype|"

# if it doesn't already exist, create a json file containing the chapter breaks (you can edit this file if you want chapters to be named rather than simply "Chapter 1", etc that Audible use)
[ ! -e "$chapterdata" ] && ffprobe -loglevel error \
            -i "$1" -print_format json -show_chapters -loglevel error -sexagesimal \
            >"$chapterdata"
read -p "Now edit the file $chapterdata if required. Press ENTER to continue."
# comment out above if you don't want the script to pause!

# read the chapters into arrays for later processing
readarray -t id <<< $(jq -r '.chapters[].id' "$chapterdata")
readarray -t start <<< $(jq -r '.chapters[].start_time' "$chapterdata")
readarray -t end <<< $(jq -r '.chapters[].end_time' "$chapterdata")
readarray -t title <<< $(jq -r '.chapters[].tags.title' "$chapterdata")

# create a ffmpeg metadata file to extract addition metadata lost in splitting files - deleted afterwards
ffmpeg -loglevel error -i "$1" -f ffmetadata "$metadata"
artist_sort=$(sed 's/.*=\(.*\)/\1/' <<<$(cat "$metadata" |grep -m 1 ^sort_artist))
album_sort=$(sed 's/.*=\(.*\)/\1/' <<<$(cat "$metadata" |grep -m 1 ^sort_album))
rm "$metadata"

# create directory for the output
mkdir -p "$shortname"
echo -e "\fID\tStart Time\tEnd Time\tTitle\t\tFilename"
for i in ${!id[@]}; do
  let trackno=$i+1
  # set the name for output - currently in format <bookname>/<tranck number>
  outname="$shortname/$(printf "%02d" $trackno). $shortname - ${title[$i]}.$outputtype"
  #outname=$(sed -e 's/[^A-Za-z0-9._- ]/_/g' <<< $outname)
  outname=$(sed 's/:/_/g' <<< $outname)
  echo -e "${id[$i]}\t${start[$i]}\t${end[$i]}\t${title[$i]}\n\t\t$(basename "$outname")"
  ffmpeg -loglevel error -i "$1" -vn -c $codec \
            -ss ${start[$i]} -to ${end[$i]} \
            -metadata title="${title[$i]}" \
            -metadata track=$trackno \
            -map_metadata 0 -id3v2_version 3 \
            "$outname"
  [[ $outputtype == m4* ]] && AtomicParsley "$outname" \
            --artwork "$picture" --overWrite \
            --sortOrder artist "$artist_sort" \
            --sortOrder album "$album_sort" \
            > /dev/null
done

如果需要,您可以将 JSON 文件(.dat 文件)编辑为 Audible 文件,只需将章节命名为“第 1 章”、“第 2 章”等。

例如。最初文件的第一部分可能会读...

{
    "chapters": [
        {
            "id": 0,
            "time_base": "1/1000",
            "start": 0,
            "start_time": "0:00:00.000000",
            "end": 3206908,
            "end_time": "0:53:26.908000",
            "tags": {
                "title": "Chapter 1"
            }
        },

通过简单地将相关行更改为..."title": "Introduction"将更改生成的拆分文件。

答案2

您可以使用这种最小的方法,它取决于ffmpeg(以及ffprobe)以及jq

#!/bin/bash
# Description: Split an 
# Requires: ffmpeg, jq
# Author: Hasan Arous
# License: MIT

in="$1"
out="$2"
splits=""
while read start end title; do
  splits="$splits -c copy -ss $start -to $end $out/$title.m4b"
done <<<$(ffprobe -i "$in" -print_format json -show_chapters \
  | jq -r '.chapters[] | .start_time + " " + .end_time + " " + (.tags.title | sub(" "; "_"))')

ffmpeg -i "$in" $splits

https://gist.github.com/aularon/c48173f8246fa57e9c1ef7ff694ab06f

答案3

您可以使用 ffprobe 通过以下命令获取章节的开始和结束时间...

ffprobe -i fileName -print_format json -show_chapters

然后您可以使用 ffmpeg 在开始和结束时间进行分割......

ffmpeg -i fileName -ss start -to end outFile

确保不要使用“-t”;需要一段时间来转换。 “-ss”和“-to”是文件中的时间位置。

您必须编写脚本才能自动完成。

答案4

我有一个小错误。如果我尝试使用这个不带参数的脚本转换 MP3 有声读物 (speech.mp3),我会得到很多空的 m4b 文件(每一章一个,每章的大小为 0)。

我插入一些更改:

#!/bin/bash

# script to convert m4b (audiobook) files with embedded chapted (for eg. converted from Audbile) into individual chapter files

# required: ffmpeg; jg (json interpreter) & AtomicParsley (to embed pictures and add additional metadata to m4a/m4b AAC files)

# discover the file type (extension) of the input file
ext=${1##*.}
echo "extension: $ext"
# all files / folders are named based on the "shortname" of the input file
shortname=$(basename "$1" ".$ext")
picture=$shortname.jpg
chapterdata=$shortname.dat
metadata=$shortname.tmp
echo "shortname: $shortname"

extension="${1##*.}"

forcemp3=0

if [ "$extension" == "mp3" ]; then
  forcemp3=1
fi

# if an output type has been given on the command line, set parameters (used in ffmpeg command later)
if [[  $2 = "mp3"  ||  $forcemp3 = 1  ]] ; then
  outputtype="mp3"
  codec="libmp3lame"
  echo mp3
elif [[ $2 = "m4a" ]]; then
  outputtype="m4a"
  codec="copy"
else
  outputtype="m4b"
  codec="copy"
fi
echo "outputtype: |$outputtype|"

# if it doesn't already exist, create a json file containing the chapter breaks (you can edit this file if you want chapters to be named rather than simply "Chapter 1", etc that Audible use)
[ ! -e "$chapterdata" ] && ffprobe -loglevel error \
            -i "$1" -print_format json -show_chapters -loglevel error -sexagesimal \
            >"$chapterdata"
read -p "Now edit the file $chapterdata if required. Press ENTER to continue."
# comment out above if you don't want the script to pause!

# read the chapters into arrays for later processing
readarray -t id <<< $(jq -r '.chapters[].id' "$chapterdata")
readarray -t start <<< $(jq -r '.chapters[].start_time' "$chapterdata")
readarray -t end <<< $(jq -r '.chapters[].end_time' "$chapterdata")
readarray -t title <<< $(jq -r '.chapters[].tags.title' "$chapterdata")

# create a ffmpeg metadata file to extract addition metadata lost in splitting files - deleted afterwards
ffmpeg -loglevel error -i "$1" -f ffmetadata "$metadata"
artist_sort=$(sed 's/.*=\(.*\)/\1/' <<<$(cat "$metadata" |grep -m 1 ^sort_artist))
album_sort=$(sed 's/.*=\(.*\)/\1/' <<<$(cat "$metadata" |grep -m 1 ^sort_album))
rm "$metadata"

# create directory for the output
mkdir -p "$shortname"
echo -e "\fID\tStart Time\tEnd Time\tTitle\t\tFilename"
for i in ${!id[@]}; do
  let trackno=$i+1
  # set the name for output - currently in format <bookname>/<tranck number>
  outname="$shortname/$(printf "%02d" $trackno). $shortname - ${title[$i]}.$outputtype"
  #outname=$(sed -e 's/[^A-Za-z0-9._- ]/_/g' <<< $outname)
  outname=$(sed 's/:/_/g' <<< $outname)
  echo -e "${id[$i]}\t${start[$i]}\t${end[$i]}\t${title[$i]}\n\t\t$(basename "$outname")"
  ffmpeg -loglevel error -i "$1" -vn -c $codec \
            -ss ${start[$i]} -to ${end[$i]} \
            -metadata title="${title[$i]}" \
            -metadata track=$trackno \
            -map_metadata 0 -id3v2_version 3 \
            "$outname"
  [[ $outputtype == m4* ]] && AtomicParsley "$outname" \
            --artwork "$picture" --overWrite \
            --sortOrder artist "$artist_sort" \
            --sortOrder album "$album_sort" \
            > /dev/null
done

相关内容