歌曲文件列表的模式匹配和文本处理

Question

当您想要比较文件时，按排序顺序排列文件通常很有用。通过删除任何重复项，使-u输出中的每一行都是唯一的。

sort -u file1 > file1.sorted
sort -u file2 > file2.sorted

comm可以比较排序的文件，但只能比较文字文本。所以这会减少问题但只能过滤掉精确匹配。-1删除第一个文件特有的行并-3删除两个文件参数共有的行。这给我们留下了第二个文件特有的行。

comm -1 -3 file1.sorted file2.sorted > file2.reduced

现在我们只需要为这个（希望）较小的文件做复杂的工作。

sed 's#^\./##' file2.reduced | while read line; do
  artist_album=${line%/*}
  filename=${line##*/}
  title=$(echo "$filename" | sed 's/^[0-9]\{1,3\}\. //;t;s/^[0-9]\{1,3\} - [0-9]\{1,3\} - //;t;s/^[0-9]\{1,3\} - //')
  extension=${title##*.}
  title=${title%.$extension}
  # We use fixed strings in case there are special chars in the file name
  # If the file names are "regex-save" we can use one grep instead:
  # ! grep -q -E "^\./$artist_album/.*$title\.(mp3|flac)\$" file1.sorted
  if ! grep -F "./$artist_album/" file1.sorted | grep -F -e "$title." | grep -q -E '(mp3|flac)$'; then
    echo "./$line"
  fi
done > results

Answer 1

当您想要比较文件时，按排序顺序排列文件通常很有用。通过删除任何重复项，使-u输出中的每一行都是唯一的。

sort -u file1 > file1.sorted
sort -u file2 > file2.sorted

comm可以比较排序的文件，但只能比较文字文本。所以这会减少问题但只能过滤掉精确匹配。-1删除第一个文件特有的行并-3删除两个文件参数共有的行。这给我们留下了第二个文件特有的行。

comm -1 -3 file1.sorted file2.sorted > file2.reduced

现在我们只需要为这个（希望）较小的文件做复杂的工作。

sed 's#^\./##' file2.reduced | while read line; do
  artist_album=${line%/*}
  filename=${line##*/}
  title=$(echo "$filename" | sed 's/^[0-9]\{1,3\}\. //;t;s/^[0-9]\{1,3\} - [0-9]\{1,3\} - //;t;s/^[0-9]\{1,3\} - //')
  extension=${title##*.}
  title=${title%.$extension}
  # We use fixed strings in case there are special chars in the file name
  # If the file names are "regex-save" we can use one grep instead:
  # ! grep -q -E "^\./$artist_album/.*$title\.(mp3|flac)\$" file1.sorted
  if ! grep -F "./$artist_album/" file1.sorted | grep -F -e "$title." | grep -q -E '(mp3|flac)$'; then
    echo "./$line"
  fi
done > results

歌曲文件列表的模式匹配和文本处理

答案1

相关内容