从包含大量 pdf 的 zip 文件中提取特定的 pdf

Question

从 zip 档案中解压特定文件

unzip -j "myarchive.zip" "in/archive/file.pdf" -d "/destination/path/"

在你的脚本中

# Set a destination path
dest="/path/to/unzip/to"
# dump pdf to temp text file
tempfile=$(mktemp)
# unzip the file to stdOut and convert it to text
unzip -p "$z" "$f" | pdftotext - $tempfile
if grep -q $searchString $tempfile; then
    unzip -j "$z" "$f" -d "$dest"
    # some text output
    echo "$z -> $f"
fi
rm $tempfile

Answer 1

从 zip 档案中解压特定文件

unzip -j "myarchive.zip" "in/archive/file.pdf" -d "/destination/path/"

在你的脚本中

# Set a destination path
dest="/path/to/unzip/to"
# dump pdf to temp text file
tempfile=$(mktemp)
# unzip the file to stdOut and convert it to text
unzip -p "$z" "$f" | pdftotext - $tempfile
if grep -q $searchString $tempfile; then
    unzip -j "$z" "$f" -d "$dest"
    # some text output
    echo "$z -> $f"
fi
rm $tempfile

从包含大量 pdf 的 zip 文件中提取特定的 pdf

答案1

相关内容