递归压缩所有pdf文件

递归压缩所有pdf文件

我想使用 Ghostscript 压缩给定目录及其子目录中的所有 pdf 文件。

我被困find在按文件名(包括空格)循环内使用该命令。

这是我想要的一些示例代码:

pdffiles=$(find /path/to/directory -type f -name *.pdf)
for file in pdffiles; do
  gs -sDEVICE=pdfwrite -dCompatibilityLevel=1.4 -dPDFSETTINGS=/screen -dNOPAUSE -dBATCH -dQUIET -sOutputFile=new_$file $file; 
  rm $file;
  mv new_$file $file;
done;

知道如何解决空间问题吗?有一个更好的方法吗?

答案1

单行脚本也是一种选择:

find -type f -name "*.pdf" -exec bash -c 'gs -sDEVICE=pdfwrite -dCompatibilityLevel=1.4 -dPDFSETTINGS=/screen -dNOPAUSE -dBATCH -dQUIET -sOutputFile="new.pdf" "{}"; rm "{}"; mv "new.pdf" "{}";' {} \;

答案2

我根据您的精彩回复重构了我的脚本,并且运行得很好:)

这是经过重构、改进的代码,包含日志记录、参数和其他一些内容。我始终乐于改进我的代码。

#!/bin/bash
        
## Script to compress PDF Files using Ghostscript incl. subdirs
## Copyright (C) 2016 Maximilian Fries - All Rights Reserved
## Contact: [email protected]
## Last revised 2016-07-29

# Usage
# ./pdf-compress.sh [screen|ebook|prepress|default] [verbose]

# Variables and preparation
{
  count=0
  success=0
  successlog=./success.tmp
  gain=0
  gainlog=./gain.tmp
  pdfs=$(find ./ -type f -name "*.pdf")
  total=$(echo "$pdfs" | wc -l)
  log=./log
  verbose="-dQUIET"
  mode="prepress"
  echo "0" | tee $successlog $gainlog > /dev/null
}

# Are there any PDFs?
if [ "$total" -gt 0 ]; then

    #Parameter Handling & Logging
    {
        echo "-- Debugging for Log START --"
        echo "Number of Parameters: $#"
        echo "Parameters are: $*"
        echo "-- Debugging for Log END   --"
    } >> $log

    # Only compression-mode set
    if [ $# -eq 1 ]; then
        mode="$1"
    fi

    # Also Verbose Level Set
    if [ $# -eq 2 ]; then
        mode="$1"
        verbose=""
    fi

    echo "$pdfs" | while read -r file
    do
        ((count++))
        echo "Processing File #$count of $total Files" | tee -a $log
        echo "Current File: $file "| tee -a $log
        gs -sDEVICE=pdfwrite -dCompatibilityLevel=1.4 -dPDFSETTINGS="/$mode" -dNOPAUSE \
                -dBATCH $verbose -sOutputFile="$file-new" "$file" | tee -a $log
    
        sizeold=$(wc -c "$file"     | cut -d' ' -f1)
        sizenew=$(wc -c "$file-new" | cut -d' ' -f1)
        difference=$((sizenew-sizeold))

        # Check if new filesize is smaller
        if [ $difference -lt 0 ]
        then
            rm "$file"
            mv "$file-new" "$file"
            printf "Compression was successful. New File is %'.f Bytes smaller\n" \
                    $((-difference)) | tee -a $log
            ((success++)) 
            echo $success > $successlog
            ((gain-=difference))
            echo $gain > $gainlog
        else
            rm "$file-new"
            echo "Compression was not necessary" | tee -a $log
        fi

    done

    # Print Statistics
    printf "Successfully compressed %'.f of %'.f files\n" $(cat $successlog) $total | tee -a $log
    printf "Saved a total of %'.f Bytes\n" $(cat $gainlog) | tee -a $log

    rm $successlog $gainlog

else
    echo "No PDF File in Directory"
fi

答案3

你的循环最好写成

find ... | while read -r file

但是您需要确保在循环内引用文件名。所以我们最终得到

find /path/to/directory -type f -name *.pdf | while read -r file
do
  gs -sDEVICE=pdfwrite -dCompatibilityLevel=1.4 -dPDFSETTINGS=/screen -dNOPAUSE -dBATCH -dQUIET -sOutputFile="new_$file" "$file"
  rm "$file"
  mv "new_$file" "$file"
done

(另请注意,所有这些;都不需要)。

现在这个循环有潜在的文件所有权/权限问题,但这是另一个问题:-)

相关内容