需要使用变量重命名许多很长的文件名

Question 1

此 Powershell 脚本将按您需要的方式重命名文件。将其另存为RenameFiles.ps1并从 PowerShell 控制台运行。

脚本接受以下参数：

小路：必需，磁盘上现有的文件夹，用于存储文件。您可以提供多个路径。
递归：可选开关，控制递归。如果指定，脚本将重命名所有子文件夹中的文件。
如果什么：可选开关，如果指定，脚本将仅报告新旧文件名。不会进行重命名。

示例（从 PowerShell 控制台运行）：

重命名文件夹中的所有文件c:\path\to\files：
```
.\RenameFiles.ps1 -Path 'c:\path\to\files'
```
重命名文件夹中的所有pdf文件c:\path\to\files：
```
.\RenameFiles.ps1 -Path 'c:\path\to\files\*.pdf'
```
重命名文件夹中的所有pdf文件c:\path\to\files，递归
```
.\RenameFiles.ps1 -Path 'c:\path\to\files\*.pdf' -Recurse
```

扫描多个文件夹中的文件，递归，仅报告（不重命名）：

.\RenameFiles.ps1 -Path 'c:\path\A\*.pdf', 'c:\path\B\*.psd' -Recurse -WhatIf

RenameFiles.ps1脚本本身：

# Arguments accepted by script
Param
(
    # One or multiple paths, as array of strings
    [Parameter(Mandatory = $true, ValueFromPipeline = $true)]
    [string[]]$Path,

    # Recurse switch
    [switch]$Recurse,

    # Whatif switch
    [switch]$WhatIf
)

# This function transforms long file name (w\o extension) to short via regex
function Split-FileName
{
    [CmdletBinding()]
    Param
    (
        # Original file name
        [Parameter(Mandatory = $true, ValueFromPipeline = $true)]
        [string]$FileName
    )

    Begin
    {
        # You can change this block to adapt new rules for file renaming,
        # without modifying other parts of script.

        # Regex to match, capture groups are used to build new file name
        $Regex = '(Sample2).*(\d{2}-\d{2}-\d{4}).*(?<=[a-z]_)(\d+)(?=_\d+).*(?<=_)(\d+)$'

        # Scriptblock that builds new file name. $Matches is hashtable, but we need array for the format (-f) operator.
        # So this code: @(0..$Matches.Count | ForEach-Object {$Matches[$_]})} transforms it to the array.

        # Basically, we creating a new array of integers from 0 to count of $Matches keys, e.g. @(0,1,2,3,4,5)
        # and passing it down the pipeline. Then, in the foreach loop we output values of $Matches keys which name
        # match the current pipeline object, e.g. $Matches['1'], $Matches['2'], etc.
        # $Matches['0'] holds whole matched string, other keys hold capture groups.

        # This would also work:
        # $NewFileName = {'{0}_{1}_{2}_{3}{4}' -f $Matches['1'], $Matches['2'], $Matches['3'], $Matches['4'], $Matches['5']

        $NewFileName = {'{1}_{2}_{3}_{4}{5}' -f @(0..$Matches.Count | ForEach-Object {$Matches[$_]})}

    }

    Process
    {
        # If original file name matches regex
        if($FileName -match $Regex)
        {
            # Call scriptblock to generate new file name
            . $NewFileName
        }
    }
}

# For each path, get all file objects
Get-ChildItem -Path $Path -Recurse:$Recurse |
    # That are not directory
    Where-Object {!$_.PsIsContainer} |
        # For each file
        ForEach-Object {
            # Try to create new file name
            $NewBaseName = $_.BaseName | Split-FileName

            if($NewBaseName)
            {
                # If file name matched regex and we've got a new file name...

                # Build full path for the file with new name
                $NewFullName = Join-Path -Path $_.DirectoryName -ChildPath ($NewBaseName + $_.Extension)

                if(Test-Path -Path $NewFullName -PathType Leaf)
                {
                    # If such file already exists, show error message
                    Write-Host "File already exist: $NewFullName"
                }
                else
                {
                    # If not, rename it or just show report, depending on WhatIf switch
                    Rename-Item -Path $_.FullName -NewName $NewFullName -WhatIf:$WhatIf -Force
                }
            }
    }

此脚本中使用的正则表达式：https://regex101.com/r/hT2uN9/2（请注意，默认情况下，PowerShell 的正则表达式不区分大小写）。正则表达式解释的副本在此处：

正则表达式：

(Sample2).*(\d{2}-\d{2}-\d{4}).*(?<=[a-z]_)(\d+)(?=_\d+).*(?<=_)(\d+)$

示例2细绳：

1st Capturing group (Sample2)

Sample2 matches the characters Sample2 literally (case insensitive)

任何角色（未被捕获且不存在于$Matches变量中）：

.* matches any character (except newline)
Quantifier: * Between zero and unlimited times, as many times as possible,
giving back as needed [greedy]

日期：

2nd Capturing group (\d{2}-\d{2}-\d{4})

\d{2} match a digit [0-9]
Quantifier: {2} Exactly 2 times
- matches the character - literally

\d{2} match a digit [0-9]
Quantifier: {2} Exactly 2 times
- matches the character - literally

\d{4} match a digit [0-9]
Quantifier: {4} Exactly 4 times

任何角色（未被捕获且不存在于$Matches变量中）：

.* matches any character (except newline)
Quantifier: * Between zero and unlimited times, as many times as possible,
giving back as needed [greedy]

页数：

(?<=[a-z]_) Positive Lookbehind - Assert that the regex below can be matched

[a-z] match a single character present in the list below
a-z a single character in the range between a and z (case insensitive)
_ matches the character _ literally

3rd Capturing group (\d+)

\d+ match a digit [0-9]
Quantifier: + Between one and unlimited times, as many times as possible,
giving back as needed [greedy]

(?=_\d+) Positive Lookahead - Assert that the regex below can be matched
_ matches the character _ literally

\d+ match a digit [0-9]
Quantifier: + Between one and unlimited times, as many times as possible,
giving back as needed [greedy]

任何角色（未被捕获且不存在于$Matches变量中）：

.* matches any character (except newline)
Quantifier: * Between zero and unlimited times, as many times as possible,
giving back as needed [greedy]

身份证号：

(?<=_) Positive Lookbehind - Assert that the regex below can be matched
_ matches the character _ literally

4th Capturing group (\d+)

\d+ match a digit [0-9]
Quantifier: + Between one and unlimited times, as many times as possible,
giving back as needed [greedy]

Answer

此 Powershell 脚本将按您需要的方式重命名文件。将其另存为RenameFiles.ps1并从 PowerShell 控制台运行。

脚本接受以下参数：

小路：必需，磁盘上现有的文件夹，用于存储文件。您可以提供多个路径。
递归：可选开关，控制递归。如果指定，脚本将重命名所有子文件夹中的文件。
如果什么：可选开关，如果指定，脚本将仅报告新旧文件名。不会进行重命名。

示例（从 PowerShell 控制台运行）：

重命名文件夹中的所有文件c:\path\to\files：
```
.\RenameFiles.ps1 -Path 'c:\path\to\files'
```
重命名文件夹中的所有pdf文件c:\path\to\files：
```
.\RenameFiles.ps1 -Path 'c:\path\to\files\*.pdf'
```
重命名文件夹中的所有pdf文件c:\path\to\files，递归
```
.\RenameFiles.ps1 -Path 'c:\path\to\files\*.pdf' -Recurse
```

扫描多个文件夹中的文件，递归，仅报告（不重命名）：

.\RenameFiles.ps1 -Path 'c:\path\A\*.pdf', 'c:\path\B\*.psd' -Recurse -WhatIf

RenameFiles.ps1脚本本身：

# Arguments accepted by script
Param
(
    # One or multiple paths, as array of strings
    [Parameter(Mandatory = $true, ValueFromPipeline = $true)]
    [string[]]$Path,

    # Recurse switch
    [switch]$Recurse,

    # Whatif switch
    [switch]$WhatIf
)

# This function transforms long file name (w\o extension) to short via regex
function Split-FileName
{
    [CmdletBinding()]
    Param
    (
        # Original file name
        [Parameter(Mandatory = $true, ValueFromPipeline = $true)]
        [string]$FileName
    )

    Begin
    {
        # You can change this block to adapt new rules for file renaming,
        # without modifying other parts of script.

        # Regex to match, capture groups are used to build new file name
        $Regex = '(Sample2).*(\d{2}-\d{2}-\d{4}).*(?<=[a-z]_)(\d+)(?=_\d+).*(?<=_)(\d+)$'

        # Scriptblock that builds new file name. $Matches is hashtable, but we need array for the format (-f) operator.
        # So this code: @(0..$Matches.Count | ForEach-Object {$Matches[$_]})} transforms it to the array.

        # Basically, we creating a new array of integers from 0 to count of $Matches keys, e.g. @(0,1,2,3,4,5)
        # and passing it down the pipeline. Then, in the foreach loop we output values of $Matches keys which name
        # match the current pipeline object, e.g. $Matches['1'], $Matches['2'], etc.
        # $Matches['0'] holds whole matched string, other keys hold capture groups.

        # This would also work:
        # $NewFileName = {'{0}_{1}_{2}_{3}{4}' -f $Matches['1'], $Matches['2'], $Matches['3'], $Matches['4'], $Matches['5']

        $NewFileName = {'{1}_{2}_{3}_{4}{5}' -f @(0..$Matches.Count | ForEach-Object {$Matches[$_]})}

    }

    Process
    {
        # If original file name matches regex
        if($FileName -match $Regex)
        {
            # Call scriptblock to generate new file name
            . $NewFileName
        }
    }
}

# For each path, get all file objects
Get-ChildItem -Path $Path -Recurse:$Recurse |
    # That are not directory
    Where-Object {!$_.PsIsContainer} |
        # For each file
        ForEach-Object {
            # Try to create new file name
            $NewBaseName = $_.BaseName | Split-FileName

            if($NewBaseName)
            {
                # If file name matched regex and we've got a new file name...

                # Build full path for the file with new name
                $NewFullName = Join-Path -Path $_.DirectoryName -ChildPath ($NewBaseName + $_.Extension)

                if(Test-Path -Path $NewFullName -PathType Leaf)
                {
                    # If such file already exists, show error message
                    Write-Host "File already exist: $NewFullName"
                }
                else
                {
                    # If not, rename it or just show report, depending on WhatIf switch
                    Rename-Item -Path $_.FullName -NewName $NewFullName -WhatIf:$WhatIf -Force
                }
            }
    }

此脚本中使用的正则表达式：https://regex101.com/r/hT2uN9/2（请注意，默认情况下，PowerShell 的正则表达式不区分大小写）。正则表达式解释的副本在此处：

正则表达式：

(Sample2).*(\d{2}-\d{2}-\d{4}).*(?<=[a-z]_)(\d+)(?=_\d+).*(?<=_)(\d+)$

示例2细绳：

1st Capturing group (Sample2)

Sample2 matches the characters Sample2 literally (case insensitive)

任何角色（未被捕获且不存在于$Matches变量中）：

.* matches any character (except newline)
Quantifier: * Between zero and unlimited times, as many times as possible,
giving back as needed [greedy]

日期：

2nd Capturing group (\d{2}-\d{2}-\d{4})

\d{2} match a digit [0-9]
Quantifier: {2} Exactly 2 times
- matches the character - literally

\d{2} match a digit [0-9]
Quantifier: {2} Exactly 2 times
- matches the character - literally

\d{4} match a digit [0-9]
Quantifier: {4} Exactly 4 times

任何角色（未被捕获且不存在于$Matches变量中）：

.* matches any character (except newline)
Quantifier: * Between zero and unlimited times, as many times as possible,
giving back as needed [greedy]

页数：

(?<=[a-z]_) Positive Lookbehind - Assert that the regex below can be matched

[a-z] match a single character present in the list below
a-z a single character in the range between a and z (case insensitive)
_ matches the character _ literally

3rd Capturing group (\d+)

\d+ match a digit [0-9]
Quantifier: + Between one and unlimited times, as many times as possible,
giving back as needed [greedy]

(?=_\d+) Positive Lookahead - Assert that the regex below can be matched
_ matches the character _ literally

\d+ match a digit [0-9]
Quantifier: + Between one and unlimited times, as many times as possible,
giving back as needed [greedy]

任何角色（未被捕获且不存在于$Matches变量中）：

.* matches any character (except newline)
Quantifier: * Between zero and unlimited times, as many times as possible,
giving back as needed [greedy]

身份证号：

(?<=_) Positive Lookbehind - Assert that the regex below can be matched
_ matches the character _ literally

4th Capturing group (\d+)

\d+ match a digit [0-9]
Quantifier: + Between one and unlimited times, as many times as possible,
giving back as needed [greedy]

Question 2

就像 Karan 链接的那样，正则表达式是实现此目的的方法。我使用的是 Linux，因此我不确定 powershell 是否有合适的内置程序，但如果没有，请从 sourceforge 下载适用于 Windows 的 sed。它简直太棒了。

我的 sed-fu 很糟糕，但是这会将原始字符串重新格式化为新的字符串：

sed -r 's/Sample1_(Sample2_)[0-9]*_(..-..-....)_.*-[A-Z_]*(_[0-9][0-9]*_)._Sample4_(.)/\1\2\3\4/'

我确信有更简单的方法可以实现同样的目标。

如果您能读懂 bash，下面是一个如何使用它重命名的示例：

for i in $(ls);do mv $i $(echo $i|sed -r 's/Sample1_(Sample2_)[0-9]*_(..-..-....)_.*-[A-Z_]*(_[0-9][0-9]*_)._Sample4_(.*)/\1\2\3\4/');done

毫无疑问，在 powershell 中编写类似的脚本相当简单，但这留给读者练习吧 :P

編輯：錯誤

EDIT2：看了看我写的内容，可能很难理解，所以我将尝试展示我想要做的事情：

总的来说，正则表达式会读取该行并将我们想要保留的部分括在括号中。它们称为模式。读取该行后，丢弃所选模式以外的所有内容。

sed -r   //-r switch is here only to allow the use of parens without escaping them. It's confusing enough without backslashes.
's/      //s is the command, stands for subtitute. syntax s/[search pattern]/[replace pattern]/. string matching SP is replaced with RP.
         //Here I use the command to match the whole line and save the parts I want.

Sample1_(Sample2_)  //set "Sample2_" as first pattern
[0-9]*_(..-..-....) //read onwards and skip zero or more numerals ([0-9]*) between two underscores. Read xx-xx-xxxx as second pattern where x is any character
_.*-[A-Z_]*(_[0-9][0-9]*_) //after underscore, skip any number of characters (.*) until run across dash. after that, skip any number of capital letters and underscores until you run into underscore followed by more than one numeral and underscore (_[0-9][0-9]*_). Save that as pat 3
._Sample4_(.*) //grab everything after Sample4_ as pat 4
/\1\2\3\4/'   //First slash ends the search pattern for the s command and begin the . After that, \1, \2, \3 and \4 insert patterns we saved in search part discarding the rest. final slash ends the s command.

正则表达式虽然难以阅读，但编写起来却很容易。这也意味着它很容易出错，调试起来也很难，但你不可能拥有一切。

这是 basic/python/pseudocode-ish 格式的 shell 脚本的要点。

for OLDNAME in DIRECTORY
     let NEWNAME = output of sed command with OLDNAME piped as input.
     rename OLDNAME NEWNAME
next

Answer

就像 Karan 链接的那样，正则表达式是实现此目的的方法。我使用的是 Linux，因此我不确定 powershell 是否有合适的内置程序，但如果没有，请从 sourceforge 下载适用于 Windows 的 sed。它简直太棒了。

我的 sed-fu 很糟糕，但是这会将原始字符串重新格式化为新的字符串：

sed -r 's/Sample1_(Sample2_)[0-9]*_(..-..-....)_.*-[A-Z_]*(_[0-9][0-9]*_)._Sample4_(.)/\1\2\3\4/'

我确信有更简单的方法可以实现同样的目标。

如果您能读懂 bash，下面是一个如何使用它重命名的示例：

for i in $(ls);do mv $i $(echo $i|sed -r 's/Sample1_(Sample2_)[0-9]*_(..-..-....)_.*-[A-Z_]*(_[0-9][0-9]*_)._Sample4_(.*)/\1\2\3\4/');done

毫无疑问，在 powershell 中编写类似的脚本相当简单，但这留给读者练习吧 :P

編輯：錯誤

EDIT2：看了看我写的内容，可能很难理解，所以我将尝试展示我想要做的事情：

总的来说，正则表达式会读取该行并将我们想要保留的部分括在括号中。它们称为模式。读取该行后，丢弃所选模式以外的所有内容。

sed -r   //-r switch is here only to allow the use of parens without escaping them. It's confusing enough without backslashes.
's/      //s is the command, stands for subtitute. syntax s/[search pattern]/[replace pattern]/. string matching SP is replaced with RP.
         //Here I use the command to match the whole line and save the parts I want.

Sample1_(Sample2_)  //set "Sample2_" as first pattern
[0-9]*_(..-..-....) //read onwards and skip zero or more numerals ([0-9]*) between two underscores. Read xx-xx-xxxx as second pattern where x is any character
_.*-[A-Z_]*(_[0-9][0-9]*_) //after underscore, skip any number of characters (.*) until run across dash. after that, skip any number of capital letters and underscores until you run into underscore followed by more than one numeral and underscore (_[0-9][0-9]*_). Save that as pat 3
._Sample4_(.*) //grab everything after Sample4_ as pat 4
/\1\2\3\4/'   //First slash ends the search pattern for the s command and begin the . After that, \1, \2, \3 and \4 insert patterns we saved in search part discarding the rest. final slash ends the s command.

正则表达式虽然难以阅读，但编写起来却很容易。这也意味着它很容易出错，调试起来也很难，但你不可能拥有一切。

这是 basic/python/pseudocode-ish 格式的 shell 脚本的要点。

for OLDNAME in DIRECTORY
     let NEWNAME = output of sed command with OLDNAME piped as input.
     rename OLDNAME NEWNAME
next

需要使用变量重命名许多很长的文件名

答案1

答案2

相关内容