通过符号链接合并多个嵌套目录中的文件的脚本

通过符号链接合并多个嵌套目录中的文件的脚本

基本上,我正在寻找一个脚本来在 Ubuntu 中自动化一些事情(见下图)。我正在考虑使用 bash 脚本,但其他解决方案(例如 python?)也很好。

1) 假设我有多个真实目录“文件夹 1”和“文件夹 2”,其中包含子文件夹和文件。假设相应文件夹 1 和 2 中的文件具有唯一名称。如何创建一个新的合并文件夹,其中每个文件都是指向原始文件夹的符号链接?

2)该脚本还应提供一个选项来修剪合并文件夹中损坏的符号链接。


我之所以要这样做是因为我想改进我的东西的组织方式。例如,“文件夹 1|2”可能是在不同时间点获得的数据。然后我会为不同的项目创建 Merged_Folder1、Merged_Folder2 等,而无需复制大文件。


编辑:这个问题不同于这个帖子因为我想合并具有相同名称的相应嵌套子文件夹。上一篇文章中的问题只是将源下的顶级目录链接到目标,而不能合并嵌套子文件夹。请注意,在我的情况下,没有一个文件夹是符号链接,只有文件是符号链接。

编辑2:我应该澄清一下,我希望代码合并任意级别的嵌套子文件夹,而不仅仅是两级。因此,我在示例图中添加了“文件 J”和“文件 I”。

我希望文件夹合并如何工作的示例

答案1

我认为以下 shellscript 可以完成你想要的操作

  • 原始文件夹位于main
  • 可以有多个级别的子文件夹
  • 合并的文件夹是links
  • script在包含以下内容的目录中运行主 shell 脚本mainlinks

script

#!/bin/bash

mkdir -p links

find main -type d -exec bash -c \
'for pathname do
  #echo "------------------------------${pathname} ${pathname#*/*/}"
  if [ "${pathname/*\/*\/}" != "${pathname}" ]
  then
   mkdir -p "links/${pathname#*/*/}"
  fi
 done' bash {} +

find main -type f -exec bash -c \
'curdir=$(pwd)
for pathname do
  tpat=${pathname/main\/}
  ln -s "${curdir}/${pathname}" "links/${tpat#*/}" 2> /dev/null;
 done' bash {} +

find links -type l -exec bash -c \
'for pathname do
  LANG=C
  file "$pathname"|grep -o  "$pathname: broken symbolic link" > /dev/null; \
  if [ $? -eq 0 ];then rm "$pathname";fi
 done' \
 bash {} +

演示

$ \rm -r links

$ find main
main
main/f 4
main/f 4/s 4
main/f 4/s 4/k 4
main/asdf
main/f2
main/f2/s3
main/f2/s3/h
main/f2/s3/g
main/f2/s3/ss
main/f2/s3/ss/i
main/f2/s1
main/f2/s1/c
main/f2/s1/d
main/j
main/f1
main/f1/s2
main/f1/s2/x y
main/f1/s2/f
main/f1/s2/e
main/f1/s1
main/f1/s1/a
main/f1/s1/b

$ ./script  # doing it

$ find links/ -type l -exec file {} \;
links/s2/x y: symbolic link to /media/multimed-2/test/test0/matohak/main/f1/s2/x y
links/s2/f: symbolic link to /media/multimed-2/test/test0/matohak/main/f1/s2/f
links/s2/e: symbolic link to /media/multimed-2/test/test0/matohak/main/f1/s2/e
links/s3/h: symbolic link to /media/multimed-2/test/test0/matohak/main/f2/s3/h
links/s3/g: symbolic link to /media/multimed-2/test/test0/matohak/main/f2/s3/g
links/s3/ss/i: symbolic link to /media/multimed-2/test/test0/matohak/main/f2/s3/ss/i
links/s 4/k 4: symbolic link to /media/multimed-2/test/test0/matohak/main/f 4/s 4/k 4
links/asdf: symbolic link to /media/multimed-2/test/test0/matohak/main/asdf
links/s1/a: symbolic link to /media/multimed-2/test/test0/matohak/main/f1/s1/a
links/s1/c: symbolic link to /media/multimed-2/test/test0/matohak/main/f2/s1/c
links/s1/b: symbolic link to /media/multimed-2/test/test0/matohak/main/f1/s1/b
links/s1/d: symbolic link to /media/multimed-2/test/test0/matohak/main/f2/s1/d
links/j: symbolic link to /media/multimed-2/test/test0/matohak/main/j

$ ln -s main/asdf links/asdf-b  # create a broken link

$ find links/ -type l -name "asdf*" -exec file {} \;
links/asdf-b: broken symbolic link to main/asdf
links/asdf: symbolic link to /media/multimed-2/test/test0/matohak/main/asdf

$ ./script  # this time only to remove the broken link

$ find links/ -type l -name "asdf*" -exec file {} \;
links/asdf: symbolic link to /media/multimed-2/test/test0/matohak/main/asdf
$

允许你指定要合并 main/* 下的哪个文件夹的情况

#!/bin/bash
# First argument is target, following arbitrary number of target folders
# eg. ./script.sh links main/f1 main/f2 main/f3

argc=$#
argv=($@)

mkdir -p ${argv[0]}

for (( j=1; j<argc; j++ )); do

    find ${argv[j]} -type d -exec bash -c \
    'for pathname do
      #echo "------------------------------${pathname} ${pathname#*/*/}"
      if [ "${pathname/*\/*\/}" != "${pathname}" ]
      then
       mkdir -p "'${argv[0]}'/${pathname#*/*/}"
      fi
     done' bash {} +

    find ${argv[j]} -type f -exec bash -c \
    'curdir=$(pwd)
    for pathname do
      tpat=${pathname/${argv[j]}\/}
      ln -s "${curdir}/${pathname}" "'${argv[0]}'/${tpat#*/}" 2> /dev/null;
     done' bash {} +

    find ${argv[0]} -type l -exec bash -c \
    'for pathname do
      LANG=C
      file "$pathname"|grep -o  "$pathname: broken symbolic link" > /dev/null; \
      if [ $? -eq 0 ];then rm "$pathname";fi
     done' \
     bash {} +

    find ${argv[0]} -type d -empty -delete  # Removes empty dir in target

done

答案2

明白了……这个 Python 代码应该能够遍历任意数量的嵌套目录,并为所有合并到目标目录中的文件创建符号链接。参数分别是目标目录和源目录。源目录应该相对于目标目录。例如。

python script.py ./merged_folder ../folder1 ../folder2 ../folder3
import os
import sys
import time
'''
Loops through merge_symlink.
See https://askubuntu.com/questions/1097502/script-for-merging-files-in-multiple-nested-directories-by-symbolic-link/
KH Tam Nov 2018 (matohak)

Use: python merge_symlink.py ./target ../folder1 ../folder2 ../folder3
Note that if overwrite==True and there are duplicated filenames, links will be overwritten by the last argument's
'''

def merge_symlink(sources, overwrite=True, remove_empty_dir=True, verbose=False):
    '''
    See https://askubuntu.com/questions/1097502/script-for-merging-files-in-multiple-nested-directories-by-symbolic-link/
    Function to be run in the target directory.

    :param sources: a list of directories where the files in the subdirectories are to be merged symbolically. Path relative to the target. eg. ["../folder1", "../folder2"]
    :param overwrite: Bool, whether to overwrite existing symbolic links
    :param remove_empty_dir: Bool, whether to remove empty directories in target.
    :param verbose: Prints stuff.
    :return: None
    '''

    # Creating symlinks and folders
    for source in sources:
        for dirName, subdirList, fileList in os.walk(source):
            # print(dirName, fileList)  # print all source dir and files
            if source[-1] == "/": source=source[:-1]

            target_dir = dirName.replace(source, '.', 1)
            depth = dirName.count("/") - source.count("/")

            try:
                os.mkdir(os.path.join(target_dir))
            except FileExistsError:
                pass

            for file in fileList:
                targetlink = os.path.join(target_dir, file)
                try:
                    os.symlink(os.path.join("../"*depth + dirName, file), targetlink)
                except FileExistsError:
                    if overwrite and not (isvalidlink(targetlink)==2):  # Never replace a real file with a symlink!
                        os.remove(targetlink)
                        os.symlink(os.path.join("../" * depth + dirName, file), targetlink)
                        if verbose: print('overwriting {}'.format(targetlink))

    # Pruning broken links and then deleting empty folders.
    for dirName, subdirList, fileList in os.walk("./"):
        for file in fileList:
            link = os.path.join(dirName,file)
            if isvalidlink(link)==0:
                os.remove(link)
                if verbose: print("Removing broken symlink: {}".format(link))

    if remove_empty_dir:
        for dirName, subdirList, fileList in os.walk("./"):
            if fileList==[] and subdirList==[] and dirName!="./":
                os.rmdir(dirName)

# Checks if file is a broken link. 0: broken link; 1: valid link; 2: not a link
def isvalidlink(path):
    if not os.path.islink(path):
        return 2
    try:
        os.stat(path)
    except os.error:
        return 0
    return 1

if __name__ == "__main__":

    target = sys.argv[1]
    sources = sys.argv[2:]      # Inputs should be relative to the target dir.
    overwrite = False
    loop = False
    looptime = 10

    os.chdir(target)
    if not loop:
        merge_symlink(sources, overwrite=overwrite)
    else:
        while loop:
            merge_symlink(sources, overwrite=overwrite)
            time.sleep(looptime)

感谢@JacobVlijm 提供链接和@sudodus 的帮助!

相关内容