这个问题源于我的一个问题堆栈交换。
我使用 挂载 Windows 共享mount -t cifs -o username=username,password=password,rw,nounix,iocharset=utf8,file_mode=0777,dir_mode=0777 //192.168.1.120/storage /mnt/storage
,然后让脚本在 debian 计算机上的挂载点上运行。
挂载点将/mnt/storage
包含数量快速增长的文件,这些文件将被批量移动到子目录中并在子目录中进行处理。
我的问题是移动速度相对较慢,我担心这是因为我无法仅更改文件表(只需更改文件在硬盘上的信息)。
我目前正在 python 中使用shutil.move(src,dst)
,但也考虑使用os.rename(src,dst)
或使用子进程mv
。
我的恐惧是否正确?如果是,有办法吗安装更高效?
编辑:我刚刚再次浏览了文档shutil.move()
并阅读了以下内容:
If the destination is on the current filesystem, then os.rename() is used.
Otherwise, src is copied (using shutil.copy2()) to dst and then removed.
这听起来可能是一个问题,如果不是的话“在当前文件系统上”,我怎么知道我是否是“在当前文件系统上”
编辑2:如果有人对我发现的内容感兴趣,我也会从 stackoverflow 复制并粘贴我的编辑。
我编写了一个脚本来测试各种移动方法之间的速度差异。首先创建 1x5GB ( dd if=/dev/urandom of=/mnt/storage/source/test.file bs=100M count=50
),然后创建 100x5MB ( for i in {1..100}; do dd if=/dev/urandom of=/mnt/storage/source/file$i bs=1M count=5
),最后创建 10000x5kB ( for i in {1..100000}; do dd if=/dev/urandom of=/mnt/storage/source/file$i bs=1k count=5
)
from shutil import move
from os import rename
from datetime import datetime
import subprocess
import os
print("Subprocess mv: for every file in directory..")
s = datetime.now()
for f in os.listdir("/mnt/storage/source/"):
try:
subprocess.call(["mv /mnt/storage/source/"+str(f)+" /mnt/storage/mv"],shell=True)
except Exception as e:
print(str(e))
e = datetime.now()
print("took {}".format(e-s)+"\n")
print("Subprocessmv : directory/*..")
s = datetime.now()
try:
subprocess.call(["mv /mnt/storage/mv/* /mnt/storage/mvf"],shell=True)
except Exception as e:
print(str(e))
e = datetime.now()
print("took {}".format(e-s)+"\n")
print("shutil.move: for every file file in directory..")
s = datetime.now()
for f in os.listdir("/mnt/storage/mvf/"):
try:
move("/mnt/storage/mvf/"+str(f),"/mnt/storage/move")
except Exception as e:
print(str(e))
e = datetime.now()
print("took {}".format(e-s)+"\n")
print("os.rename: for every file in directory..")
s = datetime.now()
for f in os.listdir("/mnt/storage/move/"):
try:
rename("/mnt/storage/move/"+str(f),"/mnt/storage/rename/"+str(f))
except Exception as e:
print(str(e))
e = datetime.now()
print("took {}".format(e-s)+"\n")
if os.path.isdir("/mnt/storage/rename_new"):
rmtree('/mnt/storage/rename_new')
print("os.rename & os.mkdir: rename source dir to destination & make new source dir..")
s = datetime.now()
rename("/mnt/storage/rename/","/mnt/storage/rename_new")
os.mkdir("/mnt/storage/rename/")
e = datetime.now()
print("took {}".format(e-s)+"\n")
这表明没有太大区别。5GB 文件移动得非常快,这告诉我通过更改文件表进行移动是有效的。以下是 10000*5kB 文件的结果(感觉结果取决于当前的网络工作负载。例如,第一次mv
测试花了 2m 28s,后来使用相同的文件 3m 22s,也是os.rename()
大多数时候最快的方法。 .):
Subprocess mv: for every file in directory..
took 0:02:47.665174
Subprocessmv : directory/*..
took 0:01:40.087872
shutil.move: for every file file in directory..
took 0:01:48.454184
os.rename: for every file in directory..
rename took 0:02:05.597933
os.rename & os.mkdir: rename source dir to destination & make new source dir..
took 0:00:00.005704