简而言之,我的问题是,当我运行时os.walk()
,我得到了一个准确的文件列表,但是当我尝试获取有关这些文件的信息(例如它们的上次修改日期、文件大小,甚至只是尝试使用open()
它们)时,我收到一条错误消息,指出只有某些文件无法找到该文件。大约 0.2% 的原因尚不清楚。
背景
在工作中,我们有一台运行 Windows Server 2012 R2 的服务器(我知道,我知道……)。我们希望自动将目标共享文件夹移动到 Google Drive 中的特定共享驱动器。
我要做的第一件事是获取文件列表及其最后修改日期和文件大小,以供稍后使用。我编写的代码在运行 Windows 11 的笔记本电脑上运行良好,但当我尝试将其指向服务器上的几个不同的共享文件夹时,它反复遇到同样的问题。
故障排除
我不认为这是一个代码问题,并且已经多次修改了我的代码以使其更简单,但最终结果还是相同的 - 它可以本地工作但无法完全遍历共享文件夹。
我的第一个想法是这可能是由于路径名太长(旧系统上 255 个字符的限制)但它成功找到了路径长度 > 300 个字符的文件。
我的下一个想法是,也许存在一种明显的模式,即无法找到哪些类型的文件,但在给定的文件夹中,它可以成功找到大多数 PDF,但无法找到其他一个或几个。这只是一个观察到的例子,并不特定于 PDF。
我大概花了总共6-8个小时来尝试排除故障并调查此问题,但目前我还是很困惑。
代码
do_test.py - 使用 hurry.filesize 包获取大致文件大小
import os
import datetime
from hurry.filesize import size
from pprint import pprint
# Test directory
src = "//[DC]/PATH/TO/FOLDER"
def simple_file_check(src_dir):
total_bytes = 0
total_files = 0
total_folders = 0
total_not_found = 0
files_not_found = []
for (root, dirs, files) in os.walk(src_dir):
# just count files and folders for now
total_files += len(files)
total_folders += len(dirs)
# Get full-path file names
fnames = [os.path.join(root, f).replace("\\","/") for f in files]
# Get their sizes and sum it up
fsizes = []
for f in fnames:
try:
fsizes.append(os.stat(f).st_size)
except Exception as e:
files_not_found.append(f)
total_bytes += sum(fsizes)
total_size = size(total_bytes)
total_not_found += len(files_not_found)
pct_missing = total_not_found/total_not_found+total_files*100
data = {
"ttl-size": total_size,
"ttl-files": total_files,
"ttl-folders": total_folders,
"ttl-not-found": total_not_found,
"pct-missing": "{}%".format(pct_missing)
}
pprint(data)
def time_it_pls(func, *arg):
begin_dt = datetime.datetime.now()
begin = str(begin_dt)[:19]
print("beginning execution at: {}".format(begin))
func(*arg)
end_dt = datetime.datetime.now()
end = str(end_dt)[:19]
print("ending execution at: {}".format(end))
print("time taken: {}".format(end_dt - begin_dt))
time_it_pls(simple_file_check, src)
结果
beginning execution at: 2023-06-21 14:50:06
{'pct-missing': '0.19806269922322284%',
'ttl-files': 193878,
'ttl-folders': 18150,
'ttl-not-found': 384,
'ttl-size': '210G'}
ending execution at: 2023-06-21 14:51:11
time taken: 0:01:05.302772
没有异常块的特定错误消息
Traceback (most recent call last):
File "C:\it_scripts\do_test.py", line 53, in <module>
time_it_pls(simple_file_check, src)
File "C:\it_scripts\do_test.py", line 47, in time_it_pls
func(*arg)
File "C:\it_scripts\do_test.py", line 25, in simple_file_check
fsizes.append(os.stat(f).st_size)
^^^^^^^^^^
FileNotFoundError: [WinError 3] The system cannot find the path specified: '//DC/PATH/TO/FILE'
- 编辑 -
open()
当我尝试在解释器中仅针对单个文件使用时,我遇到了类似的错误。
>>> f = "//DC/PATH/TO/FILE" # actual path length is 267 characters long and copied from the exception in the previous example.
>>> d = open(f)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
FileNotFoundError: [Errno 2] No such file or directory: '//DC/PATH/TO/FILE'
--编辑 2--
我们越来越接近了!尝试在 PowerShell 中列出文件夹,我可以看到文件存在,但如果我尝试对单个文件运行 ls,我会收到错误。所以这是不是python 特有的,并暗示 Windows 方面存在一些奇怪的事情。
这是 PS 端输出和错误的删减版。请理解,由于这些文件的敏感性,确实需要进行一定程度的删减。
PS C:\Users\sani> ls "\\DC\#CONSOLIDATION of Checklists, Declarations, Examples, and Documents for [REDACTED], U, and I751 Applications\Application- U Visa\Closing Letters\
Rejections\"
Directory: \\DC\#CONSOLIDATION of Checklists, Declarations, Examples, and Documents for [REDACTED], U, and I751 Applications\Application- U Visa\Closing
Letters\Rejections
Mode LastWriteTime Length Name
---- ------------- ------ ----
-a---- 5/31/2019 3:06 PM 13025 Samole closing Letter - No DV Simple assault.docx
-a---- 11/21/2018 3:10 PM 16232 Sample Closing Letter-Not a qualifying crime (Sp).dotx
-a---- 7/26/2018 11:32 AM 13581 Sample Closing Letter-RE PC does not qualify a indirect victim.dotx
-a---- 11/21/2018 3:14 PM 12908 Sample Closing Letter-RE U Cert Request Denied.dotx
-a---- 7/9/2018 7:25 PM 13500 Sample Closing Letter-Unqualifying crime.dotx
-a---- 7/26/2018 6:19 PM 12769 Sample Closing Ltr w Copy of File (Sp), Over Income.dotx
-a---- 7/26/2018 1:24 PM 16432 Sample Rejection Letter, unqualifying crime.dotx
PS C:\Users\sani> ls "\\DC\#CONSOLIDATION of Checklists, Declarations, Examples, and Documents for [REDACTED], U, and I751 Applications\Application- U Visa\Closing Letters\
Rejections\Sample Closing Letter-RE PC does not qualify a indirect victim.dotx"
ls : Cannot find path '\\DC\#CONSOLIDATION of Checklists, Declarations, Examples, and Documents for [REDACTED], U, and I751 Applications\Application- U Visa\Closing
Letters\Rejections\Sample Closing Letter-RE PC does not qualify a indirect victim.dotx' because it does not exist.
At line:1 char:1
+ ls "\\DC\#CONSOLIDATION ...
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+ CategoryInfo : ObjectNotFound: (\\DC\...ect victim.dotx:String) [Get-ChildItem], ItemNotFoundException
+ FullyQualifiedErrorId : PathNotFound,Microsoft.PowerShell.Commands.GetChildItemCommand