我有一个包含 1000 多个来自机器学习数据集的文件的大型目录,但这些文件的质量不同(为简单起见,分为玫瑰图片和雏菊图片)。我有一个 CSV 文件,其中包含数据集中每个项目的文件名以及它们的分类(玫瑰和雏菊)。我如何读取这个 CSV 文件并告诉我的文件管理器将所有玫瑰照片移动到一个目录,将所有雏菊照片移动到另一目录?我需要使用 Bash 脚本吗,还是这是 Nautilus 中已经内置的东西?
答案1
好吧,我和一个朋友设法用 Python 编写了一个脚本,很好地解决了这个问题。
# Import csv
import csv
# Import os
import os
# Main Function
def main():
# Open dataset file
dataset = open('dataset.csv', newline='')
# Initialize csvreader for dataset
reader = csv.reader(dataset)
# Read data from reader
data = list(reader)
# Variables for progress counter
lines = len(data)
i = 0
# Analyze data in dataset
for row in data:
# Assign image name and state to variables
image = row[0] + '.jpeg'
state = row[1]
# Print image information
print('({}/{}) Processing image ({}): {}'.format(i + 1, lines, state, image))
# Increment i
i += 1
# Determine action to perform
if state is '0':
# Attempt to move the file
try:
# Move the file to nosymptoms/
os.rename(image, 'nosymptoms/' + image)
# Inform the user of action being taken
print(' -> Moved to nosymptoms/')
except FileNotFoundError:
# Inform the user of the failure
print(' -> Failed to find file')
elif state in ['1', '2', '3', '4']:
# Attempt to move the file
try:
# Move the file to nosymptoms/
os.rename(image, 'symptoms/' + image)
# Inform the user of action being taken
print(' -> Moved to symptoms/')
except FileNotFoundError:
# Inform the user of the failure
print(' -> Failed to find file')
# Execute main function if name is equal to main
if __name__ == '__main__':
main()
这种方法效果会更好,因为我要处理更多的类别...希望这种方法对遇到同样问题的人都有用。
答案2
这是一个可以执行您想要的操作的 bash 脚本:
#!/bin/bash
fileNameIndex=0 # set to index of file name
categoryIndex=1 # set to index of category
IFS=",""$IFS" # add comma to break lines at commas
while read -a tokens; # read a line and break it into tokens separated by commas
do
file=${tokens[$fileNameIndex]} # get the file name
category=${tokens[$categoryIndex]} # get the category
if [ ! -d $category ]; then # check if directory with category name exists
mkdir $category; # make the category directory
fi
mv $file $category # move the file into the category directory
done
将此脚本保存在文件中,可能是 do_moves.sh,编辑它以设置 fileNameIndex 和 categoryIndex 的正确值,然后按如下方式运行它:
./do_moves.sh < 数据.csv
其中 data.csv 是您的 CSV 文件。在运行此命令之前,请确保您没有任何与类别同名的文件。