问题,文件夹中有很多文件backup
。
3tpt_20190810_061011816.csv
3tpt_20190811_060912007.csv
3tpt_20190812_060910510.csv
3tpt_20190813_060911075.csv
3tpt_20190814_060911689.csv
3tpt_20190815_060911418.csv
3tpt_20190816_060911416.csv
3tpt_20190817_060911483.csv
3tpt_20190818_060911370.csv
3tpt_20190819_060911134.csv
3tpt_20190820_060911941.csv
3tpt_20190821_060911321.csv
3tpt_20190822_060911232.csv
3tpt_20190823_060911576.csv
3tpt_20190824_060911873.csv
3tpt_20190825_060911855.csv
3tpt_20190826_060911245.csv
3tpt_20190827_060911792.csv
3tpt_20190828_060912327.csv
3tpt_20190829_060911410.csv
3tpt_20190830_060912001.csv
3tpt_20190831_060911118.csv
3tpt_20190901_060911358.csv
3tpt_20190902_060911748.csv
3tpt_20190903_060911777.csv
3tpt_20190904_060911087.csv
3tpt_20190905_060913841.csv
Alco_Price_20190810_080007170.csv
Alco_Price_20190811_080006328.csv
Alco_Price_20190812_080006557.csv
Alco_Price_20190813_080007255.csv
Alco_Price_20190814_080006951.csv
Alco_Price_20190815_080006878.csv
Alco_Price_20190816_080006952.csv
Alco_Price_20190817_080007152.csv
Alco_Price_20190818_080006854.csv
Alco_Price_20190819_080006893.csv
Alco_Price_20190820_080006030.csv
Alco_Price_20190821_080006614.csv
Alco_Price_20190822_080008754.csv
Alco_Price_20190823_080006543.csv
Alco_Price_20190824_080007035.csv
Alco_Price_20190825_080006745.csv
Alco_Price_20190826_080006466.csv
Alco_Price_20190827_080006716.csv
Alco_Price_20190828_080006693.csv
Alco_Price_20190829_080005834.csv
Alco_Price_20190830_080006772.csv
Alco_Price_20190831_080007234.csv
Alco_Price_20190901_080008703.csv
Alco_Price_20190902_080006663.csv
Alco_Price_20190903_080007009.csv
Alco_Price_20190904_080006726.csv
Alco_Price_20190905_080006404.csv
art_info_20190813_200910412.csv
art_info_20190821_150910982.csv
art_info_20190904_200911005.csv
Brand_list_20190625_0510.csv
CIP1_20190812_111004332.csv
CIP1_20190813_105004414.csv
CIP1_20190814_110004061.csv
CIP1_20190814_111004562.csv
CIP1_20190814_112004539.csv
CIP1_20190814_135004807.csv
CIP1_20190815_094004023.csv
CIP1_20190903_122003816.csv
CIP 16_20190903_092004327.csv
CIP_20190812_105004438.csv
CIP_20190812_111004314.csv
CIP_20190812_113003892.csv
CIP_20190814_110004031.csv
CIP_20190815_101004603.csv
CIP_20190816_094004230.csv
CIP_20190816_153003821.csv
CIP_20190829_102004209.csv
CIP2_20190812_111004347.csv
CIP2_20190814_112004560.csv
CIP3_20190812_111004375.csv
CIP4_20190812_111004390.csv
CIP5_20190812_113003911.csv
CIP bel_20190829_171004879.csv
CIP dom_20190830_102004468.csv
CIP DT_20190904_105005222.csv
CIP k1_20190904_120004083.csv
CIP k_20190904_115003878.csv
CIP k_20190904_121004631.csv
CIP k2_20190904_121004653.csv
CIP k3_20190904_121004671.csv
CIP k4_20190904_121004687.csv
CIP k8_20190904_121004706.csv
CIP k9_20190904_121004721.csv
CIP kolesn_20190830_101004809.csv
CIP kost10_20190829_112005105.csv
CIP kost11_20190829_113004742.csv
CIP kost1_20190829_111003993.csv
CIP kost1_20190829_112005077.csv
CIP kost1_20190829_163005460.csv
CIP kost1_20190902_090004365.csv
CIP kost_20190829_102003983.csv
CIP kost_20190829_111003972.csv
CIP kost_20190829_121004075.csv
CIP kost_20190829_163005445.csv
CIP kost_20190829_165004177.csv
CIP kost_20190830_084003859.csv
CIP kost_20190830_100005953.csv
CIP kost_20190902_090004314.csv
CIP kost_20190903_175004956.csv
CIP kost2_20190829_112005124.csv
CIP kost2_20190829_163005470.csv
CIP kost 2_20190829_165004151.csv
CIP kost2_20190902_091004252.csv
CIP kost2_20190902_092004226.csv
CIP kost2_20190902_094005360.csv
CIP kost3_20190829_112005134.csv
CIP kost3_20190829_163005481.csv
CIP kost3_20190902_092004264.csv
CIP kost3_20190902_094005406.csv
CIP kost4_20190829_112005151.csv
CIP kost4_20190829_164004669.csv
CIP kost4_20190902_092004300.csv
CIP kost5_20190829_112005167.csv
CIP kost5_20190829_164004687.csv
CIP kost5_20190902_092004337.csv
CIP kost6_20190829_112005177.csv
CIP kost6_20190829_164004710.csv
CIP kost7_20190829_112005195.csv
CIP kost7_20190829_164004724.csv
CIP kost8_20190829_112005205.csv
CIP kost8_20190829_164004742.csv
CIP kost9_20190829_112005227.csv
CIP kost9_20190829_164004753.csv
CIP kost9_20190829_165004198.csv
CIP kostina10_20190903_173004267.csv
CIP kostina11_20190903_174004830.csv
CIP kostina1_20190902_171004547.csv
CIP kostina1_20190902_172004172.csv
CIP kostina1_20190902_173004618.csv
CIP kostina1_20190903_170004630.csv
CIP kostina1_20190903_173004253.csv
CIP kostina12_20190903_174004839.csv
CIP kostina12_20190903_181004437.csv
CIP kostina12_20190903_183004958.csv
CIP kostina13_20190903_174004847.csv
CIP kostina_20190902_170004568.csv
CIP kostina_20190902_171004525.csv
CIP kostina_20190903_170004605.csv
CIP kostina2_20190902_172004192.csv
CIP kostina2_20190903_173004277.csv
CIP kostina3_20190902_172004206.csv
CIP kostina3_20190903_173004286.csv
CIP kostina4_20190902_172004219.csv
CIP kostina4_20190903_173004296.csv
CIP kostina5_20190903_173004305.csv
CIP kostina6_20190902_172004237.csv
CIP kostina6_20190903_173004314.csv
CIP kostina7_20190902_172004250.csv
CIP kostina7_20190903_173004332.csv
CIP kostina8_20190903_173004350.csv
CIP kostina9_20190903_173004368.csv
CIP kostina wine1_20190903_174004713.csv
CIP kostina wine_20190903_170004583.csv
CIP kostina wine_20190903_174004681.csv
CIP kostina wine2_20190903_174004736.csv
CIP kostina wine3_20190903_174004750.csv
CIP kostina wine4_20190903_174004775.csv
CIP kostina wine5_20190903_174004795.csv
CIP kostina wine6_20190903_174004819.csv
CIP koval1_20190902_125004761.csv
CIP koval_20190902_123005017.csv
CIP koval_20190902_125004704.csv
CIP koval meat1_20190902_134004159.csv
CIP koval meat_20190902_130004218.csv
CIP koval meat_20190902_135004850.csv
CIP koval meat2_20190902_134004219.csv
CIP koval meat3_20190902_135004905.csv
CIP koval meat5_20190902_135004941.csv
CIP koval meat6_20190902_135004976.csv
CIP kozak 1_20190902_111004709.csv
CIP kozak 1_20190902_113004182.csv
CIP kozak_20190902_111004728.csv
CIP kozak_20190902_112003960.csv
CIP kozak2_20190902_113004232.csv
CIP kozak 3_20190902_113004203.csv
CIP kozak 4_20190902_113004223.csv
CIP kozak5_20190902_113004243.csv
CIP kozak6_20190902_113004252.csv
CIP meat_20190902_134004277.csv
CIP_new for XML_20190816_153003846.csv
CIPtaras1_20190903_142004791.csv
CIPtaras_20190903_133005037.csv
CIPtaras_20190903_142004747.csv
CIPtaras2_20190903_142004821.csv
CIPtaras3_20190903_143004418.csv
CIPtest_20190814_110004081.csv
CIP_test_20190904_152648516.csv
CIP_test_20190904_153613279.csv
merch_20190821_190906414.csv
Price_20190810_061241534.csv
Price_20190811_061131054.csv
Price_20190812_061136441.csv
Price_20190813_061143313.csv
Price_20190814_061140316.csv
Price_20190815_061147084.csv
Price_20190816_061155167.csv
Price_20190817_061144303.csv
Price_20190818_061140965.csv
Price_20190819_061138157.csv
Price_20190820_061142357.csv
Price_20190821_061142722.csv
Price_20190822_061152858.csv
Price_20190823_061151866.csv
Price_20190824_061202921.csv
Price_20190825_061148768.csv
Price_20190826_061146044.csv
Price_20190827_061146548.csv
Price_20190828_061202533.csv
Price_20190829_061137793.csv
Price_20190830_061149138.csv
Price_20190831_061215519.csv
Price_20190901_061142399.csv
Price_20190902_061145861.csv
Price_20190903_061146422.csv
Price_20190904_061155080.csv
Price_20190905_061141271.csv
Push_offers20190718_0510.csv
Push_offers20190802_1220.csv
Sample_lang_20190703_1230.csv
sample_lang_20190813_201731275.csv
sample_lang_20190904_200939502.csv
我需要在此文件夹中仅保留最后 3 个文件(通过文件名末尾的日期检测),通过前 3 个相同的字符检测,其他文件需要移动到另一个文件夹。
例子。
backup/3tpt_20190903_060911777.csv
backup/3tpt_20190904_060911087.csv
backup/3tpt_20190905_060913841.csv
backup/Alco_Price_20190903_080007009.csv
backup/Alco_Price_20190904_080006726.csv
backup/Alco_Price_20190905_080006404.csv
backup/art_info_20190813_200910412.csv
backup/art_info_20190821_150910982.csv
backup/art_info_20190904_200911005.csv
... etc.
我试过这个,但是对于以 开头的文件,存在一个问题,CIP
因为 以空格代替_
。 的值为10
,cut
因为backup/XXX
有 10 个字符。
ls backup/*.csv > asd.txt | cut -c 1-10 | uniq | xargs -Ifile bash -c "grep 'file' asd.txt | sort -b -t '_' -k 2,3 | tail -n 3"
我该如何纠正这个问题?我如何从文件末尾按日期取出最后 3 个文件,文件名以CIP
(前 3 个字符) 开头?
谢谢。
答案1
这是一个输出命令的 Python 脚本,mv
便于审计(可以切换到shutil.move
实际移动文件的命令)。它不在 BASH 中,因为现在所有的 Ubuntu 都附带了 Python 3,文件名需要一些清理,看起来用正则表达式解析更安全。
#!/usr/bin/env python3
import os
import re
from collections import defaultdict
date_re = re.compile(r'(?P<year>\d{4})(?P<month>\d{2})(?P<day>\d{2})_(?P<rest>\d+)\.')
grouped = defaultdict(list)
# Group
for line in os.listdir('.'): # for line in Path('1199628.txt').open().read().split():
src = line.strip().replace(' ', '_')
ext = line.strip().split('.')[-1]
match = date_re.search(src)
if not match:
continue
parts = match.groupdict()
# Get group name (via the position of the first matching digit)
name = src[:match.start('year')].rstrip('_').lower()
group_name = name[:3]
# Clean up name
normalized = "{}_{}-{}-{}_{}.{}".format(name, parts['year'], parts['month'], parts['day'], parts['rest'], ext)
# Archive in group folders
dst = os.path.join('archive', group_name, normalized)
grouped[group_name].append([src, dst])
# Sort and print mv commands
for group, items in grouped.items():
# ..by normalized name
sorted_group = sorted(items, key=lambda x: x[1], reverse=True)
to_archive = sorted_group[3:]
for src, dst in to_archive:
print("mv {} {}".format(src, dst)) # shutil.move(src, dst)