我不喜欢编写脚本,但在这个论坛的帮助下设法创建了一些脚本。遇到问题但无法使其工作(不确定是否可能)
我有一个包含内容的 fileY
lrwxrwxrwx 1 user1 gp 35 2021-09-07 2000 /folder/subfolder1/subfolder2/subfolder3/main/summary.txt
lrwxrwxrwx 1 user1 gp 35 2021-09-08 1400 /folder/subfolder1/subfolder2/main/summary.txt
lrwxrwxrwx 1 user1 gp 35 2021-09-09 1800 /folder/subfolder1/subfolder2/subfolder3/subfolder4/main/summary.txt
我想输出第 3,6,7,8 列并与“main”之前的文件夹名称连接,如下所示
user1 2021-09-07 2000 /folder/subfolder1/subfolder2/subfolder3/main/summary.txt subfolder3
user1 2021-09-08 1400 /folder/subfolder1/subfolder2/main/summary.txt subfolder2
user1 2021-09-09 1800 /folder/subfolder1/subfolder2/subfolder3/subfolder4/main/summary.txt subfolder4
我怎样才能有下面sed命令作为 awk 命令的 {print} 变量之一?
awk '{print $3,$6,$7,$8}' fileY
sed 's/\// /g; s/\./ /g' fileY | awk '{for(i=8;i<=NF;i++){if($i~/^main/){a=i}} print $(a-1)}'
答案1
当您使用 awk 时,您永远不需要 sed。如果您想要的目录始终是路径中的第三个,如您的示例所示,那么您所需要的就是使用任何 awk:
$ awk '{print $3, $6, $7, $8, p[split($8,p,"/")-2]}' file
user1 2021-09-07 2000 /folder/subfolder1/subfolder2/subfolder3/main/summary.txt subfolder3
user1 2021-09-08 1400 /folder/subfolder1/subfolder2/main/summary.txt subfolder2
user1 2021-09-09 1800 /folder/subfolder1/subfolder2/subfolder3/subfolder4/main/summary.txt subfolder4
否则使用 GNU awk 将第三个参数用于 match():
$ awk '{match($8,"([^/]+)/main/",a); print $3, $6, $7, $8, a[1]}' file
user1 2021-09-07 2000 /folder/subfolder1/subfolder2/subfolder3/main/summary.txt subfolder3
user1 2021-09-08 1400 /folder/subfolder1/subfolder2/main/summary.txt subfolder2
user1 2021-09-09 1800 /folder/subfolder1/subfolder2/subfolder3/subfolder4/main/summary.txt subfolder4
或使用任何 awk:
$ awk '{match($8,"[^/]+/main/"); print $3, $6, $7, $8, substr($8,RSTART,RLENGTH-6)}' file
user1 2021-09-07 2000 /folder/subfolder1/subfolder2/subfolder3/main/summary.txt subfolder3
user1 2021-09-08 1400 /folder/subfolder1/subfolder2/main/summary.txt subfolder2
user1 2021-09-09 1800 /folder/subfolder1/subfolder2/subfolder3/subfolder4/main/summary.txt subfolder4
答案2
我真的不明白为什么你会想要sed
那里,你只需一个就可以做到awk
。当然,这假设文件夹名称中从来没有空格或换行符,并且我们可以安全地使用空格作为字段分隔符。如果情况不属实,请编辑您的问题并添加更全面的示例。
$ awk '{
split($8,dirs,"/");
dir=""
for(i in dirs){
if(dirs[i+1]=="main"){
dir=dirs[i]
}
}
print $3,$6,$7,$8,dir}' fileY
user1 2021-09-07 2000 /folder/subfolder1/subfolder2/subfolder3/main/summary.txt subfolder3
user1 2021-09-08 1400 /folder/subfolder1/subfolder2/main/summary.txt subfolder2
user1 2021-09-09 1800 /folder/subfolder1/subfolder2/subfolder3/subfolder4/main/summary.txt subfolder4
这里的技巧是split()
将第 8 个字段拆分到dirs
数组中,用作/
分隔符。然后我们迭代dirs
并保留我们找到的最后一个数组条目,其下一个数组条目是main
。请注意,这意味着如果 出现多次main
,则只会匹配最后一个。
答案3
另一种方法是使用rev
,利用所需文件夹是反向使用的第三项/
作为分隔符这一事实,假设文件夹名称结构与给定的示例一致 ( <wanted folder>/main/summary.txt
):
$ rev file | awk -F'/' '{ print $3,$0 }' | rev | awk '{ print $3,$6,$7,$8,$9 }'
user1 2021-09-07 2000 /folder/subfolder1/subfolder2/subfolder3/main/summary.txt subfolder3
user1 2021-09-08 1400 /folder/subfolder1/subfolder2/main/summary.txt subfolder2
user1 2021-09-09 1800 /folder/subfolder1/subfolder2/subfolder3/subfolder4/main/summary.txt subfolder4
答案4
使用 GNUsed
嵌套分组
$ sed -E 's|.*\s[0-9]\s\s(.[^ ]*).*([0-9]{4}-.*/(.[^/]*).*/.*/.*)|\1 \2 \3|' input_file
user1 2021-09-07 2000 /folder/subfolder1/subfolder2/subfolder3/main/summary.txt subfolder3
user1 2021-09-08 1400 /folder/subfolder1/subfolder2/main/summary.txt subfolder2
user1 2021-09-09 1800 /folder/subfolder1/subfolder2/subfolder3/subfolder4/main/summary.txt subfolder4