sed 删除许多文件中除第一行和最后一行之外的所有内容

Question 1

完美，下面有 awk 版本：

find . -type f -name \*.txt -printf "%f\0" | xargs -0 -I xxxx sed -ni '
 2 {
   $ {
     s/^[^;]*;[^;]*;[^;]*;[^;]*;\([^;]*\);\([^;]*\).*$/\1;\2/
     p
     q
   }
   s/^[^;]*;[^;]*;[^;]*;[^;]*;\([^;]*\).*$/\1/
   h
 }
 $ {
   s/^[^;]*;[^;]*;[^;]*;[^;]*;[^;]*;\([^;]*\).*$/\1/
   H
   x
   s/\n/;/
   p
 }' xxxx

谢谢非常有名的人Sed - Bruce Barnett 的介绍和教程

结果：

$ cat stat01.txt
18910101;19860630
$ cat stat56.txt
18980101;19990630
$ cat stat87.txt
19010101;19661229

----

第一个版本供参考

根据您的输入，我发明了数据文件格式和 sed 脚本来处理它们。

尝试一下：

$ find . -type f -name \*.txt -printf "%f\0" | xargs -0 -I xxxx sed -ni '
 2 {
   $ {
     s/^[^;]*;\([^;]*\);\([^;]*\).*$/\1;\2/
     p
     q
   }
   s/^[^;]*;\([^;]*\).*$/\1/
   p
 }
 $ {
   s/^[^;]*;[^;]*;\([^;]*\).*$/\1/
   p
 }' xxxx

它删除包含标题的第一行。

它仅保留遇到的第一个数据行的第 2 列和文件的最后一个数据行的第 3 列。

如果文件只包含一个数据行，则第 2 列和第 3 列保留在一行上。

呵呵，这很奇怪，但我玩得很开心！

当前目录下的数据文件：

$ cat test01.txt
Name;Price;Amount;Description
Bread;2.1;3;healthy one
$ cat test02.txt
Name;Price;Amount;Description
Water;0.0;100;For life
Wine;10.3;1;Less than half a glass a day
$ cat test03.txt
Name;Price;Amount;Description
House;1000.0;1;home
Car;500.5;0;no need
Bike;10.3;5;Good for the planet and for me

结果：

$ cat test01.txt
2.1;3
$ cat test02.txt
0.0
1
$ cat test03.txt
1000.0
5

请提供2个简短的数据文件内容和预期结果，我会修改这个答案。

Answer

完美，下面有 awk 版本：

find . -type f -name \*.txt -printf "%f\0" | xargs -0 -I xxxx sed -ni '
 2 {
   $ {
     s/^[^;]*;[^;]*;[^;]*;[^;]*;\([^;]*\);\([^;]*\).*$/\1;\2/
     p
     q
   }
   s/^[^;]*;[^;]*;[^;]*;[^;]*;\([^;]*\).*$/\1/
   h
 }
 $ {
   s/^[^;]*;[^;]*;[^;]*;[^;]*;[^;]*;\([^;]*\).*$/\1/
   H
   x
   s/\n/;/
   p
 }' xxxx

谢谢非常有名的人Sed - Bruce Barnett 的介绍和教程

结果：

$ cat stat01.txt
18910101;19860630
$ cat stat56.txt
18980101;19990630
$ cat stat87.txt
19010101;19661229

----

第一个版本供参考

根据您的输入，我发明了数据文件格式和 sed 脚本来处理它们。

尝试一下：

$ find . -type f -name \*.txt -printf "%f\0" | xargs -0 -I xxxx sed -ni '
 2 {
   $ {
     s/^[^;]*;\([^;]*\);\([^;]*\).*$/\1;\2/
     p
     q
   }
   s/^[^;]*;\([^;]*\).*$/\1/
   p
 }
 $ {
   s/^[^;]*;[^;]*;\([^;]*\).*$/\1/
   p
 }' xxxx

它删除包含标题的第一行。

它仅保留遇到的第一个数据行的第 2 列和文件的最后一个数据行的第 3 列。

如果文件只包含一个数据行，则第 2 列和第 3 列保留在一行上。

呵呵，这很奇怪，但我玩得很开心！

当前目录下的数据文件：

$ cat test01.txt
Name;Price;Amount;Description
Bread;2.1;3;healthy one
$ cat test02.txt
Name;Price;Amount;Description
Water;0.0;100;For life
Wine;10.3;1;Less than half a glass a day
$ cat test03.txt
Name;Price;Amount;Description
House;1000.0;1;home
Car;500.5;0;no need
Bike;10.3;5;Good for the planet and for me

结果：

$ cat test01.txt
2.1;3
$ cat test02.txt
0.0
1
$ cat test03.txt
1000.0
5

请提供2个简短的数据文件内容和预期结果，我会修改这个答案。

Question 2

为此，您需要对文件进行循环：

for file in *.txt; do
  lines=$(wc -l < "$file")
  if [ "$lines" -lt 3 ]; then
    echo "$file is short enough, not touching it."
  else
    # for testing, you can also use the -i option
    sed -n '1p;$p' "$file" > "$file.new"
  fi
done

如果您的文件只有一行长，则循环是必要的。随着thrig 给出的命令他们会出现两次（尝试echo 1|sed -n '1p;$p'）。

Answer

为此，您需要对文件进行循环：

for file in *.txt; do
  lines=$(wc -l < "$file")
  if [ "$lines" -lt 3 ]; then
    echo "$file is short enough, not touching it."
  else
    # for testing, you can also use the -i option
    sed -n '1p;$p' "$file" > "$file.new"
  fi
done

如果您的文件只有一行长，则循环是必要的。随着thrig 给出的命令他们会出现两次（尝试echo 1|sed -n '1p;$p'）。

Question 3

对于这项任务来说，Gawk 是比 sed 更好的工具。重新利用原始方法的 find-xargs 管道并使用相同的输出命名法：

find . -type f -name \*.txt -printf "%f\0" | xargs -0 gawk -F\; '
    FNR==2  { von = $5 }
    ENDFILE { print von FS $6 > "cleaned" FILENAME }
'

代码变得更简单、更清晰并且更易于维护。

Answer

对于这项任务来说，Gawk 是比 sed 更好的工具。重新利用原始方法的 find-xargs 管道并使用相同的输出命名法：

find . -type f -name \*.txt -printf "%f\0" | xargs -0 gawk -F\; '
    FNR==2  { von = $5 }
    ENDFILE { print von FS $6 > "cleaned" FILENAME }
'

代码变得更简单、更清晰并且更易于维护。

sed 删除许多文件中除第一行和最后一行之外的所有内容

答案1

----

答案2

答案3

相关内容