比较 BASH 中的 3 行和 2 行

Question 1

这个问题类似于这里稍加修改：

| sed '
    :1
    N    #add next line
    s/\([0-9.]\+\)\s\S\+\n.*\s\1\s\S\+$/\1 both/
    t1   #go to point 1 if exchange took place
    P    #print first line from two
    D    #remove first line, return to start
    '

Answer

这个问题类似于这里稍加修改：

| sed '
    :1
    N    #add next line
    s/\([0-9.]\+\)\s\S\+\n.*\s\1\s\S\+$/\1 both/
    t1   #go to point 1 if exchange took place
    P    #print first line from two
    D    #remove first line, return to start
    '

Question 2

这是否符合您的要求：

 awk 'BEGIN{ip="nothing" 
    time=""
    type=""
 }
 {
    # if the currently processed ip is not the same as the line 
    # being processed then we need to print the data.
    if (ip != $3)
    {
       # if ip == nothing then this is the first line do not print.
       # otherwise we are at a line with a new ip and we should print
       # the data saved from previous lines.
       if(ip != "nothing")
       { 
          print time, ip, type
       }
    # Remove the time update line since we are now doing it outside the
    # if statement so it always updates the time. This will make the 
    # outputted line print the last time stamp for each IP.
    #time=$1" "$2
    ip=$3
    type=$4
    }
    else if (type != $4)
    {
       type="both"
    }
    # no matter what update the time stamp value so that the latest
    # time stamp is kept for any given ip. Putting it after the if
    # that handles when a new ip is found, makes sure that it does not
    # override the value printed for the old ip line.
    time=$1" "$2
 }
 END{
    # Once we reach the end of the input, we still have 
    # the last set of values to print.
    print time, ip, type
 }'

它将读取文件，如果有两个连续行具有相同的 ip 和不同的类型（des、src、两者），它将把类型更改为两者，否则如果在数据中找到新的 ip，它将打印它的类型有..

Answer

这是否符合您的要求：

 awk 'BEGIN{ip="nothing" 
    time=""
    type=""
 }
 {
    # if the currently processed ip is not the same as the line 
    # being processed then we need to print the data.
    if (ip != $3)
    {
       # if ip == nothing then this is the first line do not print.
       # otherwise we are at a line with a new ip and we should print
       # the data saved from previous lines.
       if(ip != "nothing")
       { 
          print time, ip, type
       }
    # Remove the time update line since we are now doing it outside the
    # if statement so it always updates the time. This will make the 
    # outputted line print the last time stamp for each IP.
    #time=$1" "$2
    ip=$3
    type=$4
    }
    else if (type != $4)
    {
       type="both"
    }
    # no matter what update the time stamp value so that the latest
    # time stamp is kept for any given ip. Putting it after the if
    # that handles when a new ip is found, makes sure that it does not
    # override the value printed for the old ip line.
    time=$1" "$2
 }
 END{
    # Once we reach the end of the input, we still have 
    # the last set of values to print.
    print time, ip, type
 }'

它将读取文件，如果有两个连续行具有相同的 ip 和不同的类型（des、src、两者），它将把类型更改为两者，否则如果在数据中找到新的 ip，它将打印它的类型有..

Question 3

给定输入文件foo.txt:

sort前三个字段的数字，
使用datamash真正完成组合IP标签的工作，
cut冗余字段，

然后用sed“both”替换任何组合标签。

sort -r -k1n -k2n -k3n foo.txt | \
  datamash -W -f -s -g3 collapse 4 | \
  cut --complement -f4 | \
  sed 's/\t[sdb].*,.*$/\tboth/g'

输出：

2014-11-24  12:59:42.169    101.0.0.0       source
2014-11-24  12:59:40.375    104.156.80.0    destination
2014-11-24  12:59:36.729    104.219.48.0    destination
2014-11-24  12:59:40.377    104.37.160.0    source
2014-11-24  12:59:06.456    107.188.128.0   both
2014-11-24  12:59:42.043    107.192.0.0     both
2014-11-24  12:59:33.209    108.175.32.0    both
2014-11-24  12:59:55.488    111.0.0.0       both

Answer

给定输入文件foo.txt:

sort前三个字段的数字，
使用datamash真正完成组合IP标签的工作，
cut冗余字段，

然后用sed“both”替换任何组合标签。

sort -r -k1n -k2n -k3n foo.txt | \
  datamash -W -f -s -g3 collapse 4 | \
  cut --complement -f4 | \
  sed 's/\t[sdb].*,.*$/\tboth/g'

输出：

2014-11-24  12:59:42.169    101.0.0.0       source
2014-11-24  12:59:40.375    104.156.80.0    destination
2014-11-24  12:59:36.729    104.219.48.0    destination
2014-11-24  12:59:40.377    104.37.160.0    source
2014-11-24  12:59:06.456    107.188.128.0   both
2014-11-24  12:59:42.043    107.192.0.0     both
2014-11-24  12:59:33.209    108.175.32.0    both
2014-11-24  12:59:55.488    111.0.0.0       both

Question 4

我修改了OP中给出的代码：

awk '{print $3}' input.txt | sort -u | while read line
do 
    echo -n `grep $line input.txt | \
      sort -r | head -1 | \
      grep -oe "[^a-z]*"` ' ' # print latest time stamp
    if [[ $(grep -c $line input.txt) -ge 2 ]];  then 
        echo  'both'
    else
        echo `grep $line input.txt | grep -oe "[a-z]*"`
    fi
done

输出：

2014-11-24 12:59:42.169 101.0.0.0  source
2014-11-24 12:59:40.375 104.156.80.0  destination
2014-11-24 12:59:36.729 104.219.48.0  destination
2014-11-24 12:59:40.377 104.37.160.0  source
2014-11-24 12:59:06.456 107.188.128.0  both
2014-11-24 12:59:42.043 107.192.0.0  both
2014-11-24 12:59:33.209 108.175.32.0  both
2014-11-24 12:59:55.488 111.0.0.0  both

Answer

我修改了OP中给出的代码：

awk '{print $3}' input.txt | sort -u | while read line
do 
    echo -n `grep $line input.txt | \
      sort -r | head -1 | \
      grep -oe "[^a-z]*"` ' ' # print latest time stamp
    if [[ $(grep -c $line input.txt) -ge 2 ]];  then 
        echo  'both'
    else
        echo `grep $line input.txt | grep -oe "[a-z]*"`
    fi
done

输出：

2014-11-24 12:59:42.169 101.0.0.0  source
2014-11-24 12:59:40.375 104.156.80.0  destination
2014-11-24 12:59:36.729 104.219.48.0  destination
2014-11-24 12:59:40.377 104.37.160.0  source
2014-11-24 12:59:06.456 107.188.128.0  both
2014-11-24 12:59:42.043 107.192.0.0  both
2014-11-24 12:59:33.209 108.175.32.0  both
2014-11-24 12:59:55.488 111.0.0.0  both

比较 BASH 中的 3 行和 2 行

答案1

答案2

答案3

答案4

相关内容