从文件中提取特定字段

从文件中提取特定字段

我有如下文件内容

ALARM: 20190507   1 10:00:09.023000000 10:00:09.023000000 ABB < 50> @160 . "AB12345" 12345 - S . ".." "" "" "POSSIBLE DISK ISSUE (N-to-A) (VAMOS)" "POSSIBLE DISK ISSUE (Y-to-Y) (VAMOS) - At 10:00:09.023, VAR 10 crossed 90 AM at (Alarm number 1213456789). A list issues related to this alarm. See the attached CSV for details (Click 'Export Attachment' at the top of the screen).==20190507.diskissue_g-to-A.12345.Aaslmeer.IssueReferences.1.csv.gz,20190507.diskissue_g-to-A.12345.Aaslmeer.IssueIds.2.csv.gz,20190507.diskissue_g-to-A.12345.Aaslmeer.IssueList.3.csv.gz FIELD*K-ALLOW_PROPRIETARY_VAMOSS-ARR*false*FIELD*K-MEMORY_THRESHOLD-ARR*1*FIELD*K-MEMORY_THRESHOLD-VAL*N/A*FIELD*K-CPU_PERCENT_THRESHOLD-ARR*0.010000000*FIELD*K-CPU_PERCENT_THRESHOLD-VAL*N/A*FIELD*K-VOLUME_LIMIT-ARR*0.000000000*FIELD*K-VOLUME_LIMIT-VAL*N/A*FIELD*K-TIME_THRESHOLD-ARR*00:00:10.000*FIELD*K-TIME_THRESHOLD-VAL*N/A*FIELD*K-PROCESSING_PERCENT_THRESHOLD-ARR*0.000000000*FIELD*K-PROCESSING_PERCENT_THRESHOLD-VAL*N/A*FIELD*K-MATCH_MULTIPLE_TESTS-ARR*false*FIELD*K-EXCLUDE_DIFFERENT_ISSUERS-ARR*false*FIELD*N-CHILD-TYPE*CURRENT*FIELD*N-CHILD-NAME*TRAVEL CURRENT PC04052*FIELD*N-ALARM-TYPE*PRESENT*FIELD*N-OFF-APPLICATION*false*FIELD*N-CTC*27.110000000*FIELD*N-CTC-CU*ABC*FIELD*N-ISSUE-NUM*1240000551*FIELD*N-KNOR-DOWN*N/A*FIELD*N-TRIGGER-TIME-BID*08:27:25.791*FIELD*N-TRIGGER-TIME-ASK*08:27:24.796*FIELD*N-BEN-AVG-DAILY-VOLUME*576525.600000000*FIELD*SYSTEM-TIME*10:02:04.451686000" "" PresentTime

ALARM: 20190507   2 10:00:09.023000000 10:00:09.023000000 ABB < 50> @160 . "LP12345" 12345 - I . ".." "" "" "POSSIBLE DISK ISSUE (Y-to-Y) (CURRENT)" "POSSIBLE DISK ISSUE (Y-to-Y) (VAMOS) - At 10:00:09.023, var 90 crossed  (issue number 12434576589). See the attached CSV for details (Click 'Export Attachment' at the top of the screen).==20190507.diskissue_g-to-A.12345.Aaslmeer.TraderReferences.1.csv.gz,20190507.diskissue_g-to-A.12345.Aaslmeer.IssueIds.2.csv.gz,20190507.diskissue_g-to-A.12345.Aaslmeer.IssueList.3.csv.gz FIELD*K-ALLOW_PROPRIETARY_VAMOSS-ARR*false*FIELD*K-MEMORY_THRESHOLD-ARR*1*FIELD*K-MEMORY_THRESHOLD-VAL*N/A*FIELD*K-CPU_PERCENT_THRESHOLD-ARR*0.010000000*FIELD*K-CPU_PERCENT_THRESHOLD-VAL*N/A*FIELD*K-VOLUME_LIMIT-ARR*0.000000000*FIELD*K-VOLUME_LIMIT-VAL*N/A*FIELD*K-TIME_THRESHOLD-ARR*00:00:10.000*FIELD*K-TIME_THRESHOLD-VAL*N/A*FIELD*K-PRESENT_PERCENT_THRESHOLD-ARR*0.000000000*FIELD*K-PROCESSING_PERCENT_THRESHOLD-VAL*N/A*FIELD*K-MATCH_MULTIPLE_TESTS-ARR*false*FIELD*K-EXCLUDE_DIFFERENT_CASES-ARR*false*FIELD*N-CHILD-TYPE*CURRENT*FIELD*N-CHILD-NAME*TRAVEL Savings PC04052*FIELD*N-ALARM-TYPE*PRESENT*FIELD*N-OFF-APPLICATION*false*FIELD*N-CTC*27.110000000*FIELD*N-CTC-CU*ABC*FIELD*N-ISSUE-NUM*1240000551*FIELD*N-KNOR-DOWN*N/A*FIELD*N-RAISED-TIME-BID*08:27:25.791*FIELD*N-RAISED-TIME-ASK*08:27:24.796*FIELD*N-GUN-AVG-DAILY-VOLUME*576525.600000000*FIELD*SYSTEM-TIME*10:02:04.451686000" "" PresentTime

所需的输出如下所示

1,K-ALLOW_PROPRIETARY_VAMOSS-ARR,false
1,K-MEMORY_THRESHOLD-ARR,1
1,K-MEMORY_THRESHOLD-VAL,N/A
1,K-CPU_PERCENT_THRESHOLD-ARR,0.010000000
1,K-CPU_PERCENT_THRESHOLD-VAL,N/A
1,K-VOLUME_LIMIT-ARR,0.000000000
1,K-VOLUME_LIMIT-VAL,N/A
1,K-TIME_THRESHOLD-ARR,00:00:10.000
1,K-TIME_THRESHOLD-VAL,N/A
1,K-PROCESSING_PERCENT_THRESHOLD-ARR,0.000000000
1,K-PROCESSING_PERCENT_THRESHOLD-VAL,N/A
1,K-MATCH_MULTIPLE_TESTS-ARR,false
1,K-EXCLUDE_DIFFERENT_ISSUERS-ARR,false
1,N-CHILD-TYPE,CURRENT
1,N-CHILD-NAME,TRAVEL CURRENT PC04052
1,N-ALARM-TYPE,PRESENT
1,N-OFF-APPLICATION,false
1,N-CTC,27.110000000
1,N-CTC-CU,ABC
1,N-ISSUE-NUM,1240000551
1,N-KNOR-DOWN,N/A
1,N-TRIGGER-TIME-BID,08:27:25.791
1,N-TRIGGER-TIME-ASK,08:27:24.796
1,N-BEN-AVG-DAILY-VOLUME,576525.600000000
1,SYSTEM-TIME,10:02:04.451686000
2,K-ALLOW_PROPRIETARY_VAMOSS-ARR,false
2,K-MEMORY_THRESHOLD-ARR,1
2,K-MEMORY_THRESHOLD-VAL,N/A
2,K-CPU_PERCENT_THRESHOLD-ARR,0.010000000
2,K-CPU_PERCENT_THRESHOLD-VAL,N/A
2,K-VOLUME_LIMIT-ARR,0.000000000
2,K-VOLUME_LIMIT-VAL,N/A
2,K-TIME_THRESHOLD-ARR,00:00:10.000
2,K-TIME_THRESHOLD-VAL,N/A
2,K-PRESENT_PERCENT_THRESHOLD-ARR,0.000000000
2,K-PROCESSING_PERCENT_THRESHOLD-VAL,N/A
2,K-MATCH_MULTIPLE_TESTS-ARR,false
2,K-EXCLUDE_DIFFERENT_CASES-ARR,false
2,N-CHILD-TYPE,CURRENT
2,N-CHILD-NAME,TRAVEL Savings PC04052
2,N-ALARM-TYPE,PRESENT
2,N-OFF-APPLICATION,false
2,N-CTC,27.110000000
2,N-CTC-CU,ABC
2,N-ISSUE-NUM,1240000551
2,N-KNOR-DOWN,N/A
2,N-RAISED-TIME-BID,08:27:25.791
2,N-RAISED-TIME-ASK,08:27:24.796
2,N-GUN-AVG-DAILY-VOLUME,576525.600000000
2,SYSTEM-TIME,10:02:04.451686000

使用下面的命令,我可以提取每行上没有字段 3 的值。但不确定如何继续在每行添加字段 3 以获得所需的输出。

命令已尝试

 zcat input.gz |sed 's/FIELD/\n/g' | grep '^\*' | sed
 's/^*//g;s/*/,/g;s/,$//g;s/" "" PresentTime//g'

收到的输出

K-ALLOW_PROPRIETARY_VAMOSS-ARR,false
K-MEMORY_THRESHOLD-ARR,1
K-MEMORY_THRESHOLD-VAL,N/A
K-CPU_PERCENT_THRESHOLD-ARR,0.010000000
K-CPU_PERCENT_THRESHOLD-VAL,N/A
K-VOLUME_LIMIT-ARR,0.000000000
K-VOLUME_LIMIT-VAL,N/A
K-TIME_THRESHOLD-ARR,00:00:10.000
K-TIME_THRESHOLD-VAL,N/A
K-PROCESSING_PERCENT_THRESHOLD-ARR,0.000000000
K-PROCESSING_PERCENT_THRESHOLD-VAL,N/A

如果可以使用其他语言(例如 python)也可以管理。

注意:输入文件是只读文件。

答案1

尝试这个,

 while read i; do 
    LineNO=$(echo $i |awk '{print $3}'); 
    echo "$i" |  awk -F 'FIELD' -v a=$LineNO '{for(j=2;j<=NF;j++) print a$j}'| awk -F '[" *]' '{print $1","$2","$3}'; 
 done < input

对于 .gz 文件

gunzip < input.gz | while read i; do 
    LineNO=$(echo $i |awk '{print $3}'); 
    echo "$i" |  awk -F 'FIELD' -v a=$LineNO '{for(j=2;j<=NF;j++) print a$j}'| awk -F '[" *]' '{print $1","$2","$3}'; 
 done
  • gunzip读取文件。
  • while逐行阅读。
  • LineNo保存行号。
  • FIELD用分隔符分割行。
  • for从第二列开始打印。
  • a打印行号
  • 第二步awk使用逗号分隔符格式化输出。

答案2

$ cat tst.awk
BEGIN { OFS="," }
NF {
    idx = $3
    sub(/.* FIELD\*/,"")
    sub(/" .*/,"")
    n = split($0,f,/[*]/)
    for (i=1; i<=n; i+=3) {
        print idx, f[i], f[i+1]
    }
}

$ awk -f tst.awk file
1,K-ALLOW_PROPRIETARY_VAMOSS-ARR,false
1,K-MEMORY_THRESHOLD-ARR,1
1,K-MEMORY_THRESHOLD-VAL,N/A
1,K-CPU_PERCENT_THRESHOLD-ARR,0.010000000
1,K-CPU_PERCENT_THRESHOLD-VAL,N/A
1,K-VOLUME_LIMIT-ARR,0.000000000
1,K-VOLUME_LIMIT-VAL,N/A
1,K-TIME_THRESHOLD-ARR,00:00:10.000
1,K-TIME_THRESHOLD-VAL,N/A
1,K-PROCESSING_PERCENT_THRESHOLD-ARR,0.000000000
1,K-PROCESSING_PERCENT_THRESHOLD-VAL,N/A
1,K-MATCH_MULTIPLE_TESTS-ARR,false
1,K-EXCLUDE_DIFFERENT_ISSUERS-ARR,false
1,N-CHILD-TYPE,CURRENT
1,N-CHILD-NAME,TRAVEL CURRENT PC04052
1,N-ALARM-TYPE,PRESENT
1,N-OFF-APPLICATION,false
1,N-CTC,27.110000000
1,N-CTC-CU,ABC
1,N-ISSUE-NUM,1240000551
1,N-KNOR-DOWN,N/A
1,N-TRIGGER-TIME-BID,08:27:25.791
1,N-TRIGGER-TIME-ASK,08:27:24.796
1,N-BEN-AVG-DAILY-VOLUME,576525.600000000
1,SYSTEM-TIME,10:02:04.451686000
2,K-ALLOW_PROPRIETARY_VAMOSS-ARR,false
2,K-MEMORY_THRESHOLD-ARR,1
2,K-MEMORY_THRESHOLD-VAL,N/A
2,K-CPU_PERCENT_THRESHOLD-ARR,0.010000000
2,K-CPU_PERCENT_THRESHOLD-VAL,N/A
2,K-VOLUME_LIMIT-ARR,0.000000000
2,K-VOLUME_LIMIT-VAL,N/A
2,K-TIME_THRESHOLD-ARR,00:00:10.000
2,K-TIME_THRESHOLD-VAL,N/A
2,K-PRESENT_PERCENT_THRESHOLD-ARR,0.000000000
2,K-PROCESSING_PERCENT_THRESHOLD-VAL,N/A
2,K-MATCH_MULTIPLE_TESTS-ARR,false
2,K-EXCLUDE_DIFFERENT_CASES-ARR,false
2,N-CHILD-TYPE,CURRENT
2,N-CHILD-NAME,TRAVEL Savings PC04052
2,N-ALARM-TYPE,PRESENT
2,N-OFF-APPLICATION,false
2,N-CTC,27.110000000
2,N-CTC-CU,ABC
2,N-ISSUE-NUM,1240000551
2,N-KNOR-DOWN,N/A
2,N-RAISED-TIME-BID,08:27:25.791
2,N-RAISED-TIME-ASK,08:27:24.796
2,N-GUN-AVG-DAILY-VOLUME,576525.600000000
2,SYSTEM-TIME,10:02:04.451686000

相关内容