Informatica 作业日志文件通过 shell 脚本处理

2024-6-12 • tag-icon

日志文件原始数据：

READER_1_1_1> BIGQUERYV2_10000 [2022-11-04 01:55:20.724] [INFO] Job statistics - \n Job ID [job_PsfUvYJkPeBfecxeIzUUrIIa9TEc] \n Job creation time [2022-11-04 01:54:54.724] , \n Job start time [2022-11-04 01:54:54.936], \n Job end time [2022-11-04 01:55:10.88], \n Bytes processed [4,081,564,561] .

DIRECTOR> CMN_1740 [2022-11-04 03:25:53.269] Table: [src_BQ_CONSUMER_CURRENT] (Instance Name: [src_BQ_CONSUMER_CURRENT] Instance UI Name: [src_BQ_CONSUMER_CURRENT])

     Output Rows [2104173], Affected Rows [2104173], Applied Rows [2104173], Rejected Rows [0]
DIRECTOR> CMN_1740 [2022-11-04 03:25:53.269] Table: [Account] (Instance Name: [Account] Instance UI Name: [TGT_ACCOUNT])

     Output Rows [2103334], Affected Rows [0], Applied Rows [2103334], Rejected Rows [839]
DIRECTOR> CMN_1740 [2022-11-04 03:25:53.269] Table: [EU_Delta_Account_txt] (Instance Name: [EU_Delta_Account_txt] Instance UI Name: [tgt_FILE])

     Output Rows [1], Affected Rows [1], Applied Rows [1], Rejected Rows [0]
DIRECTOR> TM_6020 [2022-11-04 03:25:53.269] Session [s_mtt_0117JZ0Z000000000047] completed at [Fri Nov 04 03:25:53 2022].

需要将Table:,Job start time, completed at,appended rows,error rows数据等字段捕获到另一个文件中。这些字段数据将其存储在 Hadoop hive 表中以用于邮件发送。

相关内容