我有一个问题,我有一个文件 test.txt,其内容如下:
dn: serv=CSPS,mscId=167e48dc2b7a42d4acce611c8b477262,ou=multiSCs,dc=three
structuralObjectClass: CP1
objectClass: CP1
objectClass: CUDBServiceAuxiliary
objectClass: CP2
objectClass: CP3
objectClass: CP4
objectClass: CP5
objectClass: CP6
UNKNLOCDATECS:: FQsJ
UNKNLOCDATEPS:: FgMe
ISTTIMESTAMP:: FgMIDyI7
CSULTIME:: HgMWCzYo
CSLOCTIME:: AQQWBA0R
PSULTIME:: HgMWDBco
PSLOCTIME:: HgMWDBco
SCHAR:: AgA=
ICS: 1
CAT: 10
DBSG: 1
OFA: 1
SOCB: 1
PWD: 0000
PWDC: 0
SOCFB: 0
每次找到文本 CSULTIME:: 和 CSLOCTIME:: 时,我想用以下函数替换这些文字后面的值,以将该时间戳解码为可识别的格式(如果我可以在一次文件扫描中替换这两个值,效果更好)我们讨论的是 8 GB 文件,两种情况下的功能是相同的):
base64 -d | hexdump -v -e '1/1 "%02d" ' | awk 'BEGIN {FS = ""} {print "20" $5 $6 "-" $3 $4 "-" $1 $2 " " $7 $8 ":" $9 $10 ":" $11 $12}'
如果我在 unix 中对这两个值进行 echo:
For CSULTIME the result would be 2022-03-30 11:54:40: echo -n "HgMWCzYo" | base64 -d | hexdump -v -e '1/1 "%02d" ' | awk 'BEGIN {FS = ""} {print "20" $5 $6 "-" $3 $4 "-" $1 $2 " " $7 $8 ":" $9 $10 ":" $11 $12}'
For CSLOCTIME the result would be 2022-04-01 04:13:17: echo -n "AQQWBA0R" | base64 -d | hexdump -v -e '1/1 "%02d" ' | awk 'BEGIN {FS = ""} {print "20" $5 $6 "-" $3 $4 "-" $1 $2 " " $7 $8 ":" $9 $10 ":" $11 $12}'
因此,文件最后将具有以下 CSULTIME 和 CSLOCTIME 值:
dn: serv=CSPS,mscId=167e48dc2b7a42d4acce611c8b477262,ou=multiSCs,dc=three
structuralObjectClass: CP1
objectClass: CP1
objectClass: CUDBServiceAuxiliary
objectClass: CP2
objectClass: CP3
objectClass: CP4
objectClass: CP5
objectClass: CP6
UNKNLOCDATECS:: FQsJ
UNKNLOCDATEPS:: FgMe
ISTTIMESTAMP:: FgMIDyI7
CSULTIME:: 2022-03-30 11:54:40
CSLOCTIME:: 2022-04-01 04:13:17
PSULTIME:: HgMWDBco
PSLOCTIME:: HgMWDBco
SCHAR:: AgA=
ICS: 1
CAT: 10
DBSG: 1
OFA: 1
SOCB: 1
PWD: 0000
PWDC: 0
SOCFB: 0
我完全迷失了,因为我尝试的 sed 的所有组合都无法使它们工作。
提前致谢!!!!
答案1
我会做类似的事情:
perl -MMIME::Base64 -pe 's{^[^:]*TIME:\K: (\S+)}{
my ($d, $m, $y, @t) = unpack "C*", decode_base64 $1;
sprintf "20%02d-%02d-%02d %02d:%02d:%02d", $y, $m, $d, @t}e'
答案2
您可以使用以下方法,我们首先设置用于 Base64 解码的设备,然后使用 GNU sed 处理编码数据。
#------------------
# base64 in sed
#------------------
set -u
#> present century
century=$(date '+%C')
#> format string printf
fmt='%s%s\n'
#> bit, sextet, & octet regex
bit='[01]'
sextet="${bit}{6}"
octet="${bit}{8}"; byte=$octet
#> base64 charset
b64='/[:alnum:]+'
declare -a b64_chars=({A..Z} {a..z} {0..9} + /)
#> user-defined helper function(s)
oneLine() {
# collapse stdin into one line
paste -sd'\0' -
}
esc_rhs() {
# make stdin pkuggable
# on the rhs of a s///
sed -e '
s:[\/&]:\\&:g
$!s:$:\\:
' -
}
dec2bin() {
# create an array of size 2^$1
# Usage: dec2bin 6
# creates array d2b as:
# underscore for clarity only
# $d2b[0] => "000_000"
# $d2b[1] => "000_001"
# ...
# $d2b[63] => "111_111"
eval "d2b=($(yes '{0,1}' | sed "$1q" | oneLine))"
}
#> build the encoding lookup table
encode_tbl=$(printf '%s\n' "$(
dec2bin 6
i=0
for c in "${b64_chars[@]}"
do
printf "$fmt" "$c" "${d2b[$i]}"
(( i++ ))
done
)" | esc_rhs)
#> build the decoding lookup table
decode_tbl=$(printf '%s\n' "$(
dec2bin 8
for dec in {0..255}
do
hex=$(printf '%02x' "$dec")
bin=${d2b[$dec]}
printf "$fmt" "$bin" "$hex"
done
)" | esc_rhs)
#> hex to bin
h2d=$(for i in {0..255};do
printf 'x%02x:%02d\n' "$i" "$i"
done | oneLine | esc_rhs)
#######> main()
sed -E "
s/^\s*(CSULTIME|CSLOCTIME)::\s*/&\n/;T
h;s/\n.*//
x;s/.*\n//
s/[^${b64}]//g
/^\$/d;s/^|\$/\n/g
# unencode
s/\$/$encode_tbl/
:encode
/\n\n/!{
s/((\n)[${b64}])(.*\1(${sextet}))/\4\2\3/
b encode
}
s/\n.*//;:pad
/^(${octet})+\$/!{
s/\$/0/;b pad
}
# chunk it in 8-bit portions
s/(${octet})/& /g
G;s/\$/${decode_tbl}/;t decode
:decode
s/(${octet}) (.*\n\1([[:xdigit:]]{2}))/\3\2/
t decode
s/\n.*//
#> hex to decimal
s/^|\$/\n/g
s/\$/$h2d/;t hex2int
:hex2int
s/\n([[:xdigit:]]{2})(.*\n.*x\1:([[:digit:]]+))/\3\n\2/
t hex2int
s/\n.*//
#> rearrange decoded value into
#> yyy-mm-dd hh:mm:ss format
s/(..)(..)(..)/\3\2\1/
s/../&-/g
s/(.*)./$century\1/
s/-/:/3g
s/:/ /
#> stitch back the prefixes
#> CSULTIME, CSLOCTIME
H;z;x;s/\n//
" file
输出:-
-1e-0===
dn: serv=CSPS,mscId=167e48dc2b7a42d4acce611c8b477262,ou=multiSCs,dc=three
structuralObjectClass: CP1
objectClass: CP1
objectClass: CUDBServiceAuxiliary
objectClass: CP2
objectClass: CP3
objectClass: CP4
objectClass: CP5
objectClass: CP6
UNKNLOCDATECS:: FQsJ
UNKNLOCDATEPS:: FgMe
ISTTIMESTAMP:: FgMIDyI7
CSULTIME:: 2022-03-30 11:54:40
CSLOCTIME:: 2022-04-01 04:13:17
PSULTIME:: HgMWDBco
PSLOCTIME:: HgMWDBco
SCHAR:: AgA=
ICS: 1
CAT: 10
DBSG: 1
OFA: 1
SOCB: 1
PWD: 0000
PWDC: 0
SOCFB: 0
答案3
我明白了,我做得对(使用 sed,不知道如何在 perl 中使用 hexdump),问题是我需要在 sed 内转义的字符。
现在它可以完美地工作,尽管我必须执行两个 sed:
sed 's/\(CSULTIME::\)\(.*\)/echo -n \1" ";echo \2\| base64 -d \| hexdump -v -e '\''1\/1 "%02d" '\'' \| awk '\''BEGIN {FS = ""} {print "20" $5 $6 "-" $3 $4 "-" $1 $2 " " $7 $8 ":" $9 $10 ":" $11 $12}'\'' /ge '
sed 's/\(CSLOCTIME::\)\(.*\)/echo -n \1" ";echo \2\| base64 -d \| hexdump -v -e '\''1\/1 "%02d" '\'' \| awk '\''BEGIN {FS = ""} {print "20" $5 $6 "-" $3 $4 "-" $1 $2 " " $7 $8 ":" $9 $10 ":" $11 $12}'\'' /ge '
#------------
您可以在 bash 函数中分解代码,将其导出,然后在 GNU sed 的 ss///e 中调用它
fx() {
printf '%s' "$2" | base64 -d |
hexdump -v -e '1/1 "%02d" ' |
awk -v temp="$1" 'BEGIN {FS = ""} {print temp "20" $5 $6 "-" $3 $4 "-" $1 $2 " " $7 $8 ":" $9 $10 ":" $11 $12}'
}
export -f fx
sed -E "s/((CSULTIME|CSLOCTIME)::\s*)(\S.*)/fx '\1' '\3'/e" file