Sed/awk/perl:反转逗号分隔值的顺序,保留其他文本

Sed/awk/perl:反转逗号分隔值的顺序,保留其他文本

我有这样的文字:

LABEL1
    .BYTE 01, 02, 03, 04, 05
    .BYTE 01, 02, 03

我只需要反转逗号分隔值的顺序:

LABEL1
    .BYTE 05, 04, 03, 02, 01
    .BYTE 03, 02, 01

我需要这样处理:

ITINERARY_ARRAY_01
    .BYTE <ITINERARY_00A
    .BYTE <ITINERARY_01A
    .BYTE <ITINERARY_02A
    .BYTE <ITINERARY_03A
    .BYTE <ITINERARY_04A
    .BYTE <ITINERARY_05A
    .BYTE <ITINERARY_06A
    .BYTE <ITINERARY_07A
    .BYTE <ITINERARY_08A
    .BYTE <ITINERARY_09A
    .BYTE <ITINERARY_10A
    .BYTE <ITINERARY_11A
    .BYTE <ITINERARY_12A
    .BYTE <ITINERARY_13A
    .BYTE <ITINERARY_14A
;-------------------
ITINERARY_01E
    .BYTE $03, $05, $07, $00
;-------------------
ITINERARY_01F
    .BYTE $03, $05, $07, $09, $00
;-------------------
ITINERARY_01G
    .BYTE $28, $0D, $00
;-------------------
ITINERARY_01H
    .BYTE $28, $0D, $0F, $13, $00
;-------------------
ITINERARY_01I
    .BYTE $28, $0D, $0F, $11, $00
;-------------------
ITINERARY_01J
    .BYTE $03, $05, $07, $09, $20, $1E, $00
;-------------------
ITINERARY_01K
    .BYTE $28, $0D, $0F, $13, $15, $00
;-------------------
ITINERARY_01L
    .BYTE $03, $05, $07, $09, $20, $1E, $1C, $27
    .BYTE $00
;---------------------

除了“.BYTE”之后的值之外,不需要更改任何内容,这些值必须采用相反的顺序,十六进制格式,使用“$”作为前缀...对此“编辑”感到抱歉,但我现在才看到这一点。再次感谢 !

答案1

我在这里这样做sed

sed '/,/!b                                                   
s/\( *[^ ]*\)\(.*\)/\2,\n\1/;:t
s/\([^,]*,\)\(\n.*\)/\2\1/;tt
s/\n\(.*\),/\1/' <<\DATA
LABEL1
    .BYTE 01, 02, 03, 04, 05
    .BYTE 01, 02, 03        
LABEL1
    .BYTE 01, 02, 03, 04, 05
    .BYTE 01, 02, 03
DATA

输出

LABEL1
    .BYTE 05, 04, 03, 02, 01 
    .BYTE 03, 02, 01 
LABEL1
    .BYTE 05, 04, 03, 02, 01 
    .BYTE 03, 02, 01 

它检查当前行中是否有逗号。如果没有!逗号,sed b则退出脚本并自动打印该行。如果线包含逗号sed执行以下操作:

  1. s///它首先通过替换以下内容 来准备线路:
    • \( *[^ ]*\)- 第一个出现的零个或多个空格序列,后跟零个或多个非空格字符的序列,引用\1后紧接着...
    • \(.*\)- 线上的所有其他内容都被引用为\2...
    • ...和\2,\n\1
    • 笔记-像这样\n在右侧s///替换字段中使用转义符并不完全可移植。对于sed不支持它的 a ,可以通过用文字换行符替换n语句中的 来完成。
  2. :定义了一个名为 的分支/测试标签t
  3. 虽然它仍然可以,但sed s///可以替代:
    • \([^,]*,\)- 零个或多个非逗号字符的序列然后引用的单个逗号紧随\1其后...
    • \(\n.*\)- 以至少一个\newline 字符开头的序列,后跟模式空间中剩余的任何内容/所有内容,引用为\2...
    • ...和\2\1​​。
  4. 如果之前的s///替换test 成功,sed则分支回:test 标签并重试。
  5. 最后sed进行一些清理并替换:
    • \n\(.*\),- 第一个出现的\newline 字符和最后一个出现的逗号...
    • \1- ...以及介于两者之间的一切。

sed递归替换一样,\newline 分隔符一次向后走一个逗号分隔的字段。当\newline 是该行的第一个字符时,它会停止替换。以下是l递归替换过程的进展情况:

 01, 02, 03, 04, 05,\n    .BYTE$
 01, 02, 03, 04,\n    .BYTE 05,$
 01, 02, 03,\n    .BYTE 05, 04,$
 01, 02,\n    .BYTE 05, 04, 03,$
 01,\n    .BYTE 05, 04, 03, 02,$
\n    .BYTE 05, 04, 03, 02, 01,$

在最初的准备替换之后,sed除了逗号和插入的 ewline 字符之外,不会对任何内容进行定界\n。所以任何一种逗号分隔的值工作得很好。这是运行你的长位的输出:

ITINERARY_ARRAY_01
    .BYTE <ITINERARY_00A
    .BYTE <ITINERARY_01A
    .BYTE <ITINERARY_02A
    .BYTE <ITINERARY_03A
    .BYTE <ITINERARY_04A
    .BYTE <ITINERARY_05A
    .BYTE <ITINERARY_06A
    .BYTE <ITINERARY_07A
    .BYTE <ITINERARY_08A
    .BYTE <ITINERARY_09A
    .BYTE <ITINERARY_10A
    .BYTE <ITINERARY_11A
    .BYTE <ITINERARY_12A
    .BYTE <ITINERARY_13A
    .BYTE <ITINERARY_14A
;-------------------
ITINERARY_01E
    .BYTE $00, $07, $05, $03 
;-------------------
ITINERARY_01F
    .BYTE $00, $09, $07, $05, $03 
;-------------------
ITINERARY_01G
    .BYTE $00, $0D, $28 
;-------------------
ITINERARY_01H
    .BYTE $00, $13, $0F, $0D, $28 
;-------------------
ITINERARY_01I
    .BYTE $00, $11, $0F, $0D, $28 
;-------------------
ITINERARY_01J
    .BYTE $00, $1E, $20, $09, $07, $05, $03 
;-------------------
ITINERARY_01K
    .BYTE $00, $15, $13, $0F, $0D, $28 
;-------------------
ITINERARY_01L
    .BYTE $27, $1C, $1E, $20, $09, $07, $05, $03
    .BYTE $00
;---------------------

答案2

文件revbytes2.awk

#!/usr/bin/awk -f
BEGIN {
        FS=",? +"
}
NF>2 && match($0,"^ +\.BYTE ") {
        printf substr($0,1,RSTART+RLENGTH-1)
        for(i=NF;i>3;i--) printf $i", "
        print $3
        next
}
1

FS=",? +"使awk识别后面的空格.BYTE,字节之间的加空格序列作为字段分隔符。

对于每一行,这将查找具有超过 2 个字段的行,这些字段以空格开头,后跟.BYTE一个空格,并记住 和 中此前缀的开头和长度RSTARTRLENGTH作为表达式的副作用match(...)

如果找到此匹配项并且存在超过 2 个字段,则使用RSTART和从原始行中剪切前缀,RLENGTH然后按相反顺序打印其余字段。

如果未找到空格加.BYTE空格前缀或字段不超过 2 个,则该行将按原样打印。因此,对于仅定义一个字节的 -line 也将执行此操作.BYTE,因为没有任何可反转的内容。

测试运行:

$ diff -u$(wc -l <input) input <(awk -f revbytes2.awk input)
--- input       2014-10-19 06:04:48.280714146 +0200
+++ /dev/fd/63  2014-10-19 22:40:01.385538235 +0200
@@ -1,42 +1,42 @@
 ITINERARY_ARRAY_01
     .BYTE <ITINERARY_00A
     .BYTE <ITINERARY_01A
     .BYTE <ITINERARY_02A
     .BYTE <ITINERARY_03A
     .BYTE <ITINERARY_04A
     .BYTE <ITINERARY_05A
     .BYTE <ITINERARY_06A
     .BYTE <ITINERARY_07A
     .BYTE <ITINERARY_08A
     .BYTE <ITINERARY_09A
     .BYTE <ITINERARY_10A
     .BYTE <ITINERARY_11A
     .BYTE <ITINERARY_12A
     .BYTE <ITINERARY_13A
     .BYTE <ITINERARY_14A
 ;-------------------
 ITINERARY_01E
-    .BYTE $03, $05, $07, $00
+    .BYTE $00, $07, $05, $03
 ;-------------------
 ITINERARY_01F
-    .BYTE $03, $05, $07, $09, $00
+    .BYTE $00, $09, $07, $05, $03
 ;-------------------
 ITINERARY_01G
-    .BYTE $28, $0D, $00
+    .BYTE $00, $0D, $28
 ;-------------------
 ITINERARY_01H
-    .BYTE $28, $0D, $0F, $13, $00
+    .BYTE $00, $13, $0F, $0D, $28
 ;-------------------
 ITINERARY_01I
-    .BYTE $28, $0D, $0F, $11, $00
+    .BYTE $00, $11, $0F, $0D, $28
 ;-------------------
 ITINERARY_01J
-    .BYTE $03, $05, $07, $09, $20, $1E, $00
+    .BYTE $00, $1E, $20, $09, $07, $05, $03
 ;-------------------
 ITINERARY_01K
-    .BYTE $28, $0D, $0F, $13, $15, $00
+    .BYTE $00, $15, $13, $0F, $0D, $28
 ;-------------------
 ITINERARY_01L
-    .BYTE $03, $05, $07, $09, $20, $1E, $1C, $27
+    .BYTE $27, $1C, $1E, $20, $09, $07, $05, $03
     .BYTE $00
 ;---------------------

比较mawkgawk输出:

$ diff <(mawk -f revbytes2.awk input) <(gawk -f revbytes2.awk input)
gawk: revbytes2.awk:5: warning: escape sequence `\.' treated as plain `.'

显然标准输出没有差异。好的!

"^ +\056BYTE "如果您不在表达式"^ +\.BYTE "内写入,则警告就会消失match(...)

也许gawk经常使用的人知道更好的方法来避免警告。

答案3

我会这样做:

perl -MTie::File -e'
    tie @lines,"Tie::File","your_file";
    for(@lines){
        next unless /,/; # Skip lines with no commas
        $csv = /(\s*[^,\s]+,.*)/;
        $new_csv = join ",",reverse split /,/,$csv;
        s/\Q$csv/$new_csv/;
    }'

免责声明!

这将修改您的文件就地。如果不需要,请使用文件的虚拟副本。

不会修改原始文件的版本

perl -pe'
        next unless /,/; # Skip lines with no commas
        chomp;
        $csv = /(\s*[^,\s]+,.*)/;
        $new_csv = join ",",reverse split /,/,$csv;
        $new_csv .= "\n"; # The newline removed by chomp
        s/\Q$csv/$new_csv/;
    ' your_file

假设

  • 您不关心逗号周围的间距。
  • 第一个 CSV 值.BYTE至少偏移一个空格。
  • 通过“反转顺序”,您的意思是反转它们在文件中找到的顺序,而不是按数字降序排序。

输入

ITINERARY_ARRAY_01
    .BYTE <ITINERARY_00A
    .BYTE <ITINERARY_01A
    .BYTE <ITINERARY_02A
    .BYTE <ITINERARY_03A
    .BYTE <ITINERARY_04A
    .BYTE <ITINERARY_05A
    .BYTE <ITINERARY_06A
    .BYTE <ITINERARY_07A
    .BYTE <ITINERARY_08A
    .BYTE <ITINERARY_09A
    .BYTE <ITINERARY_10A
    .BYTE <ITINERARY_11A
    .BYTE <ITINERARY_12A
    .BYTE <ITINERARY_13A
    .BYTE <ITINERARY_14A
;-------------------
ITINERARY_01E
    .BYTE $03, $05, $07, $00
;-------------------
ITINERARY_01F
    .BYTE $03, $05, $07, $09, $00
;-------------------
ITINERARY_01G
    .BYTE $28, $0D, $00
;-------------------
ITINERARY_01H
    .BYTE $28, $0D, $0F, $13, $00
;-------------------
ITINERARY_01I
    .BYTE $28, $0D, $0F, $11, $00
;-------------------
ITINERARY_01J
    .BYTE $03, $05, $07, $09, $20, $1E, $00
;-------------------
ITINERARY_01K
    .BYTE $28, $0D, $0F, $13, $15, $00
;-------------------
ITINERARY_01L
    .BYTE $03, $05, $07, $09, $20, $1E, $1C, $27
    .BYTE $00
;---------------------

输出

ITINERARY_ARRAY_01
    .BYTE <ITINERARY_00A
    .BYTE <ITINERARY_01A
    .BYTE <ITINERARY_02A
    .BYTE <ITINERARY_03A
    .BYTE <ITINERARY_04A
    .BYTE <ITINERARY_05A
    .BYTE <ITINERARY_06A
    .BYTE <ITINERARY_07A
    .BYTE <ITINERARY_08A
    .BYTE <ITINERARY_09A
    .BYTE <ITINERARY_10A
    .BYTE <ITINERARY_11A
    .BYTE <ITINERARY_12A
    .BYTE <ITINERARY_13A
    .BYTE <ITINERARY_14A
;-------------------
ITINERARY_01E
    .BYTE $00, $07, $05, $03
;-------------------
ITINERARY_01F
    .BYTE $00, $09, $07, $05, $03
;-------------------
ITINERARY_01G
    .BYTE $00, $0D, $28
;-------------------
ITINERARY_01H
    .BYTE $00, $13, $0F, $0D, $28
;-------------------
ITINERARY_01I
    .BYTE $00, $11, $0F, $0D, $28
;-------------------
ITINERARY_01J
    .BYTE $00, $1E, $20, $09, $07, $05, $03
;-------------------
ITINERARY_01K
    .BYTE $00, $15, $13, $0F, $0D, $28
;-------------------
ITINERARY_01L
    .BYTE $27, $1C, $1E, $20, $09, $07, $05, $03
    .BYTE $00
;---------------------

答案4

根据您的输入,您可以使用perl

$ perl -MText::Tabs -anle '
    BEGIN {$tabstop = 4};
    print and next if /^\S/;
    @nums = grep { $_ =~ /\d+/ } @F;
    map { s/\D//g } @nums;
    map { $_ = (pop @nums) . (@nums==0 ? "" : ",")
        if $_ =~ /\d+/ } @F;
    print expand "\t@F";
' file
LABEL1
    .BYTE 05, 04, 03, 02, 01
    .BYTE 03, 02, 01

我假设您的原始输入已排序。如果没有,您可以使用@nums = sort { $a <=> $b } grep { $_ =~ /\d+/ } @F;来代替@nums = grep { $_ =~ /\d+/ } @F;

相关内容