从新行开始的字符串的十六进制转储？

Question 1

这是一种可能性，一种紧凑的解决方案，它利用的read功能来限制读取字符的数量：

c=0
while IFS= read -n16 -r line
do
  len=${#line}
  ((len<16)) && { ((len++)) ; line+=$'\n' ;}
  printf "%08x  " $c
  for ((i=0; i<len; i++))
  do  printf " %02x" "'${line:i:1}"
  done
  printf " %*s %s\n" $((50-3*len)) "" "'${line//[^[:print:]]/.}'"
  ((c+=len))
done

Answer

这是一种可能性，一种紧凑的解决方案，它利用的read功能来限制读取字符的数量：

c=0
while IFS= read -n16 -r line
do
  len=${#line}
  ((len<16)) && { ((len++)) ; line+=$'\n' ;}
  printf "%08x  " $c
  for ((i=0; i<len; i++))
  do  printf " %02x" "'${line:i:1}"
  done
  printf " %*s %s\n" $((50-3*len)) "" "'${line//[^[:print:]]/.}'"
  ((c+=len))
done

Question 2

嗯，有printf...

hex_split()(    unset c dump slice rad pend
        _get(){ dd bs=1024 count=1; echo .; } 2>/dev/null
        _buf()  case $((${#dump}>0)):$((${#slice}>0)) in
                (0:*)   dump=$(_get); dump=${dump%.}
                        [ -n "$dump" ] || [ -n "$slice" ];;
                (*:0)   [ "${#dump}" -lt 16 ]       &&
                        slice=${dump:-$slice} dump= && return
                        slice=${dump%"${dump#$q}"} dump=${dump#$q};;esac
        _out(){ printf "%08x%02.0s" "$rad" "$((rad+=$#/2))"
                printf "%02x %.0s" "$@"
                printf "%-$(((16-($#/2))*3))s"
                printf "%.0s%.1s" '' ' ' '' \| "$@" '' \| '' "$nl"
};      q=$(printf %016s|tr \  \?) ; IFS=\  nl='
'       rad=0 c=0 split=${split:-$nl} slice="$*"; set --
        while   [ -n "$slice" ] || _buf || ! ${1:+"_out"} "$@" &&
                c=${slice%"${slice#?}"} slice=${slice#?}                
        do      set "$@" "'$c" "${c#[![:print:]]}."
                case $#$c in    (32*|*$split)   _out "$@"; set --;;esac
        done
)

您可以将标准输入或参数或两者都交给它。所以...

echo "something
is
being
written
here" | hex_split something else besides

...以上印刷品...

00000000  73 6f 6d 65 74 68 69 6e 67 20 65 6c 73 65 20 62  |something else b|
00000010  65 73 69 64 65 73 00 73 6f 6d 65 74 68 69 6e 67  |esides.something|
00000020  0a                                               |.|
00000021  69 73 0a                                         |is.|
00000024  62 65 69 6e 67 0a                                |being.|
0000002a  77 72 69 74 74 65 6e 0a                          |written.|
00000032  68 65 72 65 0a                                   |here.|

更改默认的分割字符，例如...

split=${somechar} hex_split

Answer

嗯，有printf...

hex_split()(    unset c dump slice rad pend
        _get(){ dd bs=1024 count=1; echo .; } 2>/dev/null
        _buf()  case $((${#dump}>0)):$((${#slice}>0)) in
                (0:*)   dump=$(_get); dump=${dump%.}
                        [ -n "$dump" ] || [ -n "$slice" ];;
                (*:0)   [ "${#dump}" -lt 16 ]       &&
                        slice=${dump:-$slice} dump= && return
                        slice=${dump%"${dump#$q}"} dump=${dump#$q};;esac
        _out(){ printf "%08x%02.0s" "$rad" "$((rad+=$#/2))"
                printf "%02x %.0s" "$@"
                printf "%-$(((16-($#/2))*3))s"
                printf "%.0s%.1s" '' ' ' '' \| "$@" '' \| '' "$nl"
};      q=$(printf %016s|tr \  \?) ; IFS=\  nl='
'       rad=0 c=0 split=${split:-$nl} slice="$*"; set --
        while   [ -n "$slice" ] || _buf || ! ${1:+"_out"} "$@" &&
                c=${slice%"${slice#?}"} slice=${slice#?}                
        do      set "$@" "'$c" "${c#[![:print:]]}."
                case $#$c in    (32*|*$split)   _out "$@"; set --;;esac
        done
)

您可以将标准输入或参数或两者都交给它。所以...

echo "something
is
being
written
here" | hex_split something else besides

...以上印刷品...

00000000  73 6f 6d 65 74 68 69 6e 67 20 65 6c 73 65 20 62  |something else b|
00000010  65 73 69 64 65 73 00 73 6f 6d 65 74 68 69 6e 67  |esides.something|
00000020  0a                                               |.|
00000021  69 73 0a                                         |is.|
00000024  62 65 69 6e 67 0a                                |being.|
0000002a  77 72 69 74 74 65 6e 0a                          |written.|
00000032  68 65 72 65 0a                                   |here.|

更改默认的分割字符，例如...

split=${somechar} hex_split

Question 3

我需要这个来使用 difftool 比较两个文件，但仍然能够看到哪些不可打印字符不同。

该功能添加了一个-n选项hexdump。如果-n指定，则输出将在换行符处分割，如果不调用正常的十六进制转储。与@Janis的答案相比，这并不是对hexdump的完全重写，而是使用指定的其他参数（如果给定）调用hexdump。但是 hexdump 通过使用headSkip选项逐行输入输入，-s以保留偏移量。该函数在通过管道传输以及指定文件时起作用。尽管它不像 hexdump 那样适用于多个指定文件。

我想让这成为一个更简单/更短的替代答案，但防范输入的所有这些边缘情况实际上使它变得更长。

hexdump()
{
    # introduces artifical line breaks in hexdump output at newline characters
    # might be useful for comparing files linewise, but still be able to
    # see the differences in non-printable characters utilizing hexdump
    # first argument must be -n else normal hexdump will be used
    local isTmpFile=0
    if [ "$1" != '-n' ]; then command hexdump "$@"; else
        if [ -p /dev/stdin ]; then
            local file="$( mktemp )" args=( "${@:2}" )
            isTmpFile=1
            cat > "$file" # save pipe to temporary file
        else
            local file="${@: -1}" args=( "${@:2:$#-2}" )
        fi
        # sed doesn't seem to work on file descripts for some very weird reason,
        # the linelength will always be zero, so check for that, too ...
        local readfile="$( readlink -- "$file" )"
        if [ -n "$readfile" ]; then 
            # e.g. readlink might return pipe:[123456]
            if [ "${readfile::1}" != '/' ]; then 
                readfile="$( mktemp )"
                isTmpFile=1
                cat "$file" > "$readfile"
                file="$readfile"
            else
                file="$readfile"
            fi
        fi
        # we can't use read here else \x00 in the file gets ignored.
        # Plus read will ignore the last line if it does not have a \n!
        # Unfortunately using sed '<linenumbeer>p' prints an additional \n
        # on the last line, if it wasn't there, but I guess still better than
        # ignoring it ...
        local linelength offset nBytes="$( cat "$file" | wc -c )" line=1
        for (( offset = 0; offset < nBytes; )); do
            linelength=$( sed -n "$line{p;q}" -- "$file" | wc -c )
            (( ++line ))
            head -c $(( offset + $linelength )) -- "$file" | 
            command hexdump -s $offset "${args[@]}" | sed '$d'
            (( offset += $linelength ))
        done
        # Hexdump displays a last empty line by default showing the
        # file size, bute we delete this line in the loop using sed
        # Now insert this last empty line by letting hexdump skip all input
        head -c $offset -- "$file" | command hexdump -s $offset "$args"
        if [ "$isTmpFile" -eq 1 ]; then rm "$file"; fi
    fi
}

您可以尝试使用echo -e "test\nbbb\nomg\n" | hexdump -n -C以下打印：

00000000  74 65 73 74 0a                                    |test.|
00000005  62 62 62 0a                                       |bbb.|
00000009  6f 6d 67 0a                                       |omg.|
0000000d  0a                                                |.|
0000000e

作为奖励，这是我的hexdiff功能：

hexdiff()
{
    # compares two files linewise in their hexadecimal representation
    # create temporary files, because else the two 'hexdump -n' calls
    # get executed multiple times alternatingly when using named pipes:
    # colordiff <( hexdump -n -C "${@: -2:1}" ) <( hexdump -n -C "${@: -1:1}" )
    local a="$( mktemp )" b="$( mktemp )"
    hexdump -n -C "${@: -2:1}" | sed -r 's|^[0-9a-f]+[ \t]*||;' > "$a"
    hexdump -n -C "${@: -1:1}" | sed -r 's|^[0-9a-f]+[ \t]*||;' > "$b"
    colordiff "$a" "$b"
    rm "$a" "$b"
}

例如使用进行测试hexdiff <( printf "test\nbbb\x00 \nomg\nbar" ) <( printf "test\nbbb\nomg\nfoo" )，它将打印：

2c2
< 62 62 62 11 20 0a                                 |bbb. .|
---
> 62 62 62 0a                                       |bbb.|
4,5c4,5
< 62 61 72                                          |bar|
< 00000012
---
> 0c 6f 6f                                          |.oo|
> 00000010

编辑：好吧，这个函数不适合像 8MB 这样的大文件，而且像comparehex或之类的工具dhex也不够好，因为它们忽略换行符，因此无法很好地匹配差异。使用od和的组合sed要快得多：

hexlinedump()
{
    local nChars=$1 file=$2
    paste -d$'\n' -- <( od -w$( cat -- "$file" | wc -c ) -tx1 -v -An -- "$file" |
        sed 's| 0a| 0a\n|g' | sed -r 's|(.{'"$(( 3*nChars ))"'})|\1\n|g' |
        sed '/^ *$/d' ) <(
    # need to delete empty lines, because 0a might be at the end of a char
    # boundary, so that not only 0a, but also the character limit introduces
    # a line break
    sed -r 's|(.{'"$nChars"'})|\1\n|g' -- "$file" | sed -r 's|(.)| \1 |g' )
}

hexdiff()
{
    colordiff <( hexlinedump 16 "${@: -2:1}" ) <( hexlinedump 16 "${@: -1:1}" )
}

Answer