Bash - 将每行的第一个单词放入数组中

Bash - 将每行的第一个单词放入数组中

我试图将每个句子的第一个单词放入一个数组中。

我究竟做错了什么?

#!/bin/bash
file_name=$1
content=$(cat $file_name)
content=${content//"\n"/" "}
content=${content//". "/"\n"}
declare -A arr
cat $content| while read line;
do 
   line=($line)
   word=${line[0]}

if [[ ${arr[$word]} == '' ]] 
then
    arr[$word]=1
else
    let arr[$word]=${arr[$word]}+1
fi

done

答案1

除非您必须以纯粹的方式bash(出于迂腐的原因?)来执行此操作,否则请使用cut循环生成“第一个单词”列表,for wrd in $(cut "-d " -f1 $file_name ) ; do或者如果您的单词列表大于xargs --no-run-if-empty --show-limits </dev/null,则使用cutxargs

在您现有的代码中,您似乎试图吸收所有数据,并且仅使用bashParameter Expansion”,对其进行处理content(滥用bash变量),然后将其用作文件名(cat $content),以展平数据并一次处理 1 行。

当然,请阅读man cut、、man xargsman bash

答案2

创建一个名为的输入文件,~/Documents/FirstWord.txt包含以下内容:

Sample data file --> FirstWord.txt

This is a simple little input file used
as to test the bash script FirstWord.sh

There are three paragraphs separated by
one blank line. The output from this file
is "Sample", "This", "as", "one" and "is".

创建名为 bash 的脚本,~/Downloads/FirstWord.sh包含以下内容:

!/bin/bash
file_name=$1
content=$(cat $file_name)
content="${content//"\n"/" "}"      # <-- You were missing double quotes around
content="${content//". "/"\n"}"     #     ${content//...} needed when spaces occur.
echo "$content" > /tmp/ContentFile  # Create temporary file of massaged data.

declare arr         # Define regular array.

while IFS='' read -r line || [[ -n "$line" ]]; do
#      ^ 1.       ^ 2.          ^ 3.

# 1. IFS='' (or IFS=) prevents leading/trailing whitespace from being trimmed.
# 2. -r prevents backslash escapes from being interpreted.
# 3. || [[ -n $line ]] prevents the last line from being ignored if it doesn't end with a \n (since read returns a non-zero exit code when it encounters EOF).

# ** Above explanation from: https://stackoverflow.com/a/10929511/6929343

    first=${line%% *}   # Extract first word from line.
    arr+=($first)       # Assign $first to next available array entry.

done < /tmp/ContentFile # <-- In bash it is best to read files at end of while loop.

echo ${arr[@]}          # Dump array contents to screen.
rm -f /tmp/ContentFile  # Remove temporary file.

# +============================================================================+
# | Original program with comments below                                       |
# +----------------------------------------------------------------------------+

# declare -A arr                        <-- We don't need associative array.

# cat $content| while read line;        <-- Unconventional method
# do                                    <-- I usually append to line above.
    # line=($line) <--- Assigning a variable to itself has no effect.
    # word=${line[0]} < Not sure what would be accomplished by this.

    # if [[ ${arr[$word]} == '' ]]      <-- I've indented for readability.
    # then
    #     arr[$word]=1                  <-- "word" isn't a counter, won't work.
    # else
    #     let arr[$word]=${arr[$word]}+1<-- Once again "word" isn't a counter.
    # fi
# done                                  <-- Where input to while loop shoud be.

保存这两个文件,然后在终端中输入:

~$ cd Downloads
~/Downloads$ chmod a+x FirstWord.sh
~/Downloads$ ./FirstWord.sh FirstWord.txt
Sample This as There one is

相关内容