问题

Question 1

尝试这个：

find ./ -not -path "./.git/*" -type f -exec wc -l {} + |
    awk '{print tolower($0)}' |
    sed -e '$ d' | 
    sed -e "s#/.*/##g" |
    sed -e "s/\./ \./g" |
    awk '
        { if ( NF <= 2 ) { count["none"] += $1 } else { count[$NF] += $1 } }
        { next }
        END { for (group in count) printf("%d%s%s\n", count[group], OFS, group) }
    ' |
    sort -n

细分：

find ./递归查找该目录下的对象
-not -path "./.git/*"排除.git
-type f文件而不是目录
-exec wc -l {} +对于每个文件，运行字数统计实用程序 ( wc)。这包括空行，因此不满足问题的所有要求。
awk '{print tolower($0)}'变为小写
sed -e '$ d'删除最后一行，这是所有文件的行总和
sed -e "s#/.*/##g"删除文件的路径，例如， a/something.egg/blah应该算作无扩展名，而不是.egg/blah扩展名
sed -e "s/\./ \./g"搜索/替换.为.，因此文件扩展名是它自己的单词
awk '{ if ( NF <= 2 ) { count["none"] += $1 } else { count[$NF] += $1 } } { next } END { for (group in count) printf("%d%s%s\n", count[group], OFS, group) }'这是一件大事。 awk 功能强大，但不是超级清晰
- count是一本字典
- if (NF <= 2)如果少于 3 个“单词”，即没有扩展名
- count["none"] += $1增加字典中的一个元素，键是字符串文字none，通过添加该文件中的行数来增加它，这是第一个单词，这是$1
- count[$NF] += $1字典中增加一个元素，key是$NF（行中的最后一个单词），这是扩展名，by $1（行中的第一个单词），是这个文件的行数
- { next }对所有行重复
- for (group in count)循环for、内联
- printf(...)将输出字符串格式化为number extension；例如， 123 .abc（如果以结尾的文件有 123 行.abc）
sort -n按升序对结果进行排序； -n表示按数字排序，而不是字符串

Answer

尝试这个：

find ./ -not -path "./.git/*" -type f -exec wc -l {} + |
    awk '{print tolower($0)}' |
    sed -e '$ d' | 
    sed -e "s#/.*/##g" |
    sed -e "s/\./ \./g" |
    awk '
        { if ( NF <= 2 ) { count["none"] += $1 } else { count[$NF] += $1 } }
        { next }
        END { for (group in count) printf("%d%s%s\n", count[group], OFS, group) }
    ' |
    sort -n

细分：

find ./递归查找该目录下的对象
-not -path "./.git/*"排除.git
-type f文件而不是目录
-exec wc -l {} +对于每个文件，运行字数统计实用程序 ( wc)。这包括空行，因此不满足问题的所有要求。
awk '{print tolower($0)}'变为小写
sed -e '$ d'删除最后一行，这是所有文件的行总和
sed -e "s#/.*/##g"删除文件的路径，例如， a/something.egg/blah应该算作无扩展名，而不是.egg/blah扩展名
sed -e "s/\./ \./g"搜索/替换.为.，因此文件扩展名是它自己的单词
awk '{ if ( NF <= 2 ) { count["none"] += $1 } else { count[$NF] += $1 } } { next } END { for (group in count) printf("%d%s%s\n", count[group], OFS, group) }'这是一件大事。 awk 功能强大，但不是超级清晰
- count是一本字典
- if (NF <= 2)如果少于 3 个“单词”，即没有扩展名
- count["none"] += $1增加字典中的一个元素，键是字符串文字none，通过添加该文件中的行数来增加它，这是第一个单词，这是$1
- count[$NF] += $1字典中增加一个元素，key是$NF（行中的最后一个单词），这是扩展名，by $1（行中的第一个单词），是这个文件的行数
- { next }对所有行重复
- for (group in count)循环for、内联
- printf(...)将输出字符串格式化为number extension；例如， 123 .abc（如果以结尾的文件有 123 行.abc）
sort -n按升序对结果进行排序； -n表示按数字排序，而不是字符串

Question 2

如果我理解正确，并且我的测试很好，我建议这样做（假设您想跳过隐藏的目录和文件，请告诉我是否情况并非如此）：

shopt -s globstar

declare -A arr
for f in test/**; do
  # if a directory, skip
  [[ -d "$f" ]] && continue
  lines=0
  # strip the extension
  ext="${f##*.}"
  # convert it to lowercase
  ext="${ext,,}"
  # if no dot in the name, extension is "empty"
  [[ ! $(basename "$f") =~ \. ]] && ext="empty"
  # count the lines
  lines=$(wc -l "$f"| cut -d' ' -f1)
  # if lines equals to 0, skip
  [[ $lines -eq 0 ]] && continue
  # append the number of line to the array
  lines=$(( "${arr[$ext]}"+$lines ))
  arr[$ext]=$lines 
done

# loop over the array
for n in ${!arr[@]}; do
  echo "files $n: total lines ${arr[$n]}"
done

输出（来自我的示例文件）：

files yaml: total lines 3
files md: total lines 3
files empty: total lines 4
files csv: total lines 6
files py: total lines 5

Answer

如果我理解正确，并且我的测试很好，我建议这样做（假设您想跳过隐藏的目录和文件，请告诉我是否情况并非如此）：

shopt -s globstar

declare -A arr
for f in test/**; do
  # if a directory, skip
  [[ -d "$f" ]] && continue
  lines=0
  # strip the extension
  ext="${f##*.}"
  # convert it to lowercase
  ext="${ext,,}"
  # if no dot in the name, extension is "empty"
  [[ ! $(basename "$f") =~ \. ]] && ext="empty"
  # count the lines
  lines=$(wc -l "$f"| cut -d' ' -f1)
  # if lines equals to 0, skip
  [[ $lines -eq 0 ]] && continue
  # append the number of line to the array
  lines=$(( "${arr[$ext]}"+$lines ))
  arr[$ext]=$lines 
done

# loop over the array
for n in ${!arr[@]}; do
  echo "files $n: total lines ${arr[$n]}"
done

输出（来自我的示例文件）：

files yaml: total lines 3
files md: total lines 3
files empty: total lines 4
files csv: total lines 6
files py: total lines 5

Question 3

我将其分解为函数以使其更易于理解：

#!/bin/bash

# For the next two functions, we will use "-print0", which will print out \0 instead of \n.
# This will help prevent whitespace problems when piping the filenames into xargs.

find_extension()
{
    find "$1" -type f -name "*.$2" -print0 2>/dev/null
}

find_no_extension()
{
    find "$1" -type f -regex '^.*/[^.]+$' -print0 2>/dev/null
}

concat_files()
{
    xargs -0 cat
}

delete_empty_lines()
{
    sed -E '/^[[:space:]]*$/d'
}

line_count_of_files()
{
    concat_files | delete_empty_lines | wc -l
}

print_usage()
{
    echo "Usage: $0 [EXTENSION]... [SEARCH_DIRECTORY]";
}

NUMBER_OF_EXTENSIONS=$(($# - 1))
SEARCH_DIR="${*: -1}"

if [ $# -lt 2 ];
then
    echo "Not enough parameters.";
    print_usage;
    exit 1;
fi

if ! [ -d "$SEARCH_DIR" ];
then
    echo "$SEARCH_DIR does not exist, or is not a directory."
    print_usage;
    exit 1;
fi

for EXTENSION in "${@:1:$NUMBER_OF_EXTENSIONS}";
do
    printf ".$EXTENSION: %s\n" $(find_extension "$SEARCH_DIR" "$EXTENSION" | line_count_of_files)
done


printf "No extension: %s\n" $(find_no_extension "$SEARCH_DIR" | line_count_of_files)

这更像是一个通用脚本，可让您指定要搜索的任意文件扩展名。但它总是会搜索没有扩展名的文件。

您应该将其保存到一个文件中，授予它可执行权限，然后将其放入您的 PATH 中。假设您将其命名为 count_lines.sh。你可以这样称呼它：count_lines.sh py md yaml ~/Code。这将在目录中搜索以、和~/Code结尾的文件，以及根本没有扩展名的文件。您可以选择任意数量的扩展名进行搜索，只需确保至少有一个即可。.py.md.yaml

Answer

我将其分解为函数以使其更易于理解：

#!/bin/bash

# For the next two functions, we will use "-print0", which will print out \0 instead of \n.
# This will help prevent whitespace problems when piping the filenames into xargs.

find_extension()
{
    find "$1" -type f -name "*.$2" -print0 2>/dev/null
}

find_no_extension()
{
    find "$1" -type f -regex '^.*/[^.]+$' -print0 2>/dev/null
}

concat_files()
{
    xargs -0 cat
}

delete_empty_lines()
{
    sed -E '/^[[:space:]]*$/d'
}

line_count_of_files()
{
    concat_files | delete_empty_lines | wc -l
}

print_usage()
{
    echo "Usage: $0 [EXTENSION]... [SEARCH_DIRECTORY]";
}

NUMBER_OF_EXTENSIONS=$(($# - 1))
SEARCH_DIR="${*: -1}"

if [ $# -lt 2 ];
then
    echo "Not enough parameters.";
    print_usage;
    exit 1;
fi

if ! [ -d "$SEARCH_DIR" ];
then
    echo "$SEARCH_DIR does not exist, or is not a directory."
    print_usage;
    exit 1;
fi

for EXTENSION in "${@:1:$NUMBER_OF_EXTENSIONS}";
do
    printf ".$EXTENSION: %s\n" $(find_extension "$SEARCH_DIR" "$EXTENSION" | line_count_of_files)
done


printf "No extension: %s\n" $(find_no_extension "$SEARCH_DIR" | line_count_of_files)

这更像是一个通用脚本，可让您指定要搜索的任意文件扩展名。但它总是会搜索没有扩展名的文件。

您应该将其保存到一个文件中，授予它可执行权限，然后将其放入您的 PATH 中。假设您将其命名为 count_lines.sh。你可以这样称呼它：count_lines.sh py md yaml ~/Code。这将在目录中搜索以、和~/Code结尾的文件，以及根本没有扩展名的文件。您可以选择任意数量的扩展名进行搜索，只需确保至少有一个即可。.py.md.yaml

问题

问题

答案1

答案2

答案3

相关内容