从多个 json 文件中删除重复行，同时保留文件结构

Question 1

假设您的文件名没有空格或特殊字符，这应该适合您。您可能需要调整第一个命令以获得首先处理的文件的所需排序顺序。

#!/bin/bash
temp=$(mktemp)
for file_to_dedupe in $(echo *.json|sort)
do
   for file_to_strip in *.json
   do
      [ "$file_to_dedupe" == "$file_to_strip" ] && continue
      grep -w -Ff ${file_to_dedupe} -v ${file_to_strip} > ${temp}
      mv ${temp} ${file_to_strip}
   done
done

解释

temp=$(mktemp)创建一个要使用的 tmp 文件
for file_to_dedupe in $(echo *.json|sort)开始循环文件去重复。
for file_to_strip in *.json开始循环遍历文件以删除重复项。
[ "$file_to_dedupe" == "$file_to_strip" ] && continue跳过我们当前的文件。
grep -w -Ff ${file_to_dedupe} -v ${file_to_strip} > ${temp}使用每行作为模式删除精确的欺骗file_to_dedupe
mv ${temp} ${file_to_strip}将新文件放置到位。

Answer

假设您的文件名没有空格或特殊字符，这应该适合您。您可能需要调整第一个命令以获得首先处理的文件的所需排序顺序。

#!/bin/bash
temp=$(mktemp)
for file_to_dedupe in $(echo *.json|sort)
do
   for file_to_strip in *.json
   do
      [ "$file_to_dedupe" == "$file_to_strip" ] && continue
      grep -w -Ff ${file_to_dedupe} -v ${file_to_strip} > ${temp}
      mv ${temp} ${file_to_strip}
   done
done

解释

temp=$(mktemp)创建一个要使用的 tmp 文件
for file_to_dedupe in $(echo *.json|sort)开始循环文件去重复。
for file_to_strip in *.json开始循环遍历文件以删除重复项。
[ "$file_to_dedupe" == "$file_to_strip" ] && continue跳过我们当前的文件。
grep -w -Ff ${file_to_dedupe} -v ${file_to_strip} > ${temp}使用每行作为模式删除精确的欺骗file_to_dedupe
mv ${temp} ${file_to_strip}将新文件放置到位。

Question 2

perl -i.bak -ne 'print $_ unless $a{$_}++ '  *.json

并删除（files.bak如果有效）。

Answer

perl -i.bak -ne 'print $_ unless $a{$_}++ '  *.json

并删除（files.bak如果有效）。

从多个 json 文件中删除重复行，同时保留文件结构

答案1

答案2

相关内容