替换文件中的单词 - UNIX 命令

替换文件中的单词 - UNIX 命令

我怎样才能用计数器替换文件中的“单词”:

{"word":"resolucion","count":40723},{"word":"general","count":20976},
{"word":"","count":13334},{"word":"publica","count":12379},
{"word":"direccion","count":11958},{"word":"secretaria","count":9907},
{"word":"al","count":9324},{"word":"orden","count":8604},
{"word":"anuncia","count":8589},{"word":"concurso","count":6953},
{"word":"diciembre","count":6893},{"word":"adjudicacion","count":6762},
{"word":"estado","count":6154},{"word":"procedimiento","count":5694},
{"word":"julio","count":5598},{"word":"marzo","count":5440},
{"word":"-","count":5437},{"word":"convocatoria","count":5319},
{"word":"ayuntamiento","count":5259},{"word":"publico","count":5203},
{"word":"junio","count":4995},{"word":"convenio","count":4925},
{"word":"real","count":4916},{"word":"febrero","count":4896},
{"word":"proyecto","count":4826},{"word":"abierto","count":4782},

例如:

{"0":"resolucion","count":40723},{"1":"general","count":20976},
{"2":"","count":13334}, {"3":"publica","count":12379},
{"4":"direccion","count":11958},{"5":"secretaria","count":9907},
{"6":"al","count":9324},{"7":"orden","count":8604},
{"8":"anuncia","count":8589},

等等。

答案1

以下是我们grepsed朋友的一个可能的解决方案。这对于小文件来说很好,否则perl(或awk?)解决方案将更加高效。这是bash语法:

i=1
maxnum=$(grep -o '\<word\>' datafile | wc -l)
while (( i <= maxnum )); do
  sed -i "s/word/$i/" datafile
  (( i++ ))
done

grep -o统计总数单词datafile这是来自格伦。这里唯一的技巧是sed不能全局使用,因此只替换第一个匹配的字符串。这就是为什么这段代码如此缓慢,因为它调用sed 最大数量次。

请注意,这sed -i会更改您的原始数据文件,因此请先复制一份。

答案2

如果是 JSON 文件,你可以使用一些脚本语言来修改它。例如,如果你安装了 NodeJS,你可以运行以下程序:

var data = require('./data.json')

console.log(data)

data.forEach(function (obj, idx) { obj[idx] = obj['word']; delete obj.word; });

console.log(data)

我假设该文件名为“data.json”,并且它是有效的 JSON 语法(您的语法不完全如此:您缺少包装,并且在末尾[ ]有一个虚假的字符串。,

答案3

这应该相当快。

perl -0 -ne 's/"word"/q{"} . $x++ . q{"}/ge; print;' INFILE > OUTFILE

相关内容