我有一个 .csv 文件,其中包含两“列”数据,如下所示:
test1.ts.meta,Before Sunrise (1995)
test2.ts.meta,A Beautiful Mind (2001)
test3.ts.meta,Departures (2008)
test4.ts.meta,Love & Other Drugs (2010)
我正在尝试使用此命令将每个 .ts.meta 文件中的第 2 行替换为相应的电影名称...
cat 1TBMovie2_dotTSdotMeta.txt | while IFS=, read file moviename; do sed "2 s/^.*$/$moviename/" "$file"; done
除了电影名称中包含与号 (&) 之外,它工作正常。
例如电影名称:爱情与其他毒品 (2010)在这种情况下,.ts.meta 文件的第二行获取此电影名称:
Love Love Love & Other Drugs Other Drugs (2010) Other Drugs (2010)
同样的电影名称:爱与慈悲 (2015)出现在 .ts.meta 文件中,如下所示:爱爱爱与慈悲 慈悲 (2015) 慈悲 (2015)。
令人困惑的是......如果我打开名为 的电影的 .ts.meta 文件爱与慈悲 (2015)并手动删除第 2 行并保存,然后再次运行上面的命令我在第 2 行中得到了这个...爱慈悲 (2015)“Love”和“Mercy”之间有两个空格。
我想我需要将 $moviename 变量括在双引号中,就像我对 $file 变量所做的那样?我猜 sed 会将 & 字符视为具有特殊含义?
这是一些更多信息来澄清问题
我的 csv 文件(我实际上称之为:updatemeta.txt
test1.ts.meta,Carols from King's (2013)
test2.ts.meta,Before Sunrise (1995)
test3.ts.meta,Love & Other Drugs (2010)
test4.ts.meta,Departures (2008)
test1.ts.meta
1:0:19:1B1C:802:2:11A0000:0:0:0:
Carols from King's
The traditional Christmas carol concert from King's College Chapel, Cambridge. Stephen Cleobury conducts the famous chapel choir in carols old and new. [HD] [S]
1387969020
448066800
2913369072
f:0,c:00157c,c:01157e,c:02157f,c:03157c,c:050001
188
0
test2.ts.meta
1:0:1:189E:7FD:2:11A0000:0:0:0:
Before Sunrise
Romance starring Julie Delpy and Ethan Hawke. Two twentysomethings meet on a train and decide to spend a few hours together. Contains some strong language. Also in HD. [1995] [AD,S]
1392418980
550744512
2637755808
f:0,c:0013ec,c:0113ed,c:0213ef,c:0313ec
188
0
test3.ts.meta
1:0:1:2404:7F9:2:11A0000:0:0:0:
Love & Other Drugs
(2010) Fact-based adult comedy. Jake Gyllenhaal stars as a successful Viagra salesman who falls for a woman with Parkinson's (Anne Hathaway). Strong language/sexual scenes. [AD,S]
1472775840
712401799
2824257448
f:0,c:000931,c:010932,c:020934,c:030931
188
0
test4.ts.元
1:0:1:2404:7F9:2:11A0000:0:0:0:
Departures
(2008) An Oscar-winning, whimsical look at the Japanese undertaking profession. Masahiro Motoki stars as a musician starting a new career preparing the dead for burial. Japanese/subs.
1400111580
863881200
3699150040
f:0,c:000931,c:010932,c:020934,c:030931
188
0
我会将 .csv 文件与许多 .ts.meta 文件放在同一目录中。对于目录中的每个 .ts.meta 文件,.csv 文件中都会有一行以及相应的电影名称。
如何使用 sed 或 awk 或 gawk 创建一个脚本,循环遍历 .csv 文件中的每一行,并将命名的 .ts.meta 文件中的第二行替换为 .csv 文件中指定的相应电影名称?
我尝试了下面解决方案中给出的示例,但不明白发生了什么!
谢谢你,
柔性
答案1
不要仅仅为了操作文本而编写 shell 循环,请参阅为什么使用 shell 循环来处理文本被认为是不好的做法,当您想使用文字字符串时,请使用 awk 等能够理解文字字符串的工具,而不是 sed 等不能理解的工具。
您没有提供任何.ts.meta
文件供我们测试,因此显然这是未经测试的,但类似的东西将使用 GNU awk for -i inplace
(假设您想修改原始文件)来完成这项工作并且ARGIND
:
awk -i inplace -F',' '
NR == FNR {
titles[ARGC] = $2
ARGV[ARGC++] = $1
}
(NR != FNR) && (FNR == 2) {
$0 = titles[ARGIND]
}
{ print }
' 1TBMovie2_dotTSdotMeta.txt
如果您真的想尝试使用 sed 执行此操作(不要!),请参阅是否可以使用 sed 可靠地转义正则表达式元字符并注意,这&
并不是您需要担心的唯一字符,例如,/
也\1
需要处理。
鉴于您新提供的示例输入:
$ head -n 50 update* *.meta
==> updatemeta.txt <==
test1.ts.meta,Carols from King's (2013)
test2.ts.meta,Before Sunrise (1995)
test3.ts.meta,Love & Other Drugs (2010)
test4.ts.meta,Departures (2008)
==> test1.ts.meta <==
1:0:19:1B1C:802:2:11A0000:0:0:0:
Carols from King's
The traditional Christmas carol concert from King's College Chapel, Cambridge. Stephen Cleobury conducts the famous chapel choir in carols old and new. [HD] [S]
1387969020
448066800
2913369072
f:0,c:00157c,c:01157e,c:02157f,c:03157c,c:050001
188
0
==> test2.ts.meta <==
1:0:1:189E:7FD:2:11A0000:0:0:0:
Before Sunrise
Romance starring Julie Delpy and Ethan Hawke. Two twentysomethings meet on a train and decide to spend a few hours together. Contains some strong language. Also in HD. [1995] [AD,S]
1392418980
550744512
2637755808
f:0,c:0013ec,c:0113ed,c:0213ef,c:0313ec
188
0
==> test3.ts.meta <==
1:0:1:2404:7F9:2:11A0000:0:0:0:
Love & Other Drugs
(2010) Fact-based adult comedy. Jake Gyllenhaal stars as a successful Viagra salesman who falls for a woman with Parkinson's (Anne Hathaway). Strong language/sexual scenes. [AD,S]
1472775840
712401799
2824257448
f:0,c:000931,c:010932,c:020934,c:030931
188
0
==> test4.ts.meta <==
1:0:1:2404:7F9:2:11A0000:0:0:0:
Departures
(2008) An Oscar-winning, whimsical look at the Japanese undertaking profession. Masahiro Motoki stars as a musician starting a new career preparing the dead for burial. Japanese/subs.
1400111580
863881200
3699150040
f:0,c:000931,c:010932,c:020934,c:030931
188
0
这是运行的 awk 脚本:
$ awk -i inplace -F',' '
NR == FNR {
titles[ARGC] = $2
ARGV[ARGC++] = $1
}
(NR != FNR) && (FNR == 2) {
$0 = titles[ARGIND]
}
{ print }
' updatemeta.txt
这就是对你的文件所做的事情:
$ head -n 50 update* *.meta
==> updatemeta.txt <==
test1.ts.meta,Carols from King's (2013)
test2.ts.meta,Before Sunrise (1995)
test3.ts.meta,Love & Other Drugs (2010)
test4.ts.meta,Departures (2008)
==> test1.ts.meta <==
1:0:19:1B1C:802:2:11A0000:0:0:0:
Carols from King's (2013)
The traditional Christmas carol concert from King's College Chapel, Cambridge. Stephen Cleobury conducts the famous chapel choir in carols old and new. [HD] [S]
1387969020
448066800
2913369072
f:0,c:00157c,c:01157e,c:02157f,c:03157c,c:050001
188
0
==> test2.ts.meta <==
1:0:1:189E:7FD:2:11A0000:0:0:0:
Before Sunrise (1995)
Romance starring Julie Delpy and Ethan Hawke. Two twentysomethings meet on a train and decide to spend a few hours together. Contains some strong language. Also in HD. [1995] [AD,S]
1392418980
550744512
2637755808
f:0,c:0013ec,c:0113ed,c:0213ef,c:0313ec
188
0
==> test3.ts.meta <==
1:0:1:2404:7F9:2:11A0000:0:0:0:
Love & Other Drugs (2010)
(2010) Fact-based adult comedy. Jake Gyllenhaal stars as a successful Viagra salesman who falls for a woman with Parkinson's (Anne Hathaway). Strong language/sexual scenes. [AD,S]
1472775840
712401799
2824257448
f:0,c:000931,c:010932,c:020934,c:030931
188
0
==> test4.ts.meta <==
1:0:1:2404:7F9:2:11A0000:0:0:0:
Departures (2008)
(2008) An Oscar-winning, whimsical look at the Japanese undertaking profession. Masahiro Motoki stars as a musician starting a new career preparing the dead for burial. Japanese/subs.
1400111580
863881200
3699150040
f:0,c:000931,c:010932,c:020934,c:030931
188
0
答案2
一种方法是绕过正则表达式路由并使用 的读取r
命令sed
。
cat 1TBMovie2_dotTSdotMeta.txt | while IFS=, read file moviename; do printf '%s\n' "$moviename" | sed -i -e '2r /dev/stdin' -e '2d' "$file"; done
它应该写成多行读出,如下所示:
cat 1TBMovie2_dotTSdotMeta.txt |
while IFS=, read file moviename
do
printf '%s\n' "$moviename" |
sed -i -e '2r /dev/stdin' -e '2d' "$file"
done
这里我们使用 gnu sed 功能来读取文件stdin
。对于非 GNU sed
,我们可以将电影名称保存在临时文件中,并在命令中使用该名称r
。现在你不必费心去逃避任何事情了。
但是,如果您不想带来额外文件的麻烦,那么我们需要转义命令/ \ &
的 rhs 上特殊的特殊字符sed s/.../.../
。包含是/
因为它充当分隔符。
cat 1TBMovie2_dotTSdotMeta.txt |
while IFS=, read file moviename
do
moviename_esc=$(printf '%s\n' "$moviename" | sed -e 's:[\&/]:\\&:g')
sed -i -e "2 s/.*/$moviename_esc/" "$file"
done