首先,我是 awk 的新手,所以如果事情很简单,请原谅。
我正在尝试生成一个包含路径的文件。我为此使用了一个ls -LT
清单以及一个 awk 脚本:
这是输入文件的示例:
vagrant@precise64:/vagrant$ cat structure-of-home.cnf
/home/:
vagrant
/home/vagrant:
postinstall.sh
这将是预期的输出:
/home/vagrant
/home/vagrant/postinstall.sh
awk 脚本应执行以下操作:
- 检查线路
:
中是否有 - 如果是,则将字符串(不带
:
)分配给变量($path
在我的例子中) - 如果该行为空,则不打印任何内容
- 如果它不为空并且不包含打印,
:
则$path
打印当前行$0
这是脚本:
BEGIN{
path=""
}
{
if ($1 ~ /\:/)
{
sub(/\:/,"",$1)
if (substr($1, length,1) ~ /\//)
{
path=$1;
}
else
{
path=$1"/"
}
}
else if (length($0) == 0)
{}
else
print $path$1
}
问题是,当我运行脚本时,我遇到以下混乱:
vagrant@precise64:/vagrant$ awk -f format_output.awk structure-of-home.cnf
vagrantvagrant
postinstall.shpostinstall.sh
请问我做错了什么?
答案1
作为指出经过塔列津,你的错误是在打印时使用了$
展开。与或path
不同,不使用 来将变量名称扩展为它们的值,而是引用一行的字段(类似于)。bash
make
awk
$
perl
因此,只需删除它即可使您的代码正常工作:
BEGIN{
path=""
}
{
if ($1 ~ /\:/)
{
sub(/\:/,"",$1)
if (substr($1, length,1) ~ /\//)
{
path=$1;
}
else
{
path=$1"/"
}
}
else if (length($0) == 0)
{}
else
print path$1
}
然而,这并不是一个真正awk
好的解决方案:首先,不需要path
在BEGIN
规则中初始化,非定义变量默认为""
or 0
,具体取决于上下文。
此外,任何awk
脚本都包含图案和行动,前者表示什么时候, 后者什么去做。你有一个行动总是被执行(空图案),并在内部使用(嵌套)条件来决定要做什么。
我的解决方案如下所示:
# BEGIN is actually a pattern making the following rule run only once:
# That is, before any input is read.
BEGIN{
# Split lines into chunks (fields) separated by ":".
# This is done by setting the field separator (FS) variable accordingly:
# FS=":" # this would split lines into fields by ":"
# Additionally, if a field ends with "/",
# we consider this part of the separator.
# So fields should be split by a ":" that *might*
# be predecessed by a "/".
# This can be done using a regular expression (RE) FS:
FS="/?:" # "?" means "the previous character may occur 0 or 1 times"
# When printing, we want to join the parts of the paths by "/".
# That's the sole purpose of the output field separator (OFS) variable:
OFS="/"
}
# First we want to identify records (i.e. in this [default] case: lines),
# that contain(ed) a ":".
# We can do that without any RE matching, since records are
# automatically split into fields separated by ":".
# So asking >>Does the current line contain a ":"?<< is now the same
# as asking >>Does the current record have more than 1 field?<<.
# Luckily (but not surprisingly), the number of fields (NF) variable
# keeps track of this:
NF>1{ # The follwoing action is run only if are >1 fields.
# All we want to do in this case, is store everything up to the first ":",
# without the potential final "/".
# With our FS choice (see above), that's exactly the 1st field:
path=$1
}
# The printing should be done only for non-empty lines not containing ":".
# In our case, that translates to a record that has neither 0 nor >1 fields:
NF==1{ # The following action is only run if there is exactly 1 field.
# In this case, we want to print the path varible (no need for a "$" here)
# followed by the current line, separated by a "/".
# Since we defined the proper OFS, we can use "," to join output fields:
print path,$1 # ($1==$0 since NF==1)
}
就这样。删除所有注释,缩短变量名称并将[O]FS
定义移至命令行参数,您只需编写以下内容:
awk -F'/?:' -vOFS=\/ 'NF>1{p=$1}NF==1{print p,$1}' structure-of-home.cnf
答案2
awk -F: '/:/{prefix=$1;next}/./{print prefix "/" $0}'
/
请注意,路径中有双重并不是问题。
但如果你愿意的话你可以添加
awk -F: '/:/{sub("/$","",$1);prefix=$1;next}/./{print prefix "/" $0}'
或者
awk -F: '/:/{prefix=$1;s="/";if(prefix~"/$")s="";next}/./{print prefix s $0}'
答案3
我会做类似的事情:
awk 'match($0, "/*:$") {path = substr($0, 1, RSTART-1); next}
NF {print path "/" $0}'