我有这个输入文件:
target_id length eff_length est_counts tpm
ENST00000583162.1 1066 967 1.69899 1.18376
ENST00000583355.1 891 792 13.8057 11.7445
ENST00000582528.5 5342 5243 21.3223 2.74003
ENST00000497744.1 964 865 0 0
ENST00000482564.1 1856 1757 3.29538 1.26367
ENST00000356654.8 4351 4252 56.2725 8.91668
ENST00000396684.2 4290 4191 0.206617 0.0332162
ENST00000541029.1 855 756 3.14783 2.80537
ENST00000537488.1 899 800 2.37306 1.99857
ENST00000264010.8 3939 3840 354.642 62.2241
ENST00000401394.5 2978 2879 28.362 6.63735
ENST00000566078.1 1627 1528 4.9964 2.2031
ENST00000595290.5 1242 1143 0 0
ENST00000595330.1 692 593 0 0
ENST00000596998.2 588 489 0 0
ENST00000374514.7 1810 1711 53.7113 21.1503
我想.*
从第一列中删除:
target_id length eff_length est_counts tpm
ENST00000583162. 1066 967 1.69899 1.18376
ENST00000583355. 891 792 13.8057 11.7445
ENST00000582528. 5342 5243 21.3223 2.74003
ENST00000497744. 964 865 0 0
ENST00000482564. 1856 1757 3.29538 1.26367
ENST00000356654. 4351 4252 56.2725 8.91668
ENST00000396684. 4290 4191 0.206617 0.0332162
ENST00000541029. 855 756 3.14783 2.80537
ENST00000537488. 899 800 2.37306 1.99857
ENST00000264010. 3939 3840 354.642 62.2241
ENST00000401394. 2978 2879 28.362 6.63735
ENST00000566078. 1627 1528 4.9964 2.2031
ENST00000595290. 1242 1143 0 0
ENST00000595330. 692 593 0 0
ENST00000596998. 588 489 0 0
ENST00000374514. 1810 1711 53.7113 21.1503
请告诉我可以使用什么 sed 或 awk 命令来解决这个问题。
答案1
最简单的方法是删除.
每行第一个之后的所有数字:
$ sed 's/\.[0-9]*/\./' file
target_id length eff_length est_counts tpm
ENST00000583162. 1066 967 1.69899 1.18376
ENST00000583355. 891 792 13.8057 11.7445
ENST00000582528. 5342 5243 21.3223 2.74003
ENST00000497744. 964 865 0 0
ENST00000482564. 1856 1757 3.29538 1.26367
ENST00000356654. 4351 4252 56.2725 8.91668
ENST00000396684. 4290 4191 0.206617 0.0332162
ENST00000541029. 855 756 3.14783 2.80537
ENST00000537488. 899 800 2.37306 1.99857
ENST00000264010. 3939 3840 354.642 62.2241
ENST00000401394. 2978 2879 28.362 6.63735
ENST00000566078. 1627 1528 4.9964 2.2031
ENST00000595290. 1242 1143 0 0
ENST00000595330. 692 593 0 0
ENST00000596998. 588 489 0 0
ENST00000374514. 1810 1711 53.7113 21.1503
.
这将找到该行的第一个,并.
仅用 替换 及其后面的任何数字.
。但是,鉴于这些是转录本 ID,您可能不需要.
其中任何一个,因此请尝试以下操作:
$ sed 's/\.[0-9]*//' file
target_id length eff_length est_counts tpm
ENST00000583162 1066 967 1.69899 1.18376
ENST00000583355 891 792 13.8057 11.7445
ENST00000582528 5342 5243 21.3223 2.74003
ENST00000497744 964 865 0 0
ENST00000482564 1856 1757 3.29538 1.26367
ENST00000356654 4351 4252 56.2725 8.91668
ENST00000396684 4290 4191 0.206617 0.0332162
ENST00000541029 855 756 3.14783 2.80537
ENST00000537488 899 800 2.37306 1.99857
ENST00000264010 3939 3840 354.642 62.2241
ENST00000401394 2978 2879 28.362 6.63735
ENST00000566078 1627 1528 4.9964 2.2031
ENST00000595290 1242 1143 0 0
ENST00000595330 692 593 0 0
ENST00000596998 588 489 0 0
ENST00000374514 1810 1711 53.7113 21.1503
如果第一列中的值没有.
,那么这些命令将更改下一个可用列,后跟.
数字。要显式限制第一列,您可以使用以下之一:
awk
awk -v OFS='\t' '{sub(/\.[0-9]*/,"",$1)}1' file
或者,保留尾随
.
:awk -v OFS='\t' '{sub(/\.[0-9]*/,".",$1)}1' file
GNU sed
sed -E 's/^(\S+)\.[0-9]*/\1/' file
或者,保留尾随
.
:sed -E 's/^(\S+)\.[0-9]*/\1./' file
大多数其他 sed 实现:
sed -E 's/^([^[:blank:]]*)\.[0-9]*/\1/' file
任何 sed:
sed 's/^\([^[:blank:]]*\)\.[0-9]*/\1/' file
珀尔
perl -pe 's/^(\S+)\.\d+/\1/' file
或者,保留尾随
.
:perl -pe 's/^(\S+)\.\d+/\1./' file
答案2
命令
awk '{gsub(/\.*/,"",$1);print $0}' file.txt
输出
target_id length eff_length est_counts tpm
ENST000005831621 1066 967 1.69899 1.18376
ENST000005833551 891 792 13.8057 11.7445
ENST000005825285 5342 5243 21.3223 2.74003
ENST000004977441 964 865 0 0
ENST000004825641 1856 1757 3.29538 1.26367
ENST000003566548 4351 4252 56.2725 8.91668
ENST000003966842 4290 4191 0.206617 0.0332162
ENST000005410291 855 756 3.14783 2.80537
ENST000005374881 899 800 2.37306 1.99857
ENST000002640108 3939 3840 354.642 62.2241
ENST000004013945 2978 2879 28.362 6.63735
ENST000005660781 1627 1528 4.9964 2.2031
ENST000005952905 1242 1143 0 0
ENST000005953301 692 593 0 0
ENST000005969982 588 489 0 0
ENST000003745147 1810 1711 53.7113 21.1503