从文本中获取 URL

从文本中获取 URL

我通过 获得一个文本文件apt-get --print-uris dist-upgrade > /mnt/URIs.txt,我想下载带有文本文件提供的 URL 的所有包,只有 '' 之间的文本是 URL,如何删除其余部分,因为只有 URL 和返回符号用于通过互联网浏览器下载。

答案1

可能的输出apt-get --print-uris dist-upgrade看起来像这样:

Reading package lists...
Building dependency tree...
Reading state information...
Calculating upgrade...
The following packages will be upgraded:
  evolution-data-server evolution-data-server-common gir1.2-goa-1.0
  gnome-online-accounts libcamel-1.2-62 libebackend-1.2-10 libebook-1.2-20
  libebook-contacts-1.2-3 libecal-2.0-1 libedata-book-1.2-26
  libedata-cal-2.0-1 libedataserver-1.2-24 libedataserverui-1.2-2
  libgoa-1.0-0b libgoa-1.0-common libgoa-backend-1.0-1 libyelp0 linux-libc-dev
  python-apt-common python3-apt yelp
21 upgraded, 0 newly installed, 0 to remove and 0 not upgraded.
Need to get 4,358 kB of archives.
After this operation, 16.4 kB of additional disk space will be used.
'http://se.archive.ubuntu.com/ubuntu/pool/main/p/python-apt/python-apt-common_2.0.0ubuntu0.20.04.5_all.deb' python-apt-common_2.0.0ubuntu0.20.04.5_all.deb 17052 MD5Sum:a9e11f5f8671c5069f5edaef32e2f620
'http://se.archive.ubuntu.com/ubuntu/pool/main/p/python-apt/python3-apt_2.0.0ubuntu0.20.04.5_amd64.deb' python3-apt_2.0.0ubuntu0.20.04.5_amd64.deb 154164 MD5Sum:8590dd473b444f2756e5c7498e00e7ec
'http://se.archive.ubuntu.com/ubuntu/pool/main/g/gnome-online-accounts/libgoa-1.0-common_3.36.1-0ubuntu1_all.deb' libgoa-1.0-common_3.36.1-0ubuntu1_all.deb 3752 MD5Sum:9252da969452bdf88527829a752ac175

(此输出被截断)

假设您想要从上面解析出“干净”的 URI,以下sed命令将删除从第一行到以字符串开头After(含)的所有行。从其余行中,它将删除空格后的所有内容,然后删除修改行中的第一个和最后一个字符(这将删除 URI 周围的单引号)。

sed '1,/^After/d; s/ .*//; s/.//; s/.$//'

在上面的简短示例输出中使用它:

$ sed '1,/^After/d; s/ .*//; s/.//; s/.$//' file
http://se.archive.ubuntu.com/ubuntu/pool/main/p/python-apt/python-apt-common_2.0.0ubuntu0.20.04.5_all.deb
http://se.archive.ubuntu.com/ubuntu/pool/main/p/python-apt/python3-apt_2.0.0ubuntu0.20.04.5_amd64.deb
http://se.archive.ubuntu.com/ubuntu/pool/main/g/gnome-online-accounts/libgoa-1.0-common_3.36.1-0ubuntu1_all.deb

给定相同的输入数据,命令

sed -n "s,.*\(http://[^']*\).*,\1,p" file

也会起作用。这尝试匹配以http://单引号开头和之前结束的任何子字符串。然后它用该子字符串替换整行并打印修改后的行。不匹配的行将被丢弃。

相关内容