有没有办法使用 shell 脚本建立网络连接?

有没有办法使用 shell 脚本建立网络连接?

我想与网站建立网络连接,逐行读取数据,然后使用 shell 脚本将其存储在系统中的文本文件中。我已经使用 java 完成了此操作,我可以使用 URLConnection 对象读取该特定资源。
在 shell 脚本中,WGET Spider 是唯一可以做到这一点的方法吗?如果不是,还有哪些其他方法可以从网站读取文本文件、对其进行解析并将其存储在我的本地目录中?

编辑

我尝试使用 WGET wget -o /home/user/Desktop/training.txt https://www.someurl.com。但输出是这样的

--2014-04-15 00:39:15--  https://s3.amazonaws.com/hr-testcases/368/assets/trainingdata.txt
Resolving s3.amazonaws.com (s3.amazonaws.com)... 176.32.99.154
Connecting to s3.amazonaws.com (s3.amazonaws.com)|176.32.99.154|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 1554016 (1.5M) [text/plain]
Saving to: ‘trainingdata.txt.1’

     0K .......... .......... .......... .......... ..........  3% 47.5K 31s
    50K .......... .......... .......... .......... ..........  6%  129K 20s
   100K .......... .......... .......... .......... ..........  9%  136K 16s
   150K .......... .......... .......... .......... .......... 13%  149K 14s
   200K .......... .......... .......... .......... .......... 16% 1.57M 11s
   250K .......... .......... .......... .......... .......... 19%  162K 10s
   300K .......... .......... .......... .......... .......... 23%  678K 9s
   350K .......... .......... .......... .......... .......... 26%  612K 7s
   400K .......... .......... .......... .......... .......... 29%  307K 7s
   450K .......... .......... .......... .......... .......... 32%  630K 6s
   500K .......... .......... .......... .......... .......... 36%  699K 5s
   550K .......... .......... .......... .......... .......... 39%  520K 5s
   600K .......... .......... .......... .......... .......... 42%  580K 4s
   650K .......... .......... .......... .......... .......... 46%  516K 4s
   700K .......... .......... .......... .......... .......... 49%  551K 3s
   750K .......... .......... .......... .......... .......... 52%  713K 3s
   800K .......... .......... .......... .......... .......... 56%  720K 3s
   850K .......... .......... .......... .......... .......... 59%  701K 2s
   900K .......... .......... .......... .......... .......... 62%  603K 2s
   950K .......... .......... .......... .......... .......... 65%  670K 2s
  1000K .......... .......... .......... .......... .......... 69%  715K 2s
  1050K .......... .......... .......... .......... .......... 72%  671K 1s
  1100K .......... .......... .......... .......... .......... 75%  752K 1s
  1150K .......... .......... .......... .......... .......... 79%  535K 1s
  1200K .......... .......... .......... .......... .......... 82%  607K 1s
  1250K .......... .......... .......... .......... .......... 85%  675K 1s
  1300K .......... .......... .......... .......... .......... 88%  727K 1s
  1350K .......... .......... .......... .......... .......... 92%  707K 0s
  1400K .......... .......... .......... .......... .......... 95%  632K 0s
  1450K .......... .......... .......... .......... .......... 98%  785K 0s
  1500K .......... .......                                    100%  931K=4.5s

2014-04-15 00:39:23 (341 KB/s) - ‘trainingdata.txt.1’ saved [1554016/1554016]

它似乎只提供诸如下载所用时间等统计数据。它并未保存来自 URL 的实际数据。

答案1

听起来你想要网猫

Netcat 是一款功能强大的网络实用程序,它使用 TCP/IP 协议跨网络连接读取和写入数据。它旨在成为一种可靠的“后端”工具,可以直接使用或由其他程序和脚本轻松驱动。同时,它还是一种功能丰富的网络调试和探索工具,因为它可以创建您需要的几乎任何类型的连接,并且具有多种有趣的内置功能。

欲了解更多信息,您可以随时man nc

答案2

您正在运行的命令正在使用-o执行以下任务的标志(来自man wget):

   -o logfile
   --output-file=logfile
       Log all messages to logfile.  The messages are normally reported to
       standard error.

它实际上并不保存该文件的 URL 目标,而只是保存 的标准错误wget。默认情况下,wget将目标保存为与远程文件相同的名称。例如,执行此操作

wget http://www.foo.com/index.html

将文件保存为index.html当前目录。要为文件指定其他名称,请改用-O(大写o,如 Oliver):

   -O file
   --output-document=file
       The documents will not be written to the appropriate files, but all
       will be concatenated together and written to file.  If - is used as
       file, documents will be printed to standard output, disabling link
       conversion.  (Use ./- to print to a file literally named -.)

       Use of -O is not intended to mean simply "use the name file instead
       of the one in the URL;" rather, it is analogous to shell
       redirection: wget -O file http://foo is intended to work like wget
       -O - http://foo > file; file will be truncated immediately, and all
       downloaded content will be written there.

相关内容