有时我需要执行一个简单的任务,将基本 HTML 输出到控制台中。我希望对其进行最小化渲染,以便一目了然地阅读。是否有一个实用程序可以在 shell 中处理基本的 HTML 渲染(想想山猫-风格渲染——但不是实际的浏览器)?
例如,有时我会watch
在 Apache 的mod_status
页面上放一个:
watch -n 1 curl http://some-server/server-status
页面的输出是带有一些最小标记的 HTML,在 shell 中显示如下:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">
<html><head>
<title>Apache Status</title>
</head><body>
<h1>Apache Server Status for localhost</h1>
<dl><dt>Server Version: Apache/2.2.22 (Ubuntu) PHP/5.3.10-1ubuntu3.15 with Suhosin-Patch</dt>
<dt>Server Built: Jul 22 2014 14:35:25
</dt></dl><hr /><dl>
<dt>Current Time: Wednesday, 19-Nov-2014 15:21:40 UTC</dt>
<dt>Restart Time: Wednesday, 19-Nov-2014 15:13:02 UTC</dt>
<dt>Parent Server Generation: 1</dt>
<dt>Server uptime: 8 minutes 38 seconds</dt>
<dt>Total accesses: 549 - Total Traffic: 2.8 MB</dt>
<dt>CPU Usage: u35.77 s12.76 cu0 cs0 - 9.37% CPU load</dt>
<dt>1.06 requests/sec - 5.6 kB/second - 5.3 kB/request</dt>
<dt>1 requests currently being processed, 9 idle workers</dt>
</dl><pre>__W._______.....................................................
................................................................
................................................................
................................................................
</pre>
<p>Scoreboard Key:<br />
"<b><code>_</code></b>" Waiting for Connection,
"<b><code>S</code></b>" Starting up,
"<b><code>R</code></b>" Reading Request,<br />
"<b><code>W</code></b>" Sending Reply,
"<b><code>K</code></b>" Keepalive (read),
"<b><code>D</code></b>" DNS Lookup,<br />
"<b><code>C</code></b>" Closing connection,
"<b><code>L</code></b>" Logging,
"<b><code>G</code></b>" Gracefully finishing,<br />
"<b><code>I</code></b>" Idle cleanup of worker,
"<b><code>.</code></b>" Open slot with no current process</p>
<p />
在 Lynx 中查看时,相同的 HTML 呈现为: Apache Status (p1 of 2) Apache Server Status for localhost
Server Version: Apache/2.2.22 (Ubuntu) PHP/5.3.10-1ubuntu3.15 with Suhosin-Patch
Server Built: Jul 22 2014 14:35:25
________________________________________________________________________________________________________
Current Time: Wednesday, 19-Nov-2014 15:23:50 UTC
Restart Time: Wednesday, 19-Nov-2014 15:13:02 UTC
Parent Server Generation: 1
Server uptime: 10 minutes 48 seconds
Total accesses: 606 - Total Traffic: 3.1 MB
CPU Usage: u37.48 s13.6 cu0 cs0 - 7.88% CPU load
.935 requests/sec - 5088 B/second - 5.3 kB/request
2 requests currently being processed, 9 idle workers
_C_______W_.....................................................
................................................................
................................................................
................................................................
Scoreboard Key:
"_" Waiting for Connection, "S" Starting up, "R" Reading Request,
"W" Sending Reply, "K" Keepalive (read), "D" DNS Lookup,
"C" Closing connection, "L" Logging, "G" Gracefully finishing,
"I" Idle cleanup of worker, "." Open slot with no current process
答案1
lynx
有一个“转储”模式,您可以将其与以下命令一起使用watch
:
$ watch lynx https://www.google.com -dump
从man lynx
:
-dump dumps the formatted output of the default document or those
specified on the command line to standard output. Unlike
interactive mode, all documents are processed. This can be used
in the following way:
lynx -dump http://www.subir.com/lynx.html
Files specified on the command line are formatted as HTML if
their names end with one of the standard web suffixes such as
“.htm” or “.html”. Use the -force_html option to format files
whose names do not follow this convention.
这问个Ubuntu问题还有更多选择。
答案2
w3m
是另一个有选项的程序-dump
。
它是后端 Emacs 最流行的 Web 浏览器。
答案3
答案4
对于不同类型的方法,潘多克可以在多种格式之间进行转换,包括从 html 到纯文本,并且您可以直接给它一个 URL 以将 html 转换为其他格式,
pandoc --to plain https://example.net
或者,如果您想要一些格式,您可以使用 markdown 输出:
pandoc --to markdown https://example.net