使用网站 URL 进行技术提取

使用网站 URL 进行技术提取

我想知道是否可以通过 Ubuntu 仅通过 URL 来检索构建网站所使用的技术。
例如:如果我有以下 URL:


PHP, HHVM, Vanish, AddThis and many others.

请记住,我有一个网站列表文件,我想提取这些网站的 Web 技术并将它们放在 URL 后面的文件中(逐行)。请告诉我是否可以使用 Ubuntu 命令或 Ubuntu 上的任何软件来实现这一点。


我认为这在 Ubuntu 上是不可能实现的。

您可以使用 Lynx 等终端浏览器或 curl 等命令来提取数据,但解析数据将非常耗时。




例如,它显示 techcrunch.com

  • 使用 NGINX 作为网络服务器
  • 使用来自 Godaddy 和 Wordpress 的 SSL 证书
  • 显示 SSL 重定向
  • WordPress 的 DNS
  • 使用 Postmark;Sailthru;Google Apps for Business 并设置了 SPF
  • 由 Wordpress.com 托管
  • 使用 Wordpress VIP 作为其 CMS
  • 使用 PHP 解释器
  • 拥有大量分析、广告商和追踪数据
  • 使用了很多 JS 库(网站上都列出了),包括 jQuery 和 Backbone.js 之类的
  • 使用 AOL On;Tube Mogul 和 TidalTV 作为媒体



祝你好运 :)




您可以查看 Kali 或 Parrot 发行版以获取信息收集工具。

  • nikto是我之前尝试过的其中一种,它提供了部分信息。它也可以在 Ubuntu 存储库中使用。

    ~$ whatis nikto
    nikto (1)            - Scan web server for known vulnerabilities
    ~$ sudo apt-get install nikto
    ~$ sudo nikto -update
    ~$ nikto -Tuning b -h www.wikipedia.org
    - Nikto v2.1.5
    + Target IP:
    + Target Hostname:    www.wikipedia.org
    + Target Port:        80
    + Start Time:         2016-11-14 09:22:30 (GMT1)
    + Server: Varnish
    + IP address found in the 'x-client-ip' header. The IP is "".
    + The anti-clickjacking X-Frame-Options header is not present.
    + Uncommon header 'x-client-ip' found, with contents:
    + Uncommon header 'x-cache' found, with contents: cp3041 int
    + Uncommon header 'x-varnish' found, with contents: 827655138
    + Uncommon header 'x-cache-status' found, with contents: int
    + Root page / redirects to: https://www.wikipedia.org/
    + No CGI Directories found (use '-C all' to force check all possible dirs)
    + Server banner has changed from 'Varnish' to 'mw1187.eqiad.wmnet' which may suggest a WAF, load balancer or proxy is in place
    + Cookie GeoIP created without the httponly flag
    + Retrieved via header: 1.1 varnish-v4, 1.1 varnish-v4, 1.1 varnish-v4
    + Retrieved x-powered-by header: HHVM/3.3.0-static
    + Server leaks inodes via ETags, header found with file /, fields: 0xW/3b2 0x5369720eefb07 
    + Uncommon header 'x-analytics' found, with contents: nocookies=1
    + Uncommon header 'backend-timing' found, with contents: D=236 t=1478774110870502
    + 269 items checked: 0 error(s) and 12 item(s) reported on remote host
    + End Time:           2016-11-14 09:23:21 (GMT1) (51 seconds)
    + 1 host(s) tested
  • whatweb是另一种工具。它具有未修复的错误(无效的多字节转义错误)在 Ubuntu 中。

    1. 打开编码自动检测库文件进行编辑

      sudo nano /usr/lib/ruby/vendor_ruby/rchardet/universaldetector.rb
    2. 添加# encoding: US-ASCII

    即使采用上述解决方法,输出也不像 Kali 那样干净。

    ~$ whatis whatweb
    whatweb (1)          - Web scanner to identify what websites are running.
    ~$ whatweb www.wikipedia.org
    /usr/share/whatweb/lib/tld.rb:85: warning: key "2nd_level_registration" is duplicated and overwritten on line 85
    /usr/share/whatweb/lib/tld.rb:93: warning: key "2nd_level_registration" is duplicated and overwritten on line 93
    /usr/share/whatweb/lib/tld.rb:95: warning: key "2nd_level_registration" is duplicated and overwritten on line 95
    /usr/share/whatweb/plugins/wordpress.rb:436: warning: key "2.7-beta1" is duplicated and overwritten on line 453
    /usr/share/whatweb/lib/extend-http.rb:102:in `connect': Object#timeout is deprecated, use Timeout.timeout instead.
    http://www.wikipedia.org [301] Cookies[WMF-Last-Access], Country[NETHERLANDS][NL], HTTPServer[Varnish], HttpOnly[WMF-Last-Access], IP[], RedirectLocation[https://www.wikipedia.org/], UncommonHeaders[x-varnish,x-cache-status,x-client-ip], Varnish
    /usr/share/whatweb/lib/extend-http.rb:102:in `connect': Object#timeout is deprecated, use Timeout.timeout instead.
    /usr/share/whatweb/lib/extend-http.rb:140:in `connect': Object#timeout is deprecated, use Timeout.timeout instead.
    https://www.wikipedia.org/ [200] Cookies[GeoIP,WMF-Last-Access], Country[NETHERLANDS][NL], Email[[email protected],[email protected],[email protected],[email protected],[email protected],[email protected],[email protected],[email protected]], HTML5, HTTPServer[mw1253.eqiad.wmnet], HttpOnly[WMF-Last-Access], IP[], probably MediaWiki, Script, Title[Wikipedia], UncommonHeaders[backend-timing,x-varnish,x-cache-status,strict-transport-security,x-analytics,x-client-ip], Varnish, Via-Proxy[1.1 varnish-v4, 1.1 varnish-v4, 1.1 varnish-v4], X-Powered-By[HHVM/3.3.0-static]

    Kali 的输出:

    ~# whatweb https://www.wikipedia.org
    https://www.wikipedia.org [200 OK] Cookies[GeoIP,WMF-Last-Access], Country[NETHERLANDS][NL], Email[[email protected],[email protected],[email protected],[email protected],[email protected],[email protected],[email protected],[email protected]], HTML5, HTTPServer[mw1253.eqiad.wmnet], HttpOnly[WMF-Last-Access], IP[], probably MediaWiki, Script, Strict-Transport-Security[max-age=31536000; includeSubDomains; preload], Title[Wikipedia], UncommonHeaders[backend-timing,x-varnish,x-cache-status,x-analytics,x-client-ip], Varnish, Via-Proxy[1.1 varnish-v4, 1.1 varnish-v4, 1.1 varnish-v4], X-Powered-By[HHVM/3.3.0-static]
