wget 下载表单后面的文件

wget 下载表单后面的文件

我尝试定期从 sharelatex.com 下载我的 LaTeX 项目,以便自己存档。不幸的是,我无法通过表格。这是我的尝试,我做错了什么?

#!/bin/bash
# Log in to the server.  This can be done only once.                   
wget --save-cookies cookies.txt \
     --keep-session-cookies \
     --post-data='[email protected]&password=myFancyPw' \
     --delete-after \
     --auth-no-challenge \
     https://www.sharelatex.com/project/SOME_PROJECT_NUMBER/download/zip
echo "------------------------------------------------------------------------------------------------------------------------------"
# Now grab the page or pages we care about.
wget --load-cookies cookies.txt \
     https://www.sharelatex.com/project/SOME_PROJECT_NUMBER/download/zip

编辑:

该 cookie 被存储,并且下载的文件 zip.* 只包含 HTML 登录页面。

如果我在 wget 命令中添加--verbose--debug标志,则输出将如下所示:

Setting --auth-no-challenge (authnochallenge) to 1
Setting --method (method) to POST
Setting --body-data (bodydata) to [email protected]&password=myFancyPw
DEBUG output created by Wget 1.17.1 on linux-gnu.

Reading HSTS entries from /home/USER/.wget-hsts
URI encoding = ‘UTF-8’
--2017-07-13 07:18:23--  https://www.sharelatex.com/project/SOME_PROJECT_NUMBER/download/zip
Resolving www.sharelatex.com (www.sharelatex.com)... 45.79.151.246
Caching www.sharelatex.com => 45.79.151.246
Connecting to www.sharelatex.com (www.sharelatex.com)|45.79.151.246|:443... connected.
Created socket 3.
Releasing 0x00005640d5b63b30 (new refcount 1).
Initiating SSL handshake.
Handshake successful; connected socket 3 to SSL handle 0x00005640d5b649e0
certificate:
  subject: CN=*.sharelatex.com,OU=PositiveSSL Wildcard,OU=Domain Control Validated
  issuer:  CN=COMODO RSA Domain Validation Secure Server CA,O=COMODO CA Limited,L=Salford,ST=Greater Manchester,C=GB
X509 certificate successfully verified and matches host www.sharelatex.com

---request begin---
POST /project/SOME_PROJECT_NUMBER/download/zip HTTP/1.1
User-Agent: Wget/1.17.1 (linux-gnu)
Accept: */*
Accept-Encoding: identity
Host: www.sharelatex.com
Connection: Keep-Alive
Content-Type: application/x-www-form-urlencoded
Content-Length: 47

---request end---
[BODY data: [email protected]&password=myFancyPw]
HTTP request sent, awaiting response... 
---response begin---
HTTP/1.1 403 Forbidden
Server: nginx
Date: Thu, 13 Jul 2017 05:18:24 GMT
Content-Type: text/plain; charset=utf-8
Content-Length: 9
Connection: keep-alive
X-Powered-By: Express
Vary: X-HTTP-Method-Override
ETag: W/"9-cilpV3qWyjlT6E49lJ3ugQ"
set-cookie: sharelatex_session=s%3A5y54WSx5DWpS2xwlGIQgQrZswlQgkYbu.u3gjqNtKhK%2BTQIrrG15QaWQHsEDNc%2BSI6sgOi%2BPpwsY; Domain=.sharelatex.com; Path=/; Expires=Tue, 18 Jul 2017 05:18:24 GMT; HttpOnly; Secure
X-Server-Group: green
Set-Cookie: SERVERID=sl-lin-prod-web-3; path=/

---response end---
403 Forbidden
cdm: 2 3 4 5 6 7 8
Stored cookie sharelatex.com -1 (ANY) / <permanent> <secure> [expiry 2017-07-18 07:18:24] sharelatex_session s%3A5y54WSx5DWpS2xwlGIQgQrZswlQgkYbu.u3gjqNtKhK%2BTQIrrG15QaWQHsEDNc%2BSI6sgOi%2BPpwsY

Stored cookie www.sharelatex.com -1 (ANY) / <session> <insecure> [expiry none] SERVERID sl-lin-prod-web-3
Registered socket 3 for persistent reuse.
URI content encoding = ‘utf-8’
Skipping 9 bytes of body: [Forbidden] done.
2017-07-13 07:18:24 ERROR 403: Forbidden.

Saving cookies to cookies.txt.
Done saving cookies.
Saving HSTS entries to /home/USER/.wget-hsts
------------------------------------------------------------------------------------------------------------------------------
--2017-07-13 07:18:24--  https://www.sharelatex.com/project/SOME_PROJECT_NUMBER/download/zip
Resolving www.sharelatex.com (www.sharelatex.com)... 45.79.151.246
Connecting to www.sharelatex.com (www.sharelatex.com)|45.79.151.246|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: /restricted?from=%2Fproject%SOME_PROJECT_NUMBER%2Fdownload%2Fzip [following]
--2017-07-13 07:18:24--  https://www.sharelatex.com/restricted?from=%2Fproject%SOME_PROJECT_NUMBER%2Fdownload%2Fzip
Reusing existing connection to www.sharelatex.com:443.
HTTP request sent, awaiting response... 302 Found
Location: /login [following]
--2017-07-13 07:18:24--  https://www.sharelatex.com/login
Reusing existing connection to www.sharelatex.com:443.
HTTP request sent, awaiting response... 200 OK
Length: 15737 (15K) [text/html]
Saving to: ‘zip.15’

zip.15                             100%[==============================================================>]  15.37K  --.-KB/s    in 0s      

2017-07-13 07:18:24 (362 MB/s) - ‘zip.15’ saved [15737/15737

答案1

我找到了一个使用 python 和 的解决方案requests。我天真地按照本教程进行操作,登录和后续下载非常顺利:使用请求登录

相关内容