下载网页链接图片

下载网页链接图片

是否可以下载网页中链接的所有 .jpg 和 .png 文件?我想下载 [本论坛][1] 中每个主题的每个帖子中包含链接的图片。例如 [本帖子][2] 包含指向 [本文件][3] 的链接。

我已经尝试过使用 wget:

  wget -r -np http://www.mtgsalvation.com/forums/creativity/artwork/340782-official-digital-rendering-thread? 

它会复制该主题的所有 html 文件。虽然我不知道它为什么会从 跳转到...thread?comment=336...thread?comment=3232但它会一个接一个地跳到第 336 条评论。

答案1

尝试用这个命令:

wget -P path/where/save/result -A jpg,png -r http://www.mtgsalvation.com/forums/creativity/artwork/

根据wget 手册页

    -A acclist --accept acclist
        Specify comma-separated lists of file name suffixes or patterns to
        accept or reject (@pxref{Types of Files} for more details).
    -P prefix
        Set directory prefix to prefix.  The directory prefix is the direc‐
        tory where all other files and subdirectories will be saved to,
        i.e. the top of the retrieval tree.  The default is . (the current
        directory).
    -r
    --recursive
        Turn on recursive retrieving.

尝试这个:

    mkdir wgetDir
    wget -P wgetDir http://www.mtgsalvation.com/forums/creativity/artwork/340782-official-digital-rendering-thread?page=145

此命令将获取 html 页面并将其放入wgetDir。当我尝试此命令时,我发现了这个文件:

    340782-official-digital-rendering-thread?page=145

然后,我尝试了这个命令:

    wget -P wgetDir -A png,jpg,jpeg,gif -nd --force-html -r -i "wgetDir/340782-official-digital-rendering-thread?page=145"

它会下载图片。所以,它似乎可以工作,尽管我不知道这些图片是否是你想要下载的。

答案2

#include <stdio.h>
#include <stdlib.h> // for using system calls
#include <unistd.h> // for sleep

int main ()
{
    char  body[] = "forum-post-body-content", notes[] = "p-comment-notes", img[] = "img src=", link[200], cmd[200]={0}, file[10];
    int c, pos = 0, pos2 = 0, fin = 0, i, j, num = 0, found = 0;
    FILE *fp;

    for (i = 1; i <= 149; ++i)
    {
        sprintf(cmd,"wget -O page%d.txt 'http://www.mtgsalvation.com/forums/creativity/artwork/340782-official-digital-rendering-thread?page=%d'",i,i);
        system(cmd);
        sprintf(file, "page%d.txt", i);
        fp = fopen (file, "r");
        while ((c = fgetc(fp)) != EOF)
        {
            if (body[pos] == c)
            {
                if (pos == 22)
                {
                    pos = 0;
                    while (fin == 0)
                    {
                        c = fgetc (fp);
                        if (feof (fp))
                            break;
                        if (notes[pos] == c)
                        {
                            if (pos == 14)
                            {
                                fin = 1;
                                pos = -1;
                            }
                            ++pos;
                        }
                        else
                        {
                            if(pos > 0)
                                pos = 0;
                        }
                        if (img[pos2] == c)
                        {
                            if (pos2 == 7)
                            {
                                pos2 = 0;
                                while (found == 0)
                                {
                                    c = fgetc (fp); // get char from file
                                    link[pos2] = c;
                                    if (pos2 > 0)
                                    {
                                        if(link[pos2-1] == 'g' && link[pos2] == '\"')
                                        {
                                        found = 1;
                                        }
                                    }
                                    ++pos2;
                                }
                                --pos2;
                                found = 0;
                                char link2[pos2];
                                for (j = 1; j < pos2; ++j)
                                {
                                    link2[j - 1] = link[j];
                                }
                                link2[j - 1] = '\0';
                                sprintf(cmd, "wget -O /home/arturo/Dropbox/Digital_Renders/%d \'%s\'", ++num, link2);
                                system(cmd);
                                pos2 = -1;
                            }
                            ++pos2;
                        }
                        else
                        {
                            if(pos2 > 0)
                                pos2 = 0;
                        }
                    }
                fin = 0;
                }
                ++pos;
            }
            else
                pos = 0;
        }
        // closing file
        fclose (fp);
        if (remove (file))
            fprintf(stderr, "Can't remove file\n");
    }
}

相关内容