EADST

Obtain Links and Download Images from Webpages

Obtain Links and Download Images from Webpages

import requests
from bs4 import BeautifulSoup

def getHTMLText(url):
    try:
        res = requests.get(url, timeout = 6)
        res.raise_for_status()
        res.encoding = res.apparent_encoding
        return res.text
    except:
        return 'Error'

def main(url):
    demo = getHTMLText(url)
    soup = BeautifulSoup(demo, 'html.parser')
    a_labels = soup.find_all('a', attrs={'href': True})

    for idx, a in enumerate(a_labels):
        link = a.get('href')
        if "res" not in link and ".jpg" in link and idx % 50 == 1:
            urls = url + link
            save_path = "./save/" + link
            with open(save_path, 'wb') as f:
                f.write(requests.get(urls).content)


url = "http://eadst.com/"
main(url)
相关标签
About Me
XD
Goals determine what you are going to be.
Category
标签云
LLM mmap VGG-16 Git Dataset llama.cpp Crawler NameSilo Numpy Image2Text Disk Hilton Markdown SQL Pillow FP8 scipy SQLite Plate 继承 多进程 CAM Bitcoin Diagram Translation VPN API 域名 Datetime 飞书 tar ChatGPT Baidu UI Tiktoken TensorRT XGBoost TensorFlow CTC Qwen Mixtral Card Web Base64 hf Random NLP Interview Magnet git Breakpoint InvalidArgumentError Freesound Animate Color Website 公式 Augmentation FlashAttention logger GoogLeNet 版权 TSV 净利润 Jetson Pytorch FastAPI XML Conda JSON PIP Excel 签证 Use COCO Llama Vmess Review 财报 音频 RAR Qwen2 printf Bin SAM OpenCV Quantize Python 关于博主 Miniforge WebCrawler QWEN BF16 IndexTTS2 ModelScope Gemma Bipartite GIT tqdm Docker Google PDB Land Safetensors git-lfs AI Attention EXCEL CLAP v2ray RGB Jupyter MD5 OpenAI VSCode FP64 HuggingFace PyCharm PDF torchinfo NLTK LeetCode Proxy uWSGI Cloudreve v0.dev uwsgi Clash WAN Data BTC Vim Hungarian Template Paddle Password 报税 ResNet-50 PyTorch Ptyhon 腾讯云 LLAMA Transformers 算法题 CUDA Plotly Heatmap Video Domain GGML 证件照 Bert CEIR 搞笑 Hotel Tensor Shortcut Quantization Ubuntu Paper DeepStream Zip diffusers Pandas TTS Distillation LaTeX FP16 FP32 Statistics ONNX Streamlit GPTQ SPIE Pickle UNIX Claude Math Nginx Permission Knowledge Anaconda Algorithm Sklearn Firewall 多线程 OCR YOLO 阿里云 Github transformers Linux Logo DeepSeek Food Michelin CV HaggingFace CSV SVR Input LoRA Windows Django GPT4 Qwen2.5 C++ Tracking CC BeautifulSoup
站点统计

本站现有博文311篇,共被浏览742417

本站已经建立2382天!

热门文章
文章归档
回到顶部