EADST

Obtain Links and Download Images from Webpages

Obtain Links and Download Images from Webpages

import requests
from bs4 import BeautifulSoup

def getHTMLText(url):
    try:
        res = requests.get(url, timeout = 6)
        res.raise_for_status()
        res.encoding = res.apparent_encoding
        return res.text
    except:
        return 'Error'

def main(url):
    demo = getHTMLText(url)
    soup = BeautifulSoup(demo, 'html.parser')
    a_labels = soup.find_all('a', attrs={'href': True})

    for idx, a in enumerate(a_labels):
        link = a.get('href')
        if "res" not in link and ".jpg" in link and idx % 50 == 1:
            urls = url + link
            save_path = "./save/" + link
            with open(save_path, 'wb') as f:
                f.write(requests.get(urls).content)


url = "http://eadst.com/"
main(url)
相关标签
About Me
XD
Goals determine what you are going to be.
Category
标签云
Numpy Magnet SVR LaTeX FP32 VGG-16 logger NameSilo Review GPTQ Bipartite Website UNIX FastAPI Tracking EXCEL 净利润 DeepStream Template Web Password Proxy Color Distillation FP16 TSV Claude Hilton Markdown Use Base64 QWEN tar SPIE Hungarian ONNX Datetime Zip Jetson FlashAttention RAR Image2Text 多线程 Animate CV 签证 Django transformers v0.dev GoogLeNet PIP MD5 Freesound WAN Clash 算法题 XGBoost Windows Interview LeetCode Google VPN CC Domain uWSGI Heatmap Conda LoRA Paddle 阿里云 Attention Data tqdm Gemma CLAP Disk VSCode Pickle Video Statistics Knowledge Vim BF16 hf 云服务器 Transformers torchinfo IndexTTS2 Card 腾讯云 Qwen Excel Michelin CAM Paper PyCharm News CSV HaggingFace uwsgi Math Input Breakpoint Logo Quantization UI Miniforge Hotel GPT4 diffusers v2ray 财报 Cloudreve TensorFlow API Git Docker Python Github Pillow Random Tiktoken Sklearn Bert 图形思考法 Baidu 顶会 WebCrawler HuggingFace BTC Ptyhon 第一性原理 TTS 强化学习 Firewall FP64 GIT ChatGPT PDF Qwen2.5 Dataset SQLite CEIR BeautifulSoup Land 公式 TensorRT FP8 COCO 证件照 scipy Safetensors PDB SQL Diagram Search OCR Linux git-lfs Pytorch Augmentation llama.cpp ResNet-50 JSON OpenAI YOLO 搞笑 Jupyter 递归学习法 GGML CTC 报税 Crawler Ubuntu Bitcoin SAM Qwen2 域名 NLTK Plotly Mixtral Permission 多进程 Algorithm Anaconda 关于博主 Quantize AI CUDA Agent Nginx Translation C++ DeepSeek Plate Tensor Shortcut git InvalidArgumentError 版权 RGB printf PyTorch 图标 Vmess icon 继承 LLAMA Llama Rebuttal mmap XML 飞书 NLP ModelScope Pandas Bin 音频 Streamlit Food LLM OpenCV
站点统计

本站现有博文324篇,共被浏览815968

本站已经建立2520天!

热门文章
文章归档
回到顶部