EADST

Obtain Links and Download Images from Webpages

Obtain Links and Download Images from Webpages

import requests
from bs4 import BeautifulSoup

def getHTMLText(url):
    try:
        res = requests.get(url, timeout = 6)
        res.raise_for_status()
        res.encoding = res.apparent_encoding
        return res.text
    except:
        return 'Error'

def main(url):
    demo = getHTMLText(url)
    soup = BeautifulSoup(demo, 'html.parser')
    a_labels = soup.find_all('a', attrs={'href': True})

    for idx, a in enumerate(a_labels):
        link = a.get('href')
        if "res" not in link and ".jpg" in link and idx % 50 == 1:
            urls = url + link
            save_path = "./save/" + link
            with open(save_path, 'wb') as f:
                f.write(requests.get(urls).content)


url = "http://eadst.com/"
main(url)
相关标签
About Me
XD
Goals determine what you are going to be.
Category
标签云
Plotly Permission Clash Algorithm Baidu COCO LLM WebCrawler Knowledge Freesound Distillation 多进程 HaggingFace FP32 IndexTTS2 v0.dev git-lfs Land Bert BTC PDB Cloudreve Color Anaconda llama.cpp 递归学习法 公式 Heatmap Logo LoRA OCR RAR SPIE JSON scipy Bin 飞书 Qwen2.5 Quantization Search 腾讯云 PDF LLAMA 图形思考法 Paper ChatGPT TSV ModelScope Animate 多线程 QWEN SQL 净利润 CLAP Hotel torchinfo InvalidArgumentError Michelin Nginx VPN FastAPI GIT 报税 HuggingFace Domain DeepSeek FP16 Excel GGML Mixtral Interview hf NLTK VSCode NameSilo Base64 搞笑 Jetson TensorRT FP64 Shortcut Bipartite Password git Gemma tqdm Web CEIR Tensor Markdown Claude Hilton 证件照 Datetime 音频 Use C++ EXCEL Qwen2 SVR Google Bitcoin Qwen BeautifulSoup Python RGB NLP CUDA Disk Zip 第一性原理 Github OpenAI LeetCode Card Statistics Review Docker Agent LaTeX PyTorch News Proxy Magnet CSV CC VGG-16 tar Paddle FP8 Quantize MD5 SQLite UI Pandas CAM logger ResNet-50 Image2Text printf 财报 BF16 Vim 关于博主 Crawler Food PyCharm FlashAttention 版权 Django 继承 TTS Math DeepStream uwsgi Miniforge GoogLeNet v2ray API GPTQ Random Pillow transformers 强化学习 Conda Transformers Pytorch Input Video Dataset GPT4 Tracking TensorFlow SAM mmap AI diffusers XML Breakpoint Augmentation 顶会 ONNX uWSGI Vmess Firewall Translation Numpy WAN OpenCV Linux 阿里云 Template Ptyhon Sklearn Streamlit Data Plate Windows 签证 Jupyter Ubuntu Website UNIX Attention Git Hungarian 算法题 XGBoost Pickle Tiktoken CV Llama 域名 YOLO Diagram PIP Safetensors CTC
站点统计

本站现有博文320篇,共被浏览759226

本站已经建立2427天!

热门文章
文章归档
回到顶部