EADST

Convert PDFs to Images

Use Python to convert PDF documents into images, page by page.

from pdf2image import convert_from_path
import os

def convert_pdf_to_images(pdf_path, output_folder):
    if not os.path.exists(output_folder):
        os.makedirs(output_folder)

    # pdf2image
    images = convert_from_path(pdf_path)

    for i, image in enumerate(images):
        image_path = os.path.join(output_folder, f"page_{i+1}.jpg")
        image.save(image_path, 'JPEG')

def process_all_pdfs(pdf_folder):
    for root, dirs, files in os.walk(pdf_folder):
        for file in files:
            if file.lower().endswith('.pdf'):
                pdf_path = os.path.join(root, file)
                output_folder = os.path.join(root, os.path.splitext(file)[0])
                convert_pdf_to_images(pdf_path, output_folder)

pdf_folder = '/your_folder_path/'  
process_all_pdfs(pdf_folder)
相关标签
About Me
XD
Goals determine what you are going to be.
Category
标签云
Diagram PDF XGBoost Datetime SPIE 多线程 Tensor SAM COCO logger AI Pickle Data Agent FP16 Markdown VPN Random Bert API PIP News JSON Pytorch GoogLeNet printf LoRA GPT4 BeautifulSoup Plotly Attention 腾讯云 EXCEL Plate Hilton UNIX ChatGPT CUDA Windows Jetson 域名 Clash mmap Streamlit Django Numpy 多进程 Ptyhon FlashAttention Firewall Freesound Web Shortcut Video Transformers WebCrawler Website FP8 图形思考法 FP32 UI 飞书 Statistics Use VSCode 继承 Conda Logo Cloudreve Qwen2 SQLite git-lfs Git Ubuntu Tracking Quantize Sklearn TSV scipy Color HaggingFace Card BTC LeetCode Augmentation OCR TensorRT Crawler Safetensors LaTeX Gemma Knowledge OpenAI Github llama.cpp Llama Claude Anaconda CLAP QWEN Baidu Python uwsgi hf XML InvalidArgumentError tar Zip PyTorch Magnet Bitcoin LLM v2ray WAN GGML NLTK ModelScope BF16 FP64 OpenCV uWSGI 音频 Qwen2.5 Input DeepSeek 强化学习 Template 第一性原理 算法题 Rebuttal Excel Animate Breakpoint Jupyter 签证 图标 CC 证件照 净利润 Bipartite Vmess Domain Qwen Paddle v0.dev CEIR Review Nginx Math PDB CSV git RAR Hotel TensorFlow Land Quantization VGG-16 Translation ResNet-50 diffusers Docker CTC 顶会 HuggingFace Vim 搞笑 Linux Proxy TTS 版权 LLAMA Algorithm IndexTTS2 Base64 关于博主 Image2Text DeepStream Mixtral C++ Tiktoken Distillation SVR CAM 递归学习法 CV RGB NLP icon ONNX torchinfo 报税 FastAPI Hungarian Bin NameSilo Paper Search MD5 transformers Michelin Miniforge YOLO PyCharm 云服务器 Google tqdm Pandas Food 公式 Interview Dataset 财报 阿里云 Heatmap SQL Permission Disk Pillow Password GPTQ GIT
站点统计

本站现有博文324篇,共被浏览811021

本站已经建立2514天!

热门文章
文章归档
回到顶部