EADST

Convert PDFs to Images

Use Python to convert PDF documents into images, page by page.

from pdf2image import convert_from_path
import os

def convert_pdf_to_images(pdf_path, output_folder):
    if not os.path.exists(output_folder):
        os.makedirs(output_folder)

    # pdf2image
    images = convert_from_path(pdf_path)

    for i, image in enumerate(images):
        image_path = os.path.join(output_folder, f"page_{i+1}.jpg")
        image.save(image_path, 'JPEG')

def process_all_pdfs(pdf_folder):
    for root, dirs, files in os.walk(pdf_folder):
        for file in files:
            if file.lower().endswith('.pdf'):
                pdf_path = os.path.join(root, file)
                output_folder = os.path.join(root, os.path.splitext(file)[0])
                convert_pdf_to_images(pdf_path, output_folder)

pdf_folder = '/your_folder_path/'  
process_all_pdfs(pdf_folder)
相关标签
About Me
XD
Goals determine what you are going to be.
Category
标签云
OpenAI YOLO BF16 CC TSV Interview Qwen 继承 EXCEL Clash WebCrawler 顶会 Pickle 多进程 Tiktoken Breakpoint CSV API Bipartite 报税 版权 Algorithm hf Vmess CAM Bitcoin Nginx 云服务器 OpenCV Google scipy Quantization Streamlit 第一性原理 VPN CLAP Miniforge 音频 SPIE Python GoogLeNet SVR Plotly Search Tensor Translation Hungarian RAR Conda Llama logger News Template Github 财报 PyCharm Qwen2 FP16 Bert Transformers printf C++ TTS JSON Base64 Domain HuggingFace Paper 签证 Pytorch LeetCode Plate Pillow Math 关于博主 BTC FP64 Permission Knowledge NLTK torchinfo mmap Docker Color tar Paddle Attention Statistics GPTQ 腾讯云 Qwen2.5 Proxy 算法题 Shortcut COCO Ubuntu Django UI SAM Datetime Magnet Git llama.cpp XML 证件照 FP8 Pandas Baidu UNIX GIT Augmentation Gemma InvalidArgumentError 强化学习 LLAMA Bin Numpy NLP Hilton Data git-lfs 阿里云 uwsgi diffusers ONNX 净利润 Markdown OCR Heatmap Hotel Website Tracking 多线程 Web CV XGBoost Use CUDA LoRA Vim Claude 域名 LLM AI PIP git WAN Zip Michelin Review Jupyter VGG-16 Ptyhon Excel Random Cloudreve uWSGI BeautifulSoup FastAPI Safetensors 公式 Freesound ModelScope Image2Text Animate transformers tqdm Crawler 递归学习法 PyTorch Mixtral IndexTTS2 NameSilo Linux Video 搞笑 Quantize RGB ResNet-50 Windows CTC GPT4 v0.dev LaTeX PDF SQL TensorRT Card 图形思考法 Disk GGML VSCode DeepStream CEIR Input FlashAttention MD5 Password Sklearn QWEN 飞书 TensorFlow ChatGPT Firewall Jetson Land Diagram Logo PDB v2ray Dataset SQLite HaggingFace Agent FP32 Distillation Food DeepSeek Anaconda
站点统计

本站现有博文321篇,共被浏览776424

本站已经建立2467天!

热门文章
文章归档
回到顶部