EADST

Save Hugging Face Model with One Bin

max_shard_size (int or str, optional, defaults to "10GB") — Only applicable for models. The maximum size for a checkpoint before being sharded. Checkpoints shard will then be each of size lower than this size. If expressed as a string, needs to be digits followed by a unit (like "5MB").

Based on the introduction, one bin model can be saved by changing the "max_shard_size".

LlamaForCausalLM.save_pretrained(base_model, output_dir, max_shard_size="100GB") # save one bin if the model is less than 100GB

Reference

PreTrainedModel

About Me
XD
Goals determine what you are going to be.
Category
标签云
InvalidArgumentError Crawler PyTorch Github BeautifulSoup Diagram GIT Claude 关于博主 域名 腾讯云 Color Excel Bert Zip PyCharm Transformers NLP PIP Animate 飞书 uwsgi Website SAM CAM GoogLeNet Use WebCrawler scipy Proxy Pytorch Plate Pillow LoRA 公式 AI Search Template 继承 Ptyhon Markdown GGML JSON 净利润 Vmess Review 图形思考法 LLM Magnet tqdm LeetCode RGB FlashAttention Sklearn hf VSCode Interview Freesound Cloudreve NameSilo ResNet-50 Mixtral transformers CSV RAR 算法题 Git Paddle Image2Text NLTK Translation FP16 Domain Algorithm LaTeX YOLO Python ChatGPT Dataset Breakpoint 报税 PDB Qwen2 GPTQ OpenCV Safetensors Vim 多线程 CEIR Miniforge TTS Food SQL DeepSeek 多进程 Google Nginx Anaconda Card Windows tar HuggingFace CV ONNX Land CLAP SVR 强化学习 git OpenAI COCO Jupyter MD5 OCR Knowledge Michelin Ubuntu Base64 Hungarian v0.dev Distillation Plotly Bitcoin Tiktoken Password GPT4 Bipartite Baidu Heatmap API Jetson 递归学习法 Tracking git-lfs LLAMA BF16 FP64 TensorFlow Agent diffusers Linux Docker UNIX CTC SPIE 签证 CUDA Augmentation Bin HaggingFace Logo DeepStream XML News 顶会 云服务器 Firewall C++ PDF 第一性原理 ModelScope Qwen2.5 Datetime llama.cpp Attention Pickle FastAPI Streamlit WAN Llama VGG-16 财报 证件照 Math Quantize Numpy Conda Web SQLite 搞笑 UI Statistics EXCEL Quantization printf CC Shortcut 音频 Hilton Clash uWSGI torchinfo Gemma BTC IndexTTS2 Tensor logger FP32 阿里云 Data Permission Disk v2ray QWEN Input XGBoost TSV 版权 Django Random Video TensorRT Pandas VPN Paper FP8 Qwen Hotel mmap
站点统计

本站现有博文321篇,共被浏览775077

本站已经建立2465天!

热门文章
文章归档
回到顶部