EADST

Save Hugging Face Model with One Bin

max_shard_size (int or str, optional, defaults to "10GB") — Only applicable for models. The maximum size for a checkpoint before being sharded. Checkpoints shard will then be each of size lower than this size. If expressed as a string, needs to be digits followed by a unit (like "5MB").

Based on the introduction, one bin model can be saved by changing the "max_shard_size".

LlamaForCausalLM.save_pretrained(base_model, output_dir, max_shard_size="100GB") # save one bin if the model is less than 100GB

Reference

PreTrainedModel

About Me
XD
Goals determine what you are going to be.
Category
标签云
Dataset HaggingFace TSV Freesound C++ Statistics GGML News Numpy Agent Bipartite git FastAPI QWEN ModelScope Input BTC LLAMA InvalidArgumentError LaTeX 域名 Qwen2.5 Magnet BeautifulSoup Crawler CSV Plotly Jetson Interview Diagram SVR WebCrawler Clash Video Web ONNX Vim Animate CTC Bert AI torchinfo Website Search TTS 关于博主 报税 签证 Google 多进程 阿里云 Breakpoint Quantization OpenAI v0.dev GIT SQL Streamlit icon Mixtral Paper Food Math VSCode PIP 顶会 WAN 继承 图形思考法 Qwen printf CAM logger DeepStream transformers Translation 论文 Transformers Vmess v2ray Logo RGB XML git-lfs XGBoost GPTQ Claude 图标 API Knowledge CUDA UNIX IndexTTS2 GoogLeNet YOLO EXCEL Excel ResNet-50 BF16 Python 强化学习 Django Bin Random mmap Nginx Docker Jupyter diffusers 搞笑 Hilton Git FP64 Datetime Password CV Rebuttal tar FP32 SQLite 云服务器 Tensor 递归学习法 LeetCode CEIR Augmentation Algorithm FP16 Safetensors 财报 ms-swift Attention 多线程 Image2Text OCR 公式 Use 音频 OpenCV Conda 腾讯云 Tracking PyCharm Disk scipy NameSilo Windows Land ChatGPT Plate LLM Ptyhon Anaconda JSON Color Base64 TensorRT tqdm 证件照 uwsgi llama.cpp Markdown Pandas Domain Shortcut FP8 Cloudreve HuggingFace 第一性原理 PDF Hotel PDB Pillow TensorFlow 飞书 SAM UI SPIE Tiktoken Review Card RAR VGG-16 MD5 Linux Baidu Quantize Zip CC Paddle Data FlashAttention Ubuntu Bitcoin Llama NLTK 版权 净利润 COCO RL uWSGI hf Github Hungarian Template Gemma Miniforge PyTorch Permission Proxy GPT4 Pytorch 论文速读 LoRA Firewall Heatmap Michelin Distillation NLP Qwen2 Sklearn DeepSeek CLAP Pickle 算法题 VPN
站点统计

本站现有博文332篇,共被浏览867414

本站已经建立2575天!

热门文章
文章归档
回到顶部