EADST

Sharding and SafeTensors in Hugging Face Transformers

In the Hugging Face transformers library, managing large models efficiently is crucial, especially when working with limited disk space or specific file size requirements. Two key features that help with this are sharding and the use of SafeTensors.

Sharding

Sharding is the process of splitting a large model's weights into smaller files or "shards." This is particularly useful when dealing with large models that exceed file size limits or when you want to manage storage more effectively.

Usage

To shard a model during the saving process, you can use the max_shard_size parameter in the save_pretrained method. Here's an example:

# Save the model with sharding, setting the maximum shard size to 1GB
model.save_pretrained('./model_directory', max_shard_size="1GB")

In this example, the model's weights will be divided into multiple files, each not exceeding 1GB. This can make storage and transfer more manageable, especially when dealing with large-scale models.

SafeTensors

The safetensors library provides a new format for storing tensors in a safe and efficient way. Unlike traditional formats like PyTorch's .pt files, SafeTensors ensures that the tensor data cannot be accidentally executed as code, offering an additional layer of security. This is particularly important when sharing models across different systems or with the community.

Usage

To save a model using SafeTensors, simply specify the safe_serialization parameter when saving:

# Save the model using SafeTensors format
model.save_pretrained('./model_directory', safe_serialization=True)

This will create files with the .safetensors extension, ensuring the saved tensors are stored safely.

Combining Sharding and SafeTensors

You can combine both sharding and SafeTensors to save a large model securely and efficiently:

# Save the model with sharding and SafeTensors
model.save_pretrained('./model_directory', max_shard_size="1GB", safe_serialization=True)

This setup splits the model into shards, each in the SafeTensors format, offering both manageability and security.

Conclusion

By leveraging sharding and SafeTensors, Hugging Face transformers users can handle large models more effectively. Sharding helps manage file sizes, while SafeTensors ensures the safe storage of tensor data. These features are essential for anyone working with large-scale models, providing both practical and security benefits.

相关标签
About Me
XD
Goals determine what you are going to be.
Category
标签云
Permission VPN 关于博主 Plate Math CSV Review UI OpenCV Augmentation printf YOLO 算法题 Color Diagram IndexTTS2 搞笑 WAN QWEN PIP TTS XGBoost Anaconda Markdown Animate llama.cpp Pillow CUDA Hotel Bipartite Plotly InvalidArgumentError Excel Algorithm AI Datetime Agent Jetson DeepSeek 净利润 Miniforge Tensor ChatGPT 腾讯云 Bitcoin Magnet FlashAttention Domain DeepStream FP8 Password Hungarian Statistics PDB Tracking Breakpoint Safetensors Proxy MD5 Zip Land 财报 UNIX 阿里云 C++ Github Data Bert 域名 Ptyhon Dataset Cloudreve 云服务器 签证 torchinfo GPT4 uWSGI TSV LoRA GoogLeNet PyTorch tqdm uwsgi CAM PDF 版权 Heatmap GPTQ Qwen SAM CEIR Firewall Gemma Translation Docker Pytorch FP64 LLAMA FP16 TensorRT GIT CV Claude Vim Web Logo CC 强化学习 BF16 ResNet-50 Attention Jupyter 多进程 SQLite Ubuntu Website Numpy BTC Knowledge git Input 报税 BeautifulSoup 继承 logger Freesound Baidu CTC Vmess Crawler 递归学习法 diffusers 证件照 Streamlit Bin tar Shortcut ModelScope Qwen2.5 Template Base64 Mixtral Image2Text Nginx Quantization Michelin RGB 第一性原理 音频 NLP XML SPIE Transformers News Quantize ONNX Paddle Use RAR COCO JSON FP32 HaggingFace OpenAI Interview Tiktoken Google Sklearn CLAP VGG-16 NameSilo 多线程 LLM TensorFlow SVR Food Random Disk VSCode LaTeX 飞书 Llama EXCEL Pandas 顶会 Linux git-lfs PyCharm SQL Qwen2 公式 图形思考法 GGML Paper v0.dev Distillation Pickle OCR WebCrawler Search FastAPI scipy NLTK Conda Windows Python Hilton Video hf LeetCode API HuggingFace mmap Clash transformers v2ray Card Django Git
站点统计

本站现有博文321篇,共被浏览776165

本站已经建立2466天!

热门文章
文章归档
回到顶部