EADST

Sharding and SafeTensors in Hugging Face Transformers

In the Hugging Face transformers library, managing large models efficiently is crucial, especially when working with limited disk space or specific file size requirements. Two key features that help with this are sharding and the use of SafeTensors.

Sharding

Sharding is the process of splitting a large model's weights into smaller files or "shards." This is particularly useful when dealing with large models that exceed file size limits or when you want to manage storage more effectively.

Usage

To shard a model during the saving process, you can use the max_shard_size parameter in the save_pretrained method. Here's an example:

# Save the model with sharding, setting the maximum shard size to 1GB
model.save_pretrained('./model_directory', max_shard_size="1GB")

In this example, the model's weights will be divided into multiple files, each not exceeding 1GB. This can make storage and transfer more manageable, especially when dealing with large-scale models.

SafeTensors

The safetensors library provides a new format for storing tensors in a safe and efficient way. Unlike traditional formats like PyTorch's .pt files, SafeTensors ensures that the tensor data cannot be accidentally executed as code, offering an additional layer of security. This is particularly important when sharing models across different systems or with the community.

Usage

To save a model using SafeTensors, simply specify the safe_serialization parameter when saving:

# Save the model using SafeTensors format
model.save_pretrained('./model_directory', safe_serialization=True)

This will create files with the .safetensors extension, ensuring the saved tensors are stored safely.

Combining Sharding and SafeTensors

You can combine both sharding and SafeTensors to save a large model securely and efficiently:

# Save the model with sharding and SafeTensors
model.save_pretrained('./model_directory', max_shard_size="1GB", safe_serialization=True)

This setup splits the model into shards, each in the SafeTensors format, offering both manageability and security.

Conclusion

By leveraging sharding and SafeTensors, Hugging Face transformers users can handle large models more effectively. Sharding helps manage file sizes, while SafeTensors ensures the safe storage of tensor data. These features are essential for anyone working with large-scale models, providing both practical and security benefits.

相关标签
About Me
XD
Goals determine what you are going to be.
Category
标签云
版权 Web Bipartite Ptyhon OpenCV FP8 hf Nginx Plotly Use Augmentation BF16 Zip 证件照 PyTorch torchinfo LLAMA Bin 签证 FlashAttention Domain ModelScope v0.dev CTC TTS Logo HuggingFace Transformers transformers CEIR SVR UI Miniforge GoogLeNet GIT PyCharm 财报 SAM Mixtral Land Freesound FastAPI Interview 音频 Cloudreve 净利润 uWSGI InvalidArgumentError NLP Video ChatGPT SPIE 多线程 Paper Diagram HaggingFace Template 腾讯云 Vim Jupyter API Ubuntu UNIX Website CSV SQLite Django FP64 Attention JSON Quantize Card 报税 Firewall BeautifulSoup Distillation Safetensors C++ ResNet-50 mmap FP16 Shortcut Datetime Jetson 飞书 CLAP Translation Base64 多进程 GPTQ scipy CUDA TSV YOLO Tensor logger DeepStream Crawler Numpy Random Password Tracking uwsgi VSCode Color BTC NameSilo llama.cpp MD5 Gemma IndexTTS2 Pandas OCR RGB Github Quantization Knowledge printf Math ONNX Pytorch Hilton CAM git 算法题 NLTK Agent AI git-lfs Qwen2 diffusers 域名 PDF Plate Algorithm Animate Markdown COCO VPN tqdm SQL Disk RAR CV Conda XML CC Bert 关于博主 PIP Pillow Pickle WebCrawler LLM OpenAI LoRA PDB DeepSeek Excel Streamlit Python GPT4 Hotel 公式 Magnet GGML Michelin Google Breakpoint 搞笑 tar Claude FP32 Sklearn TensorFlow Qwen2.5 EXCEL TensorRT Git Food Image2Text 阿里云 Hungarian Review LeetCode Proxy LaTeX Permission Heatmap XGBoost Linux VGG-16 Llama Baidu Tiktoken Windows QWEN Anaconda Statistics WAN 继承 Clash Data Qwen Vmess Docker Dataset v2ray Paddle Bitcoin Input
站点统计

本站现有博文312篇,共被浏览744377

本站已经建立2387天!

热门文章
文章归档
回到顶部