EADST

Sharding and SafeTensors in Hugging Face Transformers

In the Hugging Face transformers library, managing large models efficiently is crucial, especially when working with limited disk space or specific file size requirements. Two key features that help with this are sharding and the use of SafeTensors.

Sharding

Sharding is the process of splitting a large model's weights into smaller files or "shards." This is particularly useful when dealing with large models that exceed file size limits or when you want to manage storage more effectively.

Usage

To shard a model during the saving process, you can use the max_shard_size parameter in the save_pretrained method. Here's an example:

# Save the model with sharding, setting the maximum shard size to 1GB
model.save_pretrained('./model_directory', max_shard_size="1GB")

In this example, the model's weights will be divided into multiple files, each not exceeding 1GB. This can make storage and transfer more manageable, especially when dealing with large-scale models.

SafeTensors

The safetensors library provides a new format for storing tensors in a safe and efficient way. Unlike traditional formats like PyTorch's .pt files, SafeTensors ensures that the tensor data cannot be accidentally executed as code, offering an additional layer of security. This is particularly important when sharing models across different systems or with the community.

Usage

To save a model using SafeTensors, simply specify the safe_serialization parameter when saving:

# Save the model using SafeTensors format
model.save_pretrained('./model_directory', safe_serialization=True)

This will create files with the .safetensors extension, ensuring the saved tensors are stored safely.

Combining Sharding and SafeTensors

You can combine both sharding and SafeTensors to save a large model securely and efficiently:

# Save the model with sharding and SafeTensors
model.save_pretrained('./model_directory', max_shard_size="1GB", safe_serialization=True)

This setup splits the model into shards, each in the SafeTensors format, offering both manageability and security.

Conclusion

By leveraging sharding and SafeTensors, Hugging Face transformers users can handle large models more effectively. Sharding helps manage file sizes, while SafeTensors ensures the safe storage of tensor data. These features are essential for anyone working with large-scale models, providing both practical and security benefits.

相关标签
About Me
XD
Goals determine what you are going to be.
Category
标签云
Vim scipy Bipartite PDB 关于博主 TTS VPN 图标 ONNX Docker 算法题 阿里云 Random 多进程 Food Disk DeepStream Base64 OCR FP8 Proxy VSCode AI Jetson transformers SPIE Transformers Rebuttal 第一性原理 XGBoost Clash tqdm Bin 版权 v2ray Video Shortcut Cloudreve SVR TSV Freesound NLP Input Nginx Color 净利润 Algorithm Llama WebCrawler CSV 图形思考法 公式 Michelin Math Website Card 云服务器 BF16 CV Ubuntu Review GPT4 论文速读 Domain Knowledge 继承 Bitcoin Pickle Statistics logger Plate TensorRT XML Tracking 顶会 Claude Markdown Numpy git-lfs Hotel 报税 UI 音频 CAM EXCEL YOLO Attention Web Qwen2.5 SAM Safetensors ModelScope Tiktoken TensorFlow Logo v0.dev Quantize Miniforge JSON Windows HaggingFace CC FP16 hf RAR FP64 uWSGI Django Sklearn CEIR PyCharm Heatmap News Anaconda ResNet-50 API Search icon llama.cpp 证件照 GoogLeNet HuggingFace Hungarian RGB git 强化学习 PIP Dataset InvalidArgumentError PDF Conda NLTK Qwen2 LaTeX CLAP Permission mmap Streamlit Augmentation QWEN 论文 BeautifulSoup Image2Text Breakpoint Linux Github BTC printf Pillow LoRA LeetCode Bert Quantization Qwen torchinfo Excel diffusers CUDA tar Land Zip Tensor DeepSeek MD5 Git Data ChatGPT LLM SQLite uwsgi 多线程 Jupyter Google CTC Diagram Python VGG-16 Pytorch Use WAN GIT Agent 飞书 OpenAI Crawler COCO Interview FlashAttention C++ FP32 Vmess LLAMA GPTQ Pandas Ptyhon Translation Gemma UNIX 搞笑 GGML Magnet 腾讯云 Distillation 财报 OpenCV IndexTTS2 域名 签证 Mixtral 递归学习法 SQL Datetime FastAPI PyTorch Paper Plotly Template Baidu Hilton Animate Firewall Password NameSilo Paddle
站点统计

本站现有博文327篇,共被浏览826803

本站已经建立2533天!

热门文章
文章归档
回到顶部