EADST

Sharding and SafeTensors in Hugging Face Transformers

In the Hugging Face transformers library, managing large models efficiently is crucial, especially when working with limited disk space or specific file size requirements. Two key features that help with this are sharding and the use of SafeTensors.

Sharding

Sharding is the process of splitting a large model's weights into smaller files or "shards." This is particularly useful when dealing with large models that exceed file size limits or when you want to manage storage more effectively.

Usage

To shard a model during the saving process, you can use the max_shard_size parameter in the save_pretrained method. Here's an example:

# Save the model with sharding, setting the maximum shard size to 1GB
model.save_pretrained('./model_directory', max_shard_size="1GB")

In this example, the model's weights will be divided into multiple files, each not exceeding 1GB. This can make storage and transfer more manageable, especially when dealing with large-scale models.

SafeTensors

The safetensors library provides a new format for storing tensors in a safe and efficient way. Unlike traditional formats like PyTorch's .pt files, SafeTensors ensures that the tensor data cannot be accidentally executed as code, offering an additional layer of security. This is particularly important when sharing models across different systems or with the community.

Usage

To save a model using SafeTensors, simply specify the safe_serialization parameter when saving:

# Save the model using SafeTensors format
model.save_pretrained('./model_directory', safe_serialization=True)

This will create files with the .safetensors extension, ensuring the saved tensors are stored safely.

Combining Sharding and SafeTensors

You can combine both sharding and SafeTensors to save a large model securely and efficiently:

# Save the model with sharding and SafeTensors
model.save_pretrained('./model_directory', max_shard_size="1GB", safe_serialization=True)

This setup splits the model into shards, each in the SafeTensors format, offering both manageability and security.

Conclusion

By leveraging sharding and SafeTensors, Hugging Face transformers users can handle large models more effectively. Sharding helps manage file sizes, while SafeTensors ensures the safe storage of tensor data. These features are essential for anyone working with large-scale models, providing both practical and security benefits.

相关标签
About Me
XD
Goals determine what you are going to be.
Category
标签云
NLTK Paper 报税 财报 TensorRT Translation UNIX CTC 强化学习 Disk CC Hotel Vim FP16 Transformers logger GIT VGG-16 音频 PyCharm RAR Template Augmentation Tiktoken XML TensorFlow Interview GPTQ git-lfs TTS Git LaTeX 签证 tqdm Diagram Zip 搞笑 算法题 SQL 图形思考法 uwsgi Logo COCO PDF Shortcut Firewall Django LeetCode 公式 Vmess Streamlit EXCEL v0.dev MD5 版权 Card Bin Review Pickle 图标 InvalidArgumentError Video 多进程 递归学习法 LLM OpenCV Anaconda BTC Jupyter Paddle 多线程 CUDA Pillow Search Permission Breakpoint Attention UI 腾讯云 VPN Baidu Pandas Plate Image2Text torchinfo FP8 第一性原理 Agent Statistics Input Hungarian Color WebCrawler Ptyhon ChatGPT Llama BF16 Animate Base64 API Claude FP64 Qwen2.5 DeepStream GGML CV Sklearn Cloudreve YOLO printf GPT4 WAN NLP Tensor Datetime Distillation Mixtral Freesound 论文速读 CLAP uWSGI Password PyTorch Quantize Magnet Web 证件照 CEIR Excel HuggingFace Docker icon OpenAI 域名 Google SQLite Data ResNet-50 Bitcoin FlashAttention Rebuttal v2ray tar Markdown Github Pytorch Qwen2 关于博主 Knowledge mmap HaggingFace Qwen XGBoost 净利润 Math 论文 TSV Nginx Ubuntu Use SAM SVR AI Gemma 飞书 Windows News Plotly PIP Safetensors Domain CSV Land Crawler Tracking Jetson QWEN RGB Algorithm NameSilo Bert Bipartite llama.cpp ModelScope Quantization FP32 diffusers IndexTTS2 顶会 Heatmap SPIE Hilton Conda Food Michelin 阿里云 OCR Random LoRA git 云服务器 Dataset Python Proxy Website GoogLeNet CAM DeepSeek Miniforge Linux JSON FastAPI Clash ONNX VSCode PDB hf C++ 继承 LLAMA transformers BeautifulSoup scipy Numpy
站点统计

本站现有博文328篇,共被浏览850996

本站已经建立2557天!

热门文章
文章归档
回到顶部