Sharding and SafeTensors in Hugging Face Transformers
作者:XD / 发表: 2024年7月29日 22:01 / 更新: 2024年7月29日 22:06 / 编程笔记 / 阅读量:785
In the Hugging Face transformers
library, managing large models efficiently is crucial, especially when working with limited disk space or specific file size requirements. Two key features that help with this are sharding and the use of SafeTensors.
Sharding
Sharding is the process of splitting a large model's weights into smaller files or "shards." This is particularly useful when dealing with large models that exceed file size limits or when you want to manage storage more effectively.
Usage
To shard a model during the saving process, you can use the max_shard_size
parameter in the save_pretrained
method. Here's an example:
# Save the model with sharding, setting the maximum shard size to 1GB
model.save_pretrained('./model_directory', max_shard_size="1GB")
In this example, the model's weights will be divided into multiple files, each not exceeding 1GB. This can make storage and transfer more manageable, especially when dealing with large-scale models.
SafeTensors
The safetensors
library provides a new format for storing tensors in a safe and efficient way. Unlike traditional formats like PyTorch's .pt
files, SafeTensors ensures that the tensor data cannot be accidentally executed as code, offering an additional layer of security. This is particularly important when sharing models across different systems or with the community.
Usage
To save a model using SafeTensors, simply specify the safe_serialization
parameter when saving:
# Save the model using SafeTensors format
model.save_pretrained('./model_directory', safe_serialization=True)
This will create files with the .safetensors
extension, ensuring the saved tensors are stored safely.
Combining Sharding and SafeTensors
You can combine both sharding and SafeTensors to save a large model securely and efficiently:
# Save the model with sharding and SafeTensors
model.save_pretrained('./model_directory', max_shard_size="1GB", safe_serialization=True)
This setup splits the model into shards, each in the SafeTensors format, offering both manageability and security.
Conclusion
By leveraging sharding and SafeTensors, Hugging Face transformers
users can handle large models more effectively. Sharding helps manage file sizes, while SafeTensors ensures the safe storage of tensor data. These features are essential for anyone working with large-scale models, providing both practical and security benefits.