EADST

LLAMA Model Save with INT8 Format

LLAMA Model Save with INT8 Format

from transformers import BitsAndBytesConfig
from transformers import AutoModelForCausalLM

config = BitsAndBytesConfig(
    load_in_8bit=True,
)
path = "/home/llm/model/path/"
model = AutoModelForCausalLM.from_pretrained(path, device_map="cpu", quantization_config=config)
model.save_pretrained("model_save_folder-8bit")
相关标签
About Me
XD
Goals determine what you are going to be.
Category
标签云
图标 mmap Website CUDA ModelScope Windows JSON MD5 NLP Google Breakpoint llama.cpp C++ 强化学习 WAN Qwen2.5 Math HuggingFace YOLO LLAMA Qwen Knowledge Quantize PDF Random XML 报税 NLTK Numpy Transformers Vmess Bert Michelin Land SQLite Tensor Shortcut ONNX OpenAI transformers 顶会 WebCrawler Permission Augmentation 继承 DeepStream Statistics Github FP64 Dataset Ubuntu Distillation Diagram ms-swift 版权 阿里云 SPIE Freesound Excel VGG-16 TSV 飞书 BeautifulSoup Mixtral Clash uwsgi 搞笑 CLAP IndexTTS2 Firewall QWEN TensorRT FP16 Template LLM Sklearn CV BTC RAR Nginx GIT XGBoost SQL TTS RGB Markdown Video DeepSeek Base64 CEIR CAM News git-lfs Jupyter PyTorch Proxy git Paper Streamlit Pandas FP32 CSV GGML EXCEL icon Docker printf Disk Translation VSCode torchinfo Crawler Food 多线程 SAM logger scipy Password Tracking 腾讯云 Review FastAPI v0.dev Quantization Pytorch Attention Datetime 云服务器 Zip OpenCV Cloudreve 净利润 VPN Gemma Pillow Linux diffusers 论文速读 Conda Hungarian Pickle tqdm 公式 Vim Bin Claude Plate Agent 音频 Use SVR InvalidArgumentError tar Anaconda AI CC ChatGPT 财报 Baidu CTC Data Logo Plotly uWSGI Card 签证 Rebuttal PyCharm GPT4 算法题 ResNet-50 Heatmap Miniforge Animate hf GoogLeNet Jetson Hotel Paddle Algorithm Ptyhon GPTQ Hilton 论文 LoRA Django NameSilo 第一性原理 Input API Bipartite Qwen2 Git v2ray COCO BF16 LeetCode Interview OCR Color Search 多进程 证件照 UNIX Tiktoken Safetensors PIP UI FlashAttention 域名 Bitcoin Web Image2Text LaTeX Magnet Llama 图形思考法 Domain 关于博主 Python HaggingFace TensorFlow PDB FP8 递归学习法
站点统计

本站现有博文330篇,共被浏览863134

本站已经建立2571天!

热门文章
文章归档
回到顶部