EADST

LLAMA Model Save with INT8 Format

LLAMA Model Save with INT8 Format

from transformers import BitsAndBytesConfig
from transformers import AutoModelForCausalLM

config = BitsAndBytesConfig(
    load_in_8bit=True,
)
path = "/home/llm/model/path/"
model = AutoModelForCausalLM.from_pretrained(path, device_map="cpu", quantization_config=config)
model.save_pretrained("model_save_folder-8bit")
相关标签
About Me
XD
Goals determine what you are going to be.
Category
标签云
Firewall Pillow FP64 Tensor Linux OpenCV Django RAR Numpy Pytorch Search MD5 git git-lfs 图形思考法 scipy Distillation ResNet-50 Video Statistics Anaconda Sklearn Plate LeetCode Bert 音频 XGBoost Qwen2 SVR Qwen2.5 Baidu Website C++ CLAP TensorFlow 版权 Input Ubuntu Plotly Vmess BeautifulSoup 阿里云 Land uwsgi 关于博主 QWEN CV ChatGPT NLTK Python 递归学习法 tar Password Git WAN Diagram FastAPI diffusers CTC Docker Food Heatmap AI CAM DeepSeek YOLO Miniforge printf Augmentation Excel CEIR Freesound FlashAttention HuggingFace Card OCR hf UI 强化学习 Color LoRA Tiktoken Nginx RGB Zip Permission v0.dev Google Paddle 多线程 Jetson Quantization BTC SQLite Math Bitcoin v2ray GoogLeNet Crawler TSV PDF Bin 签证 Llama LLM 证件照 logger Proxy GPTQ Gemma Shortcut Breakpoint Claude Agent Safetensors Mixtral Cloudreve PDB Ptyhon Jupyter Streamlit 算法题 TensorRT UNIX JSON Disk 第一性原理 Hilton PyCharm Base64 Translation Pandas VSCode FP16 Web Image2Text LLAMA TTS Use Data FP32 Algorithm 搞笑 NLP 财报 Attention Qwen Hotel mmap EXCEL DeepStream Domain OpenAI Knowledge InvalidArgumentError SAM Datetime COCO VGG-16 域名 IndexTTS2 Vim torchinfo FP8 Pickle News 净利润 Github Animate WebCrawler HaggingFace CC 顶会 BF16 Windows uWSGI Michelin 多进程 Bipartite Review Conda LaTeX VPN GGML Transformers GIT Quantize GPT4 Interview tqdm SPIE XML PIP Random ONNX 腾讯云 公式 报税 Paper 继承 Dataset SQL Template ModelScope transformers CSV API Magnet Clash Hungarian 飞书 Tracking PyTorch llama.cpp CUDA NameSilo Logo Markdown
站点统计

本站现有博文320篇,共被浏览760708

本站已经建立2432天!

热门文章
文章归档
回到顶部