EADST

Quick Review: ZeroQuant-FP

ZeroQuant-FP: A Leap Forward in LLMs Post-Training W4A8 Quantization Using Floating-Point Formats

Highlights:

  • FP4 Weight Quantization: Implements 4-bit floating-point (FP4) quantization for model weights.
  • FP8 Activation Quantization: Utilizes 8-bit floating-point (FP8) quantization for activations, optimizing the balance between performance and precision.
相关标签
About Me
XD
Goals determine what you are going to be.
Category
标签云
Web Attention 第一性原理 COCO Cloudreve Input VGG-16 LaTeX Python Numpy mmap GIT C++ 图形思考法 OpenAI Shortcut JSON Nginx LoRA Heatmap Jupyter EXCEL Magnet DeepSeek InvalidArgumentError CUDA WAN 腾讯云 FP8 搞笑 净利润 阿里云 FastAPI Markdown Linux Land HaggingFace Bin CSV OCR Color Permission LLM AI 域名 音频 BF16 ResNet-50 Diagram Pillow DeepStream tar 签证 Tensor Freesound CEIR PDF TSV Translation API Random Template SQLite Github FP64 飞书 FP32 GPT4 SVR Git CTC 算法题 YOLO Plate XGBoost Michelin Video printf Vmess Proxy diffusers 公式 Clash PyCharm Review Bitcoin Plotly Hungarian TensorFlow FP16 Password Miniforge Ubuntu Llama SPIE Tracking Transformers git-lfs Base64 报税 v0.dev Quantize Animate scipy Food 版权 Google 多线程 Pytorch ModelScope tqdm 关于博主 Hilton Claude XML Ptyhon Jetson 递归学习法 Datetime IndexTTS2 Paper Website UNIX Qwen2 Qwen Agent CV Safetensors LeetCode git NLTK MD5 NameSilo logger VSCode Qwen2.5 TTS QWEN BTC Django uWSGI HuggingFace LLAMA CLAP CAM llama.cpp Excel Streamlit Augmentation Quantization Gemma PDB SAM uwsgi Breakpoint TensorRT Firewall SQL CC Data torchinfo Hotel Use VPN Dataset Algorithm Windows Bipartite transformers Domain Zip Vim BeautifulSoup 证件照 Statistics WebCrawler NLP Docker Knowledge RGB RAR ONNX Interview Pickle Distillation Anaconda ChatGPT GoogLeNet FlashAttention Disk PIP Paddle Pandas Crawler GPTQ 财报 Image2Text Mixtral Card 多进程 v2ray Baidu Logo hf 继承 Tiktoken OpenCV Conda GGML Math PyTorch UI Sklearn Bert
站点统计

本站现有博文316篇,共被浏览748357

本站已经建立2398天!

热门文章
文章归档
回到顶部