Quick Review: ZeroQuant-FP| 东毅居士

Quick Review: ZeroQuant-FP

作者：XD / 发表： 2023年12月7日 00:32 / 更新： 2023年12月7日 00:56 / 科研学习 / 阅读量：1892

ZeroQuant-FP: A Leap Forward in LLMs Post-Training W4A8 Quantization Using Floating-Point Formats

Paper: ZeroQuant-FP on arXiv
Code: ZeroQuant-FP on GitHub
Organization: Microsoft

Highlights:

FP4 Weight Quantization: Implements 4-bit floating-point (FP4) quantization for model weights.
FP8 Activation Quantization: Utilizes 8-bit floating-point (FP8) quantization for activations, optimizing the balance between performance and precision.

本文作者：XD 转载请标明出处：http://www.eadst.com/blog/227

本站采用知识共享署名-非商业性使用-相同方式共享 4.0 国际许可协议进行许可。

上一篇
Quick Review: QUIK: Towards End-to-end 4-Bit Inference on Generative Large Language Models

下一篇
Quick Review: AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration

相关标签

LLM Quantization

About Me

XD

Goals determine what you are going to be.

Category

标签云

UI UNIX Base64 Safetensors 公式 OpenAI FP16 阿里云 Docker 算法题 PDB ChatGPT Qwen SQL Claude diffusers AI PDF tqdm Windows Pickle 净利润 Tensor Michelin Video Land Bin v0.dev Markdown 报税 OpenCV PIP GPT4 Bitcoin DeepSeek Qwen2 ModelScope 财报 Permission API Review Django TensorRT XML TensorFlow Math Logo Plotly scipy 版权 v2ray 搞笑 Color uwsgi Git TSV Hilton Pillow uWSGI Jetson RGB Github SQLite Proxy Gemma Algorithm Google Animate Paddle Bipartite COCO Plate Ubuntu git Quantization C++ FP32 Excel CEIR ResNet-50 域名 InvalidArgumentError NLTK CC Crawler CV Disk VGG-16 PyCharm Vmess GIT Bert LLM MD5 OCR EXCEL Use DeepStream RAR Streamlit Pandas Paper HuggingFace CLAP Distillation FP64 关于博主 Clash Shortcut Datetime hf Template Cloudreve SVR PyTorch Magnet mmap Augmentation Ptyhon Numpy Random QWEN Transformers Hungarian Food llama.cpp Interview Input 证件照 tar Conda FlashAttention 多进程 CTC CSV Python GoogLeNet Tiktoken NameSilo Domain LaTeX Baidu GGML Mixtral Image2Text LoRA 继承 Web Firewall FastAPI Tracking 音频 torchinfo Zip VSCode Qwen2.5 ONNX Dataset FP8 Card 飞书 GPTQ Vim TTS BF16 Statistics WAN HaggingFace Jupyter Nginx Attention Password LeetCode Translation CAM YOLO JSON 腾讯云 Freesound WebCrawler NLP 多线程 LLAMA XGBoost Heatmap Data CUDA Sklearn BeautifulSoup Llama Anaconda Diagram Breakpoint Website transformers Hotel Linux VPN 签证 git-lfs logger Quantize printf BTC Knowledge Pytorch SPIE

站点统计

本站现有博文305篇,共被浏览721148次

本站已经建立2351天!

热门文章

文章归档