EADST

Quick Review: ZeroQuant-FP

ZeroQuant-FP: A Leap Forward in LLMs Post-Training W4A8 Quantization Using Floating-Point Formats

Highlights:

  • FP4 Weight Quantization: Implements 4-bit floating-point (FP4) quantization for model weights.
  • FP8 Activation Quantization: Utilizes 8-bit floating-point (FP8) quantization for activations, optimizing the balance between performance and precision.
相关标签
About Me
XD
Goals determine what you are going to be.
Category
标签云
Heatmap mmap LLM TensorRT API Diagram Agent LaTeX tqdm JSON Plate Review Clash Nginx Baidu Pillow GoogLeNet Hilton 证件照 域名 UNIX logger Knowledge SPIE 递归学习法 Augmentation FP32 IndexTTS2 Mixtral Anaconda Disk Jupyter v2ray Interview Qwen2.5 Claude Gemma printf git-lfs XGBoost TSV ONNX Math Ubuntu PDF Hotel SQLite 多进程 Pandas WebCrawler Translation SVR Breakpoint Jetson 飞书 BF16 Conda Markdown Docker llama.cpp scipy Template Crawler Color 音频 Streamlit Video COCO GPTQ Paddle Logo Google Pytorch XML Input Zip uWSGI Vmess RAR VSCode HuggingFace 公式 Llama CV CSV hf Domain ResNet-50 版权 LeetCode QWEN Food Miniforge Search Data Datetime tar Excel NameSilo PIP Cloudreve v0.dev 顶会 ChatGPT 阿里云 Web Github Land Paper NLP Git git Python LoRA Website 关于博主 PyCharm ModelScope Qwen RGB VPN Freesound Animate 多线程 搞笑 Distillation OpenAI GPT4 Numpy CEIR OpenCV PyTorch Image2Text GIT Shortcut LLAMA Statistics DeepSeek Random Transformers AI WAN Tracking Linux CLAP 算法题 BeautifulSoup Windows NLTK Qwen2 CTC 第一性原理 Card EXCEL YOLO MD5 TensorFlow SQL Base64 报税 CC InvalidArgumentError GGML Magnet PDB FlashAttention FastAPI BTC Bert DeepStream diffusers CUDA torchinfo 财报 CAM 净利润 Hungarian Password FP16 C++ 图形思考法 FP8 Plotly 签证 Dataset Tiktoken Safetensors OCR Firewall Bipartite VGG-16 Bitcoin Bin Sklearn Django Tensor Use SAM 腾讯云 FP64 Quantization Ptyhon 强化学习 Algorithm Pickle Permission News uwsgi UI TTS Vim Michelin Quantize Proxy 继承 transformers Attention HaggingFace
站点统计

本站现有博文320篇,共被浏览759193

本站已经建立2427天!

热门文章
文章归档
回到顶部