EADST

Quick Review: ZeroQuant-FP

ZeroQuant-FP: A Leap Forward in LLMs Post-Training W4A8 Quantization Using Floating-Point Formats

Highlights:

  • FP4 Weight Quantization: Implements 4-bit floating-point (FP4) quantization for model weights.
  • FP8 Activation Quantization: Utilizes 8-bit floating-point (FP8) quantization for activations, optimizing the balance between performance and precision.
相关标签
About Me
XD
Goals determine what you are going to be.
Category
标签云
Disk Interview PyTorch PDF 强化学习 WAN Firewall C++ HaggingFace XML v0.dev Jetson Vim 域名 Image2Text 报税 XGBoost Vmess Ptyhon 证件照 torchinfo Claude Excel UNIX Numpy Hotel NLP 财报 Plotly LLM 继承 LoRA 图标 Python CC TensorRT 腾讯云 WebCrawler Translation Domain GIT AI 音频 YOLO Breakpoint Gemma tqdm TSV llama.cpp printf 公式 BF16 IndexTTS2 Pillow OpenAI CLAP transformers diffusers Password FastAPI ModelScope Cloudreve HuggingFace NameSilo FP8 阿里云 Zip TTS Pickle Input Bin Animate 多进程 COCO SQL Markdown Transformers FP16 Attention Freesound hf VPN Web Permission Augmentation FlashAttention Sklearn Windows Git Miniforge uWSGI Website uwsgi Bitcoin BeautifulSoup Pandas FP64 MD5 logger 关于博主 GPT4 PyCharm 签证 搞笑 Plate Search Google GPTQ PDB CSV Random git-lfs Agent LaTeX Rebuttal 云服务器 git SAM CTC LLAMA Land Paper 递归学习法 Data Bipartite tar ResNet-50 Qwen2 CAM UI CEIR icon Nginx v2ray Ubuntu Datetime Base64 mmap Anaconda ChatGPT GoogLeNet Conda BTC Heatmap Card Review DeepSeek GGML Algorithm Proxy DeepStream Hilton Llama Mixtral Distillation scipy ONNX API Qwen 飞书 JSON Color Tracking Docker 算法题 FP32 News Michelin 图形思考法 Diagram Django Shortcut Paddle CUDA Knowledge SVR VSCode Tensor Food Quantize Crawler Baidu Quantization Magnet Streamlit EXCEL 多线程 Tiktoken 版权 Hungarian Statistics RAR LeetCode Linux Safetensors SQLite Pytorch Qwen2.5 InvalidArgumentError Clash Use Dataset 顶会 第一性原理 OpenCV Jupyter Template 净利润 PIP Github QWEN RGB CV VGG-16 Video Bert Math TensorFlow SPIE Logo OCR NLTK
站点统计

本站现有博文324篇,共被浏览812393

本站已经建立2516天!

热门文章
文章归档
回到顶部