EADST

Quick Review: ZeroQuant-FP

ZeroQuant-FP: A Leap Forward in LLMs Post-Training W4A8 Quantization Using Floating-Point Formats

Highlights:

  • FP4 Weight Quantization: Implements 4-bit floating-point (FP4) quantization for model weights.
  • FP8 Activation Quantization: Utilizes 8-bit floating-point (FP8) quantization for activations, optimizing the balance between performance and precision.
相关标签
About Me
XD
Goals determine what you are going to be.
Category
标签云
Transformers OpenCV PIP CLAP Bitcoin scipy GIT COCO ModelScope Git HaggingFace VGG-16 Quantization uwsgi ChatGPT XGBoost NameSilo API Diagram Llama Logo Card Template Jetson git tqdm Ubuntu CV Vmess Crawler Sklearn Breakpoint 图形思考法 Video XML NLP OCR 签证 Pillow 多线程 Animate Algorithm Website Plate Nginx Gemma SAM 公式 v0.dev LaTeX Cloudreve HuggingFace 飞书 EXCEL Hotel Proxy AI Heatmap Color Math 版权 Docker IndexTTS2 PDB Bert Plotly 顶会 VSCode PyCharm Datetime git-lfs torchinfo GGML 净利润 算法题 Miniforge Safetensors Attention Interview Ptyhon Bin NLTK Qwen2 BeautifulSoup CTC WebCrawler DeepStream CAM JSON CUDA TensorFlow Excel UI Conda Web C++ Use Firewall News hf InvalidArgumentError LoRA 域名 YOLO ONNX GPT4 SPIE Windows 云服务器 Shortcut diffusers DeepSeek Pandas 证件照 Zip Github UNIX Review FP8 Streamlit 腾讯云 搞笑 Claude RAR 递归学习法 Google Pytorch CSV Data Base64 Statistics Land BF16 LeetCode MD5 Pickle Input Disk Magnet LLM ResNet-50 多进程 Augmentation Numpy Translation 第一性原理 SVR Permission Django v2ray TTS tar Image2Text Vim uWSGI Food 财报 Qwen Random WAN 关于博主 SQLite Search Tiktoken VPN Agent logger 强化学习 FP32 llama.cpp Anaconda Hilton mmap GPTQ FastAPI Paddle Domain Password FP64 QWEN 音频 FP16 Tracking Quantize Dataset Baidu 阿里云 transformers Jupyter Tensor RGB Michelin TensorRT GoogLeNet OpenAI Knowledge Hungarian SQL Freesound TSV 继承 printf Distillation BTC Mixtral Linux 报税 Bipartite CEIR FlashAttention Paper PDF LLAMA Clash PyTorch CC Python Qwen2.5 Markdown
站点统计

本站现有博文321篇,共被浏览779284

本站已经建立2471天!

热门文章
文章归档
回到顶部