Quick Review: AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration
作者:XD / 发表: 2023年12月7日 00:38 / 更新: 2023年12月7日 00:38 / 科研学习 / 阅读量:1149
AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration
- Paper: AWQ on arXiv
- Code: AWQ on GitHub
- Organization: MIT
Highlight:
- Optimal Alpha Scaling: Focuses on determining the optimal alpha value for scaling weights prior to quantization.
相关标签