EADST

Quick Review: AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration

AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration

Highlight:

  • Optimal Alpha Scaling: Focuses on determining the optimal alpha value for scaling weights prior to quantization.
相关标签
About Me
XD
Goals determine what you are going to be.
Category
标签云
Baidu PDF Firewall Pickle Gemma TSV Animate Streamlit Domain Pandas Quantization diffusers llama.cpp Vim API uWSGI LeetCode Dataset uwsgi RAR Google Image2Text Freesound XML FastAPI PDB Linux 搞笑 CUDA Use QWEN Numpy IndexTTS2 RGB Quantize VPN Docker Django Interview VSCode Bert BeautifulSoup GIT LLM VGG-16 git-lfs UI Paper Statistics Mixtral 多进程 CEIR ModelScope 版权 JSON Pytorch Shortcut Math LLAMA Disk transformers Template FP16 Michelin 腾讯云 UNIX CLAP Vmess PyCharm 证件照 Github Land Permission XGBoost Magnet torchinfo 音频 SQLite hf CC Ubuntu Hotel Qwen Website 域名 Data AI Attention tqdm 飞书 Bitcoin CSV EXCEL Plate InvalidArgumentError HuggingFace Cloudreve TensorFlow NLP TensorRT Tracking DeepSeek Hilton Algorithm printf Heatmap 签证 SPIE NLTK Breakpoint Translation C++ Review HaggingFace MD5 mmap Claude BF16 报税 YOLO Web logger 阿里云 PIP Jupyter FP32 视频信息 Miniforge LoRA GoogLeNet SVR GGML Llama Pillow Datetime FlashAttention Video GPT4 Zip ResNet-50 Python Git 关于博主 Password Sklearn SAM Distillation Qwen2 Bin OpenAI COCO Nginx Knowledge Clash Tensor Random DeepStream Card BTC Crawler Windows git scipy Safetensors Logo Augmentation Base64 Input FP64 Jetson 财报 CAM Tiktoken Diagram OCR GPTQ TTS Markdown ChatGPT Proxy LaTeX WebCrawler CV 多线程 ONNX SQL Transformers Color Paddle Food v2ray 净利润 继承 Hungarian Plotly Ptyhon OpenCV 公式 FP8 WAN NameSilo Qwen2.5 Bipartite tar CTC 算法题 Anaconda Conda PyTorch v0.dev Excel
站点统计

本站现有博文311篇,共被浏览740142

本站已经建立2377天!

热门文章
文章归档
回到顶部