EADST

Quick Review: AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration

AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration

Highlight:

  • Optimal Alpha Scaling: Focuses on determining the optimal alpha value for scaling weights prior to quantization.
相关标签
About Me
XD
Goals determine what you are going to be.
Category
标签云
C++ Password 递归学习法 HuggingFace PyTorch 阿里云 CEIR WebCrawler Plate 强化学习 Logo Cloudreve WAN Baidu Magnet 云服务器 Color SQL uWSGI 报税 Pillow Freesound GPTQ 图标 Bitcoin CV Pytorch Jetson PDF 顶会 AI Linux CSV Windows LLM Nginx Use Template Input 公式 Ptyhon Dataset PDB FP32 Python Algorithm Breakpoint scipy HaggingFace GGML PyCharm 腾讯云 GIT SAM Data FastAPI InvalidArgumentError Docker Translation Random Heatmap OpenCV Transformers LoRA 关于博主 tqdm Image2Text FP16 Bin Base64 Numpy News Ubuntu Agent CTC LLAMA Diagram SPIE XGBoost Qwen Augmentation Vim RAR Proxy Knowledge ModelScope Interview Google Excel Conda FP8 icon VPN CAM Git Claude FP64 Bipartite Disk EXCEL Permission API Streamlit Food PIP 图形思考法 VSCode ONNX TensorFlow Hilton 证件照 Sklearn GoogLeNet tar Clash v0.dev Jupyter Web hf BTC Miniforge Math UNIX DeepStream Anaconda Crawler QWEN Qwen2.5 Gemma OCR Animate Rebuttal 算法题 TTS 音频 SVR Mixtral Firewall v2ray torchinfo COCO BeautifulSoup Paper 继承 Michelin git-lfs 多进程 Attention Paddle Video CUDA Vmess llama.cpp Github printf Hungarian LaTeX 净利润 MD5 OpenAI YOLO Quantize GPT4 Pickle Search mmap ChatGPT Statistics CC 飞书 Bert SQLite Land NameSilo 版权 XML TSV Quantization Zip Website Tiktoken NLP 财报 RGB Domain diffusers 论文 transformers CLAP Shortcut 论文速读 签证 JSON BF16 Card Llama IndexTTS2 FlashAttention Distillation DeepSeek TensorRT UI Markdown Plotly Django Datetime Pandas Safetensors Tensor Review ResNet-50 搞笑 多线程 Tracking Qwen2 git uwsgi 域名 NLTK logger LeetCode 第一性原理 VGG-16 Hotel
站点统计

本站现有博文328篇,共被浏览858359

本站已经建立2566天!

热门文章
文章归档
回到顶部