EADST

Quick Review: AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration

AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration

Highlight:

  • Optimal Alpha Scaling: Focuses on determining the optimal alpha value for scaling weights prior to quantization.
相关标签
About Me
XD
Goals determine what you are going to be.
Category
标签云
域名 图形思考法 PDB Paper Tracking logger Plotly Food Color Breakpoint SVR Template Bipartite Vmess WebCrawler OpenCV Bert CV Github llama.cpp LoRA Python uWSGI 公式 Quantization 净利润 HuggingFace Tensor CTC ModelScope Search Distillation FP32 BeautifulSoup LLAMA 报税 Animate 财报 Qwen2.5 Statistics CUDA GIT 多线程 Input QWEN Django 继承 Baidu Attention CSV Data Web Domain 飞书 Password XGBoost Tiktoken Heatmap hf CLAP Claude Review torchinfo Mixtral API PyTorch Hilton Quantize Diagram 递归学习法 VGG-16 Disk Google ChatGPT LLM RGB Interview 多进程 Nginx VPN TensorFlow Dataset 签证 XML Linux Zip SAM SQLite TensorRT Random Vim Firewall Freesound PDF Clash 算法题 Land Llama CC 搞笑 LeetCode CAM Website 版权 Pillow DeepStream VSCode mmap 腾讯云 COCO tar GPTQ TSV 强化学习 Conda Knowledge Hungarian Windows 顶会 PyCharm Proxy UNIX FlashAttention Shortcut Anaconda OCR BTC DeepSeek Logo SPIE FP16 FP8 BF16 Agent Image2Text v0.dev Translation FastAPI Magnet Jetson Math Use 第一性原理 Algorithm Streamlit Base64 Qwen Video WAN NLP Gemma Safetensors tqdm Plate OpenAI EXCEL TTS IndexTTS2 Hotel Permission 阿里云 News Bin PIP transformers 音频 RAR YOLO scipy Card Bitcoin Pytorch 关于博主 Datetime LaTeX GPT4 CEIR HaggingFace printf Augmentation Numpy UI diffusers C++ v2ray Cloudreve ONNX AI Docker NLTK Sklearn FP64 证件照 Pickle SQL GGML Ptyhon GoogLeNet NameSilo JSON Crawler ResNet-50 uwsgi Transformers Jupyter Pandas MD5 InvalidArgumentError git-lfs Qwen2 Michelin Miniforge git Git Ubuntu Paddle Markdown Excel
站点统计

本站现有博文320篇,共被浏览759192

本站已经建立2427天!

热门文章
文章归档
回到顶部