EADST

Quick Review: AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration

AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration

Highlight:

  • Optimal Alpha Scaling: Focuses on determining the optimal alpha value for scaling weights prior to quantization.
相关标签
About Me
XD
Goals determine what you are going to be.
Category
标签云
API Color Plate FastAPI EXCEL Datetime CEIR CUDA Website Conda torchinfo Anaconda Gemma SQLite tar VSCode SVR Bin Statistics Jupyter Pillow CV IndexTTS2 Claude WebCrawler CSV PIP Excel Math 多线程 TensorFlow Ptyhon JSON 云服务器 GGML 图标 Ubuntu 签证 Rebuttal RGB Bert Diagram Data Video TTS 报税 Google Magnet GPT4 Hungarian AI mmap LLAMA Plotly Hilton Disk Random BeautifulSoup Llama SQL v2ray 证件照 Knowledge LoRA Linux PyCharm LaTeX Proxy Tracking Translation UNIX FlashAttention Quantization PyTorch Streamlit Zip Distillation FP16 Animate CAM News CTC Python 域名 Logo GPTQ CC Sklearn NLTK Vmess Land Shortcut MD5 算法题 飞书 C++ 继承 递归学习法 Clash Base64 Tiktoken FP64 GIT Mixtral RAR 版权 ModelScope OpenCV v0.dev Review 顶会 音频 Algorithm 搞笑 GoogLeNet Firewall Image2Text ONNX Nginx diffusers HaggingFace UI tqdm Pytorch Safetensors Git PDF FP8 Pickle 多进程 TSV Transformers Qwen WAN NLP Agent Markdown Food BF16 FP32 uWSGI OpenAI 第一性原理 LLM Tensor 腾讯云 PDB Windows 图形思考法 Michelin Numpy Use COCO Attention XGBoost Input Jetson Github Crawler Permission git-lfs 阿里云 DeepStream SAM Qwen2 VPN YOLO TensorRT 公式 ResNet-50 Miniforge Search Web ChatGPT Interview Pandas InvalidArgumentError SPIE QWEN Docker DeepSeek 强化学习 财报 Domain VGG-16 Freesound Card OCR Vim Paddle transformers Bipartite Password icon BTC logger NameSilo Django HuggingFace scipy 关于博主 Baidu Cloudreve Heatmap Quantize Hotel LeetCode Paper Augmentation CLAP Qwen2.5 hf git Breakpoint Template Bitcoin uwsgi Dataset 净利润 llama.cpp XML printf
站点统计

本站现有博文323篇,共被浏览796656

本站已经建立2494天!

热门文章
文章归档
回到顶部