EADST

Quick Review: AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration

AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration

Highlight:

  • Optimal Alpha Scaling: Focuses on determining the optimal alpha value for scaling weights prior to quantization.
相关标签
About Me
XD
Goals determine what you are going to be.
Category
标签云
tqdm Docker scipy Diagram Conda News Sklearn 顶会 WAN SQLite RGB COCO Color Quantization 关于博主 Pickle Clash LLM CUDA 腾讯云 EXCEL PIP Food 财报 CSV BeautifulSoup 多线程 uwsgi Windows Excel SVR VSCode Mixtral CTC VPN 阿里云 InvalidArgumentError Jetson Magnet Gemma TSV Plotly ResNet-50 Qwen2.5 Numpy Card 公式 Nginx Qwen2 Python Crawler CAM GoogLeNet 版权 强化学习 云服务器 CV Domain Template PDF Random Attention XML uWSGI Streamlit FP32 v0.dev Tensor WebCrawler Image2Text Qwen Shortcut BF16 Miniforge Pillow Firewall torchinfo Paper hf PyTorch GPT4 git Pandas Review GPTQ AI 报税 ONNX BTC LeetCode Llama SPIE TensorFlow JSON Base64 Quantize XGBoost Safetensors TensorRT OpenAI VGG-16 域名 Markdown transformers Breakpoint FP16 Google LaTeX 搞笑 Proxy 飞书 Bitcoin RAR 签证 Password SQL PyCharm Knowledge Website Heatmap Hilton Logo Pytorch Statistics FP8 Search ChatGPT Disk Michelin Anaconda Distillation YOLO Django NameSilo PDB llama.cpp logger Algorithm LoRA Video Hungarian DeepSeek Paddle Interview Linux Tracking FP64 Animate HuggingFace Web Agent 算法题 Transformers 多进程 UNIX Input Translation Use Zip Cloudreve Bin QWEN SAM C++ GGML IndexTTS2 Bipartite Dataset 净利润 Git Datetime MD5 printf Augmentation Jupyter mmap Tiktoken Vmess GIT 递归学习法 音频 Data TTS git-lfs CEIR diffusers 图形思考法 OpenCV Hotel CC HaggingFace Land Freesound Ptyhon UI CLAP Ubuntu ModelScope Baidu OCR API 继承 LLAMA FlashAttention Github Vim Math NLTK v2ray tar NLP Plate 第一性原理 DeepStream Bert 证件照 FastAPI Permission Claude
站点统计

本站现有博文321篇,共被浏览779288

本站已经建立2471天!

热门文章
文章归档
回到顶部