EADST

Quick Review: AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration

AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration

Highlight:

  • Optimal Alpha Scaling: Focuses on determining the optimal alpha value for scaling weights prior to quantization.
相关标签
About Me
XD
Goals determine what you are going to be.
Category
标签云
Sklearn Github Vmess Card FastAPI Image2Text Excel OCR Augmentation NameSilo 净利润 Proxy 证件照 SVR LeetCode Interview Diagram Quantization GPTQ Python UNIX PyCharm BeautifulSoup diffusers CUDA SQL 第一性原理 scipy Use ResNet-50 Zip Gemma Shortcut SPIE Bitcoin OpenAI logger TTS CLAP RGB Markdown Michelin Nginx Web Safetensors Disk 搞笑 v0.dev Tracking TensorRT 飞书 Vim WebCrawler Agent Paddle Tensor OpenCV Logo TensorFlow NLP Translation BTC uWSGI FP16 财报 FP64 LoRA Google Password Mixtral SQLite torchinfo FP32 Animate PIP API Jupyter PDB Land 域名 Random Dataset CC Food CSV CAM Tiktoken Bipartite Windows Cloudreve git JSON RAR PDF Website Base64 Bin TSV FlashAttention printf Crawler DeepSeek Qwen2 Ptyhon NLTK Input Attention Template Freesound Quantize Pandas tqdm Linux GoogLeNet llama.cpp Bert MD5 Magnet Llama 报税 签证 Breakpoint mmap PyTorch GPT4 Jetson Knowledge Docker 算法题 Pillow IndexTTS2 v2ray Statistics LLM 音频 Claude Pickle Django Datetime InvalidArgumentError hf 版权 Ubuntu SAM CTC Permission 递归学习法 XML 腾讯云 ONNX Firewall Hungarian GGML LLAMA BF16 关于博主 Plotly Pytorch 继承 Qwen2.5 transformers GIT Review ModelScope uwsgi Numpy VPN Miniforge Git Data 多进程 Hilton CV HuggingFace Domain Hotel tar YOLO Qwen Baidu 阿里云 COCO DeepStream Clash Anaconda Streamlit 公式 Math C++ HaggingFace ChatGPT AI LaTeX Transformers Paper Algorithm FP8 Distillation 多线程 图形思考法 XGBoost QWEN Color Conda EXCEL VGG-16 Video CEIR VSCode WAN Heatmap git-lfs UI Plate
站点统计

本站现有博文316篇,共被浏览748364

本站已经建立2398天!

热门文章
文章归档
回到顶部