Quick Review: SpQR: A Sparse-Quantized Representation for Near-Lossless LLM Weight Compression| 东毅居士

Quick Review: SpQR: A Sparse-Quantized Representation for Near-Lossless LLM Weight Compression

作者：XD / 发表： 2023年12月6日 23:57 / 更新： 2023年12月6日 23:57 / 科研学习 / 阅读量：1711

SpQR: A Sparse-Quantized Representation for Near-Lossless Large Language Model Weight Compression

Paper: SpQR on arXiv
Code: SpQR on GitHub
Organization: University of Washington

Core Approach:

GPTQ without Outliers: Focuses on eliminating outliers during the GPTQ process, enabling more efficient and accurate weight compression for large language models.

本文作者：XD 转载请标明出处：http://www.eadst.com/blog/225

本站采用知识共享署名-非商业性使用-相同方式共享 4.0 国际许可协议进行许可。

上一篇
Quick Review: Optimize Weight Rounding via Signed Gradient Descent for the Quantization of LLMs

下一篇
Quick Review: QUIK: Towards End-to-end 4-Bit Inference on Generative Large Language Models

相关标签

LLM Quantization

About Me

XD

Goals determine what you are going to be.

Category

标签云

Windows Bipartite CLAP ResNet-50 MD5 DeepStream XML VSCode ONNX Password 飞书 COCO diffusers API git-lfs Jupyter HaggingFace CAM scipy Docker Translation TensorFlow YOLO Conda AI TSV PDB FlashAttention CTC 算法题 Diagram 版权 OpenAI Jetson Video Qwen2 InvalidArgumentError NLP Hilton CUDA Dataset Markdown UI Breakpoint Heatmap Image2Text Paper PIP Anaconda LoRA uwsgi Mixtral Bitcoin DeepSeek Template VGG-16 Data GoogLeNet Pillow VPN Interview Linux Disk Zip CC Clash RAR Base64 TTS BeautifulSoup v0.dev Streamlit PDF Magnet GGML 音频 WAN Tiktoken Pytorch Crawler Tracking PyCharm GPT4 腾讯云 Permission mmap QWEN LeetCode SVR Firewall CSV LaTeX FP16 transformers Math 净利润 logger FP32 XGBoost UNIX Quantization WebCrawler Tensor 域名 Pandas Shortcut Proxy Baidu tqdm 关于博主 Python Nginx OpenCV Web Attention Color LLAMA 阿里云 Plotly v2ray TensorRT 财报 Plate Vim 多线程 Django Datetime Bin OCR GIT PyTorch FP8 Sklearn Vmess Random Excel Qwen2.5 hf SQL Domain Ubuntu Numpy git Algorithm ModelScope LLM 签证 torchinfo Review Github Augmentation Food Website 证件照 C++ HuggingFace RGB Gemma Hungarian FastAPI Animate Transformers Claude Quantize FP64 EXCEL llama.cpp Git JSON uWSGI tar BF16 Statistics Distillation BTC ChatGPT Pickle Hotel 搞笑 SPIE Logo 公式 Knowledge Ptyhon printf Michelin Card Safetensors CV Qwen Land Input 报税 SQLite GPTQ 继承 Freesound Paddle Google Llama NameSilo 多进程 Bert Use Cloudreve NLTK CEIR

站点统计

本站现有博文305篇,共被浏览721150次

本站已经建立2351天!

热门文章

文章归档