东毅居士

Gemma模型结构注释

作者：XD / 发表： 2024年3月12日 23:50 / 编程笔记/ 阅读量：3041

Gemma模型结构注释

Download Files Using the wget Command

作者：XD / 发表： 2024年3月12日 01:49 / 编程笔记/ 阅读量：1678

Download Files Using the wget Command

Remove a Conda Environment

作者：XD / 发表： 2024年2月27日 21:56 / 编程笔记/ 阅读量：1579

Remove a Conda Environment

Clone Conda Environment

作者：XD / 发表： 2024年2月27日 21:54 / 编程笔记/ 阅读量：1742

Clone Conda Environments

Setting Up GPT4 API Without a Proxy

作者：XD / 发表： 2024年2月27日 20:45 / 编程笔记/ 阅读量：1720

Setting Up GPT4 API Without a Proxy

Merge Safetensors to Bin File

作者：XD / 发表： 2024年2月6日 03:45 / 编程笔记/ 阅读量：2158

Merge Safetensors to Bin File

Check the Index and Token from Tiktoken

作者：XD / 发表： 2024年2月4日 01:27 / 编程笔记/ 阅读量：1761

Check the Index and Token from Tiktoken

Lucid Plugin from ChatGPT to Creating the Diagram

作者：XD / 发表： 2024年1月31日 23:02 / 科研学习/ 阅读量：1695

Lucid Plugin from ChatGPT to Creating the Diagram

Use md5sum to Verify File Integrity

作者：XD / 发表： 2024年1月31日 21:34 / 编程笔记/ 阅读量：1752

Use md5sum to Verify File Integrity

llama.cpp: Definations of Q2_K, Q3_K, Q4_K, Q5_K, Q6_K, and Q8_K Structures

作者：XD / 发表： 2024年1月25日 01:05 / 编程笔记/ 阅读量：3330

llama.cpp: Definitions of Q2_K, Q3_K, Q4_K, Q5_K, Q6_K, and Q8_K Structures

llama.cpp: Efficient 6-bit Data Packing in an 8-bit Array

作者：XD / 发表： 2024年1月25日 00:39 / 编程笔记/ 阅读量：1802

llama.cpp: Efficient 6-bit Data Packing in an 8-bit Array

Setting Up v2rayNG with Tencent Cloud Silicon Valley Lighthouse

作者：XD / 发表： 2023年12月7日 02:18 / 编程笔记/ 阅读量：2003

Setting Up v2rayNG with Tencent Cloud Silicon Valley Lighthouse 利用腾讯云配置自己的v2ray

Quick Review: SmoothQuant: Accurate and Efficient Post-Training Quantization for LLMs

作者：XD / 发表： 2023年12月7日 00:45 / 科研学习/ 阅读量：1707

SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models

Paper: https://arxiv.org/abs/2211.10438

Code: https://github.com/mit-han-lab/smoothquant

Organization: MIT

Quick Review: AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration

作者：XD / 发表： 2023年12月7日 00:38 / 科研学习/ 阅读量：2251

AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration

Paper: https://arxiv.org/abs/2306.00978

Code: https://github.com/mit-han-lab/llm-awq/

Organization: MIT

Quick Review: ZeroQuant-FP

作者：XD / 发表： 2023年12月7日 00:32 / 科研学习/ 阅读量：1852

ZeroQuant-FP: A Leap Forward in LLMs Post-Training W4A8 Quantization Using Floating-Point Formats

Paper: https://arxiv.org/abs/2307.09782

Code: https://github.com/microsoft/DeepSpeed

Organization: Microsoft

Quick Review: QUIK: Towards End-to-end 4-Bit Inference on Generative Large Language Models

作者：XD / 发表： 2023年12月7日 00:06 / 科研学习/ 阅读量：1823

QUIK: Towards End-to-end 4-Bit Inference on Generative Large Language Models

Paper: https://arxiv.org/abs/2310.09259

Code: https://github.com/IST-DASLab/QUIK

Organization: ETH Zurich

原 Gemma模型结构注释

作者：XD / 发表： 2024年3月12日 23:50 / 编程笔记/ 阅读量：3041

原 Download Files Using the wget Command

作者：XD / 发表： 2024年3月12日 01:49 / 编程笔记/ 阅读量：1678

原 Remove a Conda Environment

作者：XD / 发表： 2024年2月27日 21:56 / 编程笔记/ 阅读量：1579

原 Clone Conda Environment

作者：XD / 发表： 2024年2月27日 21:54 / 编程笔记/ 阅读量：1742

原 Setting Up GPT4 API Without a Proxy

作者：XD / 发表： 2024年2月27日 20:45 / 编程笔记/ 阅读量：1720

原 Merge Safetensors to Bin File

作者：XD / 发表： 2024年2月6日 03:45 / 编程笔记/ 阅读量：2158

原 Check the Index and Token from Tiktoken

作者：XD / 发表： 2024年2月4日 01:27 / 编程笔记/ 阅读量：1761

原 Lucid Plugin from ChatGPT to Creating the Diagram

作者：XD / 发表： 2024年1月31日 23:02 / 科研学习/ 阅读量：1695

原 Use md5sum to Verify File Integrity

作者：XD / 发表： 2024年1月31日 21:34 / 编程笔记/ 阅读量：1752

原 llama.cpp: Definations of Q2_K, Q3_K, Q4_K, Q5_K, Q6_K, and Q8_K Structures

作者：XD / 发表： 2024年1月25日 01:05 / 编程笔记/ 阅读量：3330

原 llama.cpp: Efficient 6-bit Data Packing in an 8-bit Array

作者：XD / 发表： 2024年1月25日 00:39 / 编程笔记/ 阅读量：1802

原 Setting Up v2rayNG with Tencent Cloud Silicon Valley Lighthouse

作者：XD / 发表： 2023年12月7日 02:18 / 编程笔记/ 阅读量：2003

原 Quick Review: SmoothQuant: Accurate and Efficient Post-Training Quantization for LLMs

作者：XD / 发表： 2023年12月7日 00:45 / 科研学习/ 阅读量：1707

原 Quick Review: AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration

作者：XD / 发表： 2023年12月7日 00:38 / 科研学习/ 阅读量：2251

原 Quick Review: ZeroQuant-FP

作者：XD / 发表： 2023年12月7日 00:32 / 科研学习/ 阅读量：1852

原 Quick Review: QUIK: Towards End-to-end 4-Bit Inference on Generative Large Language Models

作者：XD / 发表： 2023年12月7日 00:06 / 科研学习/ 阅读量：1823

Gemma模型结构注释

Download Files Using the wget Command

Remove a Conda Environment

Clone Conda Environment

Setting Up GPT4 API Without a Proxy

Merge Safetensors to Bin File

Check the Index and Token from Tiktoken

Lucid Plugin from ChatGPT to Creating the Diagram

Use md5sum to Verify File Integrity

llama.cpp: Definations of Q2_K, Q3_K, Q4_K, Q5_K, Q6_K, and Q8_K Structures

llama.cpp: Efficient 6-bit Data Packing in an 8-bit Array

Setting Up v2rayNG with Tencent Cloud Silicon Valley Lighthouse

Quick Review: SmoothQuant: Accurate and Efficient Post-Training Quantization for LLMs

Quick Review: AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration

Quick Review: ZeroQuant-FP

Quick Review: QUIK: Towards End-to-end 4-Bit Inference on Generative Large Language Models