EADST

QWEN7B to LLAMA GPTQ model structure

Here is the markdown format for the GPTQ model structure, detailing each layer and component:


GPTQ Model Structure

The GPTQ model consists of the following layers and components:

Embedding Layer

  • model.embed_tokens.weight: torch.Size([151851, 4096])

Layers

Each layer in the model has the following components:

Layer 0 to Layer 31

Each layer (model.layers.[0-31]) includes:

  • input_layernorm.weight: torch.Size([4096])

  • Self-Attention Sublayer:

    • k_proj:

      • qweight: torch.Size([512, 4096])

      • qzeros: torch.Size([32, 512])

      • scales: torch.Size([32, 4096])

      • g_idx: torch.Size([4096])

      • bias: torch.Size([4096])

    • o_proj:

      • qweight: torch.Size([512, 4096])

      • qzeros: torch.Size([32, 512])

      • scales: torch.Size([32, 4096])

      • g_idx: torch.Size([4096])

      • bias: torch.Size([4096])

    • q_proj:

      • qweight: torch.Size([512, 4096])

      • qzeros: torch.Size([32, 512])

      • scales: torch.Size([32, 4096])

      • g_idx: torch.Size([4096])

      • bias: torch.Size([4096])

    • v_proj:

      • qweight: torch.Size([512, 4096])

      • qzeros: torch.Size([32, 512])

      • scales: torch.Size([32, 4096])

      • g_idx: torch.Size([4096])

      • bias: torch.Size([4096])

  • MLP (Multi-Layer Perceptron) Sublayer:

    • down_proj:

      • qweight: torch.Size([1376, 4096])

      • qzeros: torch.Size([86, 512])

      • scales: torch.Size([86, 4096])

      • g_idx: torch.Size([11008])

      • bias: torch.Size([4096])

    • gate_proj:

      • qweight: torch.Size([512, 11008])

      • qzeros: torch.Size([32, 1376])

      • scales: torch.Size([32, 11008])

      • g_idx: torch.Size([4096])

      • bias: torch.Size([11008])

    • up_proj:

      • qweight: torch.Size([512, 11008])

      • qzeros: torch.Size([32, 1376])

      • scales: torch.Size([32, 11008])

      • g_idx: torch.Size([4096])

      • bias: torch.Size([11008])

  • post_attention_layernorm.weight: torch.Size([4096])

Final Layer Normalization and Output

  • model.norm.weight: torch.Size([4096])
  • lm_head.weight: torch.Size([151851, 4096])
相关标签
About Me
XD
Goals determine what you are going to be.
Category
标签云
Windows GPT4 Agent Vim VGG-16 FlashAttention 论文 TensorFlow uwsgi 算法题 QWEN 顶会 Freesound Translation printf WebCrawler LaTeX SQL Vmess Tracking Random ChatGPT Qwen2.5 EXCEL RL CTC Qwen2 Input scipy v2ray Image2Text Magnet 阿里云 多进程 GoogLeNet Pillow Breakpoint v0.dev Plotly Algorithm CSV C++ Search 腾讯云 Numpy Linux PDF Jupyter BF16 FP16 SQLite tar mmap XGBoost 飞书 BTC CLAP News Firewall 签证 Jetson Sklearn Transformers Web Baidu Crawler NLP VPN Interview Card Quantize Docker Video Animate Use 域名 Excel Clash Hotel HuggingFace 递归学习法 LeetCode LLAMA 财报 ModelScope WAN PyCharm Tensor NameSilo OCR Bin Bitcoin Anaconda Pickle logger LLM RGB Paddle 强化学习 Color Dataset CAM Qwen Tiktoken Data Streamlit PyTorch BeautifulSoup FP32 Python AI Plate Template SPIE uWSGI Ubuntu LoRA Safetensors SVR Statistics Michelin torchinfo HaggingFace InvalidArgumentError Pandas Permission PIP Hilton Quantization 第一性原理 Claude TensorRT 继承 XML UI Shortcut 版权 Math Zip Logo YOLO Google 云服务器 DeepStream CC Pytorch FP64 API 搞笑 COCO OpenCV Password MD5 Nginx Website 音频 证件照 VSCode TTS FP8 ResNet-50 TSV 公式 Proxy Bipartite Heatmap Git Gemma Rebuttal GPTQ 关于博主 Datetime diffusers Django 报税 多线程 llama.cpp Conda Augmentation RAR 论文速读 Knowledge Disk tqdm Bert IndexTTS2 Miniforge Land PDB GGML UNIX Attention Distillation Markdown git Base64 git-lfs Diagram Ptyhon transformers Domain GIT OpenAI Review Llama FastAPI Cloudreve Paper Mixtral NLTK Github ONNX SAM CV 图形思考法 hf CEIR CUDA 净利润 icon JSON 图标 ms-swift DeepSeek Hungarian Food
站点统计

本站现有博文332篇,共被浏览869312

本站已经建立2578天!

热门文章
文章归档
回到顶部