EADST

QWEN7B to LLAMA GPTQ model structure

Here is the markdown format for the GPTQ model structure, detailing each layer and component:


GPTQ Model Structure

The GPTQ model consists of the following layers and components:

Embedding Layer

  • model.embed_tokens.weight: torch.Size([151851, 4096])

Layers

Each layer in the model has the following components:

Layer 0 to Layer 31

Each layer (model.layers.[0-31]) includes:

  • input_layernorm.weight: torch.Size([4096])

  • Self-Attention Sublayer:

    • k_proj:

      • qweight: torch.Size([512, 4096])

      • qzeros: torch.Size([32, 512])

      • scales: torch.Size([32, 4096])

      • g_idx: torch.Size([4096])

      • bias: torch.Size([4096])

    • o_proj:

      • qweight: torch.Size([512, 4096])

      • qzeros: torch.Size([32, 512])

      • scales: torch.Size([32, 4096])

      • g_idx: torch.Size([4096])

      • bias: torch.Size([4096])

    • q_proj:

      • qweight: torch.Size([512, 4096])

      • qzeros: torch.Size([32, 512])

      • scales: torch.Size([32, 4096])

      • g_idx: torch.Size([4096])

      • bias: torch.Size([4096])

    • v_proj:

      • qweight: torch.Size([512, 4096])

      • qzeros: torch.Size([32, 512])

      • scales: torch.Size([32, 4096])

      • g_idx: torch.Size([4096])

      • bias: torch.Size([4096])

  • MLP (Multi-Layer Perceptron) Sublayer:

    • down_proj:

      • qweight: torch.Size([1376, 4096])

      • qzeros: torch.Size([86, 512])

      • scales: torch.Size([86, 4096])

      • g_idx: torch.Size([11008])

      • bias: torch.Size([4096])

    • gate_proj:

      • qweight: torch.Size([512, 11008])

      • qzeros: torch.Size([32, 1376])

      • scales: torch.Size([32, 11008])

      • g_idx: torch.Size([4096])

      • bias: torch.Size([11008])

    • up_proj:

      • qweight: torch.Size([512, 11008])

      • qzeros: torch.Size([32, 1376])

      • scales: torch.Size([32, 11008])

      • g_idx: torch.Size([4096])

      • bias: torch.Size([11008])

  • post_attention_layernorm.weight: torch.Size([4096])

Final Layer Normalization and Output

  • model.norm.weight: torch.Size([4096])
  • lm_head.weight: torch.Size([151851, 4096])
相关标签
About Me
XD
Goals determine what you are going to be.
Category
标签云
JSON 报税 Animate Miniforge Qwen2.5 第一性原理 Github 多线程 Plotly uwsgi QWEN News Freesound LeetCode NLP 图标 Sklearn FP32 Proxy Food Statistics AI API 版权 飞书 Markdown Claude Quantization C++ Tracking Django GIT CTC 阿里云 OpenCV Review Docker GPT4 RGB Tiktoken YOLO SAM Jetson Math hf Domain BF16 Anaconda Linux 财报 关于博主 Hotel Hilton CLAP Knowledge 递归学习法 Pytorch NLTK OCR 继承 Numpy Crawler LLAMA NameSilo Pickle 域名 ResNet-50 算法题 COCO PIP Base64 TensorFlow IndexTTS2 SQL ONNX BTC BeautifulSoup 论文 Land Paper FastAPI 音频 Python GoogLeNet tqdm ModelScope Quantize UI LLM Hungarian logger diffusers Web CEIR Datetime uWSGI torchinfo DeepSeek PDB Pandas Image2Text Search Distillation CC WebCrawler 多进程 Shortcut git-lfs Paddle 腾讯云 Augmentation Bitcoin mmap WAN Translation transformers Logo Card CSV 签证 云服务器 Cloudreve Qwen2 HaggingFace Input SVR Ptyhon Rebuttal CUDA v0.dev icon 图形思考法 Mixtral Dataset Clash Zip git Ubuntu Llama MD5 CAM TSV SPIE ChatGPT Vim Color RAR Template Algorithm 证件照 Pillow XGBoost Nginx InvalidArgumentError 净利润 Attention LaTeX Interview PyTorch 强化学习 FP64 Excel printf DeepStream Breakpoint Google 搞笑 Gemma Vmess SQLite Baidu Bert FP16 Magnet Bipartite llama.cpp VGG-16 UNIX VPN PyCharm CV FlashAttention tar v2ray Firewall XML Password Windows TensorRT EXCEL HuggingFace Data VSCode Tensor PDF TTS 论文速读 Qwen 公式 Jupyter FP8 GGML Disk Video Plate Website OpenAI 顶会 Streamlit Agent LoRA Conda Safetensors Git Use Permission scipy Transformers GPTQ Michelin Random Bin Heatmap Diagram
站点统计

本站现有博文327篇,共被浏览826270

本站已经建立2532天!

热门文章
文章归档
回到顶部