EADST

QWEN7B to LLAMA GPTQ model structure

Here is the markdown format for the GPTQ model structure, detailing each layer and component:


GPTQ Model Structure

The GPTQ model consists of the following layers and components:

Embedding Layer

  • model.embed_tokens.weight: torch.Size([151851, 4096])

Layers

Each layer in the model has the following components:

Layer 0 to Layer 31

Each layer (model.layers.[0-31]) includes:

  • input_layernorm.weight: torch.Size([4096])

  • Self-Attention Sublayer:

    • k_proj:

      • qweight: torch.Size([512, 4096])

      • qzeros: torch.Size([32, 512])

      • scales: torch.Size([32, 4096])

      • g_idx: torch.Size([4096])

      • bias: torch.Size([4096])

    • o_proj:

      • qweight: torch.Size([512, 4096])

      • qzeros: torch.Size([32, 512])

      • scales: torch.Size([32, 4096])

      • g_idx: torch.Size([4096])

      • bias: torch.Size([4096])

    • q_proj:

      • qweight: torch.Size([512, 4096])

      • qzeros: torch.Size([32, 512])

      • scales: torch.Size([32, 4096])

      • g_idx: torch.Size([4096])

      • bias: torch.Size([4096])

    • v_proj:

      • qweight: torch.Size([512, 4096])

      • qzeros: torch.Size([32, 512])

      • scales: torch.Size([32, 4096])

      • g_idx: torch.Size([4096])

      • bias: torch.Size([4096])

  • MLP (Multi-Layer Perceptron) Sublayer:

    • down_proj:

      • qweight: torch.Size([1376, 4096])

      • qzeros: torch.Size([86, 512])

      • scales: torch.Size([86, 4096])

      • g_idx: torch.Size([11008])

      • bias: torch.Size([4096])

    • gate_proj:

      • qweight: torch.Size([512, 11008])

      • qzeros: torch.Size([32, 1376])

      • scales: torch.Size([32, 11008])

      • g_idx: torch.Size([4096])

      • bias: torch.Size([11008])

    • up_proj:

      • qweight: torch.Size([512, 11008])

      • qzeros: torch.Size([32, 1376])

      • scales: torch.Size([32, 11008])

      • g_idx: torch.Size([4096])

      • bias: torch.Size([11008])

  • post_attention_layernorm.weight: torch.Size([4096])

Final Layer Normalization and Output

  • model.norm.weight: torch.Size([4096])
  • lm_head.weight: torch.Size([151851, 4096])
相关标签
About Me
XD
Goals determine what you are going to be.
Category
标签云
Algorithm Bipartite 版权 Interview Numpy Docker 关于博主 CUDA Permission Markdown Tracking Knowledge EXCEL Vim Llama VGG-16 SPIE LeetCode Review Land Diagram Safetensors DeepStream BTC TensorRT Quantization Dataset CTC printf torchinfo JSON Bert Math 多进程 Bitcoin 阿里云 搞笑 Hotel 签证 IndexTTS2 Template hf Statistics FP16 多线程 Pytorch tqdm GPTQ PDB Card Paper FastAPI Tiktoken CV Conda Transformers Translation Plotly Claude CAM OpenAI 视频信息 logger 飞书 ResNet-50 v2ray uWSGI Color 报税 Django CEIR diffusers 净利润 Paddle LaTeX XGBoost Freesound TensorFlow Logo LoRA git Jetson Augmentation Github Hilton scipy 财报 Linux GIT FlashAttention VSCode TSV Gemma Pickle Crawler ChatGPT Google git-lfs 证件照 COCO Website UNIX CSV CC v0.dev Windows mmap AI Firewall Ptyhon QWEN Jupyter OpenCV GPT4 FP8 Ubuntu Cloudreve PyTorch Base64 Anaconda Streamlit BeautifulSoup Quantize Pandas Vmess PIP Excel Food Michelin Animate Qwen2.5 域名 MD5 Qwen2 Web Shortcut Bin 音频 Data HaggingFace Nginx Random FP64 Mixtral InvalidArgumentError Git llama.cpp tar Magnet NLP Password PyCharm YOLO C++ Image2Text WAN Zip API 公式 WebCrawler ONNX Plate LLAMA FP32 HuggingFace 算法题 ModelScope 腾讯云 Hungarian Datetime Tensor Use Clash XML SQL Video GGML Domain NLTK DeepSeek TTS Heatmap SAM Breakpoint BF16 CLAP Pillow RAR PDF Sklearn NameSilo OCR LLM VPN SQLite Disk Qwen Attention Baidu RGB transformers Miniforge 继承 Input SVR Python uwsgi Distillation GoogLeNet UI Proxy
站点统计

本站现有博文311篇,共被浏览739982

本站已经建立2376天!

热门文章
文章归档
回到顶部