EADST

QWEN7B to LLAMA GPTQ model structure

Here is the markdown format for the GPTQ model structure, detailing each layer and component:


GPTQ Model Structure

The GPTQ model consists of the following layers and components:

Embedding Layer

  • model.embed_tokens.weight: torch.Size([151851, 4096])

Layers

Each layer in the model has the following components:

Layer 0 to Layer 31

Each layer (model.layers.[0-31]) includes:

  • input_layernorm.weight: torch.Size([4096])

  • Self-Attention Sublayer:

    • k_proj:

      • qweight: torch.Size([512, 4096])

      • qzeros: torch.Size([32, 512])

      • scales: torch.Size([32, 4096])

      • g_idx: torch.Size([4096])

      • bias: torch.Size([4096])

    • o_proj:

      • qweight: torch.Size([512, 4096])

      • qzeros: torch.Size([32, 512])

      • scales: torch.Size([32, 4096])

      • g_idx: torch.Size([4096])

      • bias: torch.Size([4096])

    • q_proj:

      • qweight: torch.Size([512, 4096])

      • qzeros: torch.Size([32, 512])

      • scales: torch.Size([32, 4096])

      • g_idx: torch.Size([4096])

      • bias: torch.Size([4096])

    • v_proj:

      • qweight: torch.Size([512, 4096])

      • qzeros: torch.Size([32, 512])

      • scales: torch.Size([32, 4096])

      • g_idx: torch.Size([4096])

      • bias: torch.Size([4096])

  • MLP (Multi-Layer Perceptron) Sublayer:

    • down_proj:

      • qweight: torch.Size([1376, 4096])

      • qzeros: torch.Size([86, 512])

      • scales: torch.Size([86, 4096])

      • g_idx: torch.Size([11008])

      • bias: torch.Size([4096])

    • gate_proj:

      • qweight: torch.Size([512, 11008])

      • qzeros: torch.Size([32, 1376])

      • scales: torch.Size([32, 11008])

      • g_idx: torch.Size([4096])

      • bias: torch.Size([11008])

    • up_proj:

      • qweight: torch.Size([512, 11008])

      • qzeros: torch.Size([32, 1376])

      • scales: torch.Size([32, 11008])

      • g_idx: torch.Size([4096])

      • bias: torch.Size([11008])

  • post_attention_layernorm.weight: torch.Size([4096])

Final Layer Normalization and Output

  • model.norm.weight: torch.Size([4096])
  • lm_head.weight: torch.Size([151851, 4096])
相关标签
About Me
XD
Goals determine what you are going to be.
Category
标签云
Card Attention Python CAM Website LoRA Datetime Bitcoin Excel Clash hf Crawler DeepSeek TTS Miniforge WAN Markdown BTC 飞书 CTC uwsgi Windows icon Breakpoint v2ray Proxy Image2Text Freesound API torchinfo 继承 tar Land Algorithm LLAMA ModelScope VGG-16 GoogLeNet UNIX Video Numpy 域名 Cloudreve logger Tensor COCO 签证 FP32 uWSGI RAR Review Jetson 腾讯云 Pickle Paper Quantization Jupyter 云服务器 SQLite IndexTTS2 GPTQ Augmentation Base64 Knowledge Heatmap XML WebCrawler scipy GPT4 Permission Math tqdm 顶会 RGB CUDA PyCharm Transformers 搞笑 diffusers Safetensors Magnet OpenCV Llama UI XGBoost Git Dataset Input LaTeX MD5 算法题 图形思考法 Qwen2.5 Streamlit PyTorch 阿里云 Qwen TensorRT Agent Plate Paddle PDF CC SPIE NLP GIT Baidu Rebuttal Animate git-lfs 递归学习法 Food FP16 Hungarian BeautifulSoup 音频 Hotel 净利润 QWEN 强化学习 NLTK Pillow mmap Bert TensorFlow Bin Google Mixtral HuggingFace 多线程 Diagram C++ InvalidArgumentError SVR DeepStream SAM printf GGML Ptyhon 版权 ChatGPT Django llama.cpp Pandas ResNet-50 Distillation Github OpenAI Anaconda Firewall Conda News Vim Search Translation CSV AI Color Data v0.dev 多进程 Use Random YOLO Statistics Claude CV Qwen2 Michelin 关于博主 Nginx Domain Interview CEIR VPN Hilton Quantize BF16 公式 Tiktoken 第一性原理 FlashAttention Logo 报税 图标 Web LeetCode JSON CLAP Password LLM Linux Bipartite PDB Disk ONNX Sklearn EXCEL FP64 OCR Shortcut 财报 git FastAPI Vmess SQL Docker transformers NameSilo Pytorch TSV Gemma Ubuntu FP8 证件照 PIP Tracking Plotly HaggingFace Template Zip VSCode
站点统计

本站现有博文323篇,共被浏览801110

本站已经建立2500天!

热门文章
文章归档
回到顶部