EADST

QWEN7B to LLAMA GPTQ model structure

Here is the markdown format for the GPTQ model structure, detailing each layer and component:


GPTQ Model Structure

The GPTQ model consists of the following layers and components:

Embedding Layer

  • model.embed_tokens.weight: torch.Size([151851, 4096])

Layers

Each layer in the model has the following components:

Layer 0 to Layer 31

Each layer (model.layers.[0-31]) includes:

  • input_layernorm.weight: torch.Size([4096])

  • Self-Attention Sublayer:

    • k_proj:

      • qweight: torch.Size([512, 4096])

      • qzeros: torch.Size([32, 512])

      • scales: torch.Size([32, 4096])

      • g_idx: torch.Size([4096])

      • bias: torch.Size([4096])

    • o_proj:

      • qweight: torch.Size([512, 4096])

      • qzeros: torch.Size([32, 512])

      • scales: torch.Size([32, 4096])

      • g_idx: torch.Size([4096])

      • bias: torch.Size([4096])

    • q_proj:

      • qweight: torch.Size([512, 4096])

      • qzeros: torch.Size([32, 512])

      • scales: torch.Size([32, 4096])

      • g_idx: torch.Size([4096])

      • bias: torch.Size([4096])

    • v_proj:

      • qweight: torch.Size([512, 4096])

      • qzeros: torch.Size([32, 512])

      • scales: torch.Size([32, 4096])

      • g_idx: torch.Size([4096])

      • bias: torch.Size([4096])

  • MLP (Multi-Layer Perceptron) Sublayer:

    • down_proj:

      • qweight: torch.Size([1376, 4096])

      • qzeros: torch.Size([86, 512])

      • scales: torch.Size([86, 4096])

      • g_idx: torch.Size([11008])

      • bias: torch.Size([4096])

    • gate_proj:

      • qweight: torch.Size([512, 11008])

      • qzeros: torch.Size([32, 1376])

      • scales: torch.Size([32, 11008])

      • g_idx: torch.Size([4096])

      • bias: torch.Size([11008])

    • up_proj:

      • qweight: torch.Size([512, 11008])

      • qzeros: torch.Size([32, 1376])

      • scales: torch.Size([32, 11008])

      • g_idx: torch.Size([4096])

      • bias: torch.Size([11008])

  • post_attention_layernorm.weight: torch.Size([4096])

Final Layer Normalization and Output

  • model.norm.weight: torch.Size([4096])
  • lm_head.weight: torch.Size([151851, 4096])
相关标签
About Me
XD
Goals determine what you are going to be.
Category
标签云
TTS Base64 Qwen Use 递归学习法 CAM Color 腾讯云 Hotel 关于博主 Cloudreve Rebuttal Food UNIX CTC Plate CUDA Domain diffusers WAN Tensor git 音频 v2ray Python DeepStream LeetCode logger TensorFlow Plotly Jetson DeepSeek LLAMA Heatmap GoogLeNet LLM Crawler Bert Hilton 顶会 继承 Baidu Disk Firewall 域名 SVR Vmess PDF 强化学习 LaTeX COCO tar Freesound uwsgi 飞书 Knowledge Google VGG-16 Paper 多进程 AI v0.dev hf transformers IndexTTS2 图标 C++ Distillation QWEN Breakpoint icon NLTK Claude scipy Video EXCEL PDB ResNet-50 Ptyhon uWSGI Pandas UI News Mixtral llama.cpp mmap Zip Tracking 财报 NameSilo Tiktoken WebCrawler InvalidArgumentError 净利润 Interview VSCode torchinfo 报税 Web Pytorch FP64 GPTQ Linux Website Nginx Random SQL 搞笑 tqdm Bipartite Agent Pickle ModelScope PIP Gemma 第一性原理 版权 算法题 BeautifulSoup 公式 Numpy FP8 Paddle VPN Attention Quantize ChatGPT Clash Augmentation GPT4 Qwen2.5 Animate Statistics Input Translation GIT Vim Data OpenCV CC Quantization Password BF16 Jupyter Land Qwen2 Hungarian GGML NLP Django Windows Pillow Markdown RGB Image2Text 证件照 Search Transformers PyTorch TSV FP16 CV TensorRT Docker Review API Proxy CLAP Template LoRA 阿里云 Streamlit Logo SPIE printf Conda FP32 Bin Algorithm 云服务器 Github Sklearn CEIR OCR 多线程 SQLite CSV XGBoost OpenAI Dataset Bitcoin RAR SAM Excel PyCharm HuggingFace BTC Shortcut FastAPI Card Miniforge Git git-lfs Permission Safetensors Magnet Ubuntu Math XML ONNX Llama HaggingFace YOLO Diagram 图形思考法 MD5 Michelin 签证 JSON Datetime FlashAttention Anaconda
站点统计

本站现有博文324篇,共被浏览819191

本站已经建立2522天!

热门文章
文章归档
回到顶部