EADST

QWEN7B to LLAMA GPTQ model structure

Here is the markdown format for the GPTQ model structure, detailing each layer and component:


GPTQ Model Structure

The GPTQ model consists of the following layers and components:

Embedding Layer

  • model.embed_tokens.weight: torch.Size([151851, 4096])

Layers

Each layer in the model has the following components:

Layer 0 to Layer 31

Each layer (model.layers.[0-31]) includes:

  • input_layernorm.weight: torch.Size([4096])

  • Self-Attention Sublayer:

    • k_proj:

      • qweight: torch.Size([512, 4096])

      • qzeros: torch.Size([32, 512])

      • scales: torch.Size([32, 4096])

      • g_idx: torch.Size([4096])

      • bias: torch.Size([4096])

    • o_proj:

      • qweight: torch.Size([512, 4096])

      • qzeros: torch.Size([32, 512])

      • scales: torch.Size([32, 4096])

      • g_idx: torch.Size([4096])

      • bias: torch.Size([4096])

    • q_proj:

      • qweight: torch.Size([512, 4096])

      • qzeros: torch.Size([32, 512])

      • scales: torch.Size([32, 4096])

      • g_idx: torch.Size([4096])

      • bias: torch.Size([4096])

    • v_proj:

      • qweight: torch.Size([512, 4096])

      • qzeros: torch.Size([32, 512])

      • scales: torch.Size([32, 4096])

      • g_idx: torch.Size([4096])

      • bias: torch.Size([4096])

  • MLP (Multi-Layer Perceptron) Sublayer:

    • down_proj:

      • qweight: torch.Size([1376, 4096])

      • qzeros: torch.Size([86, 512])

      • scales: torch.Size([86, 4096])

      • g_idx: torch.Size([11008])

      • bias: torch.Size([4096])

    • gate_proj:

      • qweight: torch.Size([512, 11008])

      • qzeros: torch.Size([32, 1376])

      • scales: torch.Size([32, 11008])

      • g_idx: torch.Size([4096])

      • bias: torch.Size([11008])

    • up_proj:

      • qweight: torch.Size([512, 11008])

      • qzeros: torch.Size([32, 1376])

      • scales: torch.Size([32, 11008])

      • g_idx: torch.Size([4096])

      • bias: torch.Size([11008])

  • post_attention_layernorm.weight: torch.Size([4096])

Final Layer Normalization and Output

  • model.norm.weight: torch.Size([4096])
  • lm_head.weight: torch.Size([151851, 4096])
相关标签
About Me
XD
Goals determine what you are going to be.
Category
标签云
Magnet SPIE Baidu Firewall Michelin Breakpoint Diagram Hilton 图形思考法 音频 RGB CC CLAP tar Jetson Image2Text ModelScope Paddle git PDB scipy Ubuntu Math Pandas 算法题 GIT 搞笑 HuggingFace Base64 Jupyter Animate BTC Hungarian FP16 v0.dev Windows CSV TensorRT Augmentation Video BeautifulSoup Nginx 多进程 Heatmap Website NLTK llama.cpp 云服务器 Hotel Pillow uWSGI Qwen2 Transformers OpenCV Clash XGBoost logger Pickle COCO Tensor IndexTTS2 Datetime HaggingFace hf Land PIP Mixtral icon 第一性原理 Freesound 顶会 YOLO Input Llama 签证 Markdown CTC Random Plate Crawler LLM CAM DeepSeek GoogLeNet API Bipartite Conda 报税 Streamlit Food PyCharm DeepStream NameSilo FlashAttention FastAPI EXCEL Bert 域名 Color Vim CV Excel 图标 LeetCode WebCrawler Template Safetensors 递归学习法 Plotly Paper Translation Card AI SQLite ResNet-50 uwsgi Rebuttal 多线程 Interview ChatGPT Gemma GPT4 RAR Dataset Algorithm diffusers C++ Google Bin Data FP64 Tiktoken Django LaTeX BF16 Disk git-lfs LLAMA Anaconda Qwen Python Use transformers OCR tqdm TTS PyTorch Knowledge Statistics torchinfo Permission GGML NLP v2ray 财报 Miniforge QWEN XML OpenAI PDF UNIX VSCode Search CEIR VPN Agent 版权 Quantization JSON TensorFlow Shortcut SQL printf ONNX SAM Logo Proxy 强化学习 Zip 证件照 CUDA 公式 继承 mmap WAN TSV UI SVR FP8 Github Claude 飞书 Password 腾讯云 Docker VGG-16 关于博主 LoRA Pytorch Git Cloudreve 净利润 FP32 Attention Linux Vmess News Distillation GPTQ Sklearn Tracking Quantize Web Bitcoin Qwen2.5 Numpy 阿里云 Ptyhon Domain InvalidArgumentError MD5 Review
站点统计

本站现有博文324篇,共被浏览819209

本站已经建立2523天!

热门文章
文章归档
回到顶部