EADST

QWEN7B to LLAMA GPTQ model structure

Here is the markdown format for the GPTQ model structure, detailing each layer and component:


GPTQ Model Structure

The GPTQ model consists of the following layers and components:

Embedding Layer

  • model.embed_tokens.weight: torch.Size([151851, 4096])

Layers

Each layer in the model has the following components:

Layer 0 to Layer 31

Each layer (model.layers.[0-31]) includes:

  • input_layernorm.weight: torch.Size([4096])

  • Self-Attention Sublayer:

    • k_proj:

      • qweight: torch.Size([512, 4096])

      • qzeros: torch.Size([32, 512])

      • scales: torch.Size([32, 4096])

      • g_idx: torch.Size([4096])

      • bias: torch.Size([4096])

    • o_proj:

      • qweight: torch.Size([512, 4096])

      • qzeros: torch.Size([32, 512])

      • scales: torch.Size([32, 4096])

      • g_idx: torch.Size([4096])

      • bias: torch.Size([4096])

    • q_proj:

      • qweight: torch.Size([512, 4096])

      • qzeros: torch.Size([32, 512])

      • scales: torch.Size([32, 4096])

      • g_idx: torch.Size([4096])

      • bias: torch.Size([4096])

    • v_proj:

      • qweight: torch.Size([512, 4096])

      • qzeros: torch.Size([32, 512])

      • scales: torch.Size([32, 4096])

      • g_idx: torch.Size([4096])

      • bias: torch.Size([4096])

  • MLP (Multi-Layer Perceptron) Sublayer:

    • down_proj:

      • qweight: torch.Size([1376, 4096])

      • qzeros: torch.Size([86, 512])

      • scales: torch.Size([86, 4096])

      • g_idx: torch.Size([11008])

      • bias: torch.Size([4096])

    • gate_proj:

      • qweight: torch.Size([512, 11008])

      • qzeros: torch.Size([32, 1376])

      • scales: torch.Size([32, 11008])

      • g_idx: torch.Size([4096])

      • bias: torch.Size([11008])

    • up_proj:

      • qweight: torch.Size([512, 11008])

      • qzeros: torch.Size([32, 1376])

      • scales: torch.Size([32, 11008])

      • g_idx: torch.Size([4096])

      • bias: torch.Size([11008])

  • post_attention_layernorm.weight: torch.Size([4096])

Final Layer Normalization and Output

  • model.norm.weight: torch.Size([4096])
  • lm_head.weight: torch.Size([151851, 4096])
相关标签
About Me
XD
Goals determine what you are going to be.
Category
标签云
Safetensors 图形思考法 CV 继承 Dataset SQL 云服务器 GGML Conda YOLO Markdown transformers NLP Claude XGBoost Pickle PDB Ptyhon FastAPI 关于博主 Diagram llama.cpp Transformers logger BTC DeepStream PDF Freesound Ubuntu OpenCV GoogLeNet XML RAR Gemma Template hf Pandas NLTK 签证 git FP8 Git Input News JSON Datetime Quantize Quantization Translation Hungarian v2ray SVR PIP Github PyCharm 顶会 Animate LLM Proxy Tiktoken Numpy VPN CUDA Breakpoint Heatmap LaTeX Python Domain Web GIT Knowledge torchinfo diffusers C++ 飞书 ChatGPT BeautifulSoup Pytorch Streamlit 算法题 Search Tracking Bin Jetson Michelin TensorRT Bert Interview HaggingFace CLAP Shortcut Base64 ONNX UI Hotel Food Data Disk Plotly WAN InvalidArgumentError CTC Linux Random Augmentation NameSilo HuggingFace 递归学习法 腾讯云 Anaconda SPIE ModelScope Magnet UNIX LeetCode CAM Vmess LLAMA printf Qwen2 MD5 v0.dev Password Image2Text Llama OpenAI GPT4 RGB Color 公式 Attention 财报 Algorithm VGG-16 Cloudreve FP16 Sklearn scipy Agent Clash AI 多进程 DeepSeek Hilton 域名 Statistics ResNet-50 FP64 Django Google COCO 多线程 强化学习 LoRA Jupyter git-lfs Mixtral mmap Review Logo TTS Qwen2.5 Miniforge VSCode GPTQ 第一性原理 阿里云 BF16 SQLite CSV Bipartite Excel SAM Video QWEN Nginx CEIR Land 搞笑 tar FP32 音频 版权 FlashAttention Pillow Qwen Distillation WebCrawler API 报税 净利润 Vim Crawler CC Use Bitcoin Card EXCEL Zip Docker OCR Math Paper Permission Paddle uWSGI Baidu uwsgi Tensor Windows 证件照 IndexTTS2 TSV TensorFlow Plate PyTorch tqdm Firewall Website
站点统计

本站现有博文321篇,共被浏览770808

本站已经建立2457天!

热门文章
文章归档
回到顶部