EADST

QWEN7B to LLAMA GPTQ model structure

Here is the markdown format for the GPTQ model structure, detailing each layer and component:


GPTQ Model Structure

The GPTQ model consists of the following layers and components:

Embedding Layer

  • model.embed_tokens.weight: torch.Size([151851, 4096])

Layers

Each layer in the model has the following components:

Layer 0 to Layer 31

Each layer (model.layers.[0-31]) includes:

  • input_layernorm.weight: torch.Size([4096])

  • Self-Attention Sublayer:

    • k_proj:

      • qweight: torch.Size([512, 4096])

      • qzeros: torch.Size([32, 512])

      • scales: torch.Size([32, 4096])

      • g_idx: torch.Size([4096])

      • bias: torch.Size([4096])

    • o_proj:

      • qweight: torch.Size([512, 4096])

      • qzeros: torch.Size([32, 512])

      • scales: torch.Size([32, 4096])

      • g_idx: torch.Size([4096])

      • bias: torch.Size([4096])

    • q_proj:

      • qweight: torch.Size([512, 4096])

      • qzeros: torch.Size([32, 512])

      • scales: torch.Size([32, 4096])

      • g_idx: torch.Size([4096])

      • bias: torch.Size([4096])

    • v_proj:

      • qweight: torch.Size([512, 4096])

      • qzeros: torch.Size([32, 512])

      • scales: torch.Size([32, 4096])

      • g_idx: torch.Size([4096])

      • bias: torch.Size([4096])

  • MLP (Multi-Layer Perceptron) Sublayer:

    • down_proj:

      • qweight: torch.Size([1376, 4096])

      • qzeros: torch.Size([86, 512])

      • scales: torch.Size([86, 4096])

      • g_idx: torch.Size([11008])

      • bias: torch.Size([4096])

    • gate_proj:

      • qweight: torch.Size([512, 11008])

      • qzeros: torch.Size([32, 1376])

      • scales: torch.Size([32, 11008])

      • g_idx: torch.Size([4096])

      • bias: torch.Size([11008])

    • up_proj:

      • qweight: torch.Size([512, 11008])

      • qzeros: torch.Size([32, 1376])

      • scales: torch.Size([32, 11008])

      • g_idx: torch.Size([4096])

      • bias: torch.Size([11008])

  • post_attention_layernorm.weight: torch.Size([4096])

Final Layer Normalization and Output

  • model.norm.weight: torch.Size([4096])
  • lm_head.weight: torch.Size([151851, 4096])
相关标签
About Me
XD
Goals determine what you are going to be.
Category
标签云
Land 阿里云 Animate 递归学习法 Ptyhon Attention git Card Video Paddle Base64 logger Augmentation v0.dev TensorRT Safetensors Jetson Django Bipartite GPTQ CSV DeepStream torchinfo News UI Quantize Qwen Review Transformers 强化学习 Password FastAPI Jupyter uwsgi Breakpoint SQLite Pandas Plotly Python Miniforge VPN Random Logo JSON Knowledge 版权 Input FP32 Markdown WebCrawler transformers SVR Magnet Vmess Numpy Freesound DeepSeek NLTK Dataset Agent scipy 关于博主 净利润 Tiktoken Zip 证件照 SPIE Bert Cloudreve C++ CAM 第一性原理 tar 腾讯云 Excel Baidu ChatGPT Gemma Google Nginx Diagram Crawler BeautifulSoup Ubuntu 顶会 Claude Algorithm ResNet-50 EXCEL Pytorch UNIX CTC FP8 YOLO Qwen2 Statistics Windows LLM LaTeX Disk 算法题 CEIR API Streamlit 财报 Shortcut TSV CV OpenCV FlashAttention git-lfs PyCharm MD5 ModelScope mmap Linux 多线程 Hotel WAN CC 公式 域名 Domain 多进程 VGG-16 AI CUDA Sklearn Heatmap 云服务器 HuggingFace ONNX LoRA Datetime Paper RAR XML Qwen2.5 IndexTTS2 PyTorch 搞笑 COCO uWSGI FP64 Vim Permission TensorFlow Firewall Proxy Image2Text LeetCode Llama Use OpenAI SQL HaggingFace PDB Tracking 继承 hf BF16 NLP Github Plate VSCode v2ray Distillation GIT Pickle Quantization diffusers XGBoost Template llama.cpp Pillow Web CLAP RGB NameSilo Michelin tqdm 报税 SAM InvalidArgumentError Search 音频 GPT4 QWEN Interview Color Hungarian LLAMA BTC 签证 PIP Data Food Anaconda Docker TTS GoogLeNet Math OCR Clash Bitcoin Mixtral FP16 PDF 图形思考法 Bin Translation printf 飞书 Website Hilton Conda GGML Git Tensor
站点统计

本站现有博文321篇,共被浏览780184

本站已经建立2472天!

热门文章
文章归档
回到顶部