EADST

QWEN7B to LLAMA GPTQ model structure

Here is the markdown format for the GPTQ model structure, detailing each layer and component:


GPTQ Model Structure

The GPTQ model consists of the following layers and components:

Embedding Layer

  • model.embed_tokens.weight: torch.Size([151851, 4096])

Layers

Each layer in the model has the following components:

Layer 0 to Layer 31

Each layer (model.layers.[0-31]) includes:

  • input_layernorm.weight: torch.Size([4096])

  • Self-Attention Sublayer:

    • k_proj:

      • qweight: torch.Size([512, 4096])

      • qzeros: torch.Size([32, 512])

      • scales: torch.Size([32, 4096])

      • g_idx: torch.Size([4096])

      • bias: torch.Size([4096])

    • o_proj:

      • qweight: torch.Size([512, 4096])

      • qzeros: torch.Size([32, 512])

      • scales: torch.Size([32, 4096])

      • g_idx: torch.Size([4096])

      • bias: torch.Size([4096])

    • q_proj:

      • qweight: torch.Size([512, 4096])

      • qzeros: torch.Size([32, 512])

      • scales: torch.Size([32, 4096])

      • g_idx: torch.Size([4096])

      • bias: torch.Size([4096])

    • v_proj:

      • qweight: torch.Size([512, 4096])

      • qzeros: torch.Size([32, 512])

      • scales: torch.Size([32, 4096])

      • g_idx: torch.Size([4096])

      • bias: torch.Size([4096])

  • MLP (Multi-Layer Perceptron) Sublayer:

    • down_proj:

      • qweight: torch.Size([1376, 4096])

      • qzeros: torch.Size([86, 512])

      • scales: torch.Size([86, 4096])

      • g_idx: torch.Size([11008])

      • bias: torch.Size([4096])

    • gate_proj:

      • qweight: torch.Size([512, 11008])

      • qzeros: torch.Size([32, 1376])

      • scales: torch.Size([32, 11008])

      • g_idx: torch.Size([4096])

      • bias: torch.Size([11008])

    • up_proj:

      • qweight: torch.Size([512, 11008])

      • qzeros: torch.Size([32, 1376])

      • scales: torch.Size([32, 11008])

      • g_idx: torch.Size([4096])

      • bias: torch.Size([11008])

  • post_attention_layernorm.weight: torch.Size([4096])

Final Layer Normalization and Output

  • model.norm.weight: torch.Size([4096])
  • lm_head.weight: torch.Size([151851, 4096])
相关标签
About Me
XD
Goals determine what you are going to be.
Category
标签云
JSON Animate DeepSeek ChatGPT PIP Vim Mixtral AI XML Algorithm Hilton Django Miniforge Qwen Search GIT IndexTTS2 Base64 Heatmap Land Use Hungarian Video SQLite OpenAI FP64 Dataset RGB OpenCV LaTeX 论文速读 Qwen2.5 Breakpoint BeautifulSoup CC RAR SPIE Tensor printf v2ray CUDA FP32 Ptyhon Logo Datetime Git Github Michelin 图形思考法 Streamlit transformers 第一性原理 CV HaggingFace CEIR Web Disk Quantization Bin Python DeepStream Plotly Paddle 报税 多进程 PyCharm Statistics Jetson Sklearn uwsgi UNIX InvalidArgumentError Proxy Bert Claude torchinfo hf Windows Llama Conda Magnet ResNet-50 Pandas Markdown WebCrawler Diagram Crawler 继承 Knowledge NLP Attention CAM 飞书 Image2Text NameSilo C++ Freesound Input FP8 YOLO Password PDB 顶会 多线程 GPT4 MD5 音频 OCR Shortcut Pillow Baidu Plate 净利润 SVR VPN PyTorch XGBoost LLM QWEN TensorRT Jupyter Clash HuggingFace FastAPI logger git Anaconda Translation LoRA ONNX Numpy Augmentation News Pytorch diffusers 图标 BF16 SAM Bipartite FP16 VGG-16 Food Google Tiktoken Linux Pickle llama.cpp VSCode CSV Color v0.dev Rebuttal mmap TTS Bitcoin 搞笑 公式 证件照 GGML CTC COCO TensorFlow 版权 TSV GPTQ tqdm Ubuntu 关于博主 ModelScope Vmess Domain Distillation 算法题 Gemma Excel 域名 财报 Agent Template Firewall GoogLeNet PDF Review CLAP Website Tracking Qwen2 强化学习 uWSGI Safetensors Transformers UI Hotel 论文 递归学习法 Data FlashAttention Card Docker EXCEL Interview NLTK 签证 LeetCode tar icon Zip API Random WAN Nginx scipy Paper Cloudreve BTC Quantize git-lfs Permission SQL 阿里云 LLAMA Math 腾讯云 云服务器
站点统计

本站现有博文328篇,共被浏览850614

本站已经建立2557天!

热门文章
文章归档
回到顶部