EADST

QWEN7B to LLAMA GPTQ model structure

Here is the markdown format for the GPTQ model structure, detailing each layer and component:


GPTQ Model Structure

The GPTQ model consists of the following layers and components:

Embedding Layer

  • model.embed_tokens.weight: torch.Size([151851, 4096])

Layers

Each layer in the model has the following components:

Layer 0 to Layer 31

Each layer (model.layers.[0-31]) includes:

  • input_layernorm.weight: torch.Size([4096])

  • Self-Attention Sublayer:

    • k_proj:

      • qweight: torch.Size([512, 4096])

      • qzeros: torch.Size([32, 512])

      • scales: torch.Size([32, 4096])

      • g_idx: torch.Size([4096])

      • bias: torch.Size([4096])

    • o_proj:

      • qweight: torch.Size([512, 4096])

      • qzeros: torch.Size([32, 512])

      • scales: torch.Size([32, 4096])

      • g_idx: torch.Size([4096])

      • bias: torch.Size([4096])

    • q_proj:

      • qweight: torch.Size([512, 4096])

      • qzeros: torch.Size([32, 512])

      • scales: torch.Size([32, 4096])

      • g_idx: torch.Size([4096])

      • bias: torch.Size([4096])

    • v_proj:

      • qweight: torch.Size([512, 4096])

      • qzeros: torch.Size([32, 512])

      • scales: torch.Size([32, 4096])

      • g_idx: torch.Size([4096])

      • bias: torch.Size([4096])

  • MLP (Multi-Layer Perceptron) Sublayer:

    • down_proj:

      • qweight: torch.Size([1376, 4096])

      • qzeros: torch.Size([86, 512])

      • scales: torch.Size([86, 4096])

      • g_idx: torch.Size([11008])

      • bias: torch.Size([4096])

    • gate_proj:

      • qweight: torch.Size([512, 11008])

      • qzeros: torch.Size([32, 1376])

      • scales: torch.Size([32, 11008])

      • g_idx: torch.Size([4096])

      • bias: torch.Size([11008])

    • up_proj:

      • qweight: torch.Size([512, 11008])

      • qzeros: torch.Size([32, 1376])

      • scales: torch.Size([32, 11008])

      • g_idx: torch.Size([4096])

      • bias: torch.Size([11008])

  • post_attention_layernorm.weight: torch.Size([4096])

Final Layer Normalization and Output

  • model.norm.weight: torch.Size([4096])
  • lm_head.weight: torch.Size([151851, 4096])
相关标签
About Me
XD
Goals determine what you are going to be.
Category
标签云
logger TensorRT Sklearn GIT 域名 llama.cpp Password Bin Firewall 算法题 SQLite Pillow RAR mmap GPTQ AI TTS Use Bitcoin Datetime DeepSeek Zip Data COCO Ptyhon Vim WAN SAM 净利润 ModelScope uWSGI Django Interview Knowledge OpenCV Linux Website ONNX LoRA Land NLTK Pandas LaTeX SPIE Github v0.dev CUDA News CV XML Quantize 第一性原理 多进程 Plotly Nginx LeetCode 强化学习 IndexTTS2 顶会 Hilton FP64 EXCEL Baidu Anaconda Hungarian Jetson Miniforge Vmess SVR JSON FastAPI SQL Streamlit CC 报税 LLM Food Markdown NLP Clash Diagram ResNet-50 证件照 Claude Windows diffusers BTC Random PDF Video v2ray API Pytorch GPT4 飞书 FP8 PDB PIP Shortcut 搞笑 Git PyCharm 递归学习法 Logo FP16 Bert Input CSV Attention Algorithm torchinfo Google Magnet MD5 Paper YOLO Qwen Agent HuggingFace Llama CEIR VSCode Numpy Jupyter Tiktoken uwsgi Hotel WebCrawler Math git-lfs Michelin Cloudreve Qwen2.5 PyTorch git HaggingFace CTC CAM Paddle OCR Plate VGG-16 Card TensorFlow Search Distillation Quantization Proxy Conda Gemma Breakpoint Freesound Bipartite Review Excel 财报 tqdm VPN hf Ubuntu Web QWEN Animate ChatGPT Color GoogLeNet 腾讯云 NameSilo Image2Text Transformers 继承 Disk CLAP tar Tracking C++ InvalidArgumentError Augmentation FlashAttention LLAMA 签证 版权 Statistics 图形思考法 多线程 Domain GGML Safetensors UI Tensor Heatmap Permission Mixtral transformers 音频 关于博主 Pickle Python OpenAI DeepStream RGB BF16 Translation 公式 Docker scipy Qwen2 阿里云 TSV FP32 Base64 Template printf Crawler Dataset BeautifulSoup XGBoost UNIX
站点统计

本站现有博文320篇,共被浏览760691

本站已经建立2432天!

热门文章
文章归档
回到顶部