EADST

QWEN7B to LLAMA GPTQ model structure

Here is the markdown format for the GPTQ model structure, detailing each layer and component:


GPTQ Model Structure

The GPTQ model consists of the following layers and components:

Embedding Layer

  • model.embed_tokens.weight: torch.Size([151851, 4096])

Layers

Each layer in the model has the following components:

Layer 0 to Layer 31

Each layer (model.layers.[0-31]) includes:

  • input_layernorm.weight: torch.Size([4096])

  • Self-Attention Sublayer:

    • k_proj:

      • qweight: torch.Size([512, 4096])

      • qzeros: torch.Size([32, 512])

      • scales: torch.Size([32, 4096])

      • g_idx: torch.Size([4096])

      • bias: torch.Size([4096])

    • o_proj:

      • qweight: torch.Size([512, 4096])

      • qzeros: torch.Size([32, 512])

      • scales: torch.Size([32, 4096])

      • g_idx: torch.Size([4096])

      • bias: torch.Size([4096])

    • q_proj:

      • qweight: torch.Size([512, 4096])

      • qzeros: torch.Size([32, 512])

      • scales: torch.Size([32, 4096])

      • g_idx: torch.Size([4096])

      • bias: torch.Size([4096])

    • v_proj:

      • qweight: torch.Size([512, 4096])

      • qzeros: torch.Size([32, 512])

      • scales: torch.Size([32, 4096])

      • g_idx: torch.Size([4096])

      • bias: torch.Size([4096])

  • MLP (Multi-Layer Perceptron) Sublayer:

    • down_proj:

      • qweight: torch.Size([1376, 4096])

      • qzeros: torch.Size([86, 512])

      • scales: torch.Size([86, 4096])

      • g_idx: torch.Size([11008])

      • bias: torch.Size([4096])

    • gate_proj:

      • qweight: torch.Size([512, 11008])

      • qzeros: torch.Size([32, 1376])

      • scales: torch.Size([32, 11008])

      • g_idx: torch.Size([4096])

      • bias: torch.Size([11008])

    • up_proj:

      • qweight: torch.Size([512, 11008])

      • qzeros: torch.Size([32, 1376])

      • scales: torch.Size([32, 11008])

      • g_idx: torch.Size([4096])

      • bias: torch.Size([11008])

  • post_attention_layernorm.weight: torch.Size([4096])

Final Layer Normalization and Output

  • model.norm.weight: torch.Size([4096])
  • lm_head.weight: torch.Size([151851, 4096])
相关标签
About Me
XD
Goals determine what you are going to be.
Category
标签云
CV Hilton Gemma TensorFlow Markdown 多进程 Google 顶会 LoRA FP64 Attention TensorRT VPN Agent MD5 Knowledge 域名 TTS SAM Card Pandas Miniforge Search AI uWSGI Conda Michelin 递归学习法 IndexTTS2 UNIX git Statistics Bipartite PIP 腾讯云 OpenAI Animate Cloudreve Baidu Use Disk PyTorch BF16 图形思考法 Freesound ONNX Image2Text SQLite LaTeX Ubuntu Dataset Bert Git Math EXCEL Vmess OCR Streamlit NLP JSON GoogLeNet Jetson DeepSeek 报税 Breakpoint NLTK Shortcut LLAMA Firewall Hungarian C++ 阿里云 Pickle Qwen2 Clash Permission Mixtral Algorithm CSV Plotly 证件照 Password XML WebCrawler Linux Qwen BeautifulSoup Video LeetCode 飞书 PDB Windows CC Llama 搞笑 Logo Data RGB Plate Color 强化学习 RAR UI Random Safetensors printf VGG-16 Heatmap HaggingFace Review Jupyter OpenCV 继承 BTC YOLO Datetime Proxy Web Quantization 净利润 SPIE Translation Anaconda Pillow Input Quantize GPTQ scipy mmap Transformers Hotel 关于博主 Zip Ptyhon API QWEN VSCode diffusers LLM v0.dev Qwen2.5 git-lfs Paddle PyCharm uwsgi v2ray Excel Numpy Template PDF Base64 Vim SQL Docker logger Interview FastAPI CEIR Land FlashAttention CTC WAN 签证 CLAP 算法题 GPT4 Augmentation Magnet torchinfo 版权 ModelScope NameSilo GGML llama.cpp 公式 InvalidArgumentError tar CUDA Claude Tiktoken FP16 tqdm GIT Crawler Tracking Diagram ResNet-50 CAM Website Nginx Paper Github HuggingFace Pytorch Django Distillation Python hf TSV Domain Bin 财报 XGBoost 第一性原理 FP32 transformers Sklearn Tensor Bitcoin ChatGPT DeepStream COCO 音频 FP8 多线程 SVR Food
站点统计

本站现有博文319篇,共被浏览751693

本站已经建立2408天!

热门文章
文章归档
回到顶部