EADST

QWEN7B to LLAMA7B Model Structure

Here is the markdown format for the LLAMA7B model structure, detailing each layer and component:


LLAMA7B Model Structure

The LLAMA7B model consists of the following layers and components:

Embedding Layer

  • model.embed_tokens.weight: torch.Size([151851, 4096])

Layers

Each layer in the model has the following components:

Layer 0 to Layer 31

Each layer (model.layers.[0-31]) includes:

  • input_layernorm.weight: torch.Size([4096])

  • Self-Attention Sublayer:

    • q_proj.weight: torch.Size([4096, 4096])

    • k_proj.weight: torch.Size([4096, 4096])

    • v_proj.weight: torch.Size([4096, 4096])

    • q_proj.bias: torch.Size([4096])

    • k_proj.bias: torch.Size([4096])

    • v_proj.bias: torch.Size([4096])

    • o_proj.weight: torch.Size([4096, 4096])

    • post_attention_layernorm.weight: torch.Size([4096])

  • MLP (Multi-Layer Perceptron) Sublayer:

    • up_proj.weight: torch.Size([11008, 4096])

    • gate_proj.weight: torch.Size([11008, 4096])

    • down_proj.weight: torch.Size([4096, 11008])

Final Layer Normalization and Output

  • model.norm.weight: torch.Size([4096])
  • lm_head.weight: torch.Size([151851, 4096])
相关标签
About Me
XD
Goals determine what you are going to be.
Category
标签云
站点统计

本站现有博文242篇,共被浏览292109

本站已经建立1782天!

热门文章
文章归档
回到顶部