EADST

QWEN7B to LLAMA7B Model Structure

Here is the markdown format for the LLAMA7B model structure, detailing each layer and component:


LLAMA7B Model Structure

The LLAMA7B model consists of the following layers and components:

Embedding Layer

  • model.embed_tokens.weight: torch.Size([151851, 4096])

Layers

Each layer in the model has the following components:

Layer 0 to Layer 31

Each layer (model.layers.[0-31]) includes:

  • input_layernorm.weight: torch.Size([4096])

  • Self-Attention Sublayer:

    • q_proj.weight: torch.Size([4096, 4096])

    • k_proj.weight: torch.Size([4096, 4096])

    • v_proj.weight: torch.Size([4096, 4096])

    • q_proj.bias: torch.Size([4096])

    • k_proj.bias: torch.Size([4096])

    • v_proj.bias: torch.Size([4096])

    • o_proj.weight: torch.Size([4096, 4096])

    • post_attention_layernorm.weight: torch.Size([4096])

  • MLP (Multi-Layer Perceptron) Sublayer:

    • up_proj.weight: torch.Size([11008, 4096])

    • gate_proj.weight: torch.Size([11008, 4096])

    • down_proj.weight: torch.Size([4096, 11008])

Final Layer Normalization and Output

  • model.norm.weight: torch.Size([4096])
  • lm_head.weight: torch.Size([151851, 4096])
相关标签
About Me
XD
Goals determine what you are going to be.
Category
标签云
Proxy Ubuntu WAN Animate UNIX News EXCEL diffusers FP8 Python Bitcoin IndexTTS2 CEIR Qwen2.5 Color 飞书 Jetson 云服务器 v2ray Heatmap 音频 CV git-lfs InvalidArgumentError LaTeX HuggingFace 域名 Linux Streamlit Zip 继承 PDB PIP Clash FastAPI NLP API Pickle XML 搞笑 v0.dev Rebuttal CAM Random PDF LLAMA CUDA VSCode Jupyter BF16 Conda 多线程 Paddle OCR Breakpoint Web Firewall NameSilo Attention LoRA Video Bert Logo torchinfo Domain Docker Nginx Website TTS 顶会 Hilton Google WebCrawler mmap GIT Tensor Claude Cloudreve 签证 净利润 Review Plotly OpenCV Miniforge Baidu ModelScope Augmentation LeetCode Sklearn Password DeepStream 第一性原理 Django YOLO GPT4 SQL Vmess CLAP Agent RAR uWSGI ChatGPT Safetensors 财报 TensorRT Transformers Vim Gemma Git Shortcut transformers 图形思考法 Numpy Interview Statistics llama.cpp ONNX Hotel TSV 强化学习 GPTQ Template Math Datetime Windows Data scipy Mixtral FP32 TensorFlow tar ResNet-50 Pytorch Card NLTK icon MD5 PyTorch BTC logger COCO Disk Search RGB Crawler GGML 算法题 tqdm CC AI 报税 Tiktoken Plate Base64 OpenAI printf Pillow Paper Diagram SQLite QWEN FlashAttention GoogLeNet Translation 版权 Ptyhon 腾讯云 阿里云 VGG-16 Markdown 关于博主 Input VPN HaggingFace Land CSV LLM Qwen CTC SVR Food Github SPIE Use C++ Knowledge Quantize Anaconda Algorithm 多进程 Hungarian Bipartite 公式 XGBoost Distillation PyCharm Bin UI 递归学习法 uwsgi Dataset FP16 Freesound Permission Excel hf Michelin DeepSeek FP64 证件照 图标 Magnet JSON git Qwen2 Llama BeautifulSoup Image2Text Tracking Pandas SAM Quantization
站点统计

本站现有博文324篇,共被浏览808314

本站已经建立2510天!

热门文章
文章归档
回到顶部