EADST

Transformers Demo for DeepSeek-R1-Distill-Qwen-7B

Transformers Demo for DeepSeek-R1-Distill-Qwen-7B

from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "/your_deepseek-ai_DeepSeek-R1-Distill-Qwen-7B_path"

model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype="auto",
    device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained(model_name)

prompt = "Give me a short introduction to large language model."
messages = [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": prompt}
]
text = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True
)
model_inputs = tokenizer([text], return_tensors="pt").to(model.device)

generated_ids = model.generate(
    **model_inputs,
    max_new_tokens=2048
)
generated_ids = [
    output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
]

response = tokenizer.batch_decode(generated_ids, skip_special_tokens=False)[0] # show special tokens

print("Question: \n", text)
print("Answer: \n", response)
相关标签
About Me
XD
Goals determine what you are going to be.
Category
标签云
QWEN Agent Bin 飞书 Plate YOLO FP64 PDB ONNX Safetensors Tiktoken EXCEL PDF Gemma TensorRT Pickle Linux Michelin diffusers Card Math Jupyter NameSilo LLAMA tar Ptyhon Image2Text GPTQ SAM 递归学习法 logger git UI 腾讯云 Data uwsgi 继承 Attention v0.dev Baidu FP8 mmap transformers 算法题 SPIE Excel Conda Bert Logo Input git-lfs Search TSV Nginx RAR RGB Python ResNet-50 PyCharm torchinfo GIT CSV LeetCode v2ray Firewall Tensor BF16 音频 Quantize tqdm Heatmap HaggingFace Review Plotly Streamlit Transformers Git Land 顶会 Sklearn SQL LaTeX Django Hungarian Llama Web VSCode 图形思考法 Zip Augmentation Magnet AI Anaconda PIP Proxy TensorFlow CEIR Tracking 报税 API Vim Shortcut C++ Numpy SQLite Animate VPN 财报 GoogLeNet 净利润 Pytorch XML News Quantization 多进程 Miniforge llama.cpp Random 多线程 Color ModelScope COCO Algorithm Disk OCR PyTorch Food Base64 GPT4 scipy 强化学习 Template CLAP DeepSeek Knowledge BTC IndexTTS2 WebCrawler Video Breakpoint Pandas CC 第一性原理 版权 DeepStream CV Domain Ubuntu Freesound CUDA 搞笑 Qwen2 Vmess UNIX Jetson Mixtral Cloudreve HuggingFace Docker ChatGPT Clash XGBoost Crawler GGML NLTK MD5 关于博主 OpenCV FlashAttention WAN BeautifulSoup Password SVR Bipartite CAM InvalidArgumentError CTC Qwen Statistics hf uWSGI JSON Permission LLM Dataset Diagram Datetime FP32 Interview Website Markdown printf 证件照 Translation Use LoRA Github Windows FastAPI Pillow TTS OpenAI Bitcoin NLP Hotel Distillation Paddle Google Claude VGG-16 Qwen2.5 Hilton 公式 域名 Paper 阿里云 FP16 签证
站点统计

本站现有博文320篇,共被浏览760960

本站已经建立2432天!

热门文章
文章归档
回到顶部