EADST

Transformers Demo for DeepSeek-R1-Distill-Qwen-7B

Transformers Demo for DeepSeek-R1-Distill-Qwen-7B

from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "/your_deepseek-ai_DeepSeek-R1-Distill-Qwen-7B_path"

model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype="auto",
    device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained(model_name)

prompt = "Give me a short introduction to large language model."
messages = [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": prompt}
]
text = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True
)
model_inputs = tokenizer([text], return_tensors="pt").to(model.device)

generated_ids = model.generate(
    **model_inputs,
    max_new_tokens=2048
)
generated_ids = [
    output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
]

response = tokenizer.batch_decode(generated_ids, skip_special_tokens=False)[0] # show special tokens

print("Question: \n", text)
print("Answer: \n", response)
相关标签
About Me
XD
Goals determine what you are going to be.
Category
标签云
飞书 公式 Web Diagram IndexTTS2 Docker Plate PDB uWSGI CUDA CSV Augmentation 第一性原理 RGB Gemma Vmess Attention torchinfo OpenCV Bert Bitcoin Breakpoint HuggingFace API Tracking Pandas Use Git Streamlit Miniforge UI Bin Website Card uwsgi Math WebCrawler TTS CEIR 关于博主 多线程 Paper 递归学习法 Image2Text 论文 DeepSeek Distillation Conda Color 签证 SQLite GoogLeNet AI 搞笑 FastAPI Video 图形思考法 ResNet-50 Magnet Michelin ms-swift LoRA PDF HaggingFace LaTeX Password icon Translation CV ChatGPT EXCEL Hotel TensorFlow Firewall GPT4 DeepStream JSON Ubuntu FP32 Google Tiktoken Django CTC Pickle Numpy Heatmap Zip 论文速读 Github FP64 Nginx Knowledge Logo GIT 云服务器 SQL Qwen Input Claude COCO News CC 报税 FP8 Agent Transformers logger VPN SAM Vim Rebuttal Quantize Search NLTK BF16 Domain tar QWEN ONNX Review Algorithm YOLO FP16 Land scipy Tensor Sklearn XGBoost Anaconda Paddle Mixtral Pillow BeautifulSoup Proxy Safetensors Statistics mmap printf SVR C++ TSV InvalidArgumentError 顶会 v0.dev v2ray BTC 算法题 强化学习 VSCode Shortcut TensorRT 证件照 Markdown hf SPIE PIP Windows Clash Baidu Hungarian Dataset NameSilo Disk diffusers OpenAI 净利润 transformers Bipartite OCR Quantization Crawler Interview Python Template MD5 图标 git Linux CAM NLP VGG-16 llama.cpp PyTorch Hilton Datetime Food 继承 域名 Data Excel Llama XML 多进程 CLAP Cloudreve UNIX GGML 财报 Plotly FlashAttention LLM LeetCode 版权 LLAMA PyCharm Random Jetson Pytorch git-lfs Base64 Freesound Jupyter GPTQ WAN Qwen2 音频 ModelScope tqdm Permission Ptyhon Qwen2.5 RAR 阿里云 腾讯云 Animate
站点统计

本站现有博文330篇,共被浏览861191

本站已经建立2569天!

热门文章
文章归档
回到顶部