EADST

Transformers Demo for DeepSeek-R1-Distill-Qwen-7B

Transformers Demo for DeepSeek-R1-Distill-Qwen-7B

from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "/your_deepseek-ai_DeepSeek-R1-Distill-Qwen-7B_path"

model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype="auto",
    device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained(model_name)

prompt = "Give me a short introduction to large language model."
messages = [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": prompt}
]
text = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True
)
model_inputs = tokenizer([text], return_tensors="pt").to(model.device)

generated_ids = model.generate(
    **model_inputs,
    max_new_tokens=2048
)
generated_ids = [
    output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
]

response = tokenizer.batch_decode(generated_ids, skip_special_tokens=False)[0] # show special tokens

print("Question: \n", text)
print("Answer: \n", response)
相关标签
About Me
XD
Goals determine what you are going to be.
Category
标签云
WAN Llama PyCharm Password XML 报税 Vmess LLAMA CC Magnet LaTeX 域名 JSON 多进程 VGG-16 VSCode Qwen2 Quantization COCO ONNX Use Sklearn Pillow Ptyhon Agent Input Logo Bin transformers CAM CTC FastAPI Image2Text OpenAI Bitcoin Food Vim FP32 tar Review Statistics Domain UNIX tqdm v0.dev 算法题 PDF Attention FP8 PyTorch TTS Disk Paper Breakpoint ModelScope ResNet-50 GIT Pandas CLAP 净利润 hf Bipartite 飞书 VPN LoRA AI Numpy Tensor C++ Land 第一性原理 TensorFlow PDB Diagram 签证 Clash 证件照 Github Django FP16 Streamlit Miniforge Quantize Proxy Pytorch OCR Bert WebCrawler Jetson Dataset FlashAttention 继承 Qwen EXCEL TSV NameSilo 递归学习法 SAM Mixtral Math Windows API scipy FP64 Augmentation BeautifulSoup 云服务器 HuggingFace 版权 Base64 InvalidArgumentError Gemma BF16 Baidu TensorRT GGML Ubuntu Freesound Google Linux OpenCV UI 音频 Interview Tracking SQL Heatmap RAR 财报 Video NLTK DeepSeek Shortcut LLM Firewall logger NLP Animate SVR uWSGI BTC Git Nginx Anaconda News torchinfo git-lfs Michelin Tiktoken 公式 MD5 Claude DeepStream Website HaggingFace Excel Data Permission mmap CUDA git Card Plotly SQLite IndexTTS2 Conda Translation Hotel Zip Random CV XGBoost 图标 diffusers QWEN CSV GPTQ Markdown Datetime CEIR Hungarian 腾讯云 LeetCode llama.cpp Transformers Template GPT4 SPIE YOLO Jupyter Python 强化学习 PIP Color Qwen2.5 顶会 Hilton printf Pickle 搞笑 Docker Plate ChatGPT uwsgi Cloudreve Search Safetensors 多线程 GoogLeNet 关于博主 Knowledge 图形思考法 Crawler Paddle Web v2ray RGB 阿里云 Algorithm Distillation icon
站点统计

本站现有博文322篇,共被浏览786955

本站已经建立2481天!

热门文章
文章归档
回到顶部