EADST

Check the Index and Token from Tiktoken

Check the Index and Token from Tiktoken

import base64
path = "/home/your_dict_path.tiktoken"
f = open(path, "rb").read()
index = 0
for line in f.splitlines():
    l = line.split()
    print("index: ", l[1])
    print("encode: ", l[0])
    print("decode: ", base64.b64decode(l[0]))
    index += 1
    if index > 20:
        break

Reference Code

相关标签
About Me
XD
Goals determine what you are going to be.
Category
标签云
OpenCV TTS 阿里云 版权 Random Pandas 云服务器 Qwen2 CLAP GoogLeNet Proxy Excel scipy git Dataset JSON Transformers Distillation Food Firewall Data torchinfo DeepSeek Input Freesound 多线程 算法题 Zip Django 腾讯云 PyCharm Hungarian NLP Land 关于博主 Claude YOLO OCR FP64 Github FP32 GIT SPIE logger SQL PIP Numpy TensorFlow Miniforge CC BTC SAM Clash UI Vmess XML LaTeX Logo Heatmap 论文 MD5 ResNet-50 Card HuggingFace Hotel Safetensors CUDA Bin BeautifulSoup 域名 Docker UNIX 强化学习 FastAPI Animate Math 递归学习法 财报 公式 Tracking Augmentation Linux GPT4 InvalidArgumentError HaggingFace Paddle Attention COCO Ubuntu DeepStream CEIR 第一性原理 Pytorch API tqdm Diagram Bipartite Plate hf Video ChatGPT ModelScope Rebuttal Hilton Quantize Algorithm 签证 Tensor SQLite printf NLTK LLM Agent LLAMA Sklearn RAR Mixtral Streamlit TSV PDB transformers llama.cpp Magnet Pillow Permission 图标 WebCrawler Bitcoin Vim VGG-16 CV CAM Baidu FP8 git-lfs v2ray 继承 Web Shortcut 证件照 Statistics Michelin News Git Review uWSGI Breakpoint CTC Use Translation Windows CSV Pickle Llama SVR 音频 Markdown C++ LeetCode Jupyter PyTorch 飞书 mmap Qwen2.5 QWEN XGBoost GGML 论文速读 TensorRT FP16 Paper 净利润 Search RGB AI Domain GPTQ Color Interview Base64 Password Crawler FlashAttention Python Disk Template Website 顶会 Image2Text Ptyhon Quantization Jetson 图形思考法 tar Cloudreve v0.dev Google EXCEL VPN Nginx WAN Gemma IndexTTS2 多进程 Qwen Conda Datetime icon 报税 PDF Anaconda LoRA Plotly VSCode BF16 Knowledge Tiktoken uwsgi OpenAI NameSilo 搞笑 ONNX Bert diffusers
站点统计

本站现有博文328篇,共被浏览850954

本站已经建立2557天!

热门文章
文章归档
回到顶部