EADST

Check the Index and Token from Tiktoken

Check the Index and Token from Tiktoken

import base64
path = "/home/your_dict_path.tiktoken"
f = open(path, "rb").read()
index = 0
for line in f.splitlines():
    l = line.split()
    print("index: ", l[1])
    print("encode: ", l[0])
    print("decode: ", base64.b64decode(l[0]))
    index += 1
    if index > 20:
        break

Reference Code

相关标签
About Me
XD
Goals determine what you are going to be.
Category
标签云
ResNet-50 XML Proxy 域名 IndexTTS2 Bitcoin tar Nginx Quantization Hungarian Vmess NameSilo InvalidArgumentError QWEN Knowledge VGG-16 TensorRT Pytorch Card FP64 Hotel Augmentation Food FP16 LLM SPIE Paper Transformers PDB Base64 Interview Bipartite Template ChatGPT Pandas Tensor AI Breakpoint Website Github HaggingFace TSV Conda Rebuttal XGBoost uWSGI 算法题 diffusers MD5 Magnet SAM JSON Numpy Search Django CEIR SVR Safetensors Paddle Data llama.cpp Pickle LLAMA OpenCV git-lfs Agent 云服务器 CAM Bin DeepStream Windows COCO PIP HuggingFace GPT4 Heatmap 搞笑 torchinfo Vim Cloudreve 阿里云 Input 强化学习 RAR Random Python Animate Linux BTC Datetime 版权 Docker Permission Color CV Jetson Quantize Michelin Jupyter 飞书 News Pillow 顶会 SQLite BF16 v2ray Domain 净利润 CSV PDF 多进程 第一性原理 C++ icon Attention git Statistics Qwen2.5 YOLO Streamlit Use Baidu RGB 论文速读 Algorithm FlashAttention Qwen Firewall LaTeX Web OCR CLAP Math LeetCode printf CC LoRA GGML Crawler Git Plotly Bert NLP Distillation BeautifulSoup 公式 图形思考法 FP32 Tiktoken UI 财报 Ptyhon Image2Text Video 多线程 Tracking scipy FP8 Review 音频 PyCharm 报税 Freesound TensorFlow Anaconda CTC mmap CUDA logger 腾讯云 GoogLeNet 证件照 Logo UNIX 递归学习法 Markdown uwsgi 关于博主 Hilton Llama Google hf DeepSeek Gemma Translation 继承 Diagram Password EXCEL Clash Dataset FastAPI transformers Mixtral Claude Excel SQL Plate v0.dev GIT Disk Ubuntu Zip NLTK Miniforge PyTorch ModelScope WAN VPN VSCode 签证 图标 Shortcut Qwen2 Land GPTQ tqdm Sklearn TTS OpenAI WebCrawler API ONNX
站点统计

本站现有博文326篇,共被浏览823806

本站已经建立2529天!

热门文章
文章归档
回到顶部