EADST

Data Augmentation for Handwritten Recognition

I read some data augmentation papers this week.

Data augmentation has three main areas.

  • Space transform
  • Color change
  • Information delect

Here are four papers where three papers using space transform and one paper taking information delect.

  1. Bhunia, Ayan & Das, Abhirup & Bhunia, Ankan & Perla, Sai & Roy, Partha. (2019). Handwriting Recognition in Low-Resource Scripts Using Adversarial Learning. 10.1109/CVPR.2019.00490.

    • They propose the algorithm, Adversarial Feature Deformation Module (AFDM) inspired by Spatial Transformation Networks (STN).

      • Localisation Network: using Generative Adversarial Networks (GANs) to generate the transform matrix.
      • Grid Generator: transforming feature maps with matrix.
      • Sampler: based on the neighbor relative position to update weights.

      image

  2. Luo, Canjie & Zhu, Yuanzhi & Jin, Lianwen & Wang, Yongpan. (2020). Learn to Augment: Joint Data Augmentation and Network Optimization for Text Recognition.

    • They combine moving least squares with a learnable agent to augment data.

      Code

      image

  3. C. Wigington, S. Stewart, B. Davis, B. Barrett, B. Price and S. Cohen, "Data Augmentation for Recognition of Handwritten Words and Lines Using a CNN-LSTM Network," 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), Kyoto, 2017, pp. 639-645, doi: 10.1109/ICDAR.2017.110.

    • They reshape the characters with the normal distribution where the parameters from the normalization step.

      image

  4. Pengguang Chen. GridMask data augmentation. arXiv preprint arXiv:2001.04086, 2020. 3.

    • They use grid mask strategy to shadow some blocks with the grid for image.
    • A handwritten recognition is used in this paper: GridMask Based Data Augmentation for Bengali Handwritten Grapheme Classification

      Code

      image

相关标签
About Me
XD
Goals determine what you are going to be.
Category
标签云
HaggingFace 多进程 Python RAR Firewall SVR transformers Numpy AI 强化学习 COCO Data RGB Algorithm WebCrawler FlashAttention Pillow git-lfs CUDA DeepSeek Freesound ChatGPT Review ResNet-50 Magnet LoRA git Transformers Bert Qwen2 Ubuntu Translation Input PDB OpenAI CSV DeepStream Quantization 财报 Random Nginx Miniforge Vmess GoogLeNet LeetCode Tracking tar QWEN Shortcut Datetime FP16 BF16 阿里云 Template BeautifulSoup Use LLAMA InvalidArgumentError Sklearn FP8 uwsgi Image2Text Qwen Breakpoint Bipartite BTC TensorFlow SPIE Heatmap 飞书 MD5 API CLAP FastAPI LaTeX EXCEL Django 顶会 Linux Knowledge Domain Interview Excel OCR Website VGG-16 CC Dataset XGBoost Clash v0.dev Tensor Pandas 腾讯云 第一性原理 SQLite NameSilo Proxy ModelScope Plotly NLTK 继承 Vim VSCode llama.cpp Hilton Claude Card Streamlit HuggingFace GPTQ Windows Video Logo 图形思考法 Github C++ Agent Food diffusers Base64 Bin PDF mmap NLP 证件照 Color Qwen2.5 Math 关于博主 Disk uWSGI SAM 版权 CTC Land printf Password IndexTTS2 Hungarian Attention Anaconda VPN FP64 Michelin Bitcoin TSV 签证 Git YOLO Llama 算法题 GGML Diagram Jetson torchinfo 域名 Ptyhon SQL 递归学习法 Cloudreve Web OpenCV tqdm UI Docker GIT XML 搞笑 logger FP32 Google Animate Jupyter Statistics Permission 公式 PyCharm PyTorch 音频 hf 净利润 Gemma Mixtral Pytorch CV Paper 报税 Conda UNIX Crawler 多线程 Search Baidu Hotel Markdown Safetensors Tiktoken scipy Zip CAM CEIR Augmentation Distillation Pickle ONNX PIP LLM JSON TTS v2ray Plate GPT4 TensorRT WAN Quantize Paddle
站点统计

本站现有博文319篇,共被浏览751753

本站已经建立2408天!

热门文章
文章归档
回到顶部