EADST

Data Augmentation for Handwritten Recognition

I read some data augmentation papers this week.

Data augmentation has three main areas.

  • Space transform
  • Color change
  • Information delect

Here are four papers where three papers using space transform and one paper taking information delect.

  1. Bhunia, Ayan & Das, Abhirup & Bhunia, Ankan & Perla, Sai & Roy, Partha. (2019). Handwriting Recognition in Low-Resource Scripts Using Adversarial Learning. 10.1109/CVPR.2019.00490.

    • They propose the algorithm, Adversarial Feature Deformation Module (AFDM) inspired by Spatial Transformation Networks (STN).

      • Localisation Network: using Generative Adversarial Networks (GANs) to generate the transform matrix.
      • Grid Generator: transforming feature maps with matrix.
      • Sampler: based on the neighbor relative position to update weights.

      image

  2. Luo, Canjie & Zhu, Yuanzhi & Jin, Lianwen & Wang, Yongpan. (2020). Learn to Augment: Joint Data Augmentation and Network Optimization for Text Recognition.

    • They combine moving least squares with a learnable agent to augment data.

      Code

      image

  3. C. Wigington, S. Stewart, B. Davis, B. Barrett, B. Price and S. Cohen, "Data Augmentation for Recognition of Handwritten Words and Lines Using a CNN-LSTM Network," 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), Kyoto, 2017, pp. 639-645, doi: 10.1109/ICDAR.2017.110.

    • They reshape the characters with the normal distribution where the parameters from the normalization step.

      image

  4. Pengguang Chen. GridMask data augmentation. arXiv preprint arXiv:2001.04086, 2020. 3.

    • They use grid mask strategy to shadow some blocks with the grid for image.
    • A handwritten recognition is used in this paper: GridMask Based Data Augmentation for Bengali Handwritten Grapheme Classification

      Code

      image

相关标签
About Me
XD
Goals determine what you are going to be.
Category
标签云
Animate Disk CTC Llama Datetime Qwen Plotly Use TTS OpenAI DeepStream Tensor diffusers Ptyhon Land 递归学习法 Algorithm Color Crawler Video ModelScope Data FlashAttention Conda VSCode Google Pillow YOLO SQL Jupyter GPT4 mmap Quantize Vim 域名 Ubuntu QWEN Zip Hilton Pickle NLP WebCrawler Tiktoken Knowledge 财报 公式 Freesound 图形思考法 Paper CLAP Django Github Logo FastAPI Bert FP8 Qwen2 Card 多线程 HaggingFace PDB BeautifulSoup Git CV Markdown RAR Hungarian Quantization Mixtral LLAMA Bitcoin torchinfo BF16 证件照 Miniforge Safetensors TSV Dataset uwsgi Domain HuggingFace Gemma Clash TensorRT FP32 LaTeX Baidu CUDA Nginx Diagram Magnet GGML Jetson 关于博主 多进程 NLTK Web C++ PDF VPN VGG-16 Transformers PyTorch Breakpoint CC Statistics Windows RGB Distillation API Password ResNet-50 EXCEL scipy WAN Shortcut 强化学习 COCO 净利润 飞书 DeepSeek tqdm InvalidArgumentError NameSilo GIT Anaconda Search v2ray IndexTTS2 ChatGPT UNIX Excel CEIR 搞笑 MD5 FP16 腾讯云 Python Random XGBoost Docker Input Hotel JSON Plate Translation 算法题 Proxy SVR llama.cpp transformers Interview BTC Agent tar Augmentation Image2Text Pandas OCR GPTQ Base64 AI 报税 签证 版权 git-lfs Math Website XML Tracking Claude LoRA Attention Permission Michelin Vmess FP64 Heatmap 继承 git printf 阿里云 Template CSV PyCharm SAM hf uWSGI 第一性原理 PIP Linux SPIE Qwen2.5 LeetCode Review Streamlit TensorFlow GoogLeNet v0.dev 顶会 LLM Pytorch Firewall Bin Food Numpy ONNX CAM logger SQLite Paddle UI 音频 Cloudreve News OpenCV Bipartite Sklearn
站点统计

本站现有博文320篇,共被浏览759771

本站已经建立2429天!

热门文章
文章归档
回到顶部