EADST

Data Augmentation for Handwritten Recognition

I read some data augmentation papers this week.

Data augmentation has three main areas.

  • Space transform
  • Color change
  • Information delect

Here are four papers where three papers using space transform and one paper taking information delect.

  1. Bhunia, Ayan & Das, Abhirup & Bhunia, Ankan & Perla, Sai & Roy, Partha. (2019). Handwriting Recognition in Low-Resource Scripts Using Adversarial Learning. 10.1109/CVPR.2019.00490.

    • They propose the algorithm, Adversarial Feature Deformation Module (AFDM) inspired by Spatial Transformation Networks (STN).

      • Localisation Network: using Generative Adversarial Networks (GANs) to generate the transform matrix.
      • Grid Generator: transforming feature maps with matrix.
      • Sampler: based on the neighbor relative position to update weights.

      image

  2. Luo, Canjie & Zhu, Yuanzhi & Jin, Lianwen & Wang, Yongpan. (2020). Learn to Augment: Joint Data Augmentation and Network Optimization for Text Recognition.

    • They combine moving least squares with a learnable agent to augment data.

      Code

      image

  3. C. Wigington, S. Stewart, B. Davis, B. Barrett, B. Price and S. Cohen, "Data Augmentation for Recognition of Handwritten Words and Lines Using a CNN-LSTM Network," 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), Kyoto, 2017, pp. 639-645, doi: 10.1109/ICDAR.2017.110.

    • They reshape the characters with the normal distribution where the parameters from the normalization step.

      image

  4. Pengguang Chen. GridMask data augmentation. arXiv preprint arXiv:2001.04086, 2020. 3.

    • They use grid mask strategy to shadow some blocks with the grid for image.
    • A handwritten recognition is used in this paper: GridMask Based Data Augmentation for Bengali Handwritten Grapheme Classification

      Code

      image

相关标签
About Me
XD
Goals determine what you are going to be.
Category
标签云
HuggingFace git 多进程 Tracking tar COCO Interview 飞书 FlashAttention Gemma Django RAR LaTeX FP8 Pillow scipy PyCharm XML torchinfo Distillation ONNX Python Math Use v2ray Hotel 搞笑 阿里云 Hungarian Password HaggingFace FP64 SPIE GGML TTS Conda QWEN Vim GPT4 Anaconda DeepStream ChatGPT Bipartite IndexTTS2 DeepSeek Crawler Food Base64 NLP uWSGI FastAPI CEIR Template Git Jetson Windows ResNet-50 XGBoost Pandas Excel Animate diffusers Video NameSilo Miniforge VGG-16 Land AI Hilton Pickle FP16 Jupyter 净利润 Augmentation Data PDB Random PDF UI CUDA 公式 财报 Color Qwen2.5 Tiktoken Pytorch Web Linux Streamlit WebCrawler Sklearn TSV LLM 继承 Diagram CLAP SQLite 证件照 Ptyhon 算法题 Freesound uwsgi llama.cpp Bitcoin GIT Docker VSCode GPTQ TensorRT SQL OpenCV Claude Review Magnet Transformers Translation Knowledge Algorithm InvalidArgumentError Bin Proxy Zip MD5 Paper v0.dev LoRA Quantize logger 腾讯云 OCR Clash Cloudreve UNIX Plotly VPN Ubuntu LLAMA Attention API tqdm Bert printf BTC transformers OpenAI Firewall Website Image2Text NLTK Llama Baidu CTC Vmess Tensor WAN Statistics SAM Breakpoint mmap 域名 Github Nginx Disk Qwen2 版权 Quantization FP32 JSON BF16 YOLO BeautifulSoup Michelin hf 签证 EXCEL Safetensors SVR Mixtral Card CC Plate CV 音频 Permission 关于博主 Markdown git-lfs ModelScope Heatmap Shortcut Logo 报税 TensorFlow Google C++ Numpy CAM GoogLeNet PIP PyTorch Datetime Domain Input LeetCode Paddle RGB 多线程 Qwen Dataset CSV
站点统计

本站现有博文311篇,共被浏览742325

本站已经建立2382天!

热门文章
文章归档
回到顶部