EADST

Data Augmentation for Handwritten Recognition

I read some data augmentation papers this week.

Data augmentation has three main areas.

  • Space transform
  • Color change
  • Information delect

Here are four papers where three papers using space transform and one paper taking information delect.

  1. Bhunia, Ayan & Das, Abhirup & Bhunia, Ankan & Perla, Sai & Roy, Partha. (2019). Handwriting Recognition in Low-Resource Scripts Using Adversarial Learning. 10.1109/CVPR.2019.00490.

    • They propose the algorithm, Adversarial Feature Deformation Module (AFDM) inspired by Spatial Transformation Networks (STN).

      • Localisation Network: using Generative Adversarial Networks (GANs) to generate the transform matrix.
      • Grid Generator: transforming feature maps with matrix.
      • Sampler: based on the neighbor relative position to update weights.

      image

  2. Luo, Canjie & Zhu, Yuanzhi & Jin, Lianwen & Wang, Yongpan. (2020). Learn to Augment: Joint Data Augmentation and Network Optimization for Text Recognition.

    • They combine moving least squares with a learnable agent to augment data.

      Code

      image

  3. C. Wigington, S. Stewart, B. Davis, B. Barrett, B. Price and S. Cohen, "Data Augmentation for Recognition of Handwritten Words and Lines Using a CNN-LSTM Network," 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), Kyoto, 2017, pp. 639-645, doi: 10.1109/ICDAR.2017.110.

    • They reshape the characters with the normal distribution where the parameters from the normalization step.

      image

  4. Pengguang Chen. GridMask data augmentation. arXiv preprint arXiv:2001.04086, 2020. 3.

    • They use grid mask strategy to shadow some blocks with the grid for image.
    • A handwritten recognition is used in this paper: GridMask Based Data Augmentation for Bengali Handwritten Grapheme Classification

      Code

      image

相关标签
About Me
XD
Goals determine what you are going to be.
Category
标签云
音频 Video Crawler GGML Pillow Transformers Input Michelin PyCharm printf MD5 Web Interview PIP BeautifulSoup hf Bitcoin VSCode Paddle GPTQ XML 阿里云 签证 RAR SAM PDB FP8 算法题 AI diffusers LLAMA 证件照 Diagram VPN SPIE OpenAI 财报 Vim Quantization NLP Distillation Animate Tracking Zip Random Streamlit Image2Text ChatGPT torchinfo 图形思考法 Agent Dataset OCR UNIX QWEN SQLite Linux Safetensors Anaconda Paper Quantize FastAPI HaggingFace Pytorch Freesound CEIR Google Heatmap Django Augmentation WAN 强化学习 Land WebCrawler SVR 多线程 Bin Docker CSV UI PyTorch Food Template 继承 Ptyhon Qwen2.5 Domain Bipartite COCO Password tqdm tar LoRA TTS Color YOLO Shortcut Base64 Use FlashAttention CUDA scipy VGG-16 logger uwsgi Pickle Website GoogLeNet Breakpoint LeetCode 飞书 净利润 域名 CV Attention Clash PDF 搞笑 API TSV Hotel Datetime BF16 Knowledge Card Baidu Cloudreve LaTeX Python 报税 Statistics 云服务器 git-lfs DeepStream JSON TensorRT CAM 关于博主 OpenCV News DeepSeek Ubuntu mmap 第一性原理 CTC CC HuggingFace Plotly SQL TensorFlow transformers XGBoost 顶会 Llama Review Git v2ray 公式 Plate Proxy Hungarian Numpy Firewall RGB GIT llama.cpp Jupyter Tensor Logo Bert Data Tiktoken CLAP LLM BTC Github Markdown Windows Math Magnet EXCEL 腾讯云 FP16 Disk Qwen2 Jetson Gemma Permission ModelScope Qwen FP64 Translation FP32 ONNX Conda Nginx Sklearn NameSilo Mixtral git uWSGI GPT4 多进程 C++ InvalidArgumentError 版权 Excel Vmess Algorithm NLTK Pandas Miniforge Hilton v0.dev 递归学习法 Search ResNet-50 IndexTTS2 Claude
站点统计

本站现有博文321篇,共被浏览768066

本站已经建立2452天!

热门文章
文章归档
回到顶部