EADST

Data Augmentation for Handwritten Recognition

I read some data augmentation papers this week.

Data augmentation has three main areas.

  • Space transform
  • Color change
  • Information delect

Here are four papers where three papers using space transform and one paper taking information delect.

  1. Bhunia, Ayan & Das, Abhirup & Bhunia, Ankan & Perla, Sai & Roy, Partha. (2019). Handwriting Recognition in Low-Resource Scripts Using Adversarial Learning. 10.1109/CVPR.2019.00490.

    • They propose the algorithm, Adversarial Feature Deformation Module (AFDM) inspired by Spatial Transformation Networks (STN).

      • Localisation Network: using Generative Adversarial Networks (GANs) to generate the transform matrix.
      • Grid Generator: transforming feature maps with matrix.
      • Sampler: based on the neighbor relative position to update weights.

      image

  2. Luo, Canjie & Zhu, Yuanzhi & Jin, Lianwen & Wang, Yongpan. (2020). Learn to Augment: Joint Data Augmentation and Network Optimization for Text Recognition.

    • They combine moving least squares with a learnable agent to augment data.

      Code

      image

  3. C. Wigington, S. Stewart, B. Davis, B. Barrett, B. Price and S. Cohen, "Data Augmentation for Recognition of Handwritten Words and Lines Using a CNN-LSTM Network," 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), Kyoto, 2017, pp. 639-645, doi: 10.1109/ICDAR.2017.110.

    • They reshape the characters with the normal distribution where the parameters from the normalization step.

      image

  4. Pengguang Chen. GridMask data augmentation. arXiv preprint arXiv:2001.04086, 2020. 3.

    • They use grid mask strategy to shadow some blocks with the grid for image.
    • A handwritten recognition is used in this paper: GridMask Based Data Augmentation for Bengali Handwritten Grapheme Classification

      Code

      image

相关标签
About Me
XD
Goals determine what you are going to be.
Category
标签云
Disk LLM mmap Gemma logger Distillation uWSGI Bipartite LoRA v2ray scipy Pytorch ONNX VGG-16 BeautifulSoup CTC EXCEL GPT4 Numpy COCO Shortcut Breakpoint GoogLeNet QWEN Augmentation Domain Quantization Safetensors Attention Tiktoken PyTorch Bitcoin Transformers VSCode Input HaggingFace FP8 Tracking UNIX Video Search PDB CEIR XGBoost Baidu LaTeX Qwen2 SPIE Linux Miniforge Color CAM 关于博主 Hilton 阿里云 CC Heatmap Algorithm BF16 Michelin Translation Random 版权 音频 Excel LeetCode CUDA WAN 腾讯云 Web 域名 Firewall C++ Jetson Diagram Conda Proxy Datetime Anaconda Crawler 第一性原理 IndexTTS2 继承 Mixtral Sklearn torchinfo Template MD5 CLAP Password Streamlit Llama Logo TensorFlow LLAMA Jupyter 多线程 Image2Text WebCrawler Zip Statistics Hungarian Clash Tensor Base64 v0.dev Paddle Pandas FP64 AI Dataset 顶会 Magnet HuggingFace Cloudreve Card FP32 FlashAttention Nginx TensorRT Review Pillow Ubuntu Vim 证件照 净利润 BTC Website DeepSeek TSV Windows Knowledge 云服务器 Qwen2.5 tar hf Hotel Permission 财报 Math uwsgi SVR ModelScope NameSilo SQLite Bert 强化学习 Animate Vmess GPTQ Quantize PIP SAM Docker YOLO GGML InvalidArgumentError 算法题 签证 递归学习法 ResNet-50 RAR Land Freesound Ptyhon Plotly UI diffusers printf Github Claude FastAPI Food PyCharm Markdown 图形思考法 DeepStream Use News 搞笑 TTS Pickle Bin OpenAI CSV OpenCV 报税 NLTK OCR Git CV Qwen JSON git Data GIT Plate API 多进程 XML transformers 飞书 SQL Google 公式 Interview FP16 PDF tqdm NLP git-lfs Agent Django Paper VPN RGB llama.cpp Python ChatGPT
站点统计

本站现有博文321篇,共被浏览779920

本站已经建立2472天!

热门文章
文章归档
回到顶部