EADST

Data Augmentation for Handwritten Recognition

I read some data augmentation papers this week.

Data augmentation has three main areas.

  • Space transform
  • Color change
  • Information delect

Here are four papers where three papers using space transform and one paper taking information delect.

  1. Bhunia, Ayan & Das, Abhirup & Bhunia, Ankan & Perla, Sai & Roy, Partha. (2019). Handwriting Recognition in Low-Resource Scripts Using Adversarial Learning. 10.1109/CVPR.2019.00490.

    • They propose the algorithm, Adversarial Feature Deformation Module (AFDM) inspired by Spatial Transformation Networks (STN).

      • Localisation Network: using Generative Adversarial Networks (GANs) to generate the transform matrix.
      • Grid Generator: transforming feature maps with matrix.
      • Sampler: based on the neighbor relative position to update weights.

      image

  2. Luo, Canjie & Zhu, Yuanzhi & Jin, Lianwen & Wang, Yongpan. (2020). Learn to Augment: Joint Data Augmentation and Network Optimization for Text Recognition.

    • They combine moving least squares with a learnable agent to augment data.

      Code

      image

  3. C. Wigington, S. Stewart, B. Davis, B. Barrett, B. Price and S. Cohen, "Data Augmentation for Recognition of Handwritten Words and Lines Using a CNN-LSTM Network," 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), Kyoto, 2017, pp. 639-645, doi: 10.1109/ICDAR.2017.110.

    • They reshape the characters with the normal distribution where the parameters from the normalization step.

      image

  4. Pengguang Chen. GridMask data augmentation. arXiv preprint arXiv:2001.04086, 2020. 3.

    • They use grid mask strategy to shadow some blocks with the grid for image.
    • A handwritten recognition is used in this paper: GridMask Based Data Augmentation for Bengali Handwritten Grapheme Classification

      Code

      image

相关标签
About Me
XD
Goals determine what you are going to be.
Category
标签云
Mixtral NameSilo Rebuttal OpenCV HuggingFace Card 多线程 Quantization UNIX QWEN Nginx OCR Qwen transformers Pillow CEIR Food 音频 CC 第一性原理 Vmess Video SVR Jetson Image2Text Input tqdm Statistics XML Algorithm Google HaggingFace Web Datetime Shortcut AI PyCharm Git Logo Bitcoin UI 继承 OpenAI Paddle WebCrawler PIP GGML CLAP SQLite Claude Hotel Streamlit RGB 腾讯云 GoogLeNet Quantize Review Domain Miniforge icon C++ Jupyter Distillation Excel llama.cpp Tracking 证件照 Gemma CUDA Color VGG-16 FP16 YOLO FP64 递归学习法 Password 强化学习 Augmentation 阿里云 Math mmap SQL CTC Bipartite Github 搞笑 Ubuntu SAM RAR ChatGPT Tiktoken COCO Plotly git-lfs Python 飞书 News git Website BeautifulSoup Random Transformers Safetensors Hilton MD5 顶会 Template PDB SPIE ModelScope Breakpoint TSV printf Heatmap Use CSV NLTK Paper Tensor Knowledge 域名 LaTeX 关于博主 Magnet Llama CV Search WAN Cloudreve FastAPI LLAMA Django CAM Animate Michelin TensorRT VPN 报税 图形思考法 净利润 Plate Freesound InvalidArgumentError Dataset v0.dev PyTorch Bert Baidu Conda BF16 Disk Pickle 多进程 LeetCode Vim DeepSeek API 财报 Crawler FP32 TTS 公式 Linux torchinfo Anaconda NLP PDF Firewall Land JSON 算法题 XGBoost DeepStream scipy Sklearn Ptyhon Pytorch 云服务器 Qwen2 FlashAttention 版权 GIT tar Agent uwsgi Zip Qwen2.5 GPT4 LLM Translation Attention IndexTTS2 GPTQ Proxy ONNX Base64 LoRA Interview EXCEL VSCode Docker TensorFlow Windows FP8 Numpy 签证 Bin hf Permission diffusers v2ray BTC logger Hungarian Data uWSGI Clash Diagram Markdown Pandas 图标 ResNet-50
站点统计

本站现有博文324篇,共被浏览810888

本站已经建立2514天!

热门文章
文章归档
回到顶部