EADST

Data Augmentation for Handwritten Recognition

I read some data augmentation papers this week.

Data augmentation has three main areas.

  • Space transform
  • Color change
  • Information delect

Here are four papers where three papers using space transform and one paper taking information delect.

  1. Bhunia, Ayan & Das, Abhirup & Bhunia, Ankan & Perla, Sai & Roy, Partha. (2019). Handwriting Recognition in Low-Resource Scripts Using Adversarial Learning. 10.1109/CVPR.2019.00490.

    • They propose the algorithm, Adversarial Feature Deformation Module (AFDM) inspired by Spatial Transformation Networks (STN).

      • Localisation Network: using Generative Adversarial Networks (GANs) to generate the transform matrix.
      • Grid Generator: transforming feature maps with matrix.
      • Sampler: based on the neighbor relative position to update weights.

      image

  2. Luo, Canjie & Zhu, Yuanzhi & Jin, Lianwen & Wang, Yongpan. (2020). Learn to Augment: Joint Data Augmentation and Network Optimization for Text Recognition.

    • They combine moving least squares with a learnable agent to augment data.

      Code

      image

  3. C. Wigington, S. Stewart, B. Davis, B. Barrett, B. Price and S. Cohen, "Data Augmentation for Recognition of Handwritten Words and Lines Using a CNN-LSTM Network," 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), Kyoto, 2017, pp. 639-645, doi: 10.1109/ICDAR.2017.110.

    • They reshape the characters with the normal distribution where the parameters from the normalization step.

      image

  4. Pengguang Chen. GridMask data augmentation. arXiv preprint arXiv:2001.04086, 2020. 3.

    • They use grid mask strategy to shadow some blocks with the grid for image.
    • A handwritten recognition is used in this paper: GridMask Based Data Augmentation for Bengali Handwritten Grapheme Classification

      Code

      image

相关标签
About Me
XD
Goals determine what you are going to be.
Category
标签云
LLM Ptyhon 签证 Data 搞笑 LoRA Sklearn Bin CC Search FP16 Pandas Land Anaconda Django Hotel AI Ubuntu Crawler ModelScope Markdown printf 多进程 Input SQLite LeetCode API GGML NameSilo 递归学习法 Jupyter TSV 财报 CSV QWEN tar Clash Zip FastAPI Nginx Web 报税 Agent EXCEL COCO NLP VPN RGB Paper DeepSeek 域名 Logo Git HaggingFace logger Template Quantize MD5 腾讯云 Card Miniforge Baidu Dataset Diagram BTC 阿里云 Pillow 关于博主 公式 Plotly Statistics Base64 Shortcut UI Docker Numpy Mixtral icon v2ray Random CAM Use BeautifulSoup Tensor 图标 Tracking C++ PyCharm Conda Plate Quantization CLAP ONNX LLAMA CV PDF Github 论文 VGG-16 NLTK 图形思考法 PDB WebCrawler diffusers SVR Breakpoint hf Math Attention Transformers FP64 Distillation Review 论文速读 v0.dev Domain transformers XML SPIE SQL JSON News mmap CUDA XGBoost TensorRT Interview GoogLeNet TensorFlow Bert PIP Password 版权 BF16 ChatGPT 算法题 FlashAttention Michelin Algorithm 音频 YOLO Disk 证件照 Firewall Vim Qwen2 Magnet Tiktoken GPT4 Llama UNIX FP32 uWSGI Pickle DeepStream SAM RAR Safetensors Video VSCode Qwen LaTeX Food Bitcoin 顶会 OpenCV Datetime Gemma Translation HuggingFace Pytorch Google OCR Paddle Animate Cloudreve CEIR Python 强化学习 Bipartite PyTorch uwsgi Qwen2.5 Knowledge tqdm Excel CTC 净利润 Proxy 多线程 Hilton Linux Freesound Image2Text scipy torchinfo llama.cpp Rebuttal OpenAI WAN 继承 Heatmap Windows git-lfs git 云服务器 Streamlit Claude Hungarian TTS 飞书 IndexTTS2 InvalidArgumentError FP8 GIT Website Permission GPTQ ResNet-50 Augmentation Vmess 第一性原理 Jetson Color
站点统计

本站现有博文328篇,共被浏览853101

本站已经建立2560天!

热门文章
文章归档
回到顶部