EADST

Data Augmentation for Handwritten Recognition

I read some data augmentation papers this week.

Data augmentation has three main areas.

  • Space transform
  • Color change
  • Information delect

Here are four papers where three papers using space transform and one paper taking information delect.

  1. Bhunia, Ayan & Das, Abhirup & Bhunia, Ankan & Perla, Sai & Roy, Partha. (2019). Handwriting Recognition in Low-Resource Scripts Using Adversarial Learning. 10.1109/CVPR.2019.00490.

    • They propose the algorithm, Adversarial Feature Deformation Module (AFDM) inspired by Spatial Transformation Networks (STN).

      • Localisation Network: using Generative Adversarial Networks (GANs) to generate the transform matrix.
      • Grid Generator: transforming feature maps with matrix.
      • Sampler: based on the neighbor relative position to update weights.

      image

  2. Luo, Canjie & Zhu, Yuanzhi & Jin, Lianwen & Wang, Yongpan. (2020). Learn to Augment: Joint Data Augmentation and Network Optimization for Text Recognition.

    • They combine moving least squares with a learnable agent to augment data.

      Code

      image

  3. C. Wigington, S. Stewart, B. Davis, B. Barrett, B. Price and S. Cohen, "Data Augmentation for Recognition of Handwritten Words and Lines Using a CNN-LSTM Network," 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), Kyoto, 2017, pp. 639-645, doi: 10.1109/ICDAR.2017.110.

    • They reshape the characters with the normal distribution where the parameters from the normalization step.

      image

  4. Pengguang Chen. GridMask data augmentation. arXiv preprint arXiv:2001.04086, 2020. 3.

    • They use grid mask strategy to shadow some blocks with the grid for image.
    • A handwritten recognition is used in this paper: GridMask Based Data Augmentation for Bengali Handwritten Grapheme Classification

      Code

      image

相关标签
About Me
XD
Goals determine what you are going to be.
Category
标签云
Hotel WAN Streamlit Linux Baidu FlashAttention News 第一性原理 CAM 顶会 SPIE icon Vmess GGML ONNX Git GIT ModelScope FP32 CEIR VGG-16 XML PIP tar UNIX Freesound Plate Miniforge Web 报税 递归学习法 Domain tqdm TSV Bin HaggingFace OpenCV Data Sklearn LeetCode CLAP TTS 多线程 ChatGPT Base64 torchinfo 证件照 Website 公式 Markdown 图形思考法 diffusers Windows SQL LLAMA Quantize XGBoost Crawler IndexTTS2 云服务器 VSCode Excel PDF Claude git hf Math PyTorch SQLite Agent Quantization Numpy 音频 Pillow Vim 净利润 关于博主 Tiktoken Qwen Tracking C++ Github Bipartite QWEN NLP MD5 Knowledge 搞笑 Safetensors TensorRT Hungarian uwsgi uWSGI Conda InvalidArgumentError FP8 EXCEL JSON OCR RGB Plotly Pickle 飞书 Attention Augmentation Clash Algorithm CC Ptyhon CV GPT4 NLTK AI v0.dev GoogLeNet SVR Datetime Magnet Diagram BF16 YOLO UI Transformers Michelin Llama HuggingFace Rebuttal Anaconda Review 继承 Video Shortcut Pytorch Heatmap Ubuntu Translation Jetson Paddle Card Input DeepStream NameSilo Pandas FP64 PyCharm Breakpoint 图标 Paper FastAPI Docker 多进程 LaTeX Jupyter Bitcoin WebCrawler v2ray Qwen2 Mixtral LoRA Proxy DeepSeek Zip API FP16 Animate 阿里云 签证 Food 强化学习 Color BTC 算法题 Bert Cloudreve VPN PDB Distillation Django Search Dataset CUDA GPTQ Qwen2.5 Hilton Disk ResNet-50 财报 腾讯云 版权 OpenAI printf Logo Python git-lfs Image2Text TensorFlow BeautifulSoup Gemma Firewall Random scipy Permission Template logger CSV Interview SAM 域名 transformers RAR CTC Password Statistics LLM Use Tensor Google llama.cpp Nginx Land mmap COCO
站点统计

本站现有博文323篇,共被浏览796629

本站已经建立2494天!

热门文章
文章归档
回到顶部