EADST

Data Augmentation for Handwritten Recognition

I read some data augmentation papers this week.

Data augmentation has three main areas.

  • Space transform
  • Color change
  • Information delect

Here are four papers where three papers using space transform and one paper taking information delect.

  1. Bhunia, Ayan & Das, Abhirup & Bhunia, Ankan & Perla, Sai & Roy, Partha. (2019). Handwriting Recognition in Low-Resource Scripts Using Adversarial Learning. 10.1109/CVPR.2019.00490.

    • They propose the algorithm, Adversarial Feature Deformation Module (AFDM) inspired by Spatial Transformation Networks (STN).

      • Localisation Network: using Generative Adversarial Networks (GANs) to generate the transform matrix.
      • Grid Generator: transforming feature maps with matrix.
      • Sampler: based on the neighbor relative position to update weights.

      image

  2. Luo, Canjie & Zhu, Yuanzhi & Jin, Lianwen & Wang, Yongpan. (2020). Learn to Augment: Joint Data Augmentation and Network Optimization for Text Recognition.

    • They combine moving least squares with a learnable agent to augment data.

      Code

      image

  3. C. Wigington, S. Stewart, B. Davis, B. Barrett, B. Price and S. Cohen, "Data Augmentation for Recognition of Handwritten Words and Lines Using a CNN-LSTM Network," 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), Kyoto, 2017, pp. 639-645, doi: 10.1109/ICDAR.2017.110.

    • They reshape the characters with the normal distribution where the parameters from the normalization step.

      image

  4. Pengguang Chen. GridMask data augmentation. arXiv preprint arXiv:2001.04086, 2020. 3.

    • They use grid mask strategy to shadow some blocks with the grid for image.
    • A handwritten recognition is used in this paper: GridMask Based Data Augmentation for Bengali Handwritten Grapheme Classification

      Code

      image

相关标签
About Me
XD
Goals determine what you are going to be.
Category
标签云
uWSGI 算法题 FP16 InvalidArgumentError CAM GGML HuggingFace Image2Text Food VSCode Excel Conda SQL Crawler 图形思考法 Search 顶会 Cloudreve 多进程 Breakpoint Rebuttal 财报 TTS Tracking DeepStream 图标 VPN Baidu LLAMA torchinfo Bert Review llama.cpp ResNet-50 logger TensorRT Vmess RGB LaTeX Ptyhon YOLO VGG-16 阿里云 Shortcut LeetCode QWEN Website GoogLeNet Miniforge Qwen 云服务器 Bin Plate 继承 Use Quantization 飞书 Attention OpenAI Video 公式 Magnet FP8 GPT4 Bitcoin Permission mmap Clash Michelin Augmentation Sklearn Pandas Template NLTK Card FP64 ONNX PDB 搞笑 净利润 Python GPTQ 音频 RAR Input Random Animate git-lfs SQLite Distillation Tiktoken CUDA 签证 Paddle CSV News API 关于博主 Vim AI Claude Qwen2 NLP uwsgi 递归学习法 Hungarian Translation transformers 论文 Password OCR CC SAM Plotly Gemma Statistics FlashAttention 多线程 Base64 Dataset Logo 论文速读 LoRA LLM XGBoost Jupyter Markdown TensorFlow Firewall tqdm Agent DeepSeek UNIX 报税 C++ Quantize Linux v2ray 域名 Streamlit Github Docker Google Pytorch diffusers ModelScope 腾讯云 ChatGPT Web Heatmap GIT Land Hotel BeautifulSoup Git Mixtral Diagram git Jetson OpenCV Algorithm Transformers Windows HaggingFace Bipartite Domain Proxy Numpy Pillow UI Data CLAP Knowledge PyCharm Nginx MD5 Ubuntu Freesound tar Qwen2.5 PyTorch Tensor Llama SVR Pickle SPIE icon NameSilo Paper COCO Disk PDF Anaconda XML EXCEL CEIR 版权 第一性原理 Math BTC Datetime CTC BF16 JSON PIP WAN Safetensors Hilton WebCrawler FastAPI Color Django 强化学习 FP32 Interview Zip CV IndexTTS2 hf TSV scipy v0.dev printf 证件照
站点统计

本站现有博文327篇,共被浏览830517

本站已经建立2535天!

热门文章
文章归档
回到顶部