EADST

SPIE 2020 Papers

Dong Xie and Colleen P. Bailey "Novel receipt recognition with deep learning algorithms", Proc. SPIE 11400, Pattern Recognition and Tracking XXXI, 114000B (22 April 2020); https://doi.org/10.1117/12.2558206

Abstract

We propose a new recognition method to extract effective information from receipts by integrating deep learning algorithms from computer vision and natural language processing. Our method consists of three parts. The first part provides effective areas for receipt detection. By removing noise and extracting the gradient of the receipt image, we determine the threshold to crop and reshape the useful receipt area. Detecting text from a receipt image is the second part, we modify and deploy the text detection algorithm connectionist text proposal network (CTPN) to locate the text region in the receipt. In the third part, we import the connectionist temporal classification with maximum entropy regularization as the loss function for updating the convolutional recurrent neural networks (CRNN) to recognize the text detection area, which converts the receipt from an image into the text. Based on our method, the effective information of a receipt can be integrated and utilized. We train and test our system using the data set published by scanned receipts optical character recognition and information extraction (SROIE). The results illustrate that our recognition system is able to identify receipt information quickly and accurately.

Paper Download

Arthur C. Depoian, Lorenzo E Jaques, Dong Xie, Colleen P. Bailey, and Parthasarathy Guturu "Computer vision learning techniques for sports video analytics: removing overlays", Proc. SPIE 11395, Big Data II: Learning, Analytics, and Applications, 113950M (24 April 2020); https://doi.org/10.1117/12.2560888

Abstract

Big data has been driving professional sports over the last decade. In our data-driven world, it becomes important to find additional methods for the analysis of both games and athletes. There is an abundance of videos taken in professional and amateur sports. Player datasets can be created utilizing computer vision techniques. We propose a novel approach by creating an autonomous masking algorithm that can receive live or previously recorded video footage of sporting events. This procedure can identify graphical overlays to optimize further processing by tracking and text recognition algorithms for real-time analysis.

Paper Download

相关标签
About Me
XD
Goals determine what you are going to be.
Category
标签云
FP32 tqdm BTC 递归学习法 API DeepStream printf Clash 版权 git-lfs 腾讯云 Michelin TTS ChatGPT News 论文 VPN Plate Dataset Bin 财报 SAM diffusers Tensor Baidu CC Animate PyCharm Color 云服务器 Domain Paddle OpenAI Mixtral Firewall 图标 Food Translation NLP NameSilo Numpy 强化学习 Video SPIE CEIR Ptyhon 论文速读 Math 顶会 GGML Quantize Qwen2 Freesound Image2Text Windows CAM NLTK Vmess Tiktoken XGBoost Interview Bipartite uWSGI Linux mmap 搞笑 PDF ResNet-50 COCO DeepSeek Hotel Git PyTorch Nginx 域名 XML 图形思考法 SQLite FlashAttention v2ray LeetCode 继承 UNIX Pickle Card Proxy OCR VSCode EXCEL GPTQ hf Jupyter Miniforge Statistics 证件照 Land CSV Tracking GoogLeNet Crawler HuggingFace GIT Sklearn Bert BF16 ONNX Safetensors Breakpoint transformers PIP Bitcoin OpenCV FP64 Web LLM 算法题 InvalidArgumentError TensorFlow Logo 公式 Use Qwen RAR Quantization Heatmap FP8 WebCrawler Random 多进程 Permission Jetson IndexTTS2 Excel Algorithm tar CUDA FP16 Base64 Claude C++ 阿里云 GPT4 Hilton Gemma WAN 多线程 Website Magnet SVR Paper Django Pandas torchinfo RGB Conda Github CLAP Pillow Diagram Agent VGG-16 Data LLAMA BeautifulSoup HaggingFace Zip Markdown Search logger AI 关于博主 Pytorch Password Attention FastAPI ms-swift Disk CV SQL 签证 净利润 Google Augmentation YOLO PDB MD5 Anaconda QWEN v0.dev Python Template Knowledge Review CTC Ubuntu Rebuttal ModelScope LoRA icon 第一性原理 RL Hungarian 音频 Datetime Llama UI Shortcut TensorRT Docker Qwen2.5 LaTeX git Streamlit TSV Distillation uwsgi Vim 报税 JSON Input llama.cpp scipy Transformers Plotly Cloudreve 飞书
站点统计

本站现有博文332篇,共被浏览868335

本站已经建立2576天!

热门文章
文章归档
回到顶部