EADST

SPIE 2020 Papers

Dong Xie and Colleen P. Bailey "Novel receipt recognition with deep learning algorithms", Proc. SPIE 11400, Pattern Recognition and Tracking XXXI, 114000B (22 April 2020); https://doi.org/10.1117/12.2558206

Abstract

We propose a new recognition method to extract effective information from receipts by integrating deep learning algorithms from computer vision and natural language processing. Our method consists of three parts. The first part provides effective areas for receipt detection. By removing noise and extracting the gradient of the receipt image, we determine the threshold to crop and reshape the useful receipt area. Detecting text from a receipt image is the second part, we modify and deploy the text detection algorithm connectionist text proposal network (CTPN) to locate the text region in the receipt. In the third part, we import the connectionist temporal classification with maximum entropy regularization as the loss function for updating the convolutional recurrent neural networks (CRNN) to recognize the text detection area, which converts the receipt from an image into the text. Based on our method, the effective information of a receipt can be integrated and utilized. We train and test our system using the data set published by scanned receipts optical character recognition and information extraction (SROIE). The results illustrate that our recognition system is able to identify receipt information quickly and accurately.

Paper Download

Arthur C. Depoian, Lorenzo E Jaques, Dong Xie, Colleen P. Bailey, and Parthasarathy Guturu "Computer vision learning techniques for sports video analytics: removing overlays", Proc. SPIE 11395, Big Data II: Learning, Analytics, and Applications, 113950M (24 April 2020); https://doi.org/10.1117/12.2560888

Abstract

Big data has been driving professional sports over the last decade. In our data-driven world, it becomes important to find additional methods for the analysis of both games and athletes. There is an abundance of videos taken in professional and amateur sports. Player datasets can be created utilizing computer vision techniques. We propose a novel approach by creating an autonomous masking algorithm that can receive live or previously recorded video footage of sporting events. This procedure can identify graphical overlays to optimize further processing by tracking and text recognition algorithms for real-time analysis.

Paper Download

相关标签
About Me
XD
Goals determine what you are going to be.
Category
标签云
v0.dev Zip VSCode scipy Website AI Conda BeautifulSoup Miniforge Baidu Magnet Diagram Bipartite Animate SPIE 腾讯云 Clash 多线程 Qwen2.5 Tracking XGBoost Bin git-lfs Jupyter 证件照 财报 Random Algorithm Excel Math OpenCV UNIX CTC Food Plate FP32 Freesound PyCharm Shortcut Translation Docker LaTeX ModelScope Bitcoin API Tensor UI 域名 SAM Python COCO Video GIT NLTK EXCEL SVR LLAMA Password Anaconda llama.cpp Vim Color CUDA WebCrawler YOLO Quantization torchinfo Windows Heatmap 多进程 BF16 GPTQ 版权 GPT4 XML Cloudreve PyTorch OCR PDB TensorRT VPN v2ray Datetime Transformers Firewall Google InvalidArgumentError 音频 BTC SQL Paper Vmess logger Attention GGML 搞笑 Card LLM DeepSeek Pandas Domain CAM Pillow Base64 TTS hf Django FP64 递归学习法 阿里云 Plotly Llama git Git ResNet-50 JSON Claude Nginx Knowledge Ubuntu mmap Quantize tar DeepStream LoRA Input Breakpoint 图形思考法 Review FP8 RAR transformers Qwen2 Hungarian Pickle 报税 printf Distillation 关于博主 uwsgi Crawler tqdm 算法题 Tiktoken Qwen uWSGI ChatGPT Agent Jetson 飞书 ONNX Paddle Gemma PDF Image2Text Numpy 继承 IndexTTS2 WAN FastAPI Interview CLAP VGG-16 Bert LeetCode Permission SQLite HaggingFace Dataset Markdown Land Ptyhon Linux GoogLeNet MD5 QWEN Pytorch Hilton Sklearn Template Safetensors HuggingFace CC TSV TensorFlow CSV PIP Logo Github Data 第一性原理 NameSilo FP16 净利润 Michelin Proxy Mixtral Use RGB C++ Statistics Hotel CEIR Streamlit Disk 签证 NLP Augmentation FlashAttention OpenAI 公式 Web diffusers CV
站点统计

本站现有博文316篇,共被浏览747608

本站已经建立2396天!

热门文章
文章归档
回到顶部