EADST

SPIE 2020 Papers

Dong Xie and Colleen P. Bailey "Novel receipt recognition with deep learning algorithms", Proc. SPIE 11400, Pattern Recognition and Tracking XXXI, 114000B (22 April 2020); https://doi.org/10.1117/12.2558206

Abstract

We propose a new recognition method to extract effective information from receipts by integrating deep learning algorithms from computer vision and natural language processing. Our method consists of three parts. The first part provides effective areas for receipt detection. By removing noise and extracting the gradient of the receipt image, we determine the threshold to crop and reshape the useful receipt area. Detecting text from a receipt image is the second part, we modify and deploy the text detection algorithm connectionist text proposal network (CTPN) to locate the text region in the receipt. In the third part, we import the connectionist temporal classification with maximum entropy regularization as the loss function for updating the convolutional recurrent neural networks (CRNN) to recognize the text detection area, which converts the receipt from an image into the text. Based on our method, the effective information of a receipt can be integrated and utilized. We train and test our system using the data set published by scanned receipts optical character recognition and information extraction (SROIE). The results illustrate that our recognition system is able to identify receipt information quickly and accurately.

Paper Download

Arthur C. Depoian, Lorenzo E Jaques, Dong Xie, Colleen P. Bailey, and Parthasarathy Guturu "Computer vision learning techniques for sports video analytics: removing overlays", Proc. SPIE 11395, Big Data II: Learning, Analytics, and Applications, 113950M (24 April 2020); https://doi.org/10.1117/12.2560888

Abstract

Big data has been driving professional sports over the last decade. In our data-driven world, it becomes important to find additional methods for the analysis of both games and athletes. There is an abundance of videos taken in professional and amateur sports. Player datasets can be created utilizing computer vision techniques. We propose a novel approach by creating an autonomous masking algorithm that can receive live or previously recorded video footage of sporting events. This procedure can identify graphical overlays to optimize further processing by tracking and text recognition algorithms for real-time analysis.

Paper Download

相关标签
About Me
XD
Goals determine what you are going to be.
Category
标签云
Translation UNIX RGB OpenCV PyCharm Bitcoin Plotly DeepStream Firewall 图形思考法 腾讯云 CTC Land TSV printf Streamlit LaTeX Base64 Conda Hungarian ONNX Miniforge DeepSeek YOLO Data C++ LeetCode SQLite Django Git Quantize 版权 Review uWSGI Safetensors VPN CV hf LoRA Card CSV Animate Mixtral Michelin 签证 HuggingFace CC 算法题 CLAP Sklearn Cloudreve FastAPI Dataset Pickle MD5 强化学习 Knowledge Nginx CEIR Magnet 域名 Tensor scipy Math git-lfs 多进程 第一性原理 Permission Pillow COCO Template TTS JSON EXCEL WebCrawler SVR VGG-16 Use 报税 Augmentation Ptyhon Video v0.dev Jetson 递归学习法 Linux Random XGBoost OpenAI CUDA Bert Numpy Python 证件照 InvalidArgumentError Heatmap Freesound BeautifulSoup Attention Interview 顶会 GIT 继承 uwsgi GoogLeNet Docker GPT4 Datetime 飞书 Markdown Qwen llama.cpp Hilton 搞笑 PDB Jupyter Clash Llama Web Anaconda Proxy Tracking Google NameSilo SPIE News Pandas IndexTTS2 Paddle Transformers BF16 净利润 PIP Tiktoken 阿里云 XML Plate 云服务器 SQL Excel transformers FP32 Breakpoint logger icon ModelScope Logo 音频 SAM Qwen2.5 Bipartite GPTQ 关于博主 TensorRT CAM 财报 Distillation TensorFlow BTC AI v2ray FP16 Color Qwen2 Hotel Ubuntu Paper NLTK VSCode Gemma PDF FP64 Shortcut Image2Text Github Statistics Diagram Food Claude tar LLAMA LLM 公式 RAR QWEN Bin GGML API diffusers Password OCR Input Pytorch Baidu Crawler HaggingFace 多线程 UI FlashAttention Disk Windows FP8 torchinfo Domain mmap NLP Quantization Vim Website WAN 图标 ChatGPT Search tqdm ResNet-50 Vmess Agent PyTorch Algorithm Zip git
站点统计

本站现有博文322篇,共被浏览790845

本站已经建立2486天!

热门文章
文章归档
回到顶部