Multi-frame knowledge based text enhancement for mobile phone captured videos

Date

2014-02-05

Author

Ozarslan, Suleyman
Eren, Pekin Erhan

Metadata

Show full item record

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

Item Usage Stats

89
views

0
downloads

In this study, we explore automated text recognition and enhancement using mobile phone captured videos of store receipts. We propose a method which includes Optical Character Resolution ( OCR) enhanced by our proposed Row Based Multiple Frame Integration (RB-MFI), and Knowledge Based Correction (KBC) algorithms. In this method, first, the trained OCR engine is used for recognition; then, the RB-MFI algorithm is applied to the output of the OCR. The RB-MFI algorithm determines and combines the most accurate rows of the text outputs extracted by using OCR from multiple frames of the video. After RB-MFI, KBC algorithm is applied to these rows to correct erroneous characters. Results of the experiments show that the proposed video-based approach which includes the RB-MFI and the KBC algorithm increases the word character recognition rate to 95%, and the character recognition rate to 98%.

Subject Keywords

Multiple frame integration, OCR, Knowledge based correction

URI

https://hdl.handle.net/11511/31790

DOI

https://doi.org/10.1117/12.2040606

Collections

Graduate School of Informatics, Conference / Seminar

Suggestions

OpenMETU
Core

Text recognition and correction for automated data collection by mobile devices Ozarslan, Suleyman; Eren, Pekin Erhan (2014-02-06) Participatory sensing is an approach which allows mobile devices such as mobile phones to be used for data collection, analysis and sharing processes by individuals. Data collection is the first and most important part of a participatory sensing system, but it is time consuming for the participants. In this paper, we discuss automatic data collection approaches for reducing the time required for collection, and increasing the amount of collected data. In this context, we explore automated text recognition o...
Highly personalized information delivery to mobile clients Ozen, B; Kilic, O; Altinel, M; Doğaç, Asuman (Springer Science and Business Media LLC, 2004-11-01) The inherent limitations of mobile devices necessitate information to be delivered to mobile clients to be highly personalized according to their profiles. This information may be coming from a variety of resources like Web servers, company intranets, email servers. A critical issue for such systems is scalability, that is, the performance of the system should be in acceptable limits when the number of users increases dramatically. Another important issue is being able to express highly personalized informa...
Comparison of approaches for mobile document image analysis using server supported smartphones Ozarslan, Suleyman; Eren, Pekin Erhan (2014-02-05) With the recent advances in mobile technologies, new capabilities are emerging, such as mobile document image analysis. However, mobile phones are still less powerful than servers, and they have some resource limitations. One approach to overcome these limitations is performing resource-intensive processes of the application on remote servers. In mobile document image analysis, the most resource consuming process is the Optical Character Recognition (OCR) process, which is used to extract text in mobile pho...
Methods for location prediction of mobile phone users Keleş, İlkcan; Toroslu, İsmail Hakkı; Department of Computer Engineering (2014) Due to the increasing use of mobile phones and their increasing capabilities, huge amount of usage and location data can be collected. Location prediction is an important task for mobile phone operators and smart city administrations to provide better services and recommendations. In this work, we have investigated several approaches for location prediction problem including clustering, classification and sequential pattern mining. We propose a sequence mining based approach for location prediction of mobil...
Column level two-step multi-slope analog to digital converter for CMOS image sensors Tunca, Can; Koçer, Fatih; Department of Electrical and Electronics Engineering (2017) In the past few years, CMOS image sensors has performed an enormous growth in technology and their market is broadened with the integration cameras on the cell phones. The advancement trend continues as the pixel sizes getting smaller and the array formats getting larger. With pixels decreasing in size and growing in numbers, faster row read-out speed requirements have emerged to keep frame rates constant. Column parallel ADC architectures meet these demands as they utilize large numbers of parallel convers...

Citation Formats

S. Ozarslan and P. E. Eren, “Multi-frame knowledge based text enhancement for mobile phone captured videos,” 2014, vol. 9030, Accessed: 00, 2020. [Online]. Available: https://hdl.handle.net/11511/31790.