A character recognizer for Turkish language

Date

2003-01-01

Author

Korkmaz, SU
Akinci, GKY
Atalay, Mehmet Volkan

Metadata

Show full item record

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

Item Usage Stats

212
views

0
downloads

This paper presents particularly a contextual post processing subsystem for a Turkish machine printed character recognition system. The contextual post processing subsystem is based on positional binary 3-gram statistics for Turkish language, an error corrector parser and a lexicon, which contains root words and the inflected forms of the root words. Error corrector parser is used for correcting CR alternatives using Turkish Morphology.

Subject Keywords

Character recognition, Chromium, Statistics, Natural languages, Error correction, Image segmentation, Image converters, Error analysis, Morphology, Text recognition

URI

https://hdl.handle.net/11511/40250

DOI

https://doi.org/10.1109/icdar.2003.1227855

Conference Name

7th International Conference on Document Analysis and Recognition (ICDAR 2003)

Collections

Department of Computer Engineering, Conference / Seminar

Suggestions

OpenMETU
Core

A heuristic algorithm for optical character recognition of Arabic script Yarman Vural, Fatoş T.; Atici, A. Alper (1996-03-20) In this paper, a heuristic method is developed for segmentation, feature extraction and recognition of the Arabic script. The study is part of a large project for the transcription of the documents in Ottoman Archives. A geometrical and topological feature analysis method is developed for segmentation and feature extraction stages. Chain code transformation is applied to main strokes of the characters which are then classified by the hidden Markov model (HMM) in the recognition stage. Experimental results i...
A heuristic algorithm for optical character recognition of Arabic script Atici, A. Alper; Yarman Vural, Fatoş T. (1997-10-01) In this paper, a heuristic method is developed for segmentation, feature extraction and recognition of the Arabic script. The study is part of a large project for transcription of the documents in Ottoman Archives. A geometrical and topological feature analysis method is developed for segmentation and feature extraction stages. Chain code transformation is applied to main strokes of the characters, which are classified by the hidden Markov model (HMM) in the recognition stage. Experimental results indicate ...
An Efficient Part-of-Speech Tagger for Arabic Kopru, Selcuk (2011-02-26) In this paper, we present an efficient part-of-speech (POS) tagger for Arabic which is based on a Hidden Markow Model. We explore different enhancements to improve the baseline system. Despite the morphological complexity of Arabic our approach is a data driven approach and does not utilize any morphological analyzer or a lexicon as many other Arabic PUS taggers. This makes our approach simple, very efficient and valuable to be used in real-life applications and the obtained accuracy results are still compa...
An automatic geo-spatial object recognition algorithm for high resolution satellite images Ergul, Mustafa; Alatan, Abdullah Aydın (2013-09-26) This paper proposes a novel automatic geo-spatial object recognition algorithm for high resolution satellite imaging. The proposed algorithm consists of two main steps; a hypothesis generation step with a local feature-based algorithm and a verification step with a shape-based approach. In the hypothesis generation step, a set of hypothesis for possible object locations is generated, aiming lower missed detections and higher false-positives by using a Bag of Visual Words type approach. In the verification s...
A Shadow based trainable method for building detection in satellite images Dikmen, Mehmet; Halıcı, Uğur; Department of Geodetic and Geographical Information Technologies (2014) The purpose of this thesis is to develop a supervised building detection and extraction algorithm with a shadow based learning method for high-resolution satellite images. First, shadow segments are identified on an over-segmented image, and then neighboring shadow segments are merged by assuming that they are cast by a single building. Next, these shadow regions are used to detect the candidate regions where buildings most likely occur. Together with this information, distance to shadows towards illuminati...

Citation Formats

S. Korkmaz, G. Akinci, and M. V. Atalay, “A character recognizer for Turkish language,” presented at the 7th International Conference on Document Analysis and Recognition (ICDAR 2003), EDINBURGH, SCOTLAND, 2003, Accessed: 00, 2020. [Online]. Available: https://hdl.handle.net/11511/40250.