TasvirEt: A Benchmark Dataset for Automatic Turkish Description Generation from Images

2016-05-19
Unal, Mesut Erhan
Citamak, Begum
Yagcioglu, Semih
Erdem, Aykut
Erdem, Erkut
İKİZLER CİNBİŞ, NAZLI
Çakıcı, Ruket
Automatically describing images with natural sentences is considered to be a challenging research problem that has recently been explored. Although the number of methods proposed to solve this problem increases over time, since the datasets used commonly in this field contain only English descriptions, the studies have mostly been limited to single language, namely English. In this study, for the first time in the literature, a new dataset is proposed which enables generating Turkish descriptions from images, which can be used as a benchmark for this purpose. Furthermore, two approaches are proposed, again for the first time in the literature, for image captioning in Turkish with the dataset we named as TasvirEt. Our findings indicate that the new Turkish dataset and the approaches used here can be successfully used for automatically describing images in Turkish.
24th Signal Processing and Communication Application Conference (SIU)

Suggestions

Shape : representation, description, similarity and recognition
Arıca, Nafiz; Yarman Vural, Fatoş Tunay; Department of Computer Engineering (2003)
In this thesis, we study the shape analysis problem and propose new methods for shape description, similarity and recognition. Firstly, we introduce a new shape descriptor in a two-step method. In the first step, the 2-D shape information is mapped into a set of 1-D functions. The mapping is based on the beams, which are originated from a boundary point, connecting that point with the rest of the points on the boundary. At each point, the angle between a pair of beams is taken as a random variable to define...
Comparison of histograms of oriented optical flow based action recognition methods
Erciş, Fırat; Ulusoy, İlkay; Department of Electrical and Electronics Engineering (2012)
In the task of human action recognition in uncontrolled video, motion features are used widely in order to achieve subject and appearence invariance. We implemented 3 Histograms of Oriented Optical Flow based method which have a common motion feature extraction phase. We compute an optical flow field over each frame of the video. Then those flow vectors are histogrammed due to angle values to represent each frame with a histogram. In order to capture local motions, The bounding box of the subject is divided...
DATA-DRIVEN IMAGE CAPTIONING WITH META-CLASS BASED RETRIEVAL
Kilickaya, Mert; Erdem, Erkut; Erdem, Aykut; İKİZLER CİNBİŞ, NAZLI; Çakıcı, Ruket (2014-04-25)
Automatic image captioning, the process cif producing a description for an image, is a very challenging problem which has only recently received interest from the computer vision and natural language processing communities. In this study, we present a novel data-driven image captioning strategy, which, for a given image, finds the most visually similar image in a large dataset of image-caption pairs and transfers its caption as the description of the input image. Our novelty lies in employing a recently' pr...
Object recognition and segmentation via shape models
Altınoklu, Metin Burak; Ulusoy, İlkay; Tarı, Zehra Sibel; Department of Electrical and Electronics Engineering (2016)
In this thesis, the problem of object detection, recognition and segmentation in computer vision is addressed with shape based methods. An efficient object detection method based on a sparse skeleton has been proposed. The proposed method is an improved chamfer template matching method for recognition of articulated objects. Using a probabilistic graphical model structure, shape variation is represented in a skeletal shape model, where nodes correspond to parts consisting of lines and edges correspond to pa...
Data-driven image captioning via salient region discovery
Kilickaya, Mert; Akkuş, Burak Kerim; Çakıcı, Ruket; Erdem, Aykut; Erdem, Erkut; İKİZLER CİNBİŞ, NAZLI (Institution of Engineering and Technology (IET), 2017-09-01)
n the past few years, automatically generating descriptions for images has attracted a lot of attention in computer vision and natural language processing research. Among the existing approaches, data-driven methods have been proven to be highly effective. These methods compare the given image against a large set of training images to determine a set of relevant images, then generate a description using the associated captions. In this study, the authors propose to integrate an object-based semantic image r...
Citation Formats
M. E. Unal et al., “TasvirEt: A Benchmark Dataset for Automatic Turkish Description Generation from Images,” presented at the 24th Signal Processing and Communication Application Conference (SIU), Zonguldak, TURKEY, 2016, Accessed: 00, 2020. [Online]. Available: https://hdl.handle.net/11511/55213.