Connectionist multi-sequence modelling and applications to multilingual neural machine translation

Fırat, Orhan
Deep (recurrent) neural networks has been shown to successfully learn complex mappings between arbitrary length input and output sequences, called sequence to sequence learning, within the effective framework of encoder-decoder networks. This thesis investigates the extensions of sequence to sequence models, to handle multiple sequences at the same time within a single parametric model, and proposes the first large scale connectionist multi-sequence modeling approach. The proposed multisequence modeling architecture learns to map a set of input sequences into a set of output sequences thanks to the explicit and shared parametrization of a shared medium, interlingua. Proposedmulti-sequence modeling architectures applied to machine translation tasks, tackling the problem of multi-lingual neural machine translation (MLNMT). We explore applicability and the benefits of MLNMT, (1) on large scale machine translation tasks, between ten pairs of languages within the same model, (2) low-resource language transfer problems, where the data between any given pair is scarce, and measuring the transfer learning capabilities, (3) multi-source translation tasks where we have multi-way parallel data available,leveraging complementary information between input sequences while mapping them into a single output sequence and finally (4) Zero-resource translation task, where we don’t have any available aligned data between a pair of source-target sequences.  


Optical variabilities in the Be/X-ray binary system - GRO J2058+42 (CXOU J205847.5+414637)
Kiziloglu, Ue.; Kiziloglu, N.; Baykal, Altan; Yerli, Sinan Kaan; Ozbey, M. (EDP Sciences, 2007-08-01)
Aims. We present an analysis of long-term optical monitoring observations and optical spectroscopic observations of the counterpart to CXOU J205847.5+414637 (high-mass X-ray binary system). We search for variability in the light curve of Be star.
Assessing the validity of a statistical distribution: some illustrative examples from dermatological research
Sürücü, Barış (Wiley, 2008-05-01)
Background. Assuming a statistical distribution is one of the key points before conducting a statistical analysis. Goodness-of-fit tests are used to assess the validity of an assumed statistical distribution. In dermatological research, the goodness-of-fit tests used are less powerful.
Optical and X-ray outbursts of Be/X-ray binary system SAX J2103.5+4545
Kiziloglu, Ue.; Ozbilgen, S.; Kiziloglu, N.; Baykal, Altan (EDP Sciences, 2009-12-01)
Aims. The main goal of this study is to investigate the relationship between the optical and X-ray behaviours of the Be/X-ray binary system SAX J2103.5+4545.
Neural network prediction of tsunami parameters in the aegean and Marmara Seas
Erdurmaz, Muammer Sercan; Ergin, Ayşin; Department of Civil Engineering (2004)
Tsunamis are characterized as shallow water waves, with long periods and wavelengths. They occur by a sudden water volume displacement. Earthquake is one of the main reasons of a tsunami development. Historical data for an observation period of 3500 years starting from 1500 B.C. indicates that approximately 100 tsunamis occurred in the seas neighboring Turkey. Historical earthquake and tsunami data were collected and used to develop two artificial neural network models to forecast tsunami characteristics fo...
Reanalysis of high-resolution XMM-Newton data of V2491 Cygni using models of collisionally ionized hot absorbers
Balman, Şölen; Gamsizkan, C. (2017-02-01)
Aims. We model spectral absorption features in data of the high-resolution XMM-Newton Reflection Grating Spectrometer.
Citation Formats
O. Fırat, “Connectionist multi-sequence modelling and applications to multilingual neural machine translation,” Ph.D. - Doctoral Program, Middle East Technical University, 2017.