On integrating a language model into neural machine translation

2017-09-01
Gulcehre, Caglar
Firat, Orhan
Xu, Kelvin
Cho, Kyunghyun
Bengio, Yoshua
Recent advances in end-to-end neural machine translation models have achieved promising results on high-resource language pairs such as En -> Fr and En -> De. One of the major factor behind these successes is the availability of high quality parallel corpora. We explore two strategies on leveraging abundant amount of monolingual data for neural machine translation. We observe improvements by both combining scores from neural language model trained only on target monolingual data with neural machine translation model and fusing hidden-states of these two models. We obtain up to 2 BLEU improvement over hierarchical and phrase-based baseline on low-resource language pair, Turkish -> English. Our method was initially motivated towards tasks with less parallel data, but we also show that it extends to high resource languages such as Cs -> En and De -> En translation tasks, where we obtain 0.39 and 0.47 BLEU improvements over the neural machine translation baselines, respectively.
COMPUTER SPEECH AND LANGUAGE

Suggestions

Computational representation of protein sequences for homology detection and classification
Oğul, Hasan; Mumcuoğlu, Ünal Erkan; Department of Information Systems (2006)
Machine learning techniques have been widely used for classification problems in computational biology. They require that the input must be a collection of fixedlength feature vectors. Since proteins are of varying lengths, there is a need for a means of representing protein sequences by a fixed-number of features. This thesis introduces three novel methods for this purpose: n-peptide compositions with reduced alphabets, pairwise similarity scores by maximal unique matches, and pairwise similarity scores by...
Analysis of extended feature models with constraint programming
Karataş, Ahmet Serkan; Oğuztüzün, Mehmet Halit S.; Department of Computer Engineering (2010)
In this dissertation we lay the groundwork of automated analysis of extended feature models with constraint programming. Among different proposals, feature modeling has proven to be very effective for modeling and managing variability in Software Product Lines. However, industrial experiences showed that feature models often grow too large with hundreds of features and complex cross-tree relationships, which necessitates automated analysis support. To address this issue we present a mapping from extended fe...
Improvements to neural network based restoration in optical networks
Türk, Fethi; Bilgen, Semih; Department of Electrical and Electronics Engineering (2008)
Performance of neural network based restoration of optical networks is evaluated and a few possible improvements are proposed. Neural network based restoration is simulated with optical link capacities assigned by a new method. Two new improvement methods are developed to reduce the neural network size and the restoration time of severed optical connections. Cycle based restoration is suggested, which reduces the neural network structure by restoring the severed connections for each optical node, iterativel...
Verification of Modular Diagnosability With Local Specifications for Discrete-Event Systems
Schmidt, Klaus Verner (Institute of Electrical and Electronics Engineers (IEEE), 2013-09-01)
In this paper, we study the diagnosability verification for modular discrete-event systems (DESs), i.e., DESs that are composed of multiple components. We focus on a particular modular architecture, where each fault in the system must be uniquely identified by the modular component where it occurs and solely based on event observations of that component. Hence, all diagnostic computations for faults to be detected in this architecture can be performed locally on the respective modular component, and the obt...
Comparison of rough multi layer perceptron and rough radial basis function networks using fuzzy attributes
Vural, Hülya; Alpaslan, Ferda Nur; Department of Computer Engineering (2004)
The hybridization of soft computing methods of Radial Basis Function (RBF) neural networks, Multi Layer Perceptron (MLP) neural networks with back-propagation learning, fuzzy sets and rough sets are studied in the scope of this thesis. Conventional MLP, conventional RBF, fuzzy MLP, fuzzy RBF, rough fuzzy MLP, and rough fuzzy RBF networks are compared. In the fuzzy neural networks implemented in this thesis, the input data and the desired outputs are given fuzzy membership values as the fuzzy properties أlow...
Citation Formats
C. Gulcehre, O. Firat, K. Xu, K. Cho, and Y. Bengio, “On integrating a language model into neural machine translation,” COMPUTER SPEECH AND LANGUAGE, pp. 137–148, 2017, Accessed: 00, 2020. [Online]. Available: https://hdl.handle.net/11511/68185.