Show/Hide Menu
Hide/Show Apps
Logout
Türkçe
Türkçe
Search
Search
Login
Login
OpenMETU
OpenMETU
About
About
Open Science Policy
Open Science Policy
Open Access Guideline
Open Access Guideline
Postgraduate Thesis Guideline
Postgraduate Thesis Guideline
Communities & Collections
Communities & Collections
Help
Help
Frequently Asked Questions
Frequently Asked Questions
Guides
Guides
Thesis submission
Thesis submission
MS without thesis term project submission
MS without thesis term project submission
Publication submission with DOI
Publication submission with DOI
Publication submission
Publication submission
Supporting Information
Supporting Information
General Information
General Information
Copyright, Embargo and License
Copyright, Embargo and License
Contact us
Contact us
Dynamic programming approach to voice transformation
Date
2006-10-01
Author
Salor, Ozgul
Demirekler, Mübeccel
Metadata
Show full item record
This work is licensed under a
Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License
.
Item Usage Stats
170
views
0
downloads
Cite This
This paper presents a voice transformation algorithm which modifies the speech of a source speaker such that it is perceived as if spoken by a target speaker. A novel method which is based on dynamic programming approach is proposed. The designed system obtains speaker-specific codebooks of line spectral frequencies (LSFs) for both source and target speakers. Those codebooks are used to train a mapping histogram matrix, which is used for LSF transformation from one speaker to the other. The baseline system uses the maxima of the histogram matrix for LSF transformation. The shortcomings of this system, which are the limitations of the target LSF space and the spectral discontinuities due to independent mapping of subsequent frames, have been overcome by applying the dynamic programming approach. Dynamic programming approach tries to model the long-term behaviour of LSFs of the target speaker, while it is trying to preserve the relationship between the subsequent frames of the source LSFs, during transformation. Both objective and subjective evaluations have been conducted and it has been shown that dynamic programming approach improves the performance of the system in terms of both the speech quality and speaker similarity.
Subject Keywords
Linguistics and Language
,
Modelling and Simulation
,
Software
,
Communication
,
Computer Vision and Pattern Recognition
,
Language and Linguistics
,
Computer Science Applications
URI
https://hdl.handle.net/11511/52297
Journal
SPEECH COMMUNICATION
DOI
https://doi.org/10.1016/j.specom.2006.06.003
Collections
Graduate School of Natural and Applied Sciences, Article
Suggestions
OpenMETU
Core
The use of articulator motion information in automatic speech segmentation
Akdemir, Eren; Çiloğlu, Tolga (Elsevier BV, 2008-07-01)
The use of articulator motion information in automatic speech segmentation is investigated. Automatic speech segmentation is an essential task in speech processing applications like speech synthesis where accuracy and consistency of segmentation are firmly connected to the quality of synthetic speech. The motions of upper and lower lips are incorporated into a hidden Markov model based segmentation process. The MOCHA-TIMIT database, which involves simultaneous articulatograph and microphone recordings, was ...
The discourse connector list: a multi-genre cross-cultural corpus analysis
Kalajahi, Seyed Ali Rezvani; Abdullah, Ain Nadzimah; Neufeld, Steve (Walter de Gruyter GmbH, 2017-05-01)
This study examines the linguistic feature known as discourse connector using a corpus-informed approach. The study applies a taxonomy which classifies and describes 632 discourse connectors in eight broad classes with 17 categories. The frequency of use of each discourse connector listed was analyzed in the three different registers of spoken, non-academic and academic English in the two different cultural contexts of British and American English. The resulting data on discourse connector frequency were co...
Verb concepts from affordances
Kalkan, Sinan; Yuerueten, Onur; Borghi, Anna M.; Şahin, Erol (John Benjamins Publishing Company, 2014-01-01)
In this paper, we investigate how the interactions of a robot with its environment can be used to create concepts that are typically represented by verbs in language. Towards this end, we utilize the notion of affordances to argue that verbs typically refer to the generation of a specific type of effect rather than a specific type of action. Then, we show how a robot can form these concepts through interactions with the environment and how humans can use these concepts to ease their communication with the r...
The combinatory morphemic lexicon
Bozsahin, C (MIT Press - Journals, 2002-06-01)
Grammars that expect words from the lexicon may be at odds with the transparent projection of syntactic and semantic scope relations of smaller units. We propose a morphosyntactic framework based on Combinatory Categorial Grammar that provides flexible constituency, flexible category consistency, and lexical projection of morphosyntactic properties and attachment to grammar in order to establish a morphemic grammar-lexicon. These mechanisms provide enough expressive power in the lexicon to formulate semanti...
Language learning from the perspective of nonlinear dynamic systems
Hohenberger, Annette Edeltraud; Peltzer-Karpf, Annemarie (Walter de Gruyter GmbH, 2009-01-01)
This article outlines a nonlinear dynamic systems approach to language learning on the basis of developmental cognitive neuroscience. Language learning, on this view, is a process of experience-dependent shaping and selection of broadly defined domain-general and domain-specific genetic predispositions. The central concept of development is (neuro) cognitive,e growth in terms of self-organization. Linguistic structure-building is synergetic and emergent insofar as the acquisition of a critical mass of eleme...
Citation Formats
IEEE
ACM
APA
CHICAGO
MLA
BibTeX
O. Salor and M. Demirekler, “Dynamic programming approach to voice transformation,”
SPEECH COMMUNICATION
, pp. 1262–1272, 2006, Accessed: 00, 2020. [Online]. Available: https://hdl.handle.net/11511/52297.