Improvements on automatic speech segmentation at the phonetic level

Gómez Adrian, Jon Ander; Calvo Lance, Marcos

doi:10.1007/978-3-642-25085-9_66

Identificarse

Buscar en RiuNet

Listar

Todo RiuNet
Esta colección

Mi cuenta

Acceder

Estadísticas

Ver Estadísticas de uso

Ayuda RiuNet

Admin. UPV

Compartir/Enviar a

Citas

Estadísticas

Improvements on automatic speech segmentation at the phonetic level

Mostrar el registro sencillo del ítem

Ficheros en el ítem

Nombre: iass.pdf

Tamaño: 259.5Kb

Formato: PDF

Descripción: Versión del Autor.

Abrir

Nombre: Gómez;Calvo - ...

Tamaño: 130.6Kb

Formato: PDF

Descripción: Versión editorial

Solicitar una copia al autor

dc.contributor.author	Gómez Adrian, Jon Ander	es_ES
dc.contributor.author	Calvo Lance, Marcos	es_ES
dc.date.accessioned	2014-05-16T09:15:15Z
dc.date.issued	2011
dc.identifier.isbn	978-3-642-25084-2
dc.identifier.issn	0302-9743
dc.identifier.uri	http://hdl.handle.net/10251/37516
dc.description.abstract	In this paper, we present some recent improvements in our automatic speech segmentation system, which only needs the speech signal and the phonetic sequence of each sentence of a corpus to be trained. It estimates a GMM by using all the sentences of the training subcorpus, where each Gaussian distribution represents an acoustic class, which probability densities are combined with a set of conditional probabilities in order to estimate the probability densities of the states of each phonetic unit. The initial values of the conditional probabilities are obtained by using a segmentation of each sentence assigning the same number of frames to each phonetic unit. A DTW algorithm fixes the phonetic boundaries using the known phonetic sequence. This DTW is a step inside an iterative process which aims to segment the corpus and re-estimate the conditional probabilities. The results presented here demonstrate that the system has a good capacity to learn how to identify the phonetic boundaries. © 2011 Springer-Verlag.	es_ES
dc.description.sponsorship	This work was supported by the Spanish MICINN under contract TIN2008-06856-C05-02
dc.format.extent	8	es_ES
dc.language	Inglés	es_ES
dc.publisher	Springer Verlag (Germany)	es_ES
dc.relation.ispartof	Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications	es_ES
dc.relation.ispartofseries	Lecture Notes in Computer Science;7042
dc.rights	Reserva de todos los derechos	es_ES
dc.subject	Automatic speech segmentation	es_ES
dc.subject	Phoneme boundaries detection	es_ES
dc.subject	Phoneme alignment	es_ES
dc.subject	Conditional probabilities	es_ES
dc.subject	Initial values	es_ES
dc.subject	Iterative process	es_ES
dc.subject	Phonetic level	es_ES
dc.subject	Probability densities	es_ES
dc.subject	Speech signals	es_ES
dc.subject	Computer vision	es_ES
dc.subject	Estimation	es_ES
dc.subject	Image segmentation	es_ES
dc.subject	Probability distributions	es_ES
dc.subject.classification	LENGUAJES Y SISTEMAS INFORMATICOS	es_ES
dc.title	Improvements on automatic speech segmentation at the phonetic level	es_ES
dc.type	Capítulo de libro	es_ES
dc.embargo.lift	10000-01-01
dc.embargo.terms	forever	es_ES
dc.identifier.doi	10.1007/978-3-642-25085-9_66
dc.relation.projectID	info:eu-repo/grantAgreement/MICINN//TIN2008-06856-C05-02/ES/SISTEMAS BASADOS EN LA INTERACCION ORAL DINAMICAMENTE MEJORABLES Y ADAPTABLES A NUEVOS CONTEXTOS/	es_ES
dc.rights.accessRights	Abierto	es_ES
dc.contributor.affiliation	Universitat Politècnica de València. Departamento de Sistemas Informáticos y Computación - Departament de Sistemes Informàtics i Computació	es_ES
dc.description.bibliographicCitation	Gómez Adrian, JA.; Calvo Lance, M. (2011). Improvements on automatic speech segmentation at the phonetic level. En Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications. Springer Verlag (Germany). 7042:557-564. https://doi.org/10.1007/978-3-642-25085-9_66	es_ES
dc.description.accrualMethod	S	es_ES
dc.relation.conferencename	16th Iberoamerican Congress, CIARP 2011	es_ES
dc.relation.conferencedate	November 15-18, 2011	es_ES
dc.relation.conferenceplace	Pucón, Chile	es_ES
dc.relation.publisherversion	http://link.springer.com/chapter/10.1007/978-3-642-25085-9_66	es_ES
dc.description.upvformatpinicio	557	es_ES
dc.description.upvformatpfin	564	es_ES
dc.type.version	info:eu-repo/semantics/publishedVersion	es_ES
dc.description.volume	7042	es_ES
dc.relation.senia	211330
dc.contributor.funder	Ministerio de Ciencia e Innovación
dc.description.references	Toledano, D.T., Hernández Gómez, L., Villarrubia Grande, L.: Automatic Phonetic Segmentation. IEEE Transactions on Speech and Audio Processing 11(6), 617–625 (2003)	es_ES
dc.description.references	Kipp, A., Wesenick, M.B., Schiel, F.: Pronunciation modelling applied to automatic segmentation of spontaneous speech. In: Proceedings of Eurospeech, Rhodes, Greece, pp. 2013–2026 (1997)	es_ES
dc.description.references	Sethy, A., Narayanan, S.: Refined Speech Segmentation for Concatenative Speech Synthesis. In: Proceedings of ICSLP, Denver, Colorado, USA, pp. 149–152 (2002)	es_ES
dc.description.references	Jarify, S., Pastor, D., Rosec, O.: Cooperation between global and local methods for the automatic segmentation of speech synthesis corpora. In: Proceedings of Interspeech, Pittsburgh, Pennsylvania, USA, pp. 1666–1669 (2006)	es_ES
dc.description.references	Romsdorfer, H., Pfister, B.: Phonetic Labeling and Segmentation of Mixed-Lingual Prosody Databases. In: Proceedings of Interspeech, Lisbon, Portual, pp. 3281–3284 (2005)	es_ES
dc.description.references	Paulo, S., Oliveira, L.C.: DTW-based Phonetic Alignment Using Multiple Acoustic Features. In: Proceedings of Eurospeech, Geneva, Switzerland, pp. 309–312 (2003)	es_ES
dc.description.references	Park, S.S., Shin, J.W., Kim, N.S.: Automatic Speech Segmentation with Multiple Statistical Models. In: Proceedings of Interspeech, Pittsburgh, Pennsylvania, USA, pp. 2066–2069 (2006)	es_ES
dc.description.references	Mporas, I., Ganchev, T., Fakotakis, N.: Speech segmentation using regression fusion of boundary predictions. Computer Speech and Language 24, 273–288 (2010)	es_ES
dc.description.references	Povey, D., Woodland, P.C.: Minimum Phone Error and I-smoothing for improved discriminative training. In: Proceedings of ICASSP, Orlando, Florida, USA, pp. 105–108 (2002)	es_ES
dc.description.references	Kuo, J.W., Wang, H.M.: Minimum Boundary Error Training for Automatic Phonetic Segmentation. In: Proceedings of Interspeech, Pittsburgh, Pennsylvania, USA, pp. 1217–1220 (2006)	es_ES
dc.description.references	Huggins-Daines, D., Rudnicky, A.I.: A Constrained Baum-Welch Algorithm for Improved Phoneme Segmentation and Efficient Training. In: Proceedings of Interspeech, Pittsburgh, Pennsylvania, USA, pp. 1205–1208 (2006)	es_ES
dc.description.references	Ogbureke, K.U., Carson-Berndsen, J.: Improving initial boundary estimation for HMM-based automatic phonetic segmentation. In: Proceedings of Interspeech, Brighton, UK, pp. 884–887 (2009)	es_ES
dc.description.references	Gómez, J.A., Castro, M.J.: Automatic Segmentation of Speech at the Phonetic Level. In: Caelli, T.M., Amin, A., Duin, R.P.W., Kamel, M.S., de Ridder, D. (eds.) SPR 2002 and SSPR 2002. LNCS, vol. 2396, pp. 672–680. Springer, Heidelberg (2002)	es_ES
dc.description.references	Gómez, J.A., Sanchis, E., Castro-Bleda, M.J.: Automatic Speech Segmentation Based on Acoustical Clustering. In: Hancock, E.R., Wilson, R.C., Windeatt, T., Ulusoy, I., Escolano, F. (eds.) SSPR&SPR 2010. LNCS, vol. 6218, pp. 540–548. Springer, Heidelberg (2010)	es_ES
dc.description.references	Moreno, A., Poch, D., Bonafonte, A., Lleida, E., Llisterri, J., Mariño, J.B., Nadeu, C.: Albayzin Speech Database: Design of the Phonetic Corpus. In: Proceedings of Eurospeech, Berlin, Germany, vol. 1, pp. 653–656 (September 1993)	es_ES
dc.description.references	TIMIT Acoustic-Phonetic Continuous Speech Corpus, National Institute of Standards and Technology Speech Disc 1-1.1, NTIS Order No. PB91-5050651996 (October 1990)	es_ES

Este ítem aparece en la(s) siguiente(s) colección(ones)

Artículos, conferencias, monografías [48344]

Mostrar el registro sencillo del ítem

Improvements on automatic speech segmentation at the phonetic level

RiuNet: Repositorio Institucional de la Universidad Politécnica de Valencia

Buscar en RiuNet

Listar

Todo RiuNet

Esta colección

Mi cuenta

Estadísticas

Ayuda RiuNet

Admin. UPV

Compartir/Enviar a

Citas

Estadísticas

Improvements on automatic speech segmentation at the phonetic level

Ficheros en el ítem

Este ítem aparece en la(s) siguiente(s) colección(ones)