García Gómez, JM.; Benedí Ruiz, JM.; Vicente Robledo, J.; Robles Viejo, M. (2005). Corpus based learning of stochastic, context-free grammars combined with Hidden Markov Models for tRNA modelling. International Journal of Bioinformatics Research and Applications. 1(3):305-318. doi:10.1504/IJBRA.2005.007908
Por favor, use este identificador para citar o enlazar este ítem: http://hdl.handle.net/10251/45150
Title:
|
Corpus based learning of stochastic, context-free grammars combined with Hidden Markov Models for tRNA modelling
|
Author:
|
García Gómez, Juan Miguel
Benedí Ruiz, José Miguel
Vicente Robledo, Javier
Robles Viejo, Montserrat
|
UPV Unit:
|
Universitat Politècnica de València. Departamento de Sistemas Informáticos y Computación - Departament de Sistemes Informàtics i Computació
Universitat Politècnica de València. Departamento de Física Aplicada - Departament de Física Aplicada
Universitat Politècnica de València. Instituto Universitario de Aplicaciones de las Tecnologías de la Información - Institut Universitari d'Aplicacions de les Tecnologies de la Informació
|
Issued date:
|
|
Abstract:
|
[EN] In this paper, a new method for modelling tRNA secondary structures is presented. This method is based on the combination of stochastic context-free grammars (SCFG) and Hidden Markov Models (HMM). HMM are used to ...[+]
[EN] In this paper, a new method for modelling tRNA secondary structures is presented. This method is based on the combination of stochastic context-free grammars (SCFG) and Hidden Markov Models (HMM). HMM are used to capture the local relations in the loops of the molecule (nonstructured regions) and SCFG are used to capture the long term relations between nucleotides of the arms (structured regions). Given annotated public databases, the HMM and SCFG models are learned by means of automatic inductive learning methods. Two SCFG learning methods have been explored. Both of them take advantage of the structural information associated with the training sequences: one of them is based on a stochastic version of the Sakakibara algorithm and the other one is based on a Corpus based algorithm. A final model is then obtained by merging of the HMM of the nonstructured regions and the SCFG of the structured regions. Finally, the performed experiments on the tRNA sequence corpus and the non-tRNA sequence corpus give significant results. Comparative experiments with another published method are also presented.
[-]
|
Subjects:
|
Hidden Markov Models (HMM)
,
RNA
,
Secondary structure modelling
,
Language modelling
,
Grammatical inference
,
Stochastic context-free grammar (SCFG)
,
Syntactic pattern recognition
|
Copyrigths:
|
Reserva de todos los derechos
|
Source:
|
International Journal of Bioinformatics Research and Applications. (issn:
1744-5485
)
|
DOI:
|
10.1504/IJBRA.2005.007908
|
Publisher:
|
Inderscience
|
Publisher version:
|
http://dx.doi.org/10.1504/IJBRA.2005.007908
|
Thanks:
|
We would like to thank Diego Linares and Joan Andreu Sanchez for answering all our questions about SCFG, as well as Satoshi Sekine for his evaluation software. We would also like to thank the Ministerio de Sanidad y Consumo ...[+]
We would like to thank Diego Linares and Joan Andreu Sanchez for answering all our questions about SCFG, as well as Satoshi Sekine for his evaluation software. We would also like to thank the Ministerio de Sanidad y Consumo of Spain for the grants to the INBIOMED consortium.
[-]
|
Type:
|
Artículo
|