Subcellular location prediction of proteins using support vector machines with alignment of block sequences utilizing amino acid composition.

Tamura, Takeyuki; Akutsu, Tatsuya

このアイテムのアクセス数: 214

http://hdl.handle.net/2433/159464

このアイテムのファイル:

ファイル	記述	サイズ	フォーマット
1471-2105-8-466.pdf		653.44 kB	Adobe PDF	見る/開く

完全メタデータレコード

DCフィールド	値	言語
dc.contributor.author	Tamura, Takeyuki	en
dc.contributor.author	Akutsu, Tatsuya	en
dc.contributor.alternative	田村, 武幸	ja
dc.date.accessioned	2012-10-03T01:09:47Z	-
dc.date.available	2012-10-03T01:09:47Z	-
dc.date.issued	2007-11-30	-
dc.identifier.issn	1471-2105	-
dc.identifier.uri	http://hdl.handle.net/2433/159464	-
dc.description.abstract	Background: Subcellular location prediction of proteins is an important and well-studied problem in bioinformatics. This is a problem of predicting which part in a cell a given protein is transported to, where an amino acid sequence of the protein is given as an input. This problem is becoming more important since information on subcellular location is helpful for annotation of proteins and genes and the number of complete genomes is rapidly increasing. Since existing predictors are based on various heuristics, it is important to develop a simple method with high prediction accuracies. Results: In this paper, we propose a novel and general predicting method by combining techniques for sequence alignment and feature vectors based on amino acid composition. We implemented this method with support vector machines on plant data sets extracted from the TargetP database. Through fivefold cross validation tests, the obtained overall accuracies and average MCC were 0.9096 and 0.8655 respectively. We also applied our method to other datasets including that of WoLF PSORT. Conclusion: Although there is a predictor which uses the information of gene ontology and yields higher accuracy than ours, our accuracies are higher than existing predictors which use only sequence information. Since such information as gene ontology can be obtained only for known proteins, our predictor is considered to be useful for subcellular location prediction of newly-discovered proteins. Furthermore, the idea of combination of alignment and amino acid frequency is novel and general so that it may be applied to other problems in bioinformatics. Our method for plant is also implemented as a web-system and available on http://sunflower.kuicr.kyoto-u.ac.jp/~tamura/slpfa.html webcite.	en
dc.format.mimetype	application/pdf	-
dc.language.iso	eng	-
dc.publisher	BioMed Central Ltd.	en
dc.rights	© 2007 Tamura and Akutsu; licensee BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.	en
dc.subject.mesh	Algorithms	en
dc.subject.mesh	Amino Acid Sequence/physiology	en
dc.subject.mesh	Artificial Intelligence	en
dc.subject.mesh	Cluster Analysis	en
dc.subject.mesh	Computational Biology/methods	en
dc.subject.mesh	Databases, Protein	en
dc.subject.mesh	Internet	en
dc.subject.mesh	Intracellular Space/metabolism	en
dc.subject.mesh	Intracellular Space/ultrastructure	en
dc.subject.mesh	Models, Biological	en
dc.subject.mesh	Pattern Recognition, Automated/methods	en
dc.subject.mesh	Plant Proteins/metabolism	en
dc.subject.mesh	Plant Proteins/ultrastructure	en
dc.subject.mesh	Predictive Value of Tests	en
dc.subject.mesh	Protein Transport	en
dc.subject.mesh	Reproducibility of Results	en
dc.subject.mesh	Sequence Alignment/statistics & numerical data	en
dc.subject.mesh	Structure-Activity Relationship	en
dc.title	Subcellular location prediction of proteins using support vector machines with alignment of block sequences utilizing amino acid composition.	en
dc.type	journal article	-
dc.type.niitype	Journal Article	-
dc.identifier.jtitle	BMC bioinformatics	en
dc.identifier.volume	8	-
dc.relation.doi	10.1186/1471-2105-8-466	-
dc.textversion	publisher	-
dc.identifier.artnum	466	-
dc.identifier.pmid	18047679	-
dcterms.accessRights	open access	-
出現コレクション:	学術雑誌掲載論文等

アイテムの簡略レコードを表示する

Export to RefWorks

このリポジトリに保管されているアイテムはすべて著作権により保護されています。